Skip to content

[Bug]: Can not use Avro over 1.8.2 with Beam 2.52.0 #29413

@bvolpato

Description

@bvolpato

What happened?

After #27851, user code that depends on versions newer than Avro 1.8.2 are having problems running on Dataflow.

For example in https://github.com/GoogleCloudPlatform/DataflowTemplates, where we moved on to Avro 1.11.3, there were incompatibility errors:

Caused by: java.io.InvalidClassException: org.apache.avro.specific.SpecificRecordBase; local class incompatible: stream classdesc serialVersionUID = -1463700717714793795, local class serialVersionUID = 189988654766568477

and

Caused by: java.lang.NoSuchMethodError: 'boolean org.apache.avro.generic.GenericRecord.hasField(java.lang.String)' com.google.cloud.teleport.v2.transforms.FormatDatastreamRecordToJson.getMetadataIsDeleted(FormatDatastreamRecordToJson.java:258) com.google.cloud.teleport.v2.transforms.FormatDatastreamRecordToJson.apply(FormatDatastreamRecordToJson.java:123) com.google.cloud.teleport.v2.transforms.FormatDatastreamRecordToJson.apply(FormatDatastreamRecordToJson.java:51) org.apache.beam.sdk.extensions.avro.io.AvroSource$AvroBlock.readNextRecord(AvroSource.java:610)

The root cause is that Avro classes are now being shipped along with the /opt/apache/beam/jars/beam-sdks-java-harness.jar, which wasn't the case before.

Tried to relocate in #29407 but got some test failures.

Next step is marking Avro as provided in that JAR, since it's apparently not used.

Issue Priority

Priority: 1 (data loss / total loss of function)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions