Skip to content

[Feature Request][prism]: Add support for the multi-chunk iterable protocol #27762

@lostluck

Description

@lostluck

What would you like to happen?

Prism's goal is to simplify testing all the bits of beam, and it has it's first real use for this!

#23043 revealed that the Spark runner uses the "multi-chunk" iterable protocol, which isn't commonly used outside of the state backed iterable path.

https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/org/apache/beam/model/pipeline/v1/beam_runner_api.proto#L859

This path wasn't properly tested for the singleIterate path in the Go SDK, revealing that the iterator being left undrained put the reader in a bad position for the next element, leading to a bug.

Prism should implement the multi-chunk iterable behavior, and have a means for configuring the iterable behavior from a passed in pipeline option.

The validation for this will come from tests in the Go SDK's exec package and validating that those paths are covered.

Issue Priority

Priority: 2 (default / most feature requests should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions