Skip to content

[prism] Preprocess failure - Expected Runner Flatten Node - but wasn't #31992

@lostluck

Description

@lostluck

Six tests are failing the Java ValidatesRunner SplittableDoFnTest suite, with

java.lang.RuntimeException: The Runner experienced the following error during execution:
jobFailed job-188[splittabledofntest0testwindowedsideinputwithcheckpointsbounded-damondouglas-0709161343-776ac8a0]: preprocess validation failure of stage 8: expected runner flatten node, but wasn't: [eParDo-SDFWithMultipleOutputsPerBlockAndSideInputBounded--ParMultiDo-SDFWithMultipleOutputsPerBlockAn_processandsplit] -- map[nParDo-SDFWithMultipleOutputsPerBlockAndSideInputBounded--ParMultiDo-SDFWithMultipleOutputsPerBlockAn_splitnsized:nParDo-SDFWithMultipleOutputsPerBlockAndSideInputBounded--ParMultiDo-SDFWithMultipleOutputsPerBlockAn_splitnsized singleton/Combine.GloballyAsSingletonView/CombineValues/Values/Values/Map/ParMultiDo(Anonymous).output:singleton/Combine.GloballyAsSingletonView/CombineValues/Values/Values/Map/ParMultiDo(Anonymous).output]

expected runner flatten node, but wasn't indicates the stage got fused with multiple parallel inputs somehow, which isn't permitted.

Error is from here:
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/runners/prism/internal/preprocess.go#L484

An initial change should clarify what the problem actually is "stage requires multiple parallel inputs". The only stage permitted to have multiple parallel inputs is a runner side Flatten. The error should further clarify what's being printed out, along with explicit counts from each, to validate reading the error (eg. "stage has %v main inputs and isn't a runner side flatten. transforms %d %v, inputs %v" or similar)

Iterate against a locally running prism instance:

TEST=org.apache.beam.sdk.transforms.SplittableDoFnTest 
./gradlew :runners:portability:java:ulrLoopbackValidatesRunnerTests -PjobEndpoint=localhost:8073 --tests="$TEST"

There are six failing tests.


Offhand this appears as though after expansion to a splittableDoFn, there's a single transform in the stage, but somehow fusion has lead to multiple inputs for the stage.

All of the tests also use side inputs. So the problem is likely in the fusion code, when handling side inputs in synthetic SDF components. https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/runners/prism/internal/preprocess.go#L525

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions