Skip to content

[Failing Test]: SparkVersionTest 3.4.0+ fails on Java11 #32207

@Abacn

Description

@Abacn

What happened?

After migrating to Java11 to build and test Beam (#31677) PostCommit Spark Versions start to fail.

https://github.com/apache/beam/actions/workflows/beam_PreCommit_Java_Spark3_Versions.yml?query=event%3Aschedule

Older Spark versions still pass, but Spark 3.4.0 test stuck indefinitely.

The executor shows error message:

org.apache.beam.runners.spark.SparkPipelineStateTest STANDARD_ERROR
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/Users/.../.gradle/caches/modules-2/files-2.1/org.slf4j/slf4j-log4j12/1.7.30/c21f55139d8141d2231214fb1feaf50a1edca95e/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/Users/.../.gradle/caches/modules-2/files-2.1/org.slf4j/slf4j-simple/1.7.30/e606eac955f55ecf1d8edcccba04eb8ac98088dd/slf4j-simple-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

org.apache.beam.runners.spark.SparkPipelineStateTest > testStreamingPipelineRunningState STANDARD_ERROR
    Exception in thread "Executor task launch worker for task 0.0 in stage 0.0 (TID 0)" java.lang.NoClassDefFoundError: Could not initialize class org.slf4j.MDC
        at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$cleanMDCForTask(Executor.scala:838)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:763)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
    Exception in thread "Executor task launch worker-1" java.lang.NoClassDefFoundError: Could not initialize class org.slf4j.MDC
        at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$setMDCForTask(Executor.scala:829)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:488)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)

Likely due to Spark 3.4.0 upgraded to slf4j 2, in which Beam does not support, see https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.12/3.4.0

However the fact this passes on Java8 may be due to some tweak for dependency managements enable to run different spark versions may not work properly in newer java versions

Issue Failure

Failure: Test is continually failing

Issue Priority

Priority: 2 (backlog / disabled test but we think the product is healthy)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions