Skip to content

[Bug]: Issues observed in maven to gradle conversion example #30992

@rajkgupt

Description

@rajkgupt

What happened?

Below are few issues and observations noticed:

  1. I ran the example from [2] and it seems BOM is not included in generated build.gradle.kts file [6]
    Please clarify/fix below couple of questions from it.

    • a. is it intentional to not have BOM dependency, like <libraries-bom.version>26.32.0</libraries-bom.version> found in corresponding pom.xml file?
    • b. the generated gradle file only has dependency for org.apache.beam:beam-runners-google-cloud-dataflow-java:2.55.1, but from [1] we can see other two i.e. beam-sdks-java-io-google-cloud-platform & beam-sdks-java-core are also required. I was able to run without these two dependencies but wanted to confirm on these two dependencies?
  2. sample dataflow runner command in [2] needs to be updated to add --runner=DataflowRunner inside --args

  3. After following the steps given in apache beam site [3] and running command [4], I get error Exception in thread "main" java.lang.IllegalArgumentException: Unknown 'runner' specified 'DataflowRunner', supported pipeline runners [DirectRunner, PortableRunner, TestUniversalRunner]

    Then I ran maven package command with dataflow-runner like [5] and it worked

    Also the output variable needs to be added in the command like --output=gs://<YOUR_GCS_BUCKET>/output

[1] https://cloud.google.com/dataflow/docs/guides/manage-dependencies#java-management
[2] https://beam.apache.org/get-started/quickstart-java/#optional-convert-from-maven-to-gradle
[3] https://beam.apache.org/documentation/runners/dataflow/
[4] mvn package
[5] mvn package -Pdataflow-runner

[6] Generated Gradle file contents
/*

  • This file was generated by the Gradle 'init' task.
    */
plugins {
    `java-library`
    `maven-publish`
}

repositories {
    mavenCentral()
    maven {
        url = uri("https://repository.apache.org/content/repositories/snapshots/")
    }

    maven {
        url = uri("https://repo.maven.apache.org/maven2/")
    }
    maven {
        url = uri("https://packages.confluent.io/maven/")
    }
}

dependencies {
    api("org.apache.beam:beam-sdks-java-core:2.55.1")
    api("org.apache.beam:beam-sdks-java-io-google-cloud-platform:2.55.1")
    api("org.apache.beam:beam-sdks-java-extensions-python:2.55.1")
    api("org.apache.beam:beam-sdks-java-io-kafka:2.55.1")
    api("com.google.api-client:google-api-client:2.0.0")
    api("com.google.apis:google-api-services-bigquery:v2-rev20240124-2.0.0")
    api("com.google.http-client:google-http-client:1.43.3")
    api("com.google.apis:google-api-services-pubsub:v1-rev20220904-2.0.0")
    api("joda-time:joda-time:2.10.10")
    api("org.apache.kafka:kafka-clients:2.4.1")
    api("org.slf4j:slf4j-api:1.7.30")
    api("org.hamcrest:hamcrest-core:2.1")
    api("org.hamcrest:hamcrest-library:2.1")
    api("junit:junit:4.13.1")
    runtimeOnly("org.slf4j:slf4j-jdk14:1.7.30")
    runtimeOnly("org.apache.beam:beam-runners-direct-java:2.55.1")
    runtimeOnly("org.apache.beam:beam-runners-portability-java:2.55.1")
    testImplementation("org.mockito:mockito-core:3.7.7")
}

group = "org.example"
version = "0.1"
description = "word-count-beam"
java.sourceCompatibility = JavaVersion.VERSION_1_8

publishing {
    publications.create<MavenPublication>("maven") {
        from(components["java"])
    }
}
if (project.hasProperty("dataflow-runner")) {
    dependencies {
        runtimeOnly("org.apache.beam:beam-runners-google-cloud-dataflow-java:2.55.1")
    }
}
task("execute", JavaExec::class) {
    classpath = sourceSets["main"].runtimeClasspath
    mainClass.set(System.getProperty("mainClass"))
}

Issue Priority

Priority: 3 (minor)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions