Skip to content

Conversation

@youngoli
Copy link
Contributor

@youngoli youngoli commented Jun 14, 2022

The expansion service now reads the Project pipeline option so that during expansion it can perform GCP operations on the correct project. Also un-sickbay a test that was blocked on this issue.

Bug: #21761


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests

See CI.md for more information about GitHub Actions CI.

@youngoli
Copy link
Contributor Author

Run XVR_GoUsingJava_Dataflow PostCommit

@youngoli
Copy link
Contributor Author

R: @chamikaramj

@github-actions
Copy link
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

@youngoli
Copy link
Contributor Author

This seems like the best approach would be to have this extra pipeline option only read while using the GCP expansion service, but I'm not sure if there's a straightforward way to alter the behavior only when using that expansion service

@youngoli youngoli requested a review from chamikaramj June 14, 2022 19:32
@youngoli
Copy link
Contributor Author

I've changed it so that now GcpExpansionService is a derived class of the original ExpansionService, and only that class receives the project pipeline option. So only the GCP expansion service jar has grown.

@youngoli
Copy link
Contributor Author

Run Go PostCommit

@youngoli
Copy link
Contributor Author

Run XVR_GoUsingJava_Dataflow PostCommit

@youngoli
Copy link
Contributor Author

Looks like the test is still failing, which means my local environment isn't replicating the bug. I'll look into this tomorrow.

@damccorm damccorm mentioned this pull request Jun 16, 2022
4 tasks
@youngoli
Copy link
Contributor Author

Run XVR_GoUsingJava_Dataflow PostCommit

1 similar comment
@youngoli
Copy link
Contributor Author

Run XVR_GoUsingJava_Dataflow PostCommit

youngoli added 3 commits June 22, 2022 13:56
The expansion service now reads the Project pipeline option so that during expansion it can perform GCP operations on the correct project. Also un-sickbay a test that was blocked on this issue.
This makes some changes to ExpansionService.java to allow performing additional configuration in subclasses.
@youngoli
Copy link
Contributor Author

Run XVR_GoUsingJava_Dataflow PostCommit

2 similar comments
@lostluck
Copy link
Contributor

lostluck commented Jul 3, 2022

Run XVR_GoUsingJava_Dataflow PostCommit

@lostluck
Copy link
Contributor

Run XVR_GoUsingJava_Dataflow PostCommit

@lostluck
Copy link
Contributor

Looks like it still fails to get the projectID, but it could be just in how the Go SDK is providing it in pipeline options (or not).

14:35:37 Caused by: java.lang.NullPointerException: Required parameter projectId must be specified.

14:35:37 --- FAIL: TestBigQueryIO_BasicWriteQueryRead (313.80s)
14:35:37 panic: 	tried cross-language for beam:transform:org.apache.beam:schemaio_bigquery_read:v1 against localhost:37161 and failed
14:35:37 	expanding external transform
14:35:37 	expanding transform with ExpansionRequest: components:{environments:{key:"go"  value:{}}}  transform:{unique_name:"External"  spec:{urn:"beam:transform:org.apache.beam:schemaio_bigquery_read:v1"  payload:"\nX\n\x0e\n\x08location\x1a\x02\x10\x07\n\x0c\n\x06config\x1a\x02\x10\t\n\x12\n\ndataSchema\x1a\x04\x08\x01\x10\t\x12$3fc24beb-ef0b-4fd0-b491-64340b33ca1e\x12k\x03\x01\x04\x00f\x04\x01\rbSELECT * FROM `apache-beam-testing.beam_bigquery_io_test_temp.go_bqio_it_temp_1660164801805738010`"}  environment_id:"go"}  namespace:"VkgVLcucsq"
14:35:37 expansion failed
14:35:37 	caused by:
14:35:37 org.apache.beam.sdk.io.gcp.bigquery.BigQuerySchemaRetrievalException: Exception while trying to retrieve schema of query
14:35:37 	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryQuerySourceDef.getBeamSchema(BigQueryQuerySourceDef.java:179)
14:35:37 	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO$TypedRead.expand(BigQueryIO.java:1226)
14:35:37 	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO$TypedRead.expand(BigQueryIO.java:747)
14:35:37 	at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:548)
14:35:37 	at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:482)
14:35:37 	at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:44)
14:35:37 	at org.apache.beam.sdk.io.gcp.bigquery.BigQuerySchemaIOProvider$BigQuerySchemaIO$1.expand(BigQuerySchemaIOProvider.java:183)
14:35:37 	at org.apache.beam.sdk.io.gcp.bigquery.BigQuerySchemaIOProvider$BigQuerySchemaIO$1.expand(BigQuerySchemaIOProvider.java:165)
14:35:37 	at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:548)
14:35:37 	at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:499)
14:35:37 	at org.apache.beam.sdk.expansion.service.ExpansionService$TransformProvider.apply(ExpansionService.java:396)
14:35:37 	at org.apache.beam.sdk.expansion.service.ExpansionService.expand(ExpansionService.java:516)
14:35:37 	at org.apache.beam.sdk.expansion.service.ExpansionService.expand(ExpansionService.java:603)
14:35:37 	at org.apache.beam.model.expansion.v1.ExpansionServiceGrpc$MethodHandlers.invoke(ExpansionServiceGrpc.java:220)
14:35:37 	at org.apache.beam.vendor.grpc.v1p48p1.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:182)
14:35:37 	at org.apache.beam.vendor.grpc.v1p48p1.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:354)
14:35:37 	at org.apache.beam.vendor.grpc.v1p48p1.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:866)
14:35:37 	at org.apache.beam.vendor.grpc.v1p48p1.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
14:35:37 	at org.apache.beam.vendor.grpc.v1p48p1.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
14:35:37 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
14:35:37 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
14:35:37 	at java.lang.Thread.run(Thread.java:750)
14:35:37 Caused by: java.lang.NullPointerException: Required parameter projectId must be specified.
14:35:37 	at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:907)
14:35:37 	at com.google.api.client.util.Preconditions.checkNotNull(Preconditions.java:138)
14:35:37 	at com.google.api.services.bigquery.Bigquery$Jobs$Insert.<init>(Bigquery.java:1737)
14:35:37 	at com.google.api.services.bigquery.Bigquery$Jobs.insert(Bigquery.java:1687)
14:35:37 	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.dryRunQuery(BigQueryServicesImpl.java:473)
14:35:37 	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryQueryHelper.dryRunQueryIfNeeded(BigQueryQueryHelper.java:73)
14:35:37 	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryQuerySourceDef.getBeamSchema(BigQueryQuerySourceDef.java:168)
14:35:37 	... 21 more
14:35:37  [recovered]
14:35:37 	panic: 	tried cross-language for beam:transform:org.apache.beam:schemaio_bigquery_read:v1 against localhost:37161 and failed
14:35:37 	expanding external transform
14:35:37 	expanding transform with ExpansionRequest: components:{environments:{key:"go"  value:{}}}  transform:{unique_name:"External"  spec:{urn:"beam:transform:org.apache.beam:schemaio_bigquery_read:v1"  payload:"\nX\n\x0e\n\x08location\x1a\x02\x10\x07\n\x0c\n\x06config\x1a\x02\x10\t\n\x12\n\ndataSchema\x1a\x04\x08\x01\x10\t\x12$3fc24beb-ef0b-4fd0-b491-64340b33ca1e\x12k\x03\x01\x04\x00f\x04\x01\rbSELECT * FROM `apache-beam-testing.beam_bigquery_io_test_temp.go_bqio_it_temp_1660164801805738010`"}  environment_id:"go"}  namespace:"VkgVLcucsq"
14:35:37 expansion failed
14:35:37 	caused by:
14:35:37 org.apache.beam.sdk.io.gcp.bigquery.BigQuerySchemaRetrievalException: Exception while trying to retrieve schema of query
14:35:37 	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryQuerySourceDef.getBeamSchema(BigQueryQuerySourceDef.java:179)
14:35:37 	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO$TypedRead.expand(BigQueryIO.java:1226)
14:35:37 	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO$TypedRead.expand(BigQueryIO.java:747)
14:35:37 	at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:548)
14:35:37 	at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:482)
14:35:37 	at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:44)
14:35:37 	at org.apache.beam.sdk.io.gcp.bigquery.BigQuerySchemaIOProvider$BigQuerySchemaIO$1.expand(BigQuerySchemaIOProvider.java:183)
14:35:37 	at org.apache.beam.sdk.io.gcp.bigquery.BigQuerySchemaIOProvider$BigQuerySchemaIO$1.expand(BigQuerySchemaIOProvider.java:165)
14:35:37 	at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:548)
14:35:37 	at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:499)
14:35:37 	at org.apache.beam.sdk.expansion.service.ExpansionService$TransformProvider.apply(ExpansionService.java:396)
14:35:37 	at org.apache.beam.sdk.expansion.service.ExpansionService.expand(ExpansionService.java:516)
14:35:37 	at org.apache.beam.sdk.expansion.service.ExpansionService.expand(ExpansionService.java:603)
14:35:37 	at org.apache.beam.model.expansion.v1.ExpansionServiceGrpc$MethodHandlers.invoke(ExpansionServiceGrpc.java:220)
14:35:37 	at org.apache.beam.vendor.grpc.v1p48p1.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:182)
14:35:37 	at org.apache.beam.vendor.grpc.v1p48p1.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:354)
14:35:37 	at org.apache.beam.vendor.grpc.v1p48p1.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:866)
14:35:37 	at org.apache.beam.vendor.grpc.v1p48p1.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
14:35:37 	at org.apache.beam.vendor.grpc.v1p48p1.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
14:35:37 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
14:35:37 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
14:35:37 	at java.lang.Thread.run(Thread.java:750)
14:35:37 Caused by: java.lang.NullPointerException: Required parameter projectId must be specified.
14:35:37 	at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:907)
14:35:37 	at com.google.api.client.util.Preconditions.checkNotNull(Preconditions.java:138)
14:35:37 	at com.google.api.services.bigquery.Bigquery$Jobs$Insert.<init>(Bigquery.java:1737)
14:35:37 	at com.google.api.services.bigquery.Bigquery$Jobs.insert(Bigquery.java:1687)
14:35:37 	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.dryRunQuery(BigQueryServicesImpl.java:473)
14:35:37 	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryQueryHelper.dryRunQueryIfNeeded(BigQueryQueryHelper.java:73)
14:35:37 	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryQuerySourceDef.getBeamSchema(BigQueryQuerySourceDef.java:168)
14:35:37 	... 21 more
14:35:37 
14:35:37 
14:35:37 goroutine 646 [running]:
14:35:37 testing.tRunner.func1.2({0xeaf720, 0xc000716e40})
14:35:37 	/home/jenkins/sdk/go1.18.1/src/testing/testing.go:1389 +0x24e
14:35:37 testing.tRunner.func1()
14:35:37 	/home/jenkins/sdk/go1.18.1/src/testing/testing.go:1392 +0x39f
14:35:37 panic({0xeaf720, 0xc000716e40})
14:35:37 	/home/jenkins/sdk/go1.18.1/src/runtime/panic.go:838 +0x207
14:35:37 github.com/apache/beam/sdks/v2/go/pkg/beam.CrossLanguage({0xc0004801a0?, 0xc0002e20a0?}, {0x104e6c0, 0x38}, {0xc00020a820?, 0xc0000c7d30?, 0x1?}, {0xc000040b80, 0xf}, 0x0, ...)
14:35:37 	/home/jenkins/jenkins-slave/workspace/beam_PostCommit_XVR_GoUsingJava_Dataflow_PR/src/sdks/go/pkg/beam/xlang.go:162 +0x136
14:35:37 github.com/apache/beam/sdks/v2/go/pkg/beam/io/xlang/bigqueryio.Read({0xc000480120?, 0xc0002e20a0?}, {0x11bd640, 0xea8660}, {0xc0000c7e98, 0x2, 0xc000010458?})
14:35:37 	/home/jenkins/jenkins-slave/workspace/beam_PostCommit_XVR_GoUsingJava_Dataflow_PR/src/sdks/go/pkg/beam/io/xlang/bigqueryio/bigquery.go:158 +0x3b7
14:35:37 github.com/apache/beam/sdks/v2/go/test/integration/io/xlang/bigquery.ReadFromQueryPipeline({0xc000040b80, 0xf}, {0xc000118c60, 0x52}, {0xe007c0, 0xc000cc4540})
14:35:37 	/home/jenkins/jenkins-slave/workspace/beam_PostCommit_XVR_GoUsingJava_Dataflow_PR/src/sdks/go/test/integration/io/xlang/bigquery/bigquery_test.go:191 +0x2d4
14:35:37 github.com/apache/beam/sdks/v2/go/test/integration/io/xlang/bigquery.TestBigQueryIO_BasicWriteQueryRead(0xc0008b21a0)
14:35:37 	/home/jenkins/jenkins-slave/workspace/beam_PostCommit_XVR_GoUsingJava_Dataflow_PR/src/sdks/go/test/integration/io/xlang/bigquery/bigquery_test.go:246 +0x225
14:35:37 testing.tRunner(0xc0008b21a0, 0x107cfa0)
14:35:37 	/home/jenkins/sdk/go1.18.1/src/testing/testing.go:1439 +0x102
14:35:37 created by testing.(*T).Run
14:35:37 	/home/jenkins/sdk/go1.18.1/src/testing/testing.go:1486 +0x35f
14:35:37 FAIL	github.com/apache/beam/sdks/v2/go/test/integration/io/xlang/bigquery	979.320s

@chamikaramj
Copy link
Contributor

@lostluck should someone take this over from @youngoli ?

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2023

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@beam.apache.org list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Mar 6, 2023
@github-actions
Copy link
Contributor

This pull request has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

@github-actions github-actions bot closed this Mar 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request]: Allow project to be passed to expansion services.

3 participants