Skip to content

[Bug]: [Java BQ FILE_LOADS] When streaming to dynamic destinations with copy jobs and CREATE_IF_NEEDED, only the first destination's table is created #28309

@ahmedabu98

Description

@ahmedabu98

What happened?

Was testing FILE_LOADS streaming writes and found that when dynamic destinations are set and copy jobs are used (ie. large data) and CREATE_IF_NEEDED is set, only the first table is created. For example, if I'm writing to two tables A and B, it becomes a race condition on which copy job is seen first in the pipeline. If copy job to table A is performed first, then table A will be created and all subsequent copy jobs to table B will fail with an error similar to the following:

WARNING: Load job beam_bq_job_COPY_testpipelineahmedabualsaud0905154941c26eb941_17a94e2694554455aa31cca9f9389b49_4e2479b4160d04b56a8075645f4974e1_00003-0 failed, will retry: {
  "errorResult" : {
    "message" : "Not found: Table <project>:<dataset>.mytable_B",
    "reason" : "notFound"
  },
  "errors" : [ {
    "message" : "Not found: Table <project>:<dataset>.mytable_B",
    "reason" : "notFound"
  } ],
  "state" : "DONE"
}. Next job id beam_bq_job_COPY_testpipelineahmedabualsaud0905154941c26eb941_17a94e2694554455aa31cca9f9389b49_4e2479b4160d04b56a8075645f4974e1_00003-1

What we would expect instead is for all tables to be created.

P.S. not seeing this behavior in batch mode

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions