What happened?
Was testing FILE_LOADS streaming writes and found that when dynamic destinations are set and copy jobs are used (ie. large data) and CREATE_IF_NEEDED is set, only the first table is created. For example, if I'm writing to two tables A and B, it becomes a race condition on which copy job is seen first in the pipeline. If copy job to table A is performed first, then table A will be created and all subsequent copy jobs to table B will fail with an error similar to the following:
WARNING: Load job beam_bq_job_COPY_testpipelineahmedabualsaud0905154941c26eb941_17a94e2694554455aa31cca9f9389b49_4e2479b4160d04b56a8075645f4974e1_00003-0 failed, will retry: {
"errorResult" : {
"message" : "Not found: Table <project>:<dataset>.mytable_B",
"reason" : "notFound"
},
"errors" : [ {
"message" : "Not found: Table <project>:<dataset>.mytable_B",
"reason" : "notFound"
} ],
"state" : "DONE"
}. Next job id beam_bq_job_COPY_testpipelineahmedabualsaud0905154941c26eb941_17a94e2694554455aa31cca9f9389b49_4e2479b4160d04b56a8075645f4974e1_00003-1
What we would expect instead is for all tables to be created.
P.S. not seeing this behavior in batch mode
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components