Skip to content

Storage+BigQuery.import notFound errors #1177

@jakeorr

Description

@jakeorr

I am uploading files to Google Cloud Storage and then importing them to Big Query and am seeing an intermittent notFound error during the import job.

Here is a simplified version of how I am streaming files to GCS:

rs.pipe(bucket.file('[filename]').createWriteStream({
  metadata: {
    contentType: 'text/csv',
    metadata: {
      custom: 'metadata'
    }
}).on('finish', function () {});

After file streaming has finished, the next step is importing to Big Query:

bigQuery
  .dataset([datasetId])
  .table([tableId])
  .import([bucket.file('[filename]'], {
    createDisposition: 'CREATE_IF_NEEDED',
    writeDisposition: 'WRITE_TRUNCATE',
    sourceFormat: 'CSV',
    schema: {
      fields: [fields array]
    }
  }, function (err, bqJob) {
    // Executes successfully
  });

I am seeing the error in the response I get from job.getMetadata():

bigQuery.job(bqJob.id).getMetadata(function (err, response) {});

response.status.errorResult.message: Not found: URI gs://[bucket]/[filename]
response.status.errorResult.reason: notFound
response.status.errorResult.errors: [{"reason":"notFound","message":"Not found: URI gs://[bucket]/[filename]"}]

After receiving this error, I have confirmed that the file in question does exist in GCS. Can there be some lag between when a stream has finished and when the file is actually available? If so is there a better way to know when the file(s) are ready for import into big query?

To be clear, this flow of events usually completes successfully, but I am intermittently seeing the notFound error. I would like to track it down or at least find a way to avoid it. Is this a known issue? (I haven't been able to find mention of it yet). Is it possible that rate limiting it occurring and causing this error to be generated?

Metadata

Metadata

Assignees

Labels

api: bigqueryIssues related to the BigQuery API.api: storageIssues related to the Cloud Storage API.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions