Skip to content

BigQuery: provide option to omit 'insertId' for 'insert_rows' and 'insert_rows_json'. #9539

@mryerse

Description

@mryerse

Streaming inserts to BigQuery can achieve 1GB per second but only when insertId is omitted. Currently both methods from the google-cloud-bigquery client that implement the BigQuery insertAll REST API method, insert_rows and insert_rows_json, automatically add insertId preventing anyone from making use of the higher throughput limits.

Proposing that an argument be added to one or both of these methods like "insertId=True" (defaulting to true). When false, uuid is not used to add insert IDs. Furthermore, documentation is updated to clearly articulate that omitting insertId can result in duplicate records upon retry of failed API calls according to this guide.

Metadata

Metadata

Assignees

Labels

api: bigqueryIssues related to the BigQuery API.type: feature request‘Nice-to-have’ improvement, new feature or different behavior or design.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions