Skip to content

BigQuery: load_table_from_dataframe automatically generate schema for known dtypes #9044

@tswast

Description

@tswast

Is your feature request related to a problem? Please describe.

I want to upload a DataFrame without specifying a schema. My DataFrame has well defined dtypes for all columns, so it should be safe to serialize the DataFrame without worrying about cases such as ARRAY / STRUCT columns.

With #9042, I may get a warning in this case, though.

Describe the solution you'd like

I'd like BigQuery to automatically generate a schema based on the dtypes in the DataFrame. Ideally, this generated schema could be merged with the partial schema at #8140.

If the number of columns for the generated schema + partial schema matches that of the DataFrame (meaning, we handled all the object columns with an explicit schema), there should be no warning message printed.

Describe alternatives you've considered

  • Status quo: let pandas/pyarrow/fastparquet figure out what to do.
  • Introspect the actual objects in the object column to guess the correct type. (Slow!)

/CC @plamut

Metadata

Metadata

Assignees

Labels

api: bigqueryIssues related to the BigQuery API.type: feature request‘Nice-to-have’ improvement, new feature or different behavior or design.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions