-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Is your feature request related to a problem? Please describe.
I want to upload a DataFrame without specifying a schema. My DataFrame has well defined dtypes for all columns, so it should be safe to serialize the DataFrame without worrying about cases such as ARRAY / STRUCT columns.
With #9042, I may get a warning in this case, though.
Describe the solution you'd like
I'd like BigQuery to automatically generate a schema based on the dtypes in the DataFrame. Ideally, this generated schema could be merged with the partial schema at #8140.
If the number of columns for the generated schema + partial schema matches that of the DataFrame (meaning, we handled all the object columns with an explicit schema), there should be no warning message printed.
Describe alternatives you've considered
- Status quo: let pandas/pyarrow/fastparquet figure out what to do.
- Introspect the actual objects in the object column to guess the correct type. (Slow!)
/CC @plamut