-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Versions:
google-cloud-bigquery==1.19.0
We were initially using 1.18.0 but I noticed #9044 was included in 1.19.0 so we tried that as well, but it made no difference.
We're using a pandas dataframe read from parquet, example data would be
id status created_at execution_date
0 1 NEW 2018-09-12 09:24:31.291 2019-05-10 17:40:00
1 2 NEW 2018-09-12 09:26:45.890 2019-05-10 17:40:00
[628 rows x 4 columns]
df.dtypes shows:
id object
status object
created_at datetime64[ns]
dataplatform_execution_date datetime64[ns]
dtype: object
When trying to load this into BigQuery using load_table_from_dataframe() and setting the job config's time_partitioning to bigquery.table.TimePartitioning(field="execution_date") we get the following error:
The field specified for time partitioning can only be of type TIMESTAMP or DATE. The type found is: INTEGER.
Which doesn't really make sense, since the field is clearly a datetime64.
The job config shown in the console looks correct (ie it's set to partitioned by day and it's using the correct field).
edit:
It seems the cause for this is that Dataframe columns of type datetime64 are being converted to type INTEGER instead of DATE (or TIMESTAMP? I'm not sure which one would be the correct type in BigQuery).
edit2:
Could it be this mapping is wrong and it should be DATE instead of DATETIME?
| "datetime64[ns]": "DATETIME", |