-
Notifications
You must be signed in to change notification settings - Fork 126
Closed
Description
Currently unreleased, but pandas 0.24.0 will add an extension dtype to allow a nullable integer dtype: http://pandas-docs.github.io/pandas-docs-travis/integer_na.html#integer-na Unfortunately, we won't use it with our current logic of deferring to the DataFrame constructor for type inference.
It [Int64, nullable integer] is not the default dtype for integers, and will not be inferred; you must explicitly pass the dtype into array() or Series.
The question is how can we support this dtype in pandas-gbq? I see a few options.
- Use
pd.Int64Dtype()by default for nullable integer columns, similar to how previously pandas-gbq defaulted tostringfor integer columns.- Con: ties new versions of pandas-gbq to 0.24.0+
- Use
pd.Int64Dtype()for nullable integer columns when pandas-gbq 0.24.0+ is installed.- Con: inconsistent with pandas.
- Con: unable to turn this feature off when float is desired (perhaps for performance reasons).
- Add an argument to
read_gbqwhich is a map of column names to dtypes, overriding the dtype of any column present.- Con: float isn't the safest default for nullable integer columns, but at least it's consistent with pandas.
- Con: will require reading rows into separate Series before constructing a DataFrame, as the DataFrame constructor only accepts a single dtype.
Metadata
Metadata
Assignees
Labels
No labels