Discussion: how to handle the new Int64 (nullable integer) dtype with pandas 0.24.0

Currently unreleased, but pandas 0.24.0 will add an extension dtype to allow a nullable integer dtype:  http://pandas-docs.github.io/pandas-docs-travis/integer_na.html#integer-na Unfortunately, we won't use it with our current logic of deferring to the DataFrame constructor for type inference.

> It [Int64, nullable integer] is not the default dtype for integers, and will not be inferred; you must explicitly pass the dtype into array() or Series.

The question is how can we support this dtype in pandas-gbq? I see a few options.

- Use `pd.Int64Dtype()` by default for nullable integer columns, similar to how previously pandas-gbq defaulted to `string` for integer columns.
  - Con: ties new versions of pandas-gbq to 0.24.0+
- Use `pd.Int64Dtype()` for nullable integer columns when pandas-gbq 0.24.0+ is installed.
  - Con: inconsistent with pandas.
  - Con: unable to turn this feature off when float is desired (perhaps for performance reasons).
- Add an argument to `read_gbq` which is a map of column names to dtypes, overriding the dtype of any column present.
  - Con: float isn't the safest default for nullable integer columns, but at least it's consistent with pandas.
  - Con: will require reading rows into separate Series before constructing a DataFrame, as the DataFrame constructor only accepts a single dtype.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Discussion: how to handle the new Int64 (nullable integer) dtype with pandas 0.24.0 #242

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Discussion: how to handle the new Int64 (nullable integer) dtype with pandas 0.24.0 #242

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions