Skip to content

Conversation

@phofl
Copy link
Member

@phofl phofl commented Nov 26, 2021

_agg_index and _convert_to_ndarryays need a refactor in the future, they both contain more or less the same casting logic now

@phofl phofl added Bug IO CSV read_csv, to_csv labels Nov 26, 2021
@jreback jreback added this to the 1.4 milestone Nov 28, 2021
@jreback
Copy link
Contributor

jreback commented Nov 28, 2021

_agg_index and _convert_to_ndarryays need a refactor in the future, they both contain more or less the same casting logic now

sgtm

@jreback jreback merged commit 8ffa2a9 into pandas-dev:master Nov 28, 2021
@phofl phofl deleted the 9435 branch November 28, 2021 01:46
@gaow
Copy link

gaow commented Feb 3, 2022

Unfortunately, it does not work for cases where dtype is specified globally:

>>> import pandas as pd
>>> from io import StringIO
>>> data = "1,a\n2,b"
>>> df = pd.read_csv(StringIO(data), index_col=0, dtype=str, header=None)
>>> df.index
Int64Index([1, 2], dtype='int64', name=0)

whereas this works

>>> df = pd.read_csv(StringIO(data), index_col=0, dtype={0:str}, header=None)
>>> df.index
Index(['1', '2'], dtype='object', name=0)

However i need to set all columns str globally. So I expect the first chunk of code works and turns index into str.

@phofl
Copy link
Member Author

phofl commented Feb 3, 2022

Hm I think I prefer the current behavior. But you could open an issue to gather feedback from others.

@phofl
Copy link
Member Author

phofl commented Feb 3, 2022

One additional note: If we should decide to change that, we need a deprecation cycle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug IO CSV read_csv, to_csv

Projects

None yet

Development

Successfully merging this pull request may close these issues.

index_col in read_csv and read_table ignores dtype argument

3 participants