Skip to content

Conversation

@litchfield
Copy link

No description provided.

@chris-b1
Copy link
Contributor

chris-b1 commented Nov 6, 2015

I'm not sure those keywords actually do anything with the excel parser?

In [12]: dti = pd.date_range('2014-1-1', periods=10)

In [13]: df = pd.DataFrame({'dates':dti, 'strings':dti.strftime('%m/%d/%Y')})

In [14]: df.dtypes
Out[14]: 
dates      datetime64[ns]
strings            object
dtype: object

In [15]: df.to_excel('test.xlsx')

In [16]: pd.read_excel('test.xlsx').dtypes
Out[16]: 
dates      datetime64[ns]
strings            object
dtype: object

In [17]: pd.read_excel('test.xlsx', parse_dates=True).dtypes
Out[17]: 
dates      datetime64[ns]
strings            object
dtype: object

In [18]: pd.read_excel('test.xlsx', parse_dates=False).dtypes
Out[18]: 
dates      datetime64[ns]
strings            object
dtype: object

@jreback
Copy link
Contributor

jreback commented Nov 7, 2015

yeh, I don't think these are valid keywords. Actually what we really need here is a check on non-implemented keywords. Closing this and I will create another issue.

@jorisvandenbossche
Copy link
Member

@chris-b1 It actually even gives an error and is not just ignored. With your example (as parse_dates=True if for parsing the index, so if you want to see if it can parse the string column, you have to pass its name):

In [37]: pd.read_excel('test.xlsx', parse_dates=['strings'])

....

C:\Anaconda\lib\site-packages\pandas\io\parsers.pyc in _should_parse_dates(self,
 i)
    812             return self.parse_dates
    813         else:
--> 814             name = self.index_names[i]
    815             j = self.index_col[i]
    816

TypeError: 'NoneType' object has no attribute '__getitem__'

But if you set the strings column as the index, parse_dates=True is indeed ignored.

@jorisvandenbossche
Copy link
Member

@chris-b1 This error actually only happens if you have an implicit index due to the structure of the excel file. If you don't have this, parse_dates works as expected:

In [46]: df.to_excel('test.xlsx', index=False)

In [47]: pd.read_excel('test.xlsx').dtypes
Out[47]:
dates      datetime64[ns]
strings            object
dtype: object

In [48]: pd.read_excel('test.xlsx', parse_dates=['strings']).dtypes
Out[48]:
dates      datetime64[ns]
strings    datetime64[ns]
dtype: object

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

IO Excel read_excel, to_excel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants