Skip to content

Conversation

@alorenzo175
Copy link
Contributor

BUG: Fixes #7777, HDFStore.read_column did not preserve timezone information
when fetching a DatetimeIndex column with tz=UTC

@alorenzo175 alorenzo175 changed the title read_column did not preserve UTC tzinfo BUG: read_column did not preserve UTC tzinfo Jul 18, 2014
@jreback jreback added this to the 0.15.0 milestone Jul 18, 2014
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to refactor the tz handling here: https://github.com/pydata/pandas/blob/master/pandas/io/pytables.py#L1467
then I think you can directly reuse it

@alorenzo175
Copy link
Contributor Author

Hmm, I'm not quite sure that will work. self.values probably needs to remain an Index, and when a Series is constructed from an index, it calls _to_embed which removes UTC tzinfo. That's why I workaround _to_embed by constructing the series from a list instead of an index.

@jreback
Copy link
Contributor

jreback commented Jul 18, 2014

that's not what I mean, re-factor it out as a function which you can call from both places, rather than duplicated the code

@alorenzo175
Copy link
Contributor Author

You mean something like the following that can be called in convert and when returning the series (with preserve_UTC = True)?

def _set_tz(values, tz, preserve_UTC=False):
    if tz is not None and isinstance(values, Index):
        tz = _ensure_decoded(tz)
        if values.tz is None:
            values = values.tz_localize('UTC').tz_convert(tz)
        if preserve_UTC:
            if tz == pytz.utc:
                values = list(values)

    return values

@jreback
Copy link
Contributor

jreback commented Jul 18, 2014

yep

though I think you can simply do tz == 'UTC' (which works for dateutil and pytz)

@jreback
Copy link
Contributor

jreback commented Jul 18, 2014

cc @dbew

@alorenzo175
Copy link
Contributor Author

You can't do tz == 'UTC', but you can do tslib.get_timezone(tz) == 'UTC'.

@jreback
Copy link
Contributor

jreback commented Jul 18, 2014

@alorenzo175 ok, gr8!

hmm maybe should document that somewhere .....

@dbew
Copy link
Contributor

dbew commented Jul 21, 2014

That looks good to me and the tests are sensible.

@jreback
Copy link
Contributor

jreback commented Jul 21, 2014

@alorenzo175 this look good. just need release note in v0.15.0 (reference original issue in bug fix section). and I think good to go

@alorenzo175
Copy link
Contributor Author

@jreback should be good to go now

@jreback
Copy link
Contributor

jreback commented Jul 22, 2014

perfect, now just squash to a snigle commit and then I can merge, see here: https://github.com/pydata/pandas/wiki/Using-Git

@alorenzo175
Copy link
Contributor Author

squashed

jreback added a commit that referenced this pull request Jul 22, 2014
BUG: read_column did not preserve UTC tzinfo
@jreback jreback merged commit a0a25c3 into pandas-dev:master Jul 22, 2014
@jreback
Copy link
Contributor

jreback commented Jul 22, 2014

@alorenzo175 thanks for the fix!

@alorenzo175 alorenzo175 deleted the pytables_index_tzutc branch July 22, 2014 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug IO HDF5 read_hdf, HDFStore Timezones Timezone data dtype

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: select_column not preserving a UTC timezone

3 participants