Skip to content

Conversation

@jbrockmendel
Copy link
Member

This makes tz_convert_dst follow the same pattern we use elsewhere, and brings a perf improvement along with it.

In [2]: dti = pd.date_range("2016-01-01", periods=10000, freq="S", tz="US/Pacific")                                                                                                                          

In [3]: %timeit dti.tz_localize(None)                                                                                                                                                                        
89.3 µs ± 627 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)  # <-- PR
100 µs ± 6.36 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)  # <-- master

The boost comes from not doing a copy, so makes a bigger difference in the scalar case than the array case.

In [5]: ts = dti[0]                                                                                                                                                                                          

In [6]: %timeit ts.tz_localize(None)                                                                                                                                                                         
12.4 µs ± 188 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)   # <-- PR
16.1 µs ± 215 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)   # <-- master

The improvement is much bigger for fixed offsets:

In [9]: tz = pytz.FixedOffset(-60)                                                                                                                                                                           
In [10]: dti2 = dti.tz_convert(tz)                                                                                                                                                                           

In [11]: %timeit dti2.tz_localize(None)                                                                                                                                                                      
34.8 µs ± 1.09 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)  # <-- PR
71.8 µs ± 2.99 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)  # <-- master

@gfyoung gfyoung added Performance Memory or execution speed performance Datetime Datetime data dtype Benchmark Performance (ASV) benchmarks labels Jul 3, 2020
@jreback jreback added this to the 1.1 milestone Jul 7, 2020
@jreback jreback merged commit d90ca45 into pandas-dev:master Jul 7, 2020
@jbrockmendel jbrockmendel deleted the ref-tz_convert-4 branch July 7, 2020 01:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Benchmark Performance (ASV) benchmarks Datetime Datetime data dtype Performance Memory or execution speed performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants