Problem:
When using pyarrow.Table.from_pandas to load a pandas DataFrame which contains a timestamp object with timezone information, the created Table object will shift the datetime, while still keeping the timezone information. Please see my scripts.
Reproduce scripts:
import pandas as pd
import pyarrow
ts = pd.Timestamp("2022-10-21 22:46:17", tz="America/Los_Angeles")
df = pd.DataFrame({"TS": [ts]})
table = pyarrow.Table.from_pandas(df)
print(df)
"""
TS
0 2022-10-21 22:46:17-07:00
"""
print(table)
"""
pyarrow.Table
TS: timestamp[ns, tz=America/Los_Angeles]
----
TS: [[2022-10-22 05:46:17.000000000]]"""
Expected results:
The table should not shift the datetime when timezone information is provided.
Environment: MacOS M1, Python 3.8.13
Reporter: Adam Ling
Related issues:
Note: This issue was originally created as ARROW-18298. Please see the migration documentation for further details.