-
Notifications
You must be signed in to change notification settings - Fork 4k
Description
Hi,
Not sure if I am missing something, but I am unable to get pyarrow to parse my datetimes that are being inferred as strings, to be timestamps.
My strings are arriving in CSVs with this format: '2015-01-09 00:00:00.000'
I have tried:
convert_ops = csv.ConvertOptions(timestamp_parsers=['%Y-%m-%d %H:%M:%S.%f])
df = csv.read_csv('path_to_csv', convert_options=convert_opts)
print(df.schema)
This yields no change and has my columns with these formatted timestamps still showing as strings.
Additionally, I have tried casting as well:
dfschema = pa.schema([
('date_column', pa.timestamp('ms'))
])
df = csv.read_csv('path_to_csv')
df.cast(target_schema=dfschema)
This way yields the error: "pyarrow.lib.ArrowInvalid: Failed to parse string: 2015-01-09 00:00:00.000"
I am using pyarrow=1.0.1 on a linux docker container.
I tried to send an email to the users email list but it keeps returning a Mailer Daemon error.
Reporter: Gary
Watchers: Rok Mihevc / @rok
Related issues:
- [C++] Strptime issues umbrella (is a child of)
- [C++] Support for fractional seconds in strptime() (is duplicated by)
Note: This issue was originally created as ARROW-9907. Please see the migration documentation for further details.