-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Is your feature request related to a problem or challenge?
After #8886 (thanks to @Omega359) DataFusion supports converting strings to timestamps using a string format:
SELEECT to_timestamp('2020-09-08T12:00:00+00:00', '2020-09-08 12/00/00+00:00', '%c', '%+', '%Y-%m-%d %H/%M/%s%#z'Which will parse '2020-09-08T12:00:00+00:00' with several possible formats %c', '%+', '%Y-%m-%d
However, as @comphead points out, the format used is specific to chrono , the underlying Rust library used. These are slightly different semantics than any existing to_timestamp (it isn't postgres format strings, nor is it spark format strings, it is something datafusion specific based on the rust chrono format strings)
Describe the solution you'd like
Ideally users could decide what "dialect" of string format specifiers they wanted to support based on configuration option. For example, either postgres or spark,
However, this is non trivial given the scope of those two implementations
Describe alternatives you've considered
Users can always use DataFusion's user defined functions to define the semantics they want, for example with a ScalarUDF that rewrites the specified time string from a postgres format into the chrono format
(though there are likely all sorts of corner cases -- see #8886 (comment))
Additional context
@jhorstmann has notes about Postgres: #5398 (comment)
@Omega359 notes that the spark format library is entirely different still: #5398 (comment)