Minor: Document timestamp with/without cast behavior#5826
Minor: Document timestamp with/without cast behavior#5826tustvold merged 5 commits intoapache:masterfrom
Conversation
| /// However, note that when casting from a timestamp with timezone BACK to a | ||
| /// timestamp without timezone the cast kernel does not adjust the values. | ||
| /// | ||
| /// Thus round trip casting a timestamp without timezone to a timestamp with |
There was a problem hiding this comment.
The behavior that round trip casting the timestamp CHANGES the underlying timestamp I think caused a lot of confusion (at least for me) in the context of apache/datafusion#10602
There was a problem hiding this comment.
I filed #5827 to discuss changing the behavior
|
CI integration is failing due to #5815 |
Abdullahsab3
left a comment
There was a problem hiding this comment.
Thanks for documenting this. Made the behavior much clearer. Minor remark
| //! use arrow_array::types::Float64Type; | ||
| //! use arrow_array::cast::AsArray; | ||
| //! | ||
| //! # use arrow_array::*; |
There was a problem hiding this comment.
this just makes the existing example less verbose
| /// | ||
| /// When casting from a timestamp without timezone to a timestamp with | ||
| /// timezone, the cast kernel treats the underlying timestamp values as being in | ||
| /// UTC and adjusts them to the provided timezone. |
There was a problem hiding this comment.
I don't think this is correct, it interprets the timestamp as being in the destination timezone and then adjusts the value to UTC as required.
From the docs on DataType
One possibility is to assume that the original timestamp values are relative to the epoch of the timezone being set; timestamp values should then adjusted to the Unix epoch (for example, changing the timezone from empty to “Europe/Paris” would require converting the timestamp values from “Europe/Paris” to “UTC”, which seems counter-intuitive but is nevertheless correct).
Which issue does this PR close?
Closes #.
Rationale for this change
The behavior of casting timestamps to/from timezones is quite subtle and I spent quite some time testing them out in the context of apache/datafusion#10602
Thus I thought it would be good to document this behavior in the arrow crate itself so I don't have to do that next time (and hopefully) others can benefit from it as well.
What changes are included in this PR?
Document, with examples, what happens when one casts
timestamp with timezoneto/fromtimestamp without timezoneAre there any user-facing changes?
Documentation. No changes to code