feat(spark): implement add_months function#19711
Conversation
|
|
||
| query D | ||
| SELECT date_sub('2016-07-30'::date, 0::int); | ||
| SELECT date_add('2016-07-30'::date, 0::int); |
| @@ -15,13 +15,38 @@ | |||
| # specific language governing permissions and limitations | |||
| # under the License. | |||
|
|
|||
There was a problem hiding this comment.
Does spark have a behaviour for overflow/result date too large that we need to consider?
There was a problem hiding this comment.
saprk will error with ARITHMETIC_OVERFLOW. I tried adding the following test case SELECT add_months('2016-07-30'::date, 2147483647::int); and im getting External error: task 35 panicked with message "NaiveDate + Months out of range". So in a way we have the same behavior as spark but its not testable ?
There was a problem hiding this comment.
Oh this is interesting; panicking is definitely not good, and I can reproduce that on main too:
DataFusion CLI v51.0.0
> select '2016-07-30'::date + arrow_cast(2147483647::int, 'Interval(YearMonth)');
thread 'main' (15797910) panicked at /Users/jeffrey/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/chrono-0.4.42/src/naive/date/mod.rs:2005:41:
`NaiveDate + Months` out of range
note: run with `RUST_BACKTRACE=1` environment variable to display a backtraceWe should raise this as a separate issue as this should be erroring not panicking
There was a problem hiding this comment.
traced it back to https://github.com/chronotope/chrono/blob/1c0b8f011ab2f2e53c195df1866a1fb4c7fd193a/src/naive/date/mod.rs#L2034
0: __rustc::rust_begin_unwind
at /rustc/ded5c06cf21d2b93bffd5d884aa6e96934ee4234/library/std/src/panicking.rs:698:5
1: core::panicking::panic_fmt
at /rustc/ded5c06cf21d2b93bffd5d884aa6e96934ee4234/library/core/src/panicking.rs:80:14
2: core::panicking::panic_display
at /rustc/ded5c06cf21d2b93bffd5d884aa6e96934ee4234/library/core/src/panicking.rs:264:5
3: core::option::expect_failed
at /rustc/ded5c06cf21d2b93bffd5d884aa6e96934ee4234/library/core/src/option.rs:2183:5
4: core::option::Option<T>::expect
at /Users/chuet/.rustup/toolchains/1.92.0-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/option.rs:970:21
5: <chrono::naive::date::NaiveDate as core::ops::arith::Add<chrono::month::Months>>::add
at /Users/chuet/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/chrono-0.4.42/src/naive/date/mod.rs:2005:41
6: arrow_array::delta::shift_months
at /Users/chuet/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-array-57.1.0/src/delta.rs:36:30
7: arrow_array::types::Date32Type::add_year_months
at /Users/chuet/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-array-57.1.0/src/types.rs:934:25
8: <arrow_array::types::Date32Type as arrow_arith::numeric::DateOp>::add_year_month
at /Users/chuet/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-arith-57.1.0/src/numeric.rs:554:12
9: arrow_arith::numeric::date_op::{{closure}}
at /Users/chuet/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-arith-57.1.0/src/numeric.rs:775:47
10: arrow_arith::arity::try_binary_no_nulls
at /Users/chuet/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-arith-57.1.0/src/arity.rs:385:35
11: arrow_arith::arity::try_binary
at /Users/chuet/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-arith-57.1.0/src/arity.rs:274:9
12: arrow_arith::numeric::date_op
at /Users/chuet/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-arith-57.1.0/src/numeric.rs:295:46
13: arrow_arith::numeric::arithmetic_op
at /Users/chuet/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-arith-57.1.0/src/numeric.rs:249:24
14: arrow_arith::numeric::add_wrapping
at /Users/chuet/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-arith-57.1.0/src/numeric.rs:40:5
15: core::ops::function::Fn::call
at /Users/chuet/.rustup/toolchains/1.92.0-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/ops/function.rs:79:5
16: datafusion_physical_expr_common::datum::apply
at /Users/chuet/Projects/foundry/datafusion/datafusion/physical-expr-common/src/datum.rs:51:25
17: <datafusion_physical_expr::expressions::binary::BinaryExpr as datafusion_physical_expr_common::physical_expr::PhysicalExpr>::evaluate
There was a problem hiding this comment.
opened apache/arrow-rs#9144 to fix the panic in arrow
There was a problem hiding this comment.
@Jefffrey should we merge this and add the test back after next arrow upgrade ? or wait for the upgrade first ?
There was a problem hiding this comment.
Let's comment out the test so we can merge this PR; can re-enable the test later when the upstream fix comes in
a0282fe to
44892cd
Compare
|
Thanks @cht42 |
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Part of apache#15914 - Closes apache#19710 ## Rationale for this change Implementation of spark `add_months` function. ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? yes ## Are there any user-facing changes? yes
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - From this discussion: apache#19711 (comment) ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> In above PR we didn't enable a test because upstream arrow-rs had a panic bug. This is now fixed: - apache/arrow-rs#9144 So now we can enable this test again to assert expected error. ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
Which issue does this PR close?
datafusion-sparkSpark Compatible Functions #15914Rationale for this change
Implementation of spark
add_monthsfunction.What changes are included in this PR?
Are these changes tested?
yes
Are there any user-facing changes?
yes