Conversation
| } | ||
|
|
||
| #[cfg(test)] | ||
| mod tests { |
There was a problem hiding this comment.
those tests were added when there was a custom function used for shift_months (https://github.com/apache/arrow-rs/pull/2031/changes#diff-39c31d552939b8478fb676b059d5920a9c71b6659600348681c9ed7c03ea8485R52). Now that we use chrono functions, those tests are not necessary anymore
There was a problem hiding this comment.
However, this PR also adds add_months_date that is quite similar to shift_months -- do you think it is valuable to port the test to use add_months_date instead?
That would make it easier to ensure we are not changing behavior (esp with respect to over flow handling / truncation)
There was a problem hiding this comment.
sounds good, added them back. Removed the datetime ones as we don't need them and added a couple test for the integer overflow handling
scovich
left a comment
There was a problem hiding this comment.
Approach generally looks good, tho it's a lot harder to review a PR that mixes refactoring with bug fixes.
NOTE: Github isn't handling the diff very gracefully. I'm not sure why -- git diff -b handles it just fine?? Discussion here 🍿
| let epoch = NaiveDate::from_ymd_opt(1970, 1, 1).unwrap(); | ||
| let ms_per_day = 24 * 60 * 60 * 1000i64; | ||
|
|
||
| // Define the boundary dates using NaiveDate::from_ymd_opt | ||
| let max_valid_date = NaiveDate::from_ymd_opt(262142, 12, 31).unwrap(); | ||
| let min_valid_date = NaiveDate::from_ymd_opt(-262143, 1, 1).unwrap(); | ||
|
|
||
| // Calculate their millisecond values from epoch | ||
| let max_valid_millis = (max_valid_date - epoch).num_milliseconds(); | ||
| let min_valid_millis = (min_valid_date - epoch).num_milliseconds(); | ||
| let max_valid_millis = (MAX_VALID_DATE - EPOCH).num_milliseconds(); | ||
| let min_valid_millis = (MIN_VALID_DATE - EPOCH).num_milliseconds(); |
There was a problem hiding this comment.
Changes like this seem unrelated, better to split out as a "prefactor" PR that can merge quickly? Then this PR would be easier to review by not mixing refactor with bug fixes and API changes.
| /// # Arguments | ||
| /// | ||
| /// * `i` - The Date32Type to convert | ||
| #[deprecated(since = "58.0.0", note = "Use to_naive_date_opt instead.")] |
There was a problem hiding this comment.
I wonder if we should split out the #[deprecated] annotations to a separate PR so this one can merge immediately while the other waits for next major release?
There was a problem hiding this comment.
(FWIW the next release will be major so it is less of a concern)
| let res = res.add(Duration::try_milliseconds(ms as i64).unwrap()); | ||
| Date32Type::from_naive_date(res) | ||
| let res = Date32Type::to_naive_date_opt(date)?; | ||
| let res = res.checked_add_signed(Duration::try_days(days as i64)?)?; |
There was a problem hiding this comment.
I checked that add just calls into checked_add_signed and panics. Thus this is equivalent
https://docs.rs/chrono/latest/src/chrono/naive/date/mod.rs.html#1952
impl Add for NaiveDate {
1953 type Output = NaiveDate;
1954
1955 #[inline]
1956 fn add(self, rhs: TimeDelta) -> NaiveDate {
1957 self.checked_add_signed(rhs).expect("NaiveDate + TimeDelta overflowed")
1958 }
| Date32Type, | ||
| as_primitive, | ||
| Date32Type::to_naive_date, | ||
| |v| Date32Type::to_naive_date_opt(v).unwrap(), |
There was a problem hiding this comment.
This used to panic implicitly and now the panic'ing is explicit. I think that is an improvement
| } | ||
|
|
||
| #[cfg(test)] | ||
| mod tests { |
There was a problem hiding this comment.
However, this PR also adds add_months_date that is quite similar to shift_months -- do you think it is valuable to port the test to use add_months_date instead?
That would make it easier to ensure we are not changing behavior (esp with respect to over flow handling / truncation)
| // Date64Type::to_naive_date_opt has boundaries determined by NaiveDate's supported range. | ||
| // The valid date range is from January 1, -262143 to December 31, 262142 (Gregorian calendar). | ||
|
|
||
| let epoch = NaiveDate::from_ymd_opt(1970, 1, 1).unwrap(); |
There was a problem hiding this comment.
thank you for cleaning up this repetition
|
Thank you for your contribution @cht42 🙏 |
# Which issue does this PR close? - Closes #N/A (internal refactoring - no issue) # Rationale for this change Noticed while writing tests in #9144, that the current tests could be re-written to be easier to read/re-use.⚠️ FIY, I used claude to refactor those tests, I read the changes and we are keeping the same test cases. The Date64 boundary tests in `arrow-arith/src/numeric.rs` had significant code duplication. Each test function for Date64 operations (`to_naive_date_opt`, `add_year_months_opt`, `subtract_year_months_opt`, `add_day_time_opt`, `subtract_day_time_opt`, `add_month_day_nano_opt`, `subtract_month_day_nano_opt`) repeated similar setup code and boundary checks, making the test suite harder to maintain and extend. # What changes are included in this PR? This PR refactors the Date64 boundary tests by: 1. **Introducing shared constants** for commonly used values: - `MAX_VALID_DATE`, `MIN_VALID_DATE`, `EPOCH` - NaiveDate constants for chrono's valid date range 2. **Adding utility functions** to reduce repetition: - `date_to_millis(year, month, day)` - converts a date to milliseconds from epoch - `max_valid_millis()`, `min_valid_millis()`, `year_2000_millis()` - common millisecond values 3. **Consolidating similar test patterns** into parameterized helper functions: - `test_year_month_op()` - tests `add_year_months_opt` and `subtract_year_months_opt` - `test_day_time_op()` - tests `add_day_time_opt` and `subtract_day_time_opt` - `test_month_day_nano_op()` - tests `add_month_day_nano_opt` and `subtract_month_day_nano_opt` 4. **Reducing 8 separate test functions to 4** while maintaining the same test coverage Net result: **-297 lines** (163 added, 460 removed) with equivalent functionality. # Are these changes tested? Yes - this is a refactoring of existing tests. The same boundary conditions and edge cases are still tested, just organized more efficiently. Running `cargo test` confirms all tests pass. # Are there any user-facing changes? No. This is an internal test refactoring with no changes to public APIs or functionality. --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
|
updated with the new tests, had to change them a bit again to account for the types difference |
# Which issue does this PR close? - Closes #N/A (internal refactoring - no issue) # Rationale for this change Noticed while writing tests in apache#9144, that the current tests could be re-written to be easier to read/re-use.⚠️ FIY, I used claude to refactor those tests, I read the changes and we are keeping the same test cases. The Date64 boundary tests in `arrow-arith/src/numeric.rs` had significant code duplication. Each test function for Date64 operations (`to_naive_date_opt`, `add_year_months_opt`, `subtract_year_months_opt`, `add_day_time_opt`, `subtract_day_time_opt`, `add_month_day_nano_opt`, `subtract_month_day_nano_opt`) repeated similar setup code and boundary checks, making the test suite harder to maintain and extend. # What changes are included in this PR? This PR refactors the Date64 boundary tests by: 1. **Introducing shared constants** for commonly used values: - `MAX_VALID_DATE`, `MIN_VALID_DATE`, `EPOCH` - NaiveDate constants for chrono's valid date range 2. **Adding utility functions** to reduce repetition: - `date_to_millis(year, month, day)` - converts a date to milliseconds from epoch - `max_valid_millis()`, `min_valid_millis()`, `year_2000_millis()` - common millisecond values 3. **Consolidating similar test patterns** into parameterized helper functions: - `test_year_month_op()` - tests `add_year_months_opt` and `subtract_year_months_opt` - `test_day_time_op()` - tests `add_day_time_opt` and `subtract_day_time_opt` - `test_month_day_nano_op()` - tests `add_month_day_nano_opt` and `subtract_month_day_nano_opt` 4. **Reducing 8 separate test functions to 4** while maintaining the same test coverage Net result: **-297 lines** (163 added, 460 removed) with equivalent functionality. # Are these changes tested? Yes - this is a refactoring of existing tests. The same boundary conditions and edge cases are still tested, just organized more efficiently. Running `cargo test` confirms all tests pass. # Are there any user-facing changes? No. This is an internal test refactoring with no changes to public APIs or functionality. --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
| } | ||
|
|
||
| #[test] | ||
| fn test_shift_months_datetime() { |
There was a problem hiding this comment.
we don't need those datetime tests anymore, add_months_date only accepts NaiveDate
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - From this discussion: apache#19711 (comment) ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> In above PR we didn't enable a test because upstream arrow-rs had a panic bug. This is now fixed: - apache/arrow-rs#9144 So now we can enable this test again to assert expected error. ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
Which issue does this PR close?
This PR does the same thing as what was done in #7737 to handle overflow errors gracely instead of panicking.
Rationale for this change
What changes are included in this PR?
Are these changes tested?
yes
Are there any user-facing changes?