chore: upgrade datafusion to 50.0.0 and arrow to 56.1#4751
chore: upgrade datafusion to 50.0.0 and arrow to 56.1#4751westonpace merged 4 commits intolance-format:mainfrom
Conversation
bbfcabc to
a29bc29
Compare
|
@ddupg Could you help fix the build issues? |
@LuQQiu Sure, now blocked in datafusion-python is not published to pypi, but it may take a few weeks apache/datafusion#16799 (comment) |
Hmm...we use the FFI for our boundary between datafusion-python and lance rust which in theory has a stable ABI. Do we know why we have a hard pin on the datafusion-python version? Do the tests pass if we loosen the pin? |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4751 +/- ##
==========================================
+ Coverage 80.85% 81.38% +0.52%
==========================================
Files 333 333
Lines 131944 130020 -1924
Branches 131944 130020 -1924
==========================================
- Hits 106687 105811 -876
+ Misses 21508 20653 -855
+ Partials 3749 3556 -193
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| let mut added_byte_size = stats | ||
| .num_rows | ||
| .map(|n| (n * 8).max(64)) | ||
| .map(|n| n * 8) |
Related to #4577, maybe Json type has a requirement for datafusion version.
The currently failing tests are not related to datafusion version, I'm trying to fix it. |
|
The failing test python/tests/test_scalar_index.py::test_null_handling is seems to be caused by a bug of DataFusion rust 50.0.0, and I have raised an issue apache/datafusion#17681 to the DataFusion |
This is not a bug with DataFusion, it seems to be because Lance using indexes and DataFusion have different logic on how to handle NULL. After apache/datafusion#17088, datafusion simplifies the handling of NULL expressions, which affects the results of the test python/tests/test_scalar_index.py::test_null_handling. In rust/lance-index/src/scalar/expression.rs:create_filter_plan https://github.com/lancedb/lance/blob/5c60975b2c032314304ca1d38865d6eefde4d790/rust/lance-index/src/scalar/expression.rs#L1560, if the filter executed is but after and then Can @westonpace please give me some help on how to fix this test? |
|
This appears to be a duplicate of my PR: #4778 but since you're actively working on the issue I was just looking at with the null unit test failure, I can close my PR in favor of this one. |
|
Thank you @timsaucer for sharing this. I also chatted with @ddupg offline a bit, sorry I forgot to share the analysis here. I think we did not handle the case of a In the past because the right hand side is a |
I do want to point out that "fixing" the issue by handling the literal null boolean will still give different results than an equivalent SQL command. From a SQL perspective not returning the row with the null IS the correct behavior, even though I think it is counter intuitive. |
|
Thank @jackye1995 @timsaucer both for sharing and thinking, which helped me a lot
Agreed. My personal opinion is that root cause stems from the different semantics of handling NULL when indexing is enabled or not. This is weird behavior, so I think we should make semantics consistent first. I proposed a draft #4812 to solve this problem. This will fix the currently failed test. Secondly, the DataFusion upgrade exposed a situation where the index is not working as expected. DF 50 optimizes Expr of NULL handling, for example, the Expr |
|
I agree that we should migrate to the SQL behavior and I agree with @ddupg's summary. Shall I make a PR to fix the second issue? |
|
Also, there is a third issue, reported here, which will unfortunately be quite a bit trickier to fix. I don't think that one needs to block the DF upgrade however. |
c6493af to
a28bf28
Compare
|
looks like there are some additional errors, any clue? |
This unusual local path |
|
hmm this is quite strange, I don't see any change that could affect the windows path handling... any clue? @westonpace @wjones127 @timsaucer |
I think I've seen this happen in CI in this repo before, but having trouble finding an exact example from my PRs. In the past rerunning CI has resolved it. |
1597a63 to
ccac35f
Compare
|
These new failures I'm also seeing on PRs that are completely unrelated. example: https://github.com/lancedb/lance/actions/runs/18080818538/job/51443808660?pr=4834 |
Agreed, the test_zonemap_zone_size test was fixed on main |
|
I'll see if I can make sense of the Windows failures or at least see what changed |
9ed1a11 to
0cd5af9
Compare
|
I rebased the main branch code and reran the tests multiple times. This error consistently occurs in Windows. |
|
Tracked down the root cause at least. Still working on fix. The upgrade to DF caused a transitive dependency ( Anyways, I'm sure there is just some magic incantation of path manipulation that has to happen. My trial and error continues 🫠 |
| version = "2.5.7" | ||
| source = "registry+https://github.com/rust-lang/crates.io-index" | ||
| checksum = "32f8b686cadd1473f4bd0117a5d28d36b1ade384ea9b5069a1c40aefed7fda60" | ||
| checksum = "08bc136a29a3d1758e07a9cca267be308aeebf5cfd5a10f3f67ab2097683ef5b" | ||
| dependencies = [ | ||
| "form_urlencoded", | ||
| "idna", | ||
| "percent-encoding", | ||
| "serde", |
There was a problem hiding this comment.
Since there are issues filed upstream in both object-store and url crates, can we simply revert these lines of the lock file? Or we can pin url in the cargo.toml to 2.5.4
There was a problem hiding this comment.
Datafusion asks for 2.5.7 specifically so we can't revert sadly 😢
|
I've got a fix (more of a workaround) in progress that avoids these tricky paths by reworking how we do temporary directories: #4860 |
| enable_options_value_normalization: false, | ||
| collect_spans: false, | ||
| map_string_types_to_utf8view: false, | ||
| default_null_ordering: NullOrdering::NullsMax, |
There was a problem hiding this comment.
Do we know if this was the same as before?
There was a problem hiding this comment.
I saw the PR apache/datafusion#16963, nulls_max is the default behavior.
| version = "2.5.7" | ||
| source = "registry+https://github.com/rust-lang/crates.io-index" | ||
| checksum = "32f8b686cadd1473f4bd0117a5d28d36b1ade384ea9b5069a1c40aefed7fda60" | ||
| checksum = "08bc136a29a3d1758e07a9cca267be308aeebf5cfd5a10f3f67ab2097683ef5b" | ||
| dependencies = [ | ||
| "form_urlencoded", | ||
| "idna", | ||
| "percent-encoding", | ||
| "serde", |
There was a problem hiding this comment.
Datafusion asks for 2.5.7 specifically so we can't revert sadly 😢
|
The Windows issue is fixed in #4860 if anyone is up for a review (I'm shamelessly 🎣 for reviewers) |
|
Ok, merged. If you rebase then hopefully you are good to go. |
Change-Id: Iedc2a3133d2caf8f9630edde5723618a244cb958
Change-Id: Ic0a4e492eb128316df95d04e3780547100b93de1
Change-Id: Iddf384fafe1a59157b780eea0ffa8e037599d372
Change-Id: I833aa60aee64fd06907c1de10a973d3db7c93ec3
0cd5af9 to
77f4aa7
Compare
Thank @westonpace very much for the fix, all the tests have passed after I rebased |
|
Could @jackye1995 @westonpace @timsaucer please help review this? |
jackye1995
left a comment
There was a problem hiding this comment.
thank you for pushing this through!
) This PR is to update the following dependencies: - DataFusion to 50.0.0 - Arrow to 56.1 I am working on supporting GEO type lance-format#4678 , which is blocked by the DataFusion upgrade. Recently, DataFusion released 50 apache/datafusion#16799 , so follow up on the upgrade Closes lance-format#4753


This PR is to update the following dependencies:
I am working on supporting GEO type #4678 , which is blocked by the DataFusion upgrade. Recently, DataFusion released 50 apache/datafusion#16799 , so follow up on the upgrade
Closes #4753