feat: support create btree index distributedly by xloya · Pull Request #4667 · lance-format/lance

xloya · 2025-09-08T12:24:41Z

Closed #4665.

Overall Steps:

Create ordered Btree sub-page files / sub-lookup files at the fragment level based on Ray / Daft.
Sort and merge the sub-page files using a k-way merge sort algorithm, supporting prefetch data of sub-page files.
Output the final lookup file.
Commit the final index to dataset.

Production Test Results:
In a production scenario, using Ray and 50 workers on a string ID field in a dataset of 700 million records, we achieved the following:

Btree index build time was reduced from 190 minutes to 19 minutes, a 10x increase in build speed.
Peak memory usage on the Ray head node when creating the index was reduced from 90+ GB to 4+ GB, a 95%+ reduction.

…ibutely

codecov-commenter · 2025-09-08T16:06:07Z

Codecov Report

❌ Patch coverage is 81.82927% with 149 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.73%. Comparing base (76a710e) to head (65ad622).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
rust/lance-index/src/scalar/btree.rs	85.13%	89 Missing and 28 partials ⚠️
rust/lance-index/src/scalar/inverted/builder.rs	0.00%	32 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff            @@
##             main    #4667    +/-   ##
========================================
  Coverage   80.72%   80.73%            
========================================
  Files         321      321            
  Lines      124043   124847   +804     
  Branches   124043   124847   +804     
========================================
+ Hits       100131   100792   +661     
- Misses      20340    20457   +117     
- Partials     3572     3598    +26

Flag	Coverage Δ
unittests	`80.73% <81.82%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

xloya · 2025-09-09T02:39:38Z

@jackye1995 @westonpace @BubbleCal @chenghao-guo Please take a look when you have time, thx!

chenghao-guo · 2025-09-09T03:46:26Z

+        ],
+        prefetch_batch: Optional[int] = None,
+    ):
+        """


Thank you for the refactor. My previous design wasn’t thoroughly considered; yours is a significant improvement

jackye1995

this is amazing feature! I added some initial comments, will look more into the details tomorrow

…ibutely

xloya · 2025-09-09T08:15:45Z

@jackye1995 Thanks for your review, address all comments.

westonpace

This is great. I love the overall approach of adding a train method that takes in some fragments and creates a partial index and then a merge method that will complete all the fragments.

I think we can simplify the prefetch and just use the file reader's prefetch (I have some comments here).

Also, does this merge the batches themselves? For example, if I sort fragment 1 and I get a batch with values 10-50 and then I sort fragment 2 and I get a batch with values 30-80 is there some code that is merging these two batches (maybe the output is a batch with 10-50 and 50-80)? I might be missing it but maybe it is just emitting 10-50 and then 30-80 in sequence?

Also, if we want to simplify some of the merge code, Datafusion has a SortPreservingMerge operator that can do this for us too.

xloya · 2025-09-11T02:34:35Z

Also, does this merge the batches themselves? For example, if I sort fragment 1 and I get a batch with values 10-50 and then I sort fragment 2 and I get a batch with values 30-80 is there some code that is merging these two batches (maybe the output is a batch with 10-50 and 50-80)? I might be missing it but maybe it is just emitting 10-50 and then 30-80 in sequence?

I might not fully understand what's going on here. Let me explain the logic using these two fragments:

Use train_btree_index to construct sub-indexes for fragment_1 and fragment_2. If they are grouped together, I believe this code will ensure strict ordering: https://github.com/lancedb/lance/blob/main/rust/lance/src/index/scalar.rs#L131-L135, which generates the sequence [10, 80]. If they are not grouped together, the sequences [10, 50] and [30, 80] are generated, respectively.
Use the k-way merge algorithm to merge them, ensuring the final order.

westonpace · 2025-09-11T12:27:48Z

I might not fully understand what's going on here. Let me explain the logic using these two fragments:

Use train_btree_index to construct sub-indexes for fragment_1 and fragment_2. If they are grouped together, I believe this code will ensure strict ordering: https://github.com/lancedb/lance/blob/main/rust/lance/src/index/scalar.rs#L131-L135, which generates the sequence [10, 80]. If they are not grouped together, the sequences [10, 50] and [30, 80] are generated, respectively.

Use the k-way merge algorithm to merge them, ensuring the final order.

Ah, I see where I was confused. I thought the partition iterator was yielding batches. Instead it is yielding rows. So the heap is built one row at a time and batches are merged that way. You can ignore that particular comment.

xloya · 2025-09-12T09:51:16Z

@jackye1995 @westonpace @BubbleCal Hi, I've refactored the code based on your feedback. Please review it again when you have time, thanks! I've also noticed that using Datafusion's SortPreservingMerge has slowed down index creation compared to my previous version(the speed can be improved by adjusting the prefetch nums and the number of sub-indexes), but it's still significantly faster than the current single-node creation. Furthermore, the code has been significantly simplified. I think we can further optimize performance in future PRs.

xloya · 2025-09-16T01:50:20Z

@jackye1995 @westonpace @BubbleCal Gentle pin for this

westonpace

Thanks for working through the reviews, this looks good to me now.

@xloya

Add support for distributed BTREE index building in ray connector based on @xloya's great work in lance-format/lance#4667

…dule (#4961) This PR is about to bring the distributed index creation functionality (see #4667, #4578) to java module, which is aligned with the python implementation. --------- Co-authored-by: 喆宇 <wxy407679@antgroup.com>

Closed lance-format#4665. Overall Steps: 1. Create ordered Btree sub-page files / sub-lookup files at the fragment level based on Ray / Daft. 2. Sort and merge the sub-page files using a k-way merge sort algorithm, supporting prefetch data of sub-page files. 3. Output the final lookup file. 4. Commit the final index to dataset. Production Test Results: In a production scenario, using Ray and 50 workers on a string ID field in a dataset of 700 million records, we achieved the following: 1. Btree index build time was reduced from 190 minutes to 19 minutes, a 10x increase in build speed. 2. Peak memory usage on the Ray head node when creating the index was reduced from 90+ GB to 4+ GB, a 95%+ reduction. --------- Co-authored-by: xloya <xiaojiebao@apache.org>

…dule (lance-format#4961) This PR is about to bring the distributed index creation functionality (see lance-format#4667, lance-format#4578) to java module, which is aligned with the python implementation. --------- Co-authored-by: 喆宇 <wxy407679@antgroup.com>

support btree distributely

88a7a37

github-actions Bot added enhancement New feature or request python java labels Sep 8, 2025

xloya changed the title ~~feat: support btree distributely~~ feat: support create btree distributely Sep 8, 2025

xloya changed the title ~~feat: support create btree distributely~~ feat: support create btree index distributely Sep 8, 2025

xloya added 7 commits September 8, 2025 20:32

support btree distributely

1125dc1

Merge branch 'create-btree-distributely' into feat-create-btree-distr…

f65384e

…ibutely

support btree distributely

7105d77

Merge branch 'create-btree-distributely' into feat-create-btree-distr…

e93c100

…ibutely

support btree distributely

d1afc86

Merge branch 'create-btree-distributely' into feat-create-btree-distr…

5aee77f

…ibutely

support btree distributely

acbb5a6

xloya force-pushed the feat-create-btree-distributely branch 2 times, most recently from 2cdc2b1 to b634aa8 Compare September 8, 2025 12:50

xloya changed the title ~~feat: support create btree index distributely~~ feat: support create btree index distributedly Sep 9, 2025

jackye1995 self-requested a review September 9, 2025 03:27

chenghao-guo reviewed Sep 9, 2025

View reviewed changes

xloya force-pushed the feat-create-btree-distributely branch from f09c92b to d39785d Compare September 9, 2025 05:15

jackye1995 reviewed Sep 9, 2025

View reviewed changes

Merge branch 'create-btree-distributely' into feat-create-btree-distr…

550e37a

…ibutely

xloya force-pushed the feat-create-btree-distributely branch from d39785d to 550e37a Compare September 9, 2025 06:39

fix code

966bceb

xloya force-pushed the feat-create-btree-distributely branch from 42022fb to 966bceb Compare September 9, 2025 09:34

fix clippy

bcd5b3e

xloya force-pushed the feat-create-btree-distributely branch 2 times, most recently from 5c3c211 to 00b5f59 Compare September 9, 2025 11:13

xloya force-pushed the feat-create-btree-distributely branch 2 times, most recently from d3c3e6d to c7353ce Compare September 10, 2025 01:49

update

eaaac6b

xloya force-pushed the feat-create-btree-distributely branch from c7353ce to eaaac6b Compare September 10, 2025 01:51

westonpace reviewed Sep 10, 2025

View reviewed changes

xloya added 2 commits September 11, 2025 14:07

update code

fc9aaf7

Merge branch 'main' into feat-create-btree-distributely

0734634

xloya added 2 commits September 12, 2025 17:33

refactor code

8d0a5b1

remove useless code

9a2c3fe

xloya requested review from jackye1995 and westonpace September 12, 2025 10:26

chenghao-guo mentioned this pull request Sep 15, 2025

support build IVF_FLAT/PQ/SQ vector index distributedly #4723

Closed

xloya added 4 commits September 15, 2025 14:21

Merge branch 'main' into feat-create-btree-distributely

1038359

simplify code

98b7ac2

update

ab7fe61

Merge branch 'main' into feat-create-btree-distributely

65ad622

westonpace approved these changes Sep 16, 2025

View reviewed changes

westonpace merged commit 8476edb into lance-format:main Sep 16, 2025
26 checks passed

chenghao-guo mentioned this pull request Sep 18, 2025

feat: add btree support in lance-ray create_scalar_index lance-format/lance-ray#46

Merged

yanghua pushed a commit to lance-format/lance-ray that referenced this pull request Sep 19, 2025

feat: add btree support in lance-ray create_scalar_index (#46)

8fb1c4d

Add support for distributed BTREE index building in ray connector based on @xloya's great work in lance-format/lance#4667

westonpace mentioned this pull request Sep 26, 2025

Faster index build with Ray #4827

Closed

steFaiz mentioned this pull request Oct 15, 2025

feat(java): supports building scalar indices distributedly in java module #4961

Merged

Conversation

xloya commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

xloya commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chenghao-guo Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

jackye1995 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xloya commented Sep 9, 2025

Uh oh!

westonpace left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xloya commented Sep 11, 2025

Uh oh!

westonpace commented Sep 11, 2025

Uh oh!

xloya commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xloya commented Sep 16, 2025

Uh oh!

westonpace left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

xloya commented Sep 8, 2025 •

edited

Loading

codecov-commenter commented Sep 8, 2025 •

edited

Loading

xloya commented Sep 9, 2025 •

edited

Loading

xloya commented Sep 12, 2025 •

edited

Loading