Skip to content

perf: offload IVF partition build to CPU pool#5551

Merged
BubbleCal merged 2 commits intomainfrom
offload-partition-build
Dec 19, 2025
Merged

perf: offload IVF partition build to CPU pool#5551
BubbleCal merged 2 commits intomainfrom
offload-partition-build

Conversation

@BubbleCal
Copy link
Copy Markdown
Contributor

@BubbleCal BubbleCal commented Dec 19, 2025

before:
image
after:
image

This can improve HNSW indexing 30% faster, the other vector index types can also benefit from this but not such obvious

@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 19, 2025

Codecov Report

❌ Patch coverage is 90.90909% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/index/vector/builder.rs 90.90% 1 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

@github-actions
Copy link
Copy Markdown
Contributor

Code Review

Summary: This PR offloads the IVF partition build work (filtering deleted rows, batch processing, and index building) to the dedicated CPU pool using spawn_cpu.

Assessment

The change looks correct. The spawn_cpu utility is already used elsewhere in this file (line 1349) for similar CPU-intensive operations, and the pattern is consistent.

The computation being moved (HashSet operations, filter_record_batch, and Self::build_index) is CPU-intensive and benefits from running on the dedicated CPU thread pool rather than blocking the tokio async runtime.

No blocking issues identified.

@BubbleCal BubbleCal merged commit a9c0571 into main Dec 19, 2025
32 checks passed
@BubbleCal BubbleCal deleted the offload-partition-build branch December 19, 2025 08:30
wjones127 pushed a commit to wjones127/lance that referenced this pull request Dec 30, 2025
before:
<img width="683" height="44" alt="image"
src="https://github.com/user-attachments/assets/3a2d6375-ef33-40b4-af65-64b6b568c78e"
/>
after:
<img width="745" height="43" alt="image"
src="https://github.com/user-attachments/assets/4c22281d-d291-47eb-bd71-faa7c4b5a100"
/>

This can improve HNSW indexing 30% faster, the other vector index types
can also benefit from this but not such obvious

Co-authored-by: Xuanwo <github@xuanwo.io>
jackye1995 pushed a commit to jackye1995/lance that referenced this pull request Jan 21, 2026
before:
<img width="683" height="44" alt="image"
src="https://github.com/user-attachments/assets/3a2d6375-ef33-40b4-af65-64b6b568c78e"
/>
after:
<img width="745" height="43" alt="image"
src="https://github.com/user-attachments/assets/4c22281d-d291-47eb-bd71-faa7c4b5a100"
/>

This can improve HNSW indexing 30% faster, the other vector index types
can also benefit from this but not such obvious

Co-authored-by: Xuanwo <github@xuanwo.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants