feat: support non-shared centroid vector index builds#6296
Conversation
PR Review: feat: support non-shared centroid vector index buildsClean PR overall — the plumbing of IssuesP1 — Memory risk in fragment-scoped nullable sampling ( When // current fragment_ids path for nullable:
let batch = scan_all_training_data(dataset, column, is_nullable, fragment_ids).await?;
// vs non-fragment path:
sample_nullable_fsl(column, sample_size_hint, byte_width, vector_field, scan).awaitP1 — Dead guard condition ( The match arm 🤖 Generated with Claude Code |
|
ACTION NEEDED The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. For details on the error please inspect the "PR Title Check" action. |
…troid-index-build # Conflicts: # python/src/indices.rs # rust/lance/src/index/vector/utils.rs
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
This PR builds on #6294 and exposes the remaining pieces needed to construct non-shared centroid vector index builds.
It adds fragment-scoped IVF/PQ training in Rust and exports the same training flow to Python, so users can train per-segment artifacts and feed them into the existing distributed build path.