refactor: distributed vector segment build by Xuanwo · Pull Request #6220 · lance-format/lance

Xuanwo · 2026-03-18T06:37:29Z

This refactors distributed vector indexing into a staged segment-build pipeline and exposes the public APIs needed to integrate an external distributed build workflow with Lance. It defines the storage-level model for partial shards, staged planning, built segments, and segment commit, and documents the current distributed indexing flow.

This PR builds on the segment commit API from #6209. The main changes are organized into five commits:

Follow-up fixes:

Please review accordingly.

Guide: https://github.com/lance-format/lance/blob/xuanwo/vector-staging-merge-internal/docs/src/guide/distributed_indexing.md

github-actions · 2026-03-18T06:40:52Z

PR Review: refactor: internalize vector staging merge flow

Overall: Well-structured refactor that cleanly separates planning from execution for distributed vector index finalize. Good test coverage across IVF subtypes and error cases. A few items to consider:

P1: `load_fragment_bitmap_from_storage` scans all partitions and all row IDs

load_fragment_bitmap_from_storage (ivf.rs ~line 2598) iterates every partition and every row ID to reconstruct the fragment bitmap. For large indices with millions of vectors, this could be very expensive during plan_staging_segments (which calls load_partial_vector_segment per shard) and also during load_vector_segment in commit_existing_index. This happens at planning time, not just at merge time.

Consider whether the fragment bitmap could be stored as a sidecar in the partial shard directory during build, or at minimum document the cost tradeoff and ensure callers are aware. For the legacy compat path (commit_existing_index → load_vector_segment) this could be a regression if the index is large.

P1: `reset_final_segment_dir` deletes before writing — no atomicity

merge_staging_segment_to_dir calls reset_final_segment_dir (which does remove_dir_all) on the final_dir before writing new content. If the process crashes between delete and write completion, the segment is lost. The legacy same-dir path is especially fragile: it deletes the temp dir, copies files to final_dir, then deletes the partial shards — multiple non-atomic steps.

For object stores (S3/GCS) this is somewhat inherent, but for local filesystem consider whether an atomic rename could be used instead, or at least note the non-atomic window in comments.

Minor observations

copy_partial_segment_contents does serial object_store.copy() per file. For cloud stores with many files per shard, parallel copies (e.g., futures::stream::iter(...).buffered(N)) would be faster.
plan_staging_segments processes shards sequentially in a for loop. Since each load_partial_vector_segment does I/O, these could be loaded concurrently with try_join_all or similar.
The chrono dependency added to lance-index is only used for IndexSegment's created_at field in the commit path (in lance/src/index.rs), not in lance-index itself. It doesn't appear to be used in types.rs. Is this dependency needed in lance-index?

Tests

Good coverage: multiple IVF subtypes (flat/pq/sq), HNSW variants, error cases for mismatched subtypes/centroids/overlapping fragments, duplicate segments, empty segments, and the legacy commit_existing_index wrapper. The parametrized rstest approach is clean.

Xuanwo · 2026-03-18T07:29:20Z

CI is not running because our base is not main. I will change that once #6209 merged.

codecov · 2026-03-18T18:09:35Z

Codecov Report

❌ Patch coverage is 91.45129% with 86 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
rust/lance/src/index/vector/ivf.rs	80.96%	34 Missing and 21 partials ⚠️
rust/lance/src/dataset.rs	73.91%	7 Missing and 5 partials ⚠️
rust/lance/src/index/create.rs	96.66%	9 Missing and 1 partial ⚠️
rust/lance/src/index/vector.rs	82.14%	5 Missing ⚠️
...lance-index/src/vector/distributed/index_merger.rs	92.50%	3 Missing ⚠️
rust/lance/src/index/vector/ivf/v2.rs	99.62%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

westonpace

Still trying to get up to speed I think. What's the difference between merging staging files into a segment and combining multiple smaller segments into a larger segment?

Xuanwo · 2026-03-19T07:15:54Z

Thank you @westonpace for the review! I have revisited all the concepts, aligned their naming and comments, and compiled them into a document at https://github.com/lance-format/lance/blob/xuanwo/vector-staging-merge-internal/docs/src/guide/distributed_indexing.md. I hope this will be helpful for the review.

westonpace · 2026-03-19T12:37:37Z

This guide is great, and super helpful. I have a few more questions but we can potentially address these in a follow-up.

How does a worker decide how many shards to create?
Why does each worker create shards and not index segments? For example, let's say I have 5 workers and each worker created 10 shards so I have 50 shards.

Are those shards complete indexes? Why not just make them segments?

Then there is no need for a staging dir. The create_index_segment_builder call would take in a list of segment UUIDs (instead of a staging dir). I don't think you'd need the call with_partial_shards (since we provide these to create_index_segment_builder. Everything else would work as planned.

westonpace

Can you clearly document what the breaking changes are in the PR description?

Overall this seems to be a good extension to the previous single segment mechanism. It is flexible and the API makes sense.

It might be nice to see a full end-to-end example (perhaps in distributed_indexing.md) at some point.

westonpace · 2026-03-20T22:31:44Z

        *,
        target_partition_size: Optional[int] = None,
        skip_transpose: bool = False,
+        require_commit: bool = True,


Is this a breaking change?

westonpace · 2026-03-20T22:33:11Z

            )
+        return index
+
+    def create_index(


It's hard to tell how this changed from the existing API. Were there any breaking changes?

Xuanwo · 2026-03-21T22:25:53Z

Can you clearly document what the breaking changes are in the PR description?

I think we should mark #6209 as a breaking change. Therefore, this PR does not necessarily need to be a breaking change now. I will update.

It might be nice to see a full end-to-end example (perhaps in distributed_indexing.md) at some point.

Yes! I'm thinking about this too, will work on a follow-up.

## Summary This fixes a flaky regression test added in #6220 (`b80fbb3231cf58dd50e5670f9c56d309999bbd73`). The affected test is: - `test_distributed_vector_build_commits_multiple_segments_and_preserves_query_results` Recent failures showed up in both of these runs: - https://github.com/lance-format/lance/actions/runs/23450834811 `main` at `244c721504c6ef0b4c2f9700a342509976898d6e` - https://github.com/lance-format/lance/actions/runs/23460892697 #6263 In those failures, different platforms / index variants failed: - `linux-arm / case_1_ivf_flat` on `main` - `linux-build / case_2_ivf_pq` on #6263 That points to an existing flaky test. ## Root Cause The test compared the exact Top-K `_rowid` results between: - a single-segment index build, and - a distributed multi-segment index build However, the query path used by the test is ANN (`ANNIVFPartition`) under the default probing behavior. With partial probing, the candidate set can differ slightly between single-segment and multi-segment layouts, especially near the tail of Top-K. That makes exact `_rowid` equality too strict for this test and causes intermittent failures. ## Fix Make the test probe all IVF partitions before comparing Top-K row ids: - add `.minimum_nprobes(TWO_FRAG_NUM_PARTITIONS)` to the test query This keeps the existing strong assertion (`ids_single == ids_split`) but removes the probing-related source of nondeterminism. ## Testing Local verification: - `export PROTOC=/opt/homebrew/opt/protobuf@3/bin/protoc` - `cargo test -p lance test_distributed_vector_build_commits_multiple_segments_and_preserves_query_results -- --nocapture` Observed result: - `case_1_ivf_flat ... ok` - `case_2_ivf_pq ... ok` - `case_3_ivf_sq ... ok` I also verified during debugging that with full probing enabled, repeated runs of the previously flaky `ivf_flat` / `ivf_pq` cases became stable.

This refactors distributed vector indexing into a staged segment-build pipeline and exposes the public APIs needed to integrate an external distributed build workflow with Lance. It defines the storage-level model for partial shards, staged planning, built segments, and segment commit, and documents the current distributed indexing flow. This PR builds on the segment commit API from #6209. The main changes are organized into five commits: - [test: cover distributed vector segment build](a86274da2) - [refactor: internalize distributed vector segment build](1e5f0e15b) - [feat: add public vector segment builder API](691cecb9a) - [feat: add Python vector segment builder API](a07ef6144) - [docs: document distributed vector segment build](0a3f230d7) Follow-up fixes: - [fix: expose python create_index_uncommitted](c1d3b1666) - [fix: format python bindings](b86f91c4d) - [fix: format python segment builder bindings](bfc9e63a0) Please review accordingly. --- Guide: https://github.com/lance-format/lance/blob/xuanwo/vector-staging-merge-internal/docs/src/guide/distributed_indexing.md

## Summary This fixes a flaky regression test added in #6220 (`b80fbb3231cf58dd50e5670f9c56d309999bbd73`). The affected test is: - `test_distributed_vector_build_commits_multiple_segments_and_preserves_query_results` Recent failures showed up in both of these runs: - https://github.com/lance-format/lance/actions/runs/23450834811 `main` at `244c721504c6ef0b4c2f9700a342509976898d6e` - https://github.com/lance-format/lance/actions/runs/23460892697 #6263 In those failures, different platforms / index variants failed: - `linux-arm / case_1_ivf_flat` on `main` - `linux-build / case_2_ivf_pq` on #6263 That points to an existing flaky test. ## Root Cause The test compared the exact Top-K `_rowid` results between: - a single-segment index build, and - a distributed multi-segment index build However, the query path used by the test is ANN (`ANNIVFPartition`) under the default probing behavior. With partial probing, the candidate set can differ slightly between single-segment and multi-segment layouts, especially near the tail of Top-K. That makes exact `_rowid` equality too strict for this test and causes intermittent failures. ## Fix Make the test probe all IVF partitions before comparing Top-K row ids: - add `.minimum_nprobes(TWO_FRAG_NUM_PARTITIONS)` to the test query This keeps the existing strong assertion (`ids_single == ids_split`) but removes the probing-related source of nondeterminism. ## Testing Local verification: - `export PROTOC=/opt/homebrew/opt/protobuf@3/bin/protoc` - `cargo test -p lance test_distributed_vector_build_commits_multiple_segments_and_preserves_query_results -- --nocapture` Observed result: - `case_1_ivf_flat ... ok` - `case_2_ivf_pq ... ok` - `case_3_ivf_sq ... ok` I also verified during debugging that with full probing enabled, repeated runs of the previously flaky `ivf_flat` / `ivf_pq` cases became stable.

This tightens the new multi-segment vector index path added in #6220. It enforces disjoint fragment coverage when committing a segment set, adds regression coverage that grouped segment coverage matches the union of its source shard coverage, and verifies that remap only touches segments covering affected fragments. It also adds cleanup coverage for both replaced committed segments and stale uncommitted `_indices/<uuid>` artifacts, and documents these contracts in the distributed indexing guide.

This refactors distributed vector indexing into a staged segment-build pipeline and exposes the public APIs needed to integrate an external distributed build workflow with Lance. It defines the storage-level model for partial shards, staged planning, built segments, and segment commit, and documents the current distributed indexing flow. This PR builds on the segment commit API from lance-format#6209. The main changes are organized into five commits: - [test: cover distributed vector segment build](lance-format@a86274da2) - [refactor: internalize distributed vector segment build](lance-format@1e5f0e15b) - [feat: add public vector segment builder API](lance-format@691cecb9a) - [feat: add Python vector segment builder API](lance-format@a07ef6144) - [docs: document distributed vector segment build](lance-format@0a3f230d7) Follow-up fixes: - [fix: expose python create_index_uncommitted](lance-format@c1d3b1666) - [fix: format python bindings](lance-format@b86f91c4d) - [fix: format python segment builder bindings](lance-format@bfc9e63a0) Please review accordingly. --- Guide: https://github.com/lance-format/lance/blob/xuanwo/vector-staging-merge-internal/docs/src/guide/distributed_indexing.md

) ## Summary This fixes a flaky regression test added in lance-format#6220 (`b80fbb3231cf58dd50e5670f9c56d309999bbd73`). The affected test is: - `test_distributed_vector_build_commits_multiple_segments_and_preserves_query_results` Recent failures showed up in both of these runs: - https://github.com/lance-format/lance/actions/runs/23450834811 `main` at `244c721504c6ef0b4c2f9700a342509976898d6e` - https://github.com/lance-format/lance/actions/runs/23460892697 lance-format#6263 In those failures, different platforms / index variants failed: - `linux-arm / case_1_ivf_flat` on `main` - `linux-build / case_2_ivf_pq` on lance-format#6263 That points to an existing flaky test. ## Root Cause The test compared the exact Top-K `_rowid` results between: - a single-segment index build, and - a distributed multi-segment index build However, the query path used by the test is ANN (`ANNIVFPartition`) under the default probing behavior. With partial probing, the candidate set can differ slightly between single-segment and multi-segment layouts, especially near the tail of Top-K. That makes exact `_rowid` equality too strict for this test and causes intermittent failures. ## Fix Make the test probe all IVF partitions before comparing Top-K row ids: - add `.minimum_nprobes(TWO_FRAG_NUM_PARTITIONS)` to the test query This keeps the existing strong assertion (`ids_single == ids_split`) but removes the probing-related source of nondeterminism. ## Testing Local verification: - `export PROTOC=/opt/homebrew/opt/protobuf@3/bin/protoc` - `cargo test -p lance test_distributed_vector_build_commits_multiple_segments_and_preserves_query_results -- --nocapture` Observed result: - `case_1_ivf_flat ... ok` - `case_2_ivf_pq ... ok` - `case_3_ivf_sq ... ok` I also verified during debugging that with full probing enabled, repeated runs of the previously flaky `ivf_flat` / `ivf_pq` cases became stable.

github-actions Bot added python java labels Mar 18, 2026

Xuanwo changed the base branch from main to xuanwo/index-segment-commit-api March 18, 2026 06:43

Xuanwo force-pushed the xuanwo/vector-staging-merge-internal branch from 5c4279d to 44062e8 Compare March 18, 2026 07:23

Base automatically changed from xuanwo/index-segment-commit-api to main March 18, 2026 17:22

Xuanwo mentioned this pull request Mar 18, 2026

feat: add public vector segment builder API #6224

Closed

westonpace reviewed Mar 19, 2026

View reviewed changes

Xuanwo force-pushed the xuanwo/vector-staging-merge-internal branch from 3c6d5a7 to 4938e8a Compare March 19, 2026 07:13

Xuanwo changed the title ~~refactor: internalize vector staging merge flow~~ refactor: internalize distributed vector segment build Mar 19, 2026

Xuanwo changed the title ~~refactor: internalize distributed vector segment build~~ refactor!: distributed vector segment build Mar 20, 2026

github-actions Bot added the breaking-change label Mar 20, 2026

Xuanwo force-pushed the xuanwo/vector-staging-merge-internal branch from 0950fc8 to 1ea42de Compare March 20, 2026 14:32

Xuanwo added 5 commits March 20, 2026 23:34

test: cover distributed vector segment build

a86274d

refactor: internalize distributed vector segment build

1e5f0e1

feat: add public vector segment builder API

691cecb

feat: add Python vector segment builder API

a07ef61

docs: document distributed vector segment build

0a3f230

Xuanwo force-pushed the xuanwo/vector-staging-merge-internal branch from 1ea42de to 0a3f230 Compare March 20, 2026 15:37

Xuanwo added 2 commits March 21, 2026 00:22

fix: expose python create_index_uncommitted

c1d3b16

fix: format python bindings

b86f91c

Xuanwo force-pushed the xuanwo/vector-staging-merge-internal branch from d43183d to b86f91c Compare March 20, 2026 16:59

fix: format python segment builder bindings

bfc9e63

westonpace approved these changes Mar 20, 2026

View reviewed changes

Xuanwo removed the breaking-change label Mar 21, 2026

Xuanwo changed the title ~~refactor!: distributed vector segment build~~ refactor: distributed vector segment build Mar 21, 2026

Xuanwo merged commit b80fbb3 into main Mar 21, 2026
36 checks passed

Xuanwo deleted the xuanwo/vector-staging-merge-internal branch March 21, 2026 22:28

This was referenced Mar 21, 2026

docs: add distributed indexing workflow example #6250

Closed

test: tighten multi-segment vector index coverage and cleanup #6251

Merged

This was referenced Mar 24, 2026

test: fix flaky distributed vector build results test #6268

Merged

fix: preserve row last-updated metadata for rewrite-columns updates #6263

Open

Xuanwo mentioned this pull request Mar 27, 2026

Tracking: Distributed Indexes Search #6309

Open

26 tasks

Conversation

Xuanwo commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Mar 18, 2026

PR Review: refactor: internalize vector staging merge flow

P1: load_fragment_bitmap_from_storage scans all partitions and all row IDs

P1: reset_final_segment_dir deletes before writing — no atomicity

Minor observations

Tests

Uh oh!

Xuanwo commented Mar 18, 2026

Uh oh!

codecov Bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

westonpace left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Xuanwo commented Mar 19, 2026

Uh oh!

westonpace commented Mar 19, 2026

Uh oh!

westonpace left a comment

Choose a reason for hiding this comment

Uh oh!

westonpace Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

westonpace Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Xuanwo commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Xuanwo commented Mar 18, 2026 •

edited

Loading

P1: `load_fragment_bitmap_from_storage` scans all partitions and all row IDs

P1: `reset_final_segment_dir` deletes before writing — no atomicity

codecov Bot commented Mar 18, 2026 •

edited

Loading