Skip to content

test: tighten multi-segment vector index coverage and cleanup#6251

Merged
Xuanwo merged 6 commits intomainfrom
xuanwo/multi-segment-e-series
Mar 27, 2026
Merged

test: tighten multi-segment vector index coverage and cleanup#6251
Xuanwo merged 6 commits intomainfrom
xuanwo/multi-segment-e-series

Conversation

@Xuanwo
Copy link
Copy Markdown
Collaborator

@Xuanwo Xuanwo commented Mar 21, 2026

This tightens the new multi-segment vector index path added in #6220.

It enforces disjoint fragment coverage when committing a segment set, adds regression coverage that grouped segment coverage matches the union of its source shard coverage, and verifies that remap only touches segments covering affected fragments.

It also adds cleanup coverage for both replaced committed segments and stale uncommitted _indices/<uuid> artifacts, and documents these contracts in the distributed indexing guide.

@github-actions github-actions Bot added the chore label Mar 21, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Review

Clean PR — tests are well-structured and the validation logic is correct.

One minor suggestion

Fragment overlap check could use is_disjoint instead of intersection_len (rust/lance/src/index.rs):

// current
if covered_fragments.intersection_len(segment.fragment_bitmap()) > 0 {

// slightly more readable
if !covered_fragments.is_disjoint(segment.fragment_bitmap()) {

is_disjoint can short-circuit on the first shared element, whereas intersection_len must count all overlapping elements. In practice the difference is negligible for small bitmaps, but is_disjoint communicates intent more clearly and is never slower.

Everything else looks good

  • Disjoint coverage validation at commit time is the right place for it.
  • Cleanup tests correctly verify both replaced-segment removal and uncommitted-artifact removal.
  • Remapper test properly asserts segment-granular behavior (only affected segments get remapped).
  • Fragment coverage assertion in the grouped-segment build test is a good regression guard.
  • Doc updates are accurate and concise.

LGTM with the optional nit above.

@Xuanwo Xuanwo changed the title test: close multi-segment E-series gaps test: tighten multi-segment vector index coverage and cleanup Mar 21, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 27, 2026

Codecov Report

❌ Patch coverage is 98.72204% with 4 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/dataset/cleanup.rs 97.67% 0 Missing and 4 partials ⚠️

📢 Thoughts on this report? Let us know!

@Xuanwo Xuanwo merged commit 9b43852 into main Mar 27, 2026
29 checks passed
@Xuanwo Xuanwo deleted the xuanwo/multi-segment-e-series branch March 27, 2026 09:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants