Skip to content

fix: preserve create index transaction semantics#6204

Merged
Xuanwo merged 7 commits intomainfrom
xuanwo/fix-create-index-transaction-semantics
Mar 16, 2026
Merged

fix: preserve create index transaction semantics#6204
Xuanwo merged 7 commits intomainfrom
xuanwo/fix-create-index-transaction-semantics

Conversation

@Xuanwo
Copy link
Copy Markdown
Collaborator

@Xuanwo Xuanwo commented Mar 16, 2026

This fixes the transaction semantics behind CreateIndex so we can precisely add and remove index UUIDs without relying on implicit name-based replacement.

It also closes the rebase correctness hole where concurrent same-name CreateIndex commits could survive with stale removed_indices and leave overlapping entries in the manifest. This keeps the existing uniqueness and replace=true contract intact while unblocking follow-up multi-segment index work.

This PR is part of #6180

@github-actions github-actions Bot added the bug Something isn't working label Mar 16, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR Review: fix: preserve create index transaction semantics

This PR fixes a real correctness issue: the old name-based filtering in build_manifest would incorrectly remove all same-name indices rather than only the ones explicitly listed in removed_indices. This caused problems for multi-segment index scenarios and concurrent same-name CreateIndex operations.

Summary of changes

  1. UUID-based filtering in build_manifest — correctly scopes removal to explicitly listed UUIDs
  2. execute() populates removed_indices when replace=true — previously always empty
  3. Same-name conflict detection in conflict resolver — concurrent same-name CreateIndex now correctly returns RetryableCommitConflict
  4. Removed compensating code in index.rs merge flow — no longer needed since UUID-based filtering preserves unreplaced deltas naturally

Feedback

The design is sound and the fix is correct. Two minor notes:

  1. Stale comment (conflict_resolver.rs:2299): The comment "Will only conflict with operations that modify row ids." is now inaccurate — CreateIndex also conflicts with same-name CreateIndex. Worth updating.

  2. Test coverage is good — the concurrent create/replace tests and the build_manifest unit tests cover the key scenarios well. The matrix test update (CompatibleRetryable) is correct since both sides use index0 with name "test".

No P0/P1 issues found. LGTM.

async fn execute(mut self) -> Result<IndexMetadata> {
let new_idx = self.execute_uncommitted().await?;
let index_uuid = new_idx.uuid;
let removed_indices = if self.replace {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean if we need to manually commit a CreateIndex transaction, we need to check replace everywhere?

this seems a breaking change, are we replying on this somewhere (distributed indexing?)?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. With this change, Operation::CreateIndex becomes an explicit add/remove UUID set operation instead of relying on implicit name-based replacement during manifest merge.

That means callers which want replace semantics need to populate removed_indices themselves. The additive/manual commit paths (including distributed indexing’s commit_existing_index(...)) are not relying on the old behavior. They are registering already-built indices and should continue to use removed_indices: vec![].

please specify a different name"
)));
}
let existing_named_indices = indices
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly. These two checks happen at different stages and serve different purposes.

It's possible to have a helper function to make it more clear, do you want it?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, the two implementations are diff already, maybe we don't need to check fields as we don't allow indices with the same name but diff fields. Good to have a helper function to get consistent behavior

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 16, 2026

Codecov Report

❌ Patch coverage is 98.56631% with 4 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/index/create.rs 97.05% 0 Missing and 2 partials ⚠️
rust/lance/src/io/commit/conflict_resolver.rs 98.16% 0 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question but the gist of it seems ok

Comment on lines +187 to +191
if !existing_named_indices.is_empty() && !self.replace {
return Err(Error::index(format!(
"Index name '{index_name}' already exists, \
please specify a different name or use replace=True"
)));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean I will need replace=true to add a uuid to an existing index?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, our current API didn't support adding a UUID to an existing index, it's our next step. We'll need a new public API for that.

Comment thread rust/lance/src/io/commit/conflict_resolver.rs Outdated
@Xuanwo
Copy link
Copy Markdown
Collaborator Author

Xuanwo commented Mar 16, 2026

distance::dot::tests::test_dot_f32's failure is not related.

@Xuanwo Xuanwo merged commit 4110d04 into main Mar 16, 2026
27 of 28 checks passed
@Xuanwo Xuanwo deleted the xuanwo/fix-create-index-transaction-semantics branch March 16, 2026 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants