fix: invalidate index fragment bitmaps after data replacement and stale merge#5929
Conversation
…le merge After a merge_insert that modifies indexed columns (RewriteColumns path), followed by compaction + optimize_indices, subsequent queries could fail with "non-existent fragment" errors. Two root causes: 1. optimize_indices merged stale old B-tree data: `merge_indices_with_unindexed_frags` used raw `fragment_bitmap` instead of `effective_fragment_bitmap`, carrying entries for pruned fragments. When the effective bitmap was empty (fully pruned), `index.update()` still called `combine_old_new()` which merged stale entries. Now uses effective bitmap and rebuilds from scratch when old data is fully stale. 2. DataReplacement had no index invalidation: the DataReplacement handler in `build_manifest()` didn't call `prune_updated_fields_from_indices`, so replacing an indexed column's data left the index bitmap unchanged. Fixes lance-format#5321 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Code ReviewThis PR fixes index fragment bitmap corruption after P0: Potential Issue in Scalar Index Rebuild LogicIn super::scalar::build_scalar_index(
dataset.as_ref(),
column.name.as_str(),
&new_uuid.to_string(),
¶ms,
true, // include_row_id
None, // fragments (None means all)
Some(new_data_stream), // additional_data_stream
)
.await?However, the Recommendation: Verify that P1: Replaced Fields ExtractionIn let replaced_fields: Vec<u32> = new_datafiles
.first()
.map(|f| {
f.fields.iter().filter(...).collect()
})
.unwrap_or_default();This is safe given the earlier validation (lines 2037-2058) that ensures all new data files have the same fields. However, adding a comment here would make this dependency explicit. TestsThe test coverage is thorough:
Overall the fix logic is sound and addresses the reported issue correctly. |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
ef90fe2 to
ed6c6b5
Compare
ed6c6b5 to
6ba3166
Compare
When `optimize_indices` calls `index.update()`, BTreeIndex's `combine_old_new()` reads all stored data including entries for invalidated fragments, causing "non-existent fragment" errors. The bitmap pruning from the previous fix only affects the bitmap, not the stored index data. Pass the valid fragment bitmap into `ScalarIndex::update()` so BTreeIndex can filter old data rows whose fragment ID (upper 32 bits of the row address) is not in the bitmap. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
westonpace
left a comment
There was a problem hiding this comment.
Thanks for figuring this out. It makes me wonder if the update path is even worth the complexity it brings at this point. Btree is the only one that really makes use of it.
| /// | ||
| /// If `valid_old_fragments` is provided, old index data for fragments not in the bitmap | ||
| /// will be filtered out during the merge. |
There was a problem hiding this comment.
So this only matters for btree because it is the only index that scans the old portion of the index instead of rescanning the whole column?
There was a problem hiding this comment.
Actually, bitmap does this too. I need to fix that.
There was a problem hiding this comment.
Fixed bitmap. Other indices should be done as a follow up, but I'm not aware of a code path where they cause a bug yet. Many of them like ZoneMap and BloomFilter given inexact results.
| let filtered = stream.map(move |batch_result| { | ||
| let batch = batch_result?; | ||
| let row_ids = batch | ||
| .column(1) |
There was a problem hiding this comment.
Minor nit: I've been trying to slowly switch over to using string constants like BTREE_IDS_COLUMN and avoiding integral indexing just because it seems less prone to error in the future.
That's a fair point. There's probably not a huge advantage to scanning the index vs scanning the fragments. |
This is basically the same problem encountered in #5929 However, the solution there (and in other indices fixed since then) has been to prune indices during the update phase to remove old fragments. Unfortunately, FTS indices are maybe too large for this pruning to make sense (it would require a complete scan through of the old index) though I haven't verified how bad the performance would be. This PR instead keeps track of which fragments have been invalidated. It then uses this as a block-list at search time.
This is basically the same problem encountered in #5929 However, the solution there (and in other indices fixed since then) has been to prune indices during the update phase to remove old fragments. Unfortunately, FTS indices are maybe too large for this pruning to make sense (it would require a complete scan through of the old index) though I haven't verified how bad the performance would be. This PR instead keeps track of which fragments have been invalidated. It then uses this as a block-list at search time.
This is basically the same problem encountered in lance-format#5929 However, the solution there (and in other indices fixed since then) has been to prune indices during the update phase to remove old fragments. Unfortunately, FTS indices are maybe too large for this pruning to make sense (it would require a complete scan through of the old index) though I haven't verified how bad the performance would be. This PR instead keeps track of which fragments have been invalidated. It then uses this as a block-list at search time.
…le merge (lance-format#5929) This fixes a bug where queries or merge insert could result in the error `this fragment does not exist in the dataset`. This would happen because the index referenced row addresses that were no longer in the dataset. Here's one way this would happen: * Create a BTREE index covering 2 fragments * Do a merge insert that rewrites the index column in place. This removes the updated data from the index's fragment bitmap, so data is now hidden. * Do compaction. This will move the rows to new fragments. * Optimize the BTREE index: this would build the btree index with the old data, not filtering out the invalidated rows. After doing this the invalidated rows, previously masked out, are part of the index. But since compaction happened, those old invalidated rows moved. This PR makes sure: 1. When incrementally updating indexes, we only read row ids for data in the fragment bitmap. 2. Make sure we remove fragments from the index fragment bitmap if we do a DataReplacement on those fields. This rules are described in the spec at https://lance.org/format/table/index/#loading-an-index Fixes lance-format#5321 --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This fixes a bug where queries or merge insert could result in the error
this fragment does not exist in the dataset.This would happen because the index referenced row addresses that were no longer in the dataset.
Here's one way this would happen:
This PR makes sure:
This rules are described in the spec at https://lance.org/format/table/index/#loading-an-index
Fixes #5321