fix: fix remap so that it handles deletions correctly#5828
fix: fix remap so that it handles deletions correctly#5828westonpace merged 2 commits intolance-format:mainfrom
Conversation
Code ReviewP0: Debug print statements in production codeThe remap function contains
These will pollute stdout in production. P0: Commented-out code in compression.rsThe file
P1: Missing lookup file handling for basic indexIn the new remap implementation, the lookup file is only handled for range-based indices: if let Some(ranges_to_files) = &self.ranges_to_files {
// ... merge lookups
}For basic indices (when // Copy the lookup file as-is (OLD CODE)
self.store
.copy_index_file(BTREE_LOOKUP_NAME, dest_store)
.await?;Since The core fix (retraining rather than just copying when remapping) is sound. Good test coverage for the deletion scenario. |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
jackye1995
left a comment
There was a problem hiding this comment.
I think we should backport this? looks like a serious correctness bug
Agreed. |
) The previous remap implementation assumed it was safe to just copy the lookup unchanged. This is only safe if there are no deletions. If there are deletions then the page boundaries have moved and so the min/max has moved as well. We need to recreate both the lookup and the data when we remap. Closes lance-format#5826
The previous remap implementation assumed it was safe to just copy the lookup unchanged. This is only safe if there are no deletions. If there are deletions then the page boundaries have moved and so the min/max has moved as well. We need to recreate both the lookup and the data when we remap. Closes #5826
) The previous remap implementation assumed it was safe to just copy the lookup unchanged. This is only safe if there are no deletions. If there are deletions then the page boundaries have moved and so the min/max has moved as well. We need to recreate both the lookup and the data when we remap. Closes lance-format#5826
The previous remap implementation assumed it was safe to just copy the lookup unchanged. This is only safe if there are no deletions. If there are deletions then the page boundaries have moved and so the min/max has moved as well. We need to recreate both the lookup and the data when we remap.
Closes #5826