Skip to content

fix: forward compatibility of FTS index#4906

Merged
jackye1995 merged 7 commits intolance-format:mainfrom
jackye1995:fts-fc
Oct 8, 2025
Merged

fix: forward compatibility of FTS index#4906
jackye1995 merged 7 commits intolance-format:mainfrom
jackye1995:fts-fc

Conversation

@jackye1995
Copy link
Copy Markdown
Contributor

When creating using 0.38 and reading using 0.36 for list_indices the operation fails with

thread '<unnamed>' panicked at /home/runner/work/lance/lance/rust/lance-index/src/scalar/inverted/index.rs:846:30:
called `Option::unwrap()` on a `None` value

@github-actions github-actions Bot added bug Something isn't working python labels Oct 7, 2025
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Oct 7, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.61%. Comparing base (ec2ac35) to head (83bb81f).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #4906   +/-   ##
=======================================
  Coverage   81.60%   81.61%           
=======================================
  Files         333      333           
  Lines      131384   131388    +4     
  Branches   131384   131388    +4     
=======================================
+ Hits       107221   107236   +15     
+ Misses      20565    20555   -10     
+ Partials     3598     3597    -1     
Flag Coverage Δ
unittests 81.61% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

reason="FTS token set format was introduced in 0.36.0",
)
def test_list_indices_ignores_new_fts_index_version():
session = lance.Session(index_cache_size_bytes=0, metadata_cache_size_bytes=0)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, are we supposed to set index_cache_size_bytes=0 and metadata_cache_size_bytes=0?


pub const INVERTED_INDEX_VERSION: u32 = 0;
// Version 0: Arrow TokenSetFormat (legacy)
// Version 1: Fst TokenSetFormat (new default, incompatible with old clients)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mention lance version where version 1 was added?

Copy link
Copy Markdown
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change itself (bumping version when token set format is new) seems fine.

I hadn't expected index versioning to evolve quite like this. I had assumed newer versions of lance would just write newer index versions and we wouldn't have the ability to write multiple index versions. However, this seems better for gradual migration.

@jackye1995 jackye1995 merged commit 0765b66 into lance-format:main Oct 8, 2025
29 checks passed
wjones127 added a commit that referenced this pull request Nov 5, 2025
…c RowIdTreeMap.serialize… (#5105)

…_into stable

Closes: #5093

This PR made the following changes:
1. doc RowIdTreeMap.serialize_info as stable, which was used in bitmap
index
2. make sure that the forward compatibility tests datasets have multiple
fragments to guarantee older versions Lance can correctly load bitmap
index.
3. make sure backward compatibility test `test_old_btree_bitmap_indices`
use bitmap index
4. fix fails introduced by #4906

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
jackye1995 added a commit to jackye1995/lance that referenced this pull request Jan 21, 2026
When creating using 0.38 and reading using 0.36 for `list_indices` the
operation fails with

```
thread '<unnamed>' panicked at /home/runner/work/lance/lance/rust/lance-index/src/scalar/inverted/index.rs:846:30:
called `Option::unwrap()` on a `None` value
```

This PR bumps the version of the FTS index when the token set format is FTS, so that the incompatible index should be ignored by old clients. However, there is a bug in the old client that if the indices are cached when opening manifest, then the index is not ignored and would still cause error when doing list_indices or other operations against the index.
jackye1995 pushed a commit to jackye1995/lance that referenced this pull request Jan 21, 2026
…c RowIdTreeMap.serialize… (lance-format#5105)

…_into stable

Closes: lance-format#5093

This PR made the following changes:
1. doc RowIdTreeMap.serialize_info as stable, which was used in bitmap
index
2. make sure that the forward compatibility tests datasets have multiple
fragments to guarantee older versions Lance can correctly load bitmap
index.
3. make sure backward compatibility test `test_old_btree_bitmap_indices`
use bitmap index
4. fix fails introduced by lance-format#4906

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants