Skip to content

fix: fts search should respect target fragment for scan#5081

Closed
yingjianwu98 wants to merge 1 commit intolance-format:mainfrom
yingjianwu98:fts_search_should_respect_fragment_scan
Closed

fix: fts search should respect target fragment for scan#5081
yingjianwu98 wants to merge 1 commit intolance-format:mainfrom
yingjianwu98:fts_search_should_respect_fragment_scan

Conversation

@yingjianwu98
Copy link
Copy Markdown
Contributor

@yingjianwu98 yingjianwu98 commented Oct 28, 2025

FTS search currently does not respect the target fragment list specified via Scanner.with_fragments().

Given a dataset with:

  • indexed_fragments = [1, 2, 3]
  • unindexed_fragments = [4, 5]

Scenario 1: Query targets only unindexed fragments [4, 5]

  • Bug: Still scans and returns results from indexed fragments [1, 2, 3]
  • Expected: Should only scan fragments [4, 5] using flat search

Scenario 2: Query targets mixed fragments [3, 4]

  • Bug: Scans indexed fragment 3 correctly, but also scans unindexed fragment 5 even though it's not in the target list
  • Expected: Should only scan fragments [3, 4]

@github-actions github-actions Bot added the bug Something isn't working label Oct 28, 2025
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Oct 28, 2025

Codecov Report

❌ Patch coverage is 92.45283% with 8 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/dataset/scanner.rs 92.45% 3 Missing and 5 partials ⚠️

📢 Thoughts on this report? Let us know!

@artemru
Copy link
Copy Markdown

artemru commented Oct 28, 2025

fixes #5063

@yingjianwu98
Copy link
Copy Markdown
Contributor Author

yingjianwu98 commented Oct 31, 2025

@westonpace @wjones127

Sorry for the direct ping, but I believe this bug could potentially have Lance to return results from unintended fragments. This might have broader implications so I want to bring this to your attention, thanks.

Copy link
Copy Markdown
Contributor

@jackye1995 jackye1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we combine this with #5080 and have some sort of shared function like filter_to_scanner_fragments()?


// Filter by scanner's fragment list to respect fragment restrictions
if let Some(scanner_frags) = &self.fragments {
let scanner_ids: std::collections::HashSet<_> =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use self.get_fragments_as_bitmap(); in order to perform the same operation instead of using a hashset.

@yingjianwu98
Copy link
Copy Markdown
Contributor Author

closing this in favor of #5924

wjones127 pushed a commit that referenced this pull request Feb 17, 2026
…quested fragments (#5924)

Fixes bug where vector and FTS searches ignore `with_fragments() `filter
when querying unindexed fragments, which will return results from
indexed fragments that were not requested.

Note, this does not fix the issues where `with_fragments` contains both
indexed_fragement and non_indexed_fragment for FTS and vector search,
and I have a separate follow up PR to fix the behavior.


This PR combines my previous effort for fixing the issue:
#5081
#5080

---------

Co-authored-by: stevie9868 <yingjianwu2@email.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants