Skip to content

bug: FTS gives duplicate results if searching over multiple columns #3188

@wjones127

Description

@wjones127

I'm noticing I get a duplicate result if there are multiple columns. Is this expected? @BubbleCal

This is using pylance v0.19.2.

import lance
import pyarrow as pa

assert lance.__version__ == '0.19.2'

data = [
    {"animal": "domestic rabbit", "description": "Eating carrots"},
]
data = pa.Table.from_pylist(data)

ds = lance.write_dataset(data, "./demo_fts", mode="overwrite")
ds.create_scalar_index("animal", index_type="INVERTED")
ds.create_scalar_index("description", index_type="INVERTED")

ds.to_table(full_text_query="domestic eating").to_pandas()
            animal     description    _score
0  domestic rabbit  Eating carrots  0.287682
1  domestic rabbit  Eating carrots  0.287682

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions