I am getting fewer results than expected from queries when there is a zone map filter and deletions.
Reproduction
import pyarrow as pa
import lance
data = pa.table({
"id": range(10),
"value": [True, False] * 5,
})
ds = lance.write_dataset(data, "memory://")
ds.delete("NOT value") # Works if I comment this out
ds.to_table(filter="value")
pyarrow.Table
id: int64
value: bool
----
id: [[0,2,4,6,8]]
value: [[true,true,true,true,true]]
But if I add a zone map index, I get only the first three results:
ds.create_scalar_index("value", "ZONEMAP")
ds.to_table(filter="value")
pyarrow.Table
id: int64
value: bool
----
id: [[0,2,4]]
value: [[true,true,true]]
I am getting fewer results than expected from queries when there is a zone map filter and deletions.
Reproduction
But if I add a zone map index, I get only the first three results: