fix: handle NULL elements in LABEL_LIST index results and explain_plan by fenfeng9 · Pull Request #5867 · lance-format/lance

fenfeng9 · 2026-01-31T19:24:50Z

changes:

Treat element-level NULLs in LABEL_LIST as non-matches so array_has_any/array_has_all return TRUE/FALSE when the list itself is non-NULL.
Allow nullable list literals in LabelListQuery::to_expr to prevent explain_plan() panics.
Add Python tests covering element-level NULLs, list-level NULLs, NULL-literal filters and explain behavior.

codecov · 2026-01-31T19:58:47Z

Codecov Report

❌ Patch coverage is 75.00000% with 1 line in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
rust/lance-index/src/scalar.rs	50.00%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

fenfeng9 · 2026-02-02T07:15:00Z

PTAL @westonpace .

Add tests for List<str>, List<int>, Struct, and List<Struct<str>> covering scan, take, and filter (including NOT/OR variants) with and without indices (LabelList, BTree, Bitmap). Data includes null list elements, null lists, null struct fields, and null struct elements in lists to catch regressions like lance-format#5867. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fenfeng9 · 2026-02-06T08:15:51Z

Negation is currently broken for NULL lists in the index path.
LabelListIndex flattens List<T> into scalar rows for BitmapIndex. unnest_batch drops rows where the list itself is NULL (0 indices), so those rows never enter the bitmap.

As a result, we can’t distinguish “list is NULL” from “list doesn’t contain the value”, and NULL is treated as FALSE; NOT then incorrectly returns it.

For the test case [["foo", None], ["foo"], None]:

Row 0: TRUE (contains "foo") → NOT → FALSE ✓
Row 1: TRUE (contains "foo") → NOT → FALSE ✓
Row 2: NULL (list itself is NULL) → NOT → should be NULL (filtered out)

But since Row 2 is missing from the index, it's treated as FALSE → NOT becomes TRUE, causing the negation query to incorrectly return it.

wjones127 · 2026-02-06T20:12:07Z


+def test_label_list_index_null_element_match(tmp_path: Path):
+    """Ensure LABEL_LIST index keeps scan semantics when lists contain NULLs."""
+    tbl = pa.table({"labels": [["foo", None], ["foo"], None]})


We should add a case where there are nulls but it shouldn't contain "foo".

Suggested change

tbl = pa.table({"labels": [["foo", None], ["foo"], None]})

tbl = pa.table({"labels": [["foo", None], ["foo"], ["bar", None], None]})

wjones127 · 2026-02-06T20:12:44Z

+        "NOT array_has_any(labels, ['foo'])",
+        "NOT array_has_all(labels, ['foo'])",
+        "NOT array_contains(labels, 'foo')",


If you want to address these in a different issue / PR, feel free to comment out the failing one and add a comment with a link to a follow up issue.

Thanks for the suggestion! I'll create a follow-up issue and submit a PR to address these separately.

Co-authored-by: Will Jones <willjones127@gmail.com>

fenfeng9 · 2026-02-07T07:27:43Z

There are two distinct issues.

Element-level NULLs (NULL items inside a non-NULL list): array_has_any / array_has_all should ignore NULL elements, so results are strictly TRUE/FALSE (no NULL propagation).
List-level NULLs (the list itself is NULL): this still affects NOT semantics and is tracked separately in LabelListIndex: NOT filters mis-handle NULL lists (list-level NULLs) #5904.

This PR fixes (1) by clearing element-level NULLs in the LABEL_LIST index path, and splits the unit tests to cover element-level vs list-level NULLs.

Examples:

Expression	Result	Reason
`array_has_any(["foo", NULL], ["foo"])`	TRUE	Found "foo"; NULL elements are ignored
`NOT array_has_any(["bar", NULL], ["foo"])`	TRUE	No match; NULL ignored → FALSE → NOT FALSE = TRUE

- Clear element-level nulls in label_list searches - Update null-handling tests for label_list

wjones127

Nice work!

Add tests for List<str>, List<int>, Struct, and List<Struct<str>> covering scan, take, and filter (including NOT/OR variants) with and without indices (LabelList, BTree, Bitmap). Data includes null list elements, null lists, null struct fields, and null struct elements in lists to catch regressions like lance-format#5867. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

LabelList index still has issues with null element handling despite PR lance-format#5867 and PR lance-format#5914. Tests pass without LabelList index. Re-enable when fully fixed. Issue: lance-format#5682 Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

github-actions Bot added bug Something isn't working python labels Jan 31, 2026

fenfeng9 mentioned this pull request Feb 5, 2026

fix: avoid panic when repdef serializes empty offsets #5890

Merged

wjones127 self-assigned this Feb 5, 2026

wjones127 requested changes Feb 5, 2026

View reviewed changes

Comment thread python/python/tests/test_scalar_index.py

github-actions Bot mentioned this pull request Feb 5, 2026

test: add nested data query tests #5901

Draft

wjones127 reviewed Feb 6, 2026

View reviewed changes

fenfeng9 mentioned this pull request Feb 7, 2026

LabelListIndex: NOT filters mis-handle NULL lists (list-level NULLs) #5904

Closed

fenfeng9 and others added 4 commits February 7, 2026 15:12

Fix label list explain for NULL literals

625924c

Fix label list NULL overlap in bitmap index

b3b5b46

Update python/python/tests/test_scalar_index.py

d04587c

Co-authored-by: Will Jones <willjones127@gmail.com>

test: update label index test case

bc7897b

fenfeng9 force-pushed the fix/label-list-null-handling branch from 1c10037 to bc7897b Compare February 7, 2026 07:23

fix(lance-index): ignore null elements in label_list matching

690fe23

- Clear element-level nulls in label_list searches - Update null-handling tests for label_list

wjones127 approved these changes Feb 9, 2026

View reviewed changes

wjones127 merged commit 827a59a into lance-format:main Feb 10, 2026
30 checks passed

andrea-reale mentioned this pull request Mar 30, 2026

emilk/fix write starvation rerun-io/lance#12

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: handle NULL elements in LABEL_LIST index results and explain_plan#5867

fix: handle NULL elements in LABEL_LIST index results and explain_plan#5867
wjones127 merged 5 commits intolance-format:mainfrom
fenfeng9:fix/label-list-null-handling

fenfeng9 commented Jan 31, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jan 31, 2026 •

edited

Loading

Uh oh!

fenfeng9 commented Feb 2, 2026

Uh oh!

Uh oh!

fenfeng9 commented Feb 6, 2026 •

edited

Loading

Uh oh!

wjones127 Feb 6, 2026

Uh oh!

fenfeng9 Feb 7, 2026

Uh oh!

wjones127 Feb 6, 2026

Uh oh!

fenfeng9 Feb 7, 2026

Uh oh!

fenfeng9 commented Feb 7, 2026 •

edited

Loading

Uh oh!

wjones127 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	tbl = pa.table({"labels": [["foo", None], ["foo"], None]})
	tbl = pa.table({"labels": [["foo", None], ["foo"], ["bar", None], None]})

Conversation

fenfeng9 commented Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

fenfeng9 commented Feb 2, 2026

Uh oh!

Uh oh!

fenfeng9 commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wjones127 Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

fenfeng9 Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

wjones127 Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

fenfeng9 Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

fenfeng9 commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wjones127 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fenfeng9 commented Jan 31, 2026 •

edited

Loading

codecov Bot commented Jan 31, 2026 •

edited

Loading

fenfeng9 commented Feb 6, 2026 •

edited

Loading

fenfeng9 commented Feb 7, 2026 •

edited

Loading