fix: handle DataType::Null in adjust_child_validity to prevent panic#6160
Merged
wjones127 merged 2 commits intolance-format:mainfrom Mar 11, 2026
Conversation
`adjust_child_validity` would call `ArrayData::try_new` with a null
bitmap on a `DataType::Null` array. Arrow rejects this with
`InvalidArgumentError("Arrays of type Null cannot contain a null
bitmask")`, causing an `.unwrap()` panic at lance-arrow/src/lib.rs:1187.
The panic occurs when a struct column has null rows and one of its
sub-fields has `DataType::Null` — which Arrow infers when a column
contains only null values (e.g. a Python/pandas all-None column). When a
later fragment omits that nullable sub-field, Lance inserts a NullReader
to fill it in. MergeStream then merges the real batch (with null struct
rows) and the NullReader batch (all-null struct), recursing into the
struct where `adjust_child_validity` is called with the Null-typed child
and a non-empty parent validity — triggering the panic.
Fix: skip the bitmap operation when `child.data_type() == DataType::Null`.
A Null array is always entirely null by definition and needs no
validity adjustment.
Fixes lance-format#6159
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
ReviewClean, well-documented bugfix. The root cause analysis is thorough and the fix is minimal and correct. No blocking issues found. The early return for Both the unit test (direct LGTM. |
westonpace
approved these changes
Mar 10, 2026
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
westonpace
pushed a commit
that referenced
this pull request
Mar 17, 2026
…6160) Previously, `adjust_child_validity` would call `ArrayData::try_new` with a null bitmap on a `DataType::Null` array, causing an `.unwrap()` panic with `InvalidArgumentError("Arrays of type Null cannot contain a null bitmask")`. The trigger: when a user inserts rows where a struct sub-field has only null values, Arrow infers `DataType::Null` for that column. If a subsequent fragment omits that nullable sub-field, Lance inserts a `NullReader` to fill it in. `MergeStream` then merges the real batch (with null struct rows) and the `NullReader` batch (all-null struct), recursing into the struct where `adjust_child_validity` is called with the `Null`-typed child and a non-empty parent validity — triggering the panic. Fix: skip the bitmask operation when `child.data_type() == DataType::Null`. A `Null` array is always entirely null by definition and needs no validity adjustment. Closes #6159 --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
westonpace
added a commit
that referenced
this pull request
Mar 18, 2026
## Summary Cherry-picks bug fixes onto `release/v3.0` for the v3.0.1 patch release: - **#6160** - fix: handle `DataType::Null` in `adjust_child_validity` to prevent panic - **#6187** - fix: handle nullable validity layers without def levels - **#6143** - fix: prevent duplicate manifest entries from concurrent table creation - **#6212** - chore: bump lz4_flex patch versions - **#6146** - fix: replace fetch_arrow_table with to_arrow_table ## Test plan - CI passes on cherry-picked commits (both PRs were already merged and tested on main) --------- Co-authored-by: Will Jones <willjones127@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Xuanwo <github@xuanwo.io> Co-authored-by: Jonathan Hsieh <jon@lancedb.com> Co-authored-by: BubbleCal <bubble-cal@outlook.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Previously,
adjust_child_validitywould callArrayData::try_newwith a null bitmap on aDataType::Nullarray, causing an.unwrap()panic withInvalidArgumentError("Arrays of type Null cannot contain a null bitmask").The trigger: when a user inserts rows where a struct sub-field has only null values, Arrow infers
DataType::Nullfor that column. If a subsequent fragment omits that nullable sub-field, Lance inserts aNullReaderto fill it in.MergeStreamthen merges the real batch (with null struct rows) and theNullReaderbatch (all-null struct), recursing into the struct whereadjust_child_validityis called with theNull-typed child and a non-empty parent validity — triggering the panic.Fix: skip the bitmask operation when
child.data_type() == DataType::Null. ANullarray is always entirely null by definition and needs no validity adjustment.Closes #6159