Skip to content

fix: filter garbage entries from null maps during encoding#5591

Merged
wjones127 merged 3 commits intolance-format:mainfrom
wkalt:task/fix-struct-map
Dec 31, 2025
Merged

fix: filter garbage entries from null maps during encoding#5591
wjones127 merged 3 commits intolance-format:mainfrom
wkalt:task/fix-struct-map

Conversation

@wkalt
Copy link
Copy Markdown
Contributor

@wkalt wkalt commented Dec 29, 2025

Null map entries can have non-zero length with garbage values that should be ignored. MapStructuralEncoder was passing all entries to the child encoder, but repdef only counted valid entries, which caused errors to occur when encoding Structs with Map values.

Add MapArrayExt trait (mirroring ListArrayExt) with filter_garbage_nulls() and trimmed_entries() methods, and use them in MapStructuralEncoder.

@github-actions github-actions Bot added the bug Something isn't working label Dec 29, 2025
Copy link
Copy Markdown
Contributor

@wjones127 wjones127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. I have minor suggestion, if you want to simplify some implementation. Otherwise, I'll merge this tomorrow.

Comment thread rust/lance-arrow/src/map.rs Outdated
}

impl MapArrayExt for MapArray {
fn filter_garbage_nulls(&self) -> Self {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Physically, MapArray is a ListArray. Part of my wonders if this would be simpler to just do let trimmed = ListArray::from(self.clone()).filter_garbage_nulls() and the convert back into a MapArray.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. Just updated.

Null map entries can have non-zero length with garbage values that
should be ignored. MapStructuralEncoder was passing all entries to
the child encoder, but repdef only counted valid entries, which caused
errors to occur when encoding Structs with Map values.

Add MapArrayExt trait (mirroring ListArrayExt) with filter_garbage_nulls()
and trimmed_entries() methods, and use them in MapStructuralEncoder.
@wkalt wkalt force-pushed the task/fix-struct-map branch from 6c34b98 to 2ab3e7c Compare December 31, 2025 01:10
@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 31, 2025

Codecov Report

❌ Patch coverage is 87.50000% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance-encoding/src/encodings/logical/map.rs 87.50% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@wjones127 wjones127 merged commit 5e6a460 into lance-format:main Dec 31, 2025
31 checks passed
jackye1995 pushed a commit to jackye1995/lance that referenced this pull request Jan 21, 2026
…mat#5591)

Null map entries can have non-zero length with garbage values that
should be ignored. MapStructuralEncoder was passing all entries to the
child encoder, but repdef only counted valid entries, which caused
errors to occur when encoding Structs with Map values.

Add MapArrayExt trait (mirroring ListArrayExt) with
filter_garbage_nulls() and trimmed_entries() methods, and use them in
MapStructuralEncoder.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants