Skip to content

fix(encoding): handle empty rows in variable packed struct decode#5995

Merged
Xuanwo merged 1 commit intomainfrom
xuanwo/fix-packed-empty-row
Feb 24, 2026
Merged

fix(encoding): handle empty rows in variable packed struct decode#5995
Xuanwo merged 1 commit intomainfrom
xuanwo/fix-packed-empty-row

Conversation

@Xuanwo
Copy link
Copy Markdown
Collaborator

@Xuanwo Xuanwo commented Feb 24, 2026

In full-zip variable packed decoding, rep/def may produce visible rows with empty payloads (for null/invalid items). The decoder previously assumed every visible row had bytes for each child and failed with Packed struct fixed child exceeds row bounds.

This happened during write new tables with blob v2.

Summary

  • fix PackedStructVariablePerValueDecompressor to handle empty packed rows (row_start == row_end)
  • append one per-child placeholder value for empty rows so child builders remain row-aligned

Parts of this PR were drafted with assistance from Codex (with gpt-5.3-codex) and fully reviewed and edited by me. I take full responsibility for all changes.

@github-actions github-actions Bot added the bug Something isn't working label Feb 24, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR Review: fix(encoding): handle empty rows in variable packed struct decode

Summary: This is a well-targeted bug fix for handling empty rows in variable packed struct decoding.

Assessment: ✅ LGTM

No P0/P1 issues identified.

The fix correctly handles the edge case where rep/def produces visible rows with empty payloads (row_start == row_end). The approach of pre-creating empty placeholder values and appending them when empty rows are encountered is sound and maintains child row alignment as expected.

Minor observations (non-blocking):

  • The test coverage is adequate for the specific bug being fixed
  • Pre-allocating empty_value for each field introduces minimal overhead and is only used when needed

The implementation follows the existing code patterns well.

@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 24, 2026

Codecov Report

❌ Patch coverage is 92.63158% with 7 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...st/lance-encoding/src/encodings/physical/packed.rs 92.63% 6 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

@Xuanwo Xuanwo merged commit 348b4df into main Feb 24, 2026
32 checks passed
@Xuanwo Xuanwo deleted the xuanwo/fix-packed-empty-row branch February 24, 2026 17:06
wjones127 pushed a commit to wjones127/lance that referenced this pull request Feb 25, 2026
…nce-format#5995)

In full-zip variable packed decoding, rep/def may produce visible rows
with empty payloads (for null/invalid items). The decoder previously
assumed every visible row had bytes for each child and failed with
`Packed struct fixed child exceeds row bounds`.

This happened during write new tables with blob v2.

## Summary

- fix `PackedStructVariablePerValueDecompressor` to handle empty packed
rows (`row_start == row_end`)
- append one per-child placeholder value for empty rows so child
builders remain row-aligned

---

**Parts of this PR were drafted with assistance from Codex (with
`gpt-5.3-codex`) and fully reviewed and edited by me. I take full
responsibility for all changes.**
wjones127 pushed a commit to wjones127/lance that referenced this pull request Feb 25, 2026
…nce-format#5995)

In full-zip variable packed decoding, rep/def may produce visible rows
with empty payloads (for null/invalid items). The decoder previously
assumed every visible row had bytes for each child and failed with
`Packed struct fixed child exceeds row bounds`.

This happened during write new tables with blob v2.

## Summary

- fix `PackedStructVariablePerValueDecompressor` to handle empty packed
rows (`row_start == row_end`)
- append one per-child placeholder value for empty rows so child
builders remain row-aligned

---

**Parts of this PR were drafted with assistance from Codex (with
`gpt-5.3-codex`) and fully reviewed and edited by me. I take full
responsibility for all changes.**
wjones127 pushed a commit that referenced this pull request Feb 26, 2026
)

In full-zip variable packed decoding, rep/def may produce visible rows
with empty payloads (for null/invalid items). The decoder previously
assumed every visible row had bytes for each child and failed with
`Packed struct fixed child exceeds row bounds`.

This happened during write new tables with blob v2.

## Summary

- fix `PackedStructVariablePerValueDecompressor` to handle empty packed
rows (`row_start == row_end`)
- append one per-child placeholder value for empty rows so child
builders remain row-aligned

---

**Parts of this PR were drafted with assistance from Codex (with
`gpt-5.3-codex`) and fully reviewed and edited by me. I take full
responsibility for all changes.**
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants