perf: avoid oversized variable buffers in full-zip scan batches#6013
perf: avoid oversized variable buffers in full-zip scan batches#6013
Conversation
PR ReviewSummary: Performance fix for variable full-zip decoding regression. The fix correctly slices and rebases variable-width data buffers to bound memory for each batch. AssessmentNo P0/P1 issues found. The implementation is sound:
Minor Observations (not blocking)
LGTM 👍 |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
westonpace
left a comment
There was a problem hiding this comment.
In theory the downstream decompressors should be able to realize they don't need to decompress the entire data buffer by looking at the offsets. This would avoid the offsets copy introduced here.
However, an offsets copy is relatively cheap (offsets will never be that large), and we are clearly doing the wrong thing now, so this PR is a step in the right direction.
Can we just file a follow-up to see if we can remove this copy sometime in the future?
Sure, will do! |
|
Filed a follow-up issue to investigate removing the per-batch offsets copy while preserving correctness and perf guarantees: #6027 |
…e-format#6013) `VariableFullZipDecoder::drain` used to pass the entire page-level variable `data` buffer to every batch while slicing only the batch offsets. This broke the normalized variable-buffer assumption (offsets should be 0-based for the provided data window) and caused downstream variable decompressors to repeatedly size work against oversized input buffers. Bencmarks: - scan-full: 34047us -> 15703us,2.168x - caption: 28431us -> 6979us,4.074x ## Changes - Add `LanceBuffer::slice_and_rebase_offsets` in `rust/lance-encoding/src/buffer.rs`. - Slice data to the byte range referenced by the batch offsets. - Rebase offsets to start at zero. - Validate invalid/overflowing offsets. - Update variable full-zip batch drain path to use the new helper.
VariableFullZipDecoder::drainused to pass the entire page-level variabledatabuffer to every batch while slicing only the batch offsets.This broke the normalized variable-buffer assumption (offsets should be 0-based for the provided data window) and caused downstream variable decompressors to repeatedly size work against oversized input buffers.
Bencmarks:
Changes
LanceBuffer::slice_and_rebase_offsetsinrust/lance-encoding/src/buffer.rs.