feat: add RLE support for block#4937
Conversation
|
ACTION NEEDED The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. For details on the error please inspect the "PR Title Check" action. |
Xuanwo
left a comment
There was a problem hiding this comment.
Thank you for working on this!
| out_of_line.uncompressed_bits_per_value, | ||
| ))) | ||
| } | ||
| _ => todo!(), |
There was a problem hiding this comment.
We need to support RLE here.
| assert!(debug_str.contains("GeneralMiniBlockCompressor")); | ||
| assert!(debug_str.contains("RleMiniBlockEncoder")); | ||
| assert!(debug_str.contains("RleEncoder")); | ||
| } |
There was a problem hiding this comment.
Maybe it's worth to add an end-to-end test for RLE on block.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Xuanwo
left a comment
There was a problem hiding this comment.
Thank you, I think this PR is almost ready to go. Here are some small nit changes.
| } | ||
| } | ||
| /// Validates RLE compression format and extracts bits_per_value | ||
| fn validate_rle_compression(rle: &crate::format::pb21::Rle) -> u64 { |
There was a problem hiding this comment.
Hi, could you explain more about this check?
There was a problem hiding this comment.
ah, this block is from the original mini_block decompression validation.
I move it to a single function so it can be reused across mini_block and block
| } | ||
|
|
||
| let values_size_bytes: [u8; 8] = data[..8].try_into().unwrap(); | ||
| let values_size: usize = usize::from_le_bytes(values_size_bytes); |
There was a problem hiding this comment.
We encoded values_size as u64 and it's better to use the same value.
| } | ||
| } | ||
|
|
||
| impl BlockCompressor for RleEncoder { |
There was a problem hiding this comment.
Maybe it's worth a comment here to declare that we are concating values & run_lengths buffers together.
d008bd0 to
dbc2fad
Compare
|
@Xuanwo |
8dddf16 to
310db90
Compare
resolve lance-format#4897 --------- Co-authored-by: stevie9868 <yingjianwu2@email.com> Co-authored-by: Xuanwo <github@xuanwo.io>
resolve #4897