Skip to content

feat: add RLE support for block#4937

Merged
Xuanwo merged 8 commits intolance-format:mainfrom
yingjianwu98:supportRLEForBlock
Jan 28, 2026
Merged

feat: add RLE support for block#4937
Xuanwo merged 8 commits intolance-format:mainfrom
yingjianwu98:supportRLEForBlock

Conversation

@yingjianwu98
Copy link
Copy Markdown
Contributor

@yingjianwu98 yingjianwu98 commented Oct 13, 2025

resolve #4897

@github-actions github-actions Bot added the enhancement New feature or request label Oct 13, 2025
@github-actions
Copy link
Copy Markdown
Contributor

ACTION NEEDED
Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

@yingjianwu98 yingjianwu98 changed the title feat:Support rle for block feat: Support rle for block Oct 13, 2025
@yingjianwu98 yingjianwu98 changed the title feat: Support rle for block feat: add RLE support for block Oct 13, 2025
Copy link
Copy Markdown
Collaborator

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this!

out_of_line.uncompressed_bits_per_value,
)))
}
_ => todo!(),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to support RLE here.

assert!(debug_str.contains("GeneralMiniBlockCompressor"));
assert!(debug_str.contains("RleMiniBlockEncoder"));
assert!(debug_str.contains("RleEncoder"));
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's worth to add an end-to-end test for RLE on block.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Oct 13, 2025

Codecov Report

❌ Patch coverage is 86.52695% with 45 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance-encoding/src/compression.rs 75.39% 25 Missing and 6 partials ⚠️
rust/lance-encoding/src/encodings/physical/rle.rs 93.00% 13 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

@yingjianwu98 yingjianwu98 requested a review from Xuanwo October 13, 2025 16:54
Copy link
Copy Markdown
Collaborator

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I think this PR is almost ready to go. Here are some small nit changes.

Comment thread rust/lance-encoding/src/compression.rs Outdated
}
}
/// Validates RLE compression format and extracts bits_per_value
fn validate_rle_compression(rle: &crate::format::pb21::Rle) -> u64 {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, could you explain more about this check?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, this block is from the original mini_block decompression validation.

I move it to a single function so it can be reused across mini_block and block

}

let values_size_bytes: [u8; 8] = data[..8].try_into().unwrap();
let values_size: usize = usize::from_le_bytes(values_size_bytes);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We encoded values_size as u64 and it's better to use the same value.

}
}

impl BlockCompressor for RleEncoder {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's worth a comment here to declare that we are concating values & run_lengths buffers together.

@yingjianwu98
Copy link
Copy Markdown
Contributor Author

@Xuanwo
I have updated the PR based on comments! Let me know what you think, thanks!

@Xuanwo Xuanwo merged commit 5b56431 into lance-format:main Jan 28, 2026
29 checks passed
vivek-bharathan pushed a commit to vivek-bharathan/lance that referenced this pull request Feb 2, 2026
resolve lance-format#4897

---------

Co-authored-by: stevie9868 <yingjianwu2@email.com>
Co-authored-by: Xuanwo <github@xuanwo.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add RLE for block compression

3 participants