Skip to content

perf: upgrade roaring to 0.11 and improve bitmap/range conversions#5961

Merged
westonpace merged 2 commits intolance-format:mainfrom
LuQQiu:improveBitmapConversion
Feb 18, 2026
Merged

perf: upgrade roaring to 0.11 and improve bitmap/range conversions#5961
westonpace merged 2 commits intolance-format:mainfrom
LuQQiu:improveBitmapConversion

Conversation

@LuQQiu
Copy link
Copy Markdown
Contributor

@LuQQiu LuQQiu commented Feb 17, 2026

Use roaring 0.11's next_range() in bitmap_to_ranges to extract contiguous ranges directly instead of iterating bit-by-bit. Add ranges_to_bitmap with an adaptive strategy that picks from_sorted_iter for short ranges and insert_range for long ranges.
This improve 10M bitmap <-> vec roundtrip from 33ms - 158ms range to 0.004ms to 53ms depending on the selected rate.

validated that this does not have forward compatibility issue, the serialized code created by roaring 0.11+ can be read by roaring >= 0.10.2 and lance never shipped with roaring version < 0.10.2, and roaring is always backward compatible. Verified that no backward/forward compatibility issue introduce by this change.

=== 1% selected ===
Ranges: 100,000 | Bitmap: 197 KB | Vec: 1.5 MB
Vec -> Bitmap: 0.428 ms
Bitmap -> Vec: 3.068 ms
Roundtrip: 3.488 ms

=== 5% selected ===
Ranges: 500,000 | Bitmap: 978 KB | Vec: 7.6 MB
Vec -> Bitmap: 1.909 ms
Bitmap -> Vec: 19.384 ms
Roundtrip: 21.297 ms

=== 10% selected ===
Ranges: 1,000,000 | Bitmap: 1.2 MB | Vec: 15.3 MB
Vec -> Bitmap: 5.061 ms
Bitmap -> Vec: 5.802 ms
Roundtrip: 11.050 ms

=== 30% selected ===
Ranges: 3,000,000 | Bitmap: 1.1 MB | Vec: 45.8 MB
Vec -> Bitmap: 13.044 ms
Bitmap -> Vec: 19.142 ms
Roundtrip: 32.100 ms

=== 50% selected ===
Ranges: 5,000,000 | Bitmap: 1.2 MB | Vec: 76.3 MB
Vec -> Bitmap: 21.009 ms
Bitmap -> Vec: 31.781 ms
Roundtrip: 53.135 ms

=== 70% selected ===
Ranges: 3,000,001 | Bitmap: 1.2 MB | Vec: 45.8 MB
Vec -> Bitmap: 28.787 ms
Bitmap -> Vec: 19.244 ms
Roundtrip: 48.007 ms

=== 90% selected ===
Ranges: 1,000,000 | Bitmap: 3.3 MB | Vec: 15.3 MB
Vec -> Bitmap: 26.929 ms
Bitmap -> Vec: 4.847 ms
Roundtrip: 31.887 ms

=== 95% selected ===
Ranges: 500,000 | Bitmap: 1.9 MB | Vec: 7.6 MB
Vec -> Bitmap: 14.380 ms
Bitmap -> Vec: 2.332 ms
Roundtrip: 16.514 ms

=== 99% selected ===
Ranges: 100,000 | Bitmap: 393 KB | Vec: 1.5 MB
Vec -> Bitmap: 2.497 ms
Bitmap -> Vec: 0.455 ms
Roundtrip: 3.005 ms

=== 100% selected ===
Ranges: 1 | Bitmap: 2 KB | Vec: 16 B
Vec -> Bitmap: 0.003 ms
Bitmap -> Vec: 0.001 ms
Roundtrip: 0.004 ms

Use roaring 0.11's `next_range()` in `bitmap_to_ranges` to extract
contiguous ranges directly instead of iterating bit-by-bit. Add
`ranges_to_bitmap` with an adaptive strategy that picks `from_sorted_iter`
for short ranges and `insert_range` for long ranges.
@github-actions
Copy link
Copy Markdown
Contributor

Code Review

P0: Breaking Change - Wire Format Compatibility

The PR description notes:

Roaring 0.11 introduces run containers via insert_range on new containers, which changes the wire format. Old Lance (roaring 0.10) cannot deserialize bitmaps containing run containers.

This is a breaking change that needs careful consideration:

  1. Is there a way to disable run containers to maintain backwards compatibility?
  2. Should the Lance format version be bumped?
  3. How will this affect users trying to read newer datasets with older Lance versions?

At minimum, this should be documented in release notes and potentially in a migration guide.

P1: Potential Panic on Division by Zero

In ranges_to_bitmap:

let sample_size = ranges.len().min(10);
let avg_len: u64 = ranges
    .iter()
    .take(sample_size)
    .map(|r| r.end - r.start)
    .sum::<u64>()
    / sample_size as u64;

If all sampled ranges have end == start (empty ranges), the division is safe because sample_size >= 1 (we return early if ranges.is_empty()). However, if empty ranges are passed with start > end (malformed ranges), this could underflow. Consider adding a note or defensive check.

P1: Missing Documentation

The new ranges_to_bitmap function and the updated bitmap_to_ranges are public APIs but lack doc comments. The original bitmap_to_ranges had documentation that was removed. Please add:

  • Doc comments explaining the functions purpose
  • Description of the sorted parameter for ranges_to_bitmap
  • Note about the adaptive strategy for performance

Minor Notes

  • The crossover threshold of 6 is interesting - the comment explains the tradeoff well
  • Good test coverage for the new functions
  • The performance improvements look substantial based on the benchmarks

Overall the implementation looks correct and the performance gains are impressive. The main concerns are around the breaking wire format change and the missing documentation.

@LuQQiu LuQQiu changed the title perf: upgrade roaring to 0.11 and improve bitmap/range conversions [WIP] perf: upgrade roaring to 0.11 and improve bitmap/range conversions Feb 17, 2026
@github-actions
Copy link
Copy Markdown
Contributor

ACTION NEEDED
Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@LuQQiu
Copy link
Copy Markdown
Contributor Author

LuQQiu commented Feb 17, 2026

before serialize the data, can call remove_run_compression to use the old approach to serialize without using run container.so that roaring bitmap created and serialized by roaring 0.11 can also be recognized by older lance format.
However, it makes the result roaring bitmap much bigger
Test at 10m rate

  Selectivity: 0.01%
  Size with runs: 146 B
  Size without runs: 2.0 KB
  Serialize (runs): 0 us
  Serialize (no runs): 0 us
  remove_run_compression: 0 us
  ────────────────────────────────────────
  Selectivity: 0.10%
  Size with runs: 1.4 KB
  Size without runs: 20.3 KB
  Serialize (runs): 0 us
  Serialize (no runs): 4 us
  remove_run_compression: 5 us
  ────────────────────────────────────────
  Selectivity: 1.00%
  Size with runs: 5.4 KB
  Size without runs: 196.5 KB
  Serialize (runs): 1 us
  Serialize (no runs): 42 us
  remove_run_compression: 48 us
  ────────────────────────────────────────
  Selectivity: 5.00%
  Size with runs: 21.1 KB
  Size without runs: 977.8 KB
  Serialize (runs): 5 us
  Serialize (no runs): 199 us
  remove_run_compression: 201 us
  ────────────────────────────────────────
  Selectivity: 10.00%
  Size with runs: 40.6 KB
  Size without runs: 1.20 MB
  Serialize (runs): 9 us
  Serialize (no runs): 91 us
  remove_run_compression: 54 us
  ────────────────────────────────────────
  Selectivity: 25.00%
  Size with runs: 99.3 KB
  Size without runs: 1.20 MB
  Serialize (runs): 24 us
  Serialize (no runs): 261 us
  remove_run_compression: 90 us
  ────────────────────────────────────────
  Selectivity: 50.00%
  Size with runs: 197.1 KB
  Size without runs: 1.20 MB
  Serialize (runs): 45 us
  Serialize (no runs): 67 us
  remove_run_compression: 161 us
  ────────────────────────────────────────
  Selectivity: 75.00%
  Size with runs: 307.1 KB
  Size without runs: 1.20 MB
  Serialize (runs): 67 us
  Serialize (no runs): 131 us
  remove_run_compression: 230 us
  ────────────────────────────────────────
  Selectivity: 90.00%
  Size with runs: 365.0 KB
  Size without runs: 1.20 MB
  Serialize (runs): 81 us
  Serialize (no runs): 104 us
  remove_run_compression: 263 us
  ────────────────────────────────────────
  Selectivity: 99.00%
  Size with runs: 410.7 KB
  Size without runs: 1.20 MB
  Serialize (runs): 118 us
  Serialize (no runs): 91 us
  remove_run_compression: 309 us
  ────────────────────────────────────────
  Selectivity: 100.00%
  Size with runs: 2.1 KB
  Size without runs: 1.20 MB
  Serialize (runs): 1 us
  Serialize (no runs): 87 us
  remove_run_compression: 50 us

@LuQQiu
Copy link
Copy Markdown
Contributor Author

LuQQiu commented Feb 18, 2026

Upgrade to Roaring 0.11 gives performance benefits but breaks serialization/forward compatibility (older lance code cannot read lance file created by newer lance code)

Roaring 0.11 introduces next_range() on the bitmap iterator, which extracts contiguous ranges directly from the container structure in O(1) per range, ~100x faster than the current bit-by-bit bitmap_to_ranges implementation at 10M rows. However, roaring 0.11 also changes insert_range behavior: on new containers it creates run-length encoded (RLE) containers, which produce a different wire format. Roaring versions before 0.10.2 cannot deserialize bitmaps containing run containers (error: "run containers are unsupported").

These serialized bitmaps are persisted across multiple locations in the Lance format:

  • Deletion files
  • Manifest metadata
  • Index files (_indices/{uuid}/*.lance)
    ....

For RoaringBitmap, we can call remove_run_compression() before serialization to strip run containers and maintain backward compatibility. The overhead is small (~300us at 10M rows). But RoaringTreemap (used in compaction, ngram indexing, and other places) has no remove_run_compression() method and no public access to its inner bitmaps, so there is no way to strip run containers from it. This means upgrading to roaring 0.11 risks breaking deserialization for any RoaringTreemap built with insert_range, and we cannot fully protect all serialization paths without either an upstream change to the roaring crate, replacing RoaringTreemap usage with our own RowAddrTreeMap (which we control), or staying on roaring 0.10 and giving up the next_range() performance gain.

@westonpace @wjones127 Would like to hear your opinion of dealing with such scenario. Any chance when next time we upgrade to a newer lance version, and we can have this change goes in as well?

NEVER MIND. Validated that roaring starting from 0.10.2 can deserialize all of roaring 0.11+.
And Lance never shipped a binary with roaring < 0.10.6
We are good here!

@LuQQiu
Copy link
Copy Markdown
Contributor Author

LuQQiu commented Feb 18, 2026

Validated that roaring starting from 0.10.2 can deserialize all of roaring 0.11+.
And Lance never shipped a binary with roaring < 0.10.6
We are good here!

@LuQQiu LuQQiu changed the title [WIP] perf: upgrade roaring to 0.11 and improve bitmap/range conversions perf: upgrade roaring to 0.11 and improve bitmap/range conversions Feb 18, 2026
Copy link
Copy Markdown
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome cleanup, thanks!

Comment on lines +940 to +943
while let Some(r) = iter.next_range() {
ranges.push(*r.start() as u64..(*r.end() as u64 + 1));
}
ranges
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so much nicer 😆

// from_sorted_iter appends each value in O(1) but must visit every u32.
// insert_range bulk-fills containers but does a binary search per call.
// Crossover is ~6: below that, iterating all values is cheaper.
if avg_len <= 6 {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

@westonpace westonpace merged commit d1d1e53 into lance-format:main Feb 18, 2026
30 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants