perf: optimize stable row_id index build from O(rows) to O(fragments) by jiaoew1991 · Pull Request #6310 · lance-format/lance

jiaoew1991 · 2026-03-27T11:10:57Z

Summary

When enable_stable_row_id is enabled, the first take_rows() call triggers a full RowIdIndex build. On large datasets this cold start was extremely slow (18.9s for 968M rows, 220s for 4.26B rows).

Root Causes

1. O(total_rows) segment expansion in decompose_sequence

The original code expanded every U64Segment element-by-element, even for Range segments with no deletions. For a Range(0..273711) with no deletions, this meant 273K iterations, deletion vector checks, temporary allocations, and re-compression — only to produce the same Range back. Across 18,243 fragments averaging 233K rows, this totaled 4.26 billion iterations with ~32 GB of temporary allocations.

2. O(N²) fragment lookup in load_row_id_index

The original code called fragments.iter().find() (O(N) linear search) for each of N fragments, resulting in O(N²) comparisons. try_join_all spawned all N futures at once, overwhelming the async runtime. get_deletion_vector() was called unconditionally even for fragments without deletion files.

Solution

Fix 1: O(1) fast path for Range segments without deletions. When a fragment has no deletions and its row_id sequence is a Range, construct the index chunk directly without iterating.

Fix 2: HashMap lookup + conditional deletion vector loading. Use a HashMap<u32, &FileFragment> for O(1) lookup, buffer_unordered for controlled concurrency, and skip get_deletion_vector() when there's no deletion file.

Results

Dataset	Before	After	Speedup
968M rows, 3,540 fragments	18.9s	150ms	126x
4.26B rows, 18,243 fragments	220s	89ms	2,471x

Test plan

All existing rowids tests pass (45 tests)
Clippy clean
New test test_large_range_segments_no_deletions validates fast path correctness at boundaries and performance (100 fragments × 250K rows completes < 1s)
Verify on real datasets with deletions to ensure slow path correctness

🤖 Generated with Claude Code

Fix two performance bottlenecks in RowIdIndex construction that caused extreme cold start latency when `enable_stable_row_id` is enabled: 1. `decompose_sequence` (lance-table/src/rowids/index.rs): Previously expanded every Range segment element-by-element to check deletions, then re-compressed back. For Range segments with no deletions, this is pure waste — input is Range, output is Range. Add O(1) fast path that constructs index chunks directly. Complexity: O(total_rows) → O(num_fragments). 2. `load_row_id_index` (lance/src/dataset/rowids.rs): Used linear search (.find()) over all fragments for each fragment_id, giving O(N²) total. Also spawned all N futures via try_join_all and called get_deletion_vector() even when no deletion file exists. Fix: HashMap for O(1) lookup, buffer_unordered for controlled concurrency, skip deletion vector load when unnecessary. Benchmarks on real S3 datasets: - 968M rows, 3540 fragments: 18.9s → 150ms (126x faster) - 4.26B rows, 18243 fragments: 220s → 89ms (2471x faster) All existing tests pass. Correctness verified against production data with boundary checks, random sampling, and data consistency validation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Extract shared build_chunk_from_pairs helper to eliminate duplication between decompose_segment_no_deletions and decompose_segment_with_deletions - Use .unzip() instead of manual map/collect for splitting pairs - Return Option<IndexChunk> instead of Option<Vec<IndexChunk>> to avoid unnecessary single-element Vec allocations - Fix clippy warnings: redundant .into_iter(), len() == 0 -> is_empty() - Remove redundant comments that restate what the code does - Remove OPTIMIZATION.md (content moved to PR description) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

codecov · 2026-03-27T12:38:36Z

Codecov Report

❌ Patch coverage is 97.26027% with 4 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
rust/lance-table/src/rowids/index.rs	97.69%	2 Missing and 1 partial ⚠️
rust/lance/src/dataset/rowids.rs	93.75%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

github-actions Bot added the performance label Mar 27, 2026

jiaoew1991 and others added 2 commits March 27, 2026 19:29

jiaoew1991 force-pushed the optimize/row-id-index-build branch from 67846a2 to a42f16c Compare March 27, 2026 11:32

style: cargo fmt

a824ca2

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

BubbleCal reviewed Mar 27, 2026

View reviewed changes

Comment thread rust/lance-table/src/rowids/index.rs

test: add test case for range segments with deletions

e6dd88f

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

BubbleCal approved these changes Mar 27, 2026

View reviewed changes

BubbleCal merged commit 36e8b2d into lance-format:main Mar 27, 2026
29 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: optimize stable row_id index build from O(rows) to O(fragments)#6310

perf: optimize stable row_id index build from O(rows) to O(fragments)#6310
BubbleCal merged 4 commits intolance-format:mainfrom
jiaoew1991:optimize/row-id-index-build

jiaoew1991 commented Mar 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

codecov Bot commented Mar 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jiaoew1991 commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Causes

Solution

Results

Test plan

Uh oh!

Uh oh!

codecov Bot commented Mar 27, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jiaoew1991 commented Mar 27, 2026 •

edited

Loading