Skip to content

perf: optimize stable row_id index build from O(rows) to O(fragments)#6310

Merged
BubbleCal merged 4 commits intolance-format:mainfrom
jiaoew1991:optimize/row-id-index-build
Mar 27, 2026
Merged

perf: optimize stable row_id index build from O(rows) to O(fragments)#6310
BubbleCal merged 4 commits intolance-format:mainfrom
jiaoew1991:optimize/row-id-index-build

Conversation

@jiaoew1991
Copy link
Copy Markdown
Contributor

@jiaoew1991 jiaoew1991 commented Mar 27, 2026

Summary

When enable_stable_row_id is enabled, the first take_rows() call triggers a full RowIdIndex build. On large datasets this cold start was extremely slow (18.9s for 968M rows, 220s for 4.26B rows).

Root Causes

1. O(total_rows) segment expansion in decompose_sequence

The original code expanded every U64Segment element-by-element, even for Range segments with no deletions. For a Range(0..273711) with no deletions, this meant 273K iterations, deletion vector checks, temporary allocations, and re-compression — only to produce the same Range back. Across 18,243 fragments averaging 233K rows, this totaled 4.26 billion iterations with ~32 GB of temporary allocations.

2. O(N²) fragment lookup in load_row_id_index

The original code called fragments.iter().find() (O(N) linear search) for each of N fragments, resulting in O(N²) comparisons. try_join_all spawned all N futures at once, overwhelming the async runtime. get_deletion_vector() was called unconditionally even for fragments without deletion files.

Solution

Fix 1: O(1) fast path for Range segments without deletions. When a fragment has no deletions and its row_id sequence is a Range, construct the index chunk directly without iterating.

Fix 2: HashMap lookup + conditional deletion vector loading. Use a HashMap<u32, &FileFragment> for O(1) lookup, buffer_unordered for controlled concurrency, and skip get_deletion_vector() when there's no deletion file.

Results

Dataset Before After Speedup
968M rows, 3,540 fragments 18.9s 150ms 126x
4.26B rows, 18,243 fragments 220s 89ms 2,471x

Test plan

  • All existing rowids tests pass (45 tests)
  • Clippy clean
  • New test test_large_range_segments_no_deletions validates fast path correctness at boundaries and performance (100 fragments × 250K rows completes < 1s)
  • Verify on real datasets with deletions to ensure slow path correctness

🤖 Generated with Claude Code

jiaoew1991 and others added 2 commits March 27, 2026 19:29
Fix two performance bottlenecks in RowIdIndex construction that caused
extreme cold start latency when `enable_stable_row_id` is enabled:

1. `decompose_sequence` (lance-table/src/rowids/index.rs):
   Previously expanded every Range segment element-by-element to check
   deletions, then re-compressed back. For Range segments with no
   deletions, this is pure waste — input is Range, output is Range.
   Add O(1) fast path that constructs index chunks directly.
   Complexity: O(total_rows) → O(num_fragments).

2. `load_row_id_index` (lance/src/dataset/rowids.rs):
   Used linear search (.find()) over all fragments for each fragment_id,
   giving O(N²) total. Also spawned all N futures via try_join_all and
   called get_deletion_vector() even when no deletion file exists.
   Fix: HashMap for O(1) lookup, buffer_unordered for controlled
   concurrency, skip deletion vector load when unnecessary.

Benchmarks on real S3 datasets:
- 968M rows, 3540 fragments:  18.9s → 150ms (126x faster)
- 4.26B rows, 18243 fragments: 220s → 89ms (2471x faster)

All existing tests pass. Correctness verified against production data
with boundary checks, random sampling, and data consistency validation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extract shared build_chunk_from_pairs helper to eliminate duplication
  between decompose_segment_no_deletions and decompose_segment_with_deletions
- Use .unzip() instead of manual map/collect for splitting pairs
- Return Option<IndexChunk> instead of Option<Vec<IndexChunk>> to avoid
  unnecessary single-element Vec allocations
- Fix clippy warnings: redundant .into_iter(), len() == 0 -> is_empty()
- Remove redundant comments that restate what the code does
- Remove OPTIMIZATION.md (content moved to PR description)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jiaoew1991 jiaoew1991 force-pushed the optimize/row-id-index-build branch from 67846a2 to a42f16c Compare March 27, 2026 11:32
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread rust/lance-table/src/rowids/index.rs
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 27, 2026

Codecov Report

❌ Patch coverage is 97.26027% with 4 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance-table/src/rowids/index.rs 97.69% 2 Missing and 1 partial ⚠️
rust/lance/src/dataset/rowids.rs 93.75% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@BubbleCal BubbleCal merged commit 36e8b2d into lance-format:main Mar 27, 2026
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants