Skip to content

Conversation

@xiaoxmeng
Copy link
Contributor

Summary:
This diff adds index-based chunk skip optimization to ChunkedDecoder, enabling efficient skip operations that can jump directly to target chunks using the stream index instead of sequentially scanning through all intermediate chunks.

  1. ChunkedDecoder enhancements:

    • Added StreamIndex support via constructor parameter for index-accelerated skip operations
    • Added endRow_ member to track stream end row from index, enabling bounds validation
    • Added currentRow_ member to track absolute row position in the stream
    • Implemented skipWithIndex() for index-based skipping that uses chunk lookup
    • Implemented skipWithoutIndex() for legacy sequential skipping
    • Added seekToChunk() to seek input stream to specific chunk offset
    • Added advancePosition() helper to update both remainingValues_ and currentRow_
    • Improved error handling with NIMBLE_CHECK macros instead of VELOX_CHECK
  2. Skip optimization logic:

    • Within-chunk skips use encoding's skip directly (no index lookup needed)
    • Cross-chunk skips use StreamIndex::lookupChunk() to find target chunk
    • Skip-to-end optimization resets decoder state without chunk lookup
    • Proper bounds validation using endRow_ before attempting skip

Differential Revision: D89999516

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 2, 2026
@meta-codesync
Copy link

meta-codesync bot commented Jan 2, 2026

@xiaoxmeng has exported this pull request. If you are a Meta employee, you can view the originating Diff in D89999516.

xiaoxmeng added a commit to xiaoxmeng/nimble that referenced this pull request Jan 2, 2026
…or#400)

Summary:

This diff adds index-based chunk skip optimization to `ChunkedDecoder`, enabling efficient skip operations that can jump directly to target chunks using the stream index instead of sequentially scanning through all intermediate chunks.

1. **ChunkedDecoder enhancements**:
   - Added `StreamIndex` support via constructor parameter for index-accelerated skip operations
   - Added `endRow_` member to track stream end row from index, enabling bounds validation
   - Added `currentRow_` member to track absolute row position in the stream
   - Implemented `skipWithIndex()` for index-based skipping that uses chunk lookup
   - Implemented `skipWithoutIndex()` for legacy sequential skipping
   - Added `seekToChunk()` to seek input stream to specific chunk offset
   - Added `advancePosition()` helper to update both `remainingValues_` and `currentRow_`
   - Improved error handling with `NIMBLE_CHECK` macros instead of `VELOX_CHECK`

2. **Skip optimization logic**:
   - Within-chunk skips use encoding's skip directly (no index lookup needed)
   - Cross-chunk skips use `StreamIndex::lookupChunk()` to find target chunk
   - Skip-to-end optimization resets decoder state without chunk lookup
   - Proper bounds validation using `endRow_` before attempting skip

Differential Revision: D89999516
xiaoxmeng added a commit to xiaoxmeng/nimble that referenced this pull request Jan 2, 2026
…or#400)

Summary:

This diff adds index-based chunk skip optimization to `ChunkedDecoder`, enabling efficient skip operations that can jump directly to target chunks using the stream index instead of sequentially scanning through all intermediate chunks.

1. **ChunkedDecoder enhancements**:
   - Added `StreamIndex` support via constructor parameter for index-accelerated skip operations
   - Added `endRow_` member to track stream end row from index, enabling bounds validation
   - Added `currentRow_` member to track absolute row position in the stream
   - Implemented `skipWithIndex()` for index-based skipping that uses chunk lookup
   - Implemented `skipWithoutIndex()` for legacy sequential skipping
   - Added `seekToChunk()` to seek input stream to specific chunk offset
   - Added `advancePosition()` helper to update both `remainingValues_` and `currentRow_`
   - Improved error handling with `NIMBLE_CHECK` macros instead of `VELOX_CHECK`

2. **Skip optimization logic**:
   - Within-chunk skips use encoding's skip directly (no index lookup needed)
   - Cross-chunk skips use `StreamIndex::lookupChunk()` to find target chunk
   - Skip-to-end optimization resets decoder state without chunk lookup
   - Proper bounds validation using `endRow_` before attempting skip

Differential Revision: D89999516
@xiaoxmeng xiaoxmeng force-pushed the export-D89999516 branch 2 times, most recently from 02d13dd to 21f5454 Compare January 2, 2026 18:09
xiaoxmeng added a commit to xiaoxmeng/nimble that referenced this pull request Jan 2, 2026
…ader (facebookincubator#400)

Summary:

This diff adds index-based chunk skip optimization to `ChunkedDecoder`, enabling efficient skip operations that can jump directly to target chunks using the stream index instead of sequentially scanning through all intermediate chunks.

1. **ChunkedDecoder enhancements**:
   - Added `StreamIndex` support via constructor parameter for index-accelerated skip operations
   - Added `endRow_` member to track stream end row from index, enabling bounds validation
   - Added `currentRow_` member to track absolute row position in the stream
   - Implemented `skipWithIndex()` for index-based skipping that uses chunk lookup
   - Implemented `skipWithoutIndex()` for legacy sequential skipping
   - Added `seekToChunk()` to seek input stream to specific chunk offset
   - Added `advancePosition()` helper to update both `remainingValues_` and `currentRow_`
   - Improved error handling with `NIMBLE_CHECK` macros instead of `VELOX_CHECK`

2. **Skip optimization logic**:
   - Within-chunk skips use encoding's skip directly (no index lookup needed)
   - Cross-chunk skips use `StreamIndex::lookupChunk()` to find target chunk
   - Skip-to-end optimization resets decoder state without chunk lookup
   - Proper bounds validation using `endRow_` before attempting skip

Differential Revision: D89999516
…ader (facebookincubator#400)

Summary:

This diff adds index-based chunk skip optimization to `ChunkedDecoder`, enabling efficient skip operations that can jump directly to target chunks using the stream index instead of sequentially scanning through all intermediate chunks.

1. **ChunkedDecoder enhancements**:
   - Added `StreamIndex` support via constructor parameter for index-accelerated skip operations
   - Added `endRow_` member to track stream end row from index, enabling bounds validation
   - Added `currentRow_` member to track absolute row position in the stream
   - Implemented `skipWithIndex()` for index-based skipping that uses chunk lookup
   - Implemented `skipWithoutIndex()` for legacy sequential skipping
   - Added `seekToChunk()` to seek input stream to specific chunk offset
   - Added `advancePosition()` helper to update both `remainingValues_` and `currentRow_`
   - Improved error handling with `NIMBLE_CHECK` macros instead of `VELOX_CHECK`

2. **Skip optimization logic**:
   - Within-chunk skips use encoding's skip directly (no index lookup needed)
   - Cross-chunk skips use `StreamIndex::lookupChunk()` to find target chunk
   - Skip-to-end optimization resets decoder state without chunk lookup
   - Proper bounds validation using `endRow_` before attempting skip

Reviewed By: zzhao0

Differential Revision: D89999516
@meta-codesync
Copy link

meta-codesync bot commented Jan 2, 2026

This pull request has been merged in f658828.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported Merged meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants