Skip to content

perf: submit I/O requests eagerly in FullZipScheduler#6513

Merged
Xuanwo merged 1 commit intolance-format:mainfrom
hushengquan:optimize-fullzip
Apr 14, 2026
Merged

perf: submit I/O requests eagerly in FullZipScheduler#6513
Xuanwo merged 1 commit intolance-format:mainfrom
hushengquan:optimize-fullzip

Conversation

@hushengquan
Copy link
Copy Markdown
Contributor

Summary

Refactor FullZipScheduler::create_page_load_task to accept a pre-submitted I/O future instead of deferring I/O submission until the async task executes. This allows the I/O requests to be submitted immediately during scheduling, enabling the object store layer to batch and parallelize them. close #6504

I/O Model Change

Before: Lazy I/O submission (serialized)

Previously, create_page_load_task received a FullZipReadSource::Remote(io) along with byte ranges and priority. The actual io.submit_request() call happened inside the async block, meaning the I/O request was not submitted until the future was first polled.

When decoding multiple pages (e.g. across many fragments), this created a sequential I/O pattern:

Page 1: [schedule] -> [poll] -> [submit I/O] -> [wait response] -> [decode]
Page 2:                                          [schedule] -> [poll] -> [submit I/O] -> [wait response] -> [decode]
Page 3:                                                                                   [schedule] -> [poll] -> ...

Each page's I/O request could only be submitted after the previous task started executing. The I/O scheduler had no visibility into upcoming requests, preventing it from batching or parallelizing them effectively.

After: Eager I/O submission (pipelined)

Now, io.submit_request() is called before constructing the PageLoadTask, and the resulting future is passed into create_page_load_task. All I/O requests for all pages are submitted upfront during the scheduling phase:

[schedule all pages] --> submit I/O page 1 -+
                     --> submit I/O page 2 -+
                     --> submit I/O page 3 -+  (all in-flight concurrently)
                     --> submit I/O page N -+
                                            |
                     [poll] -> [await page 1 response] -> [decode]
                     [poll] -> [await page 2 response] -> [decode]
                     [poll] -> [await page 3 response] -> [decode]

The object store layer can now see all pending requests at once and optimize I/O through batching, connection multiplexing, and parallel fetches. The async tasks only await the already-in-flight I/O futures.

Changes

  • rust/lance-encoding/src/encodings/logical/primitive.rs:
    • Changed create_page_load_task signature to accept BoxFuture<'static, Result<Vec<Bytes>>> instead of FullZipReadSource + byte ranges + priority
    • Moved io.submit_request() calls to happen eagerly at both call sites (schedule_ranges_with_rep_index and the non-rep-index path), before constructing the page load task

Performance

Tested with a multi-fragment dataset containing fixed-width columns (768-dim float32 vectors, 40 fragments, 50 rows/fragment):

Benchmark Before (p50) After (p50) Speedup
Fixed-width column scan 3453 ms 523 ms 6.6x

The improvement comes entirely from I/O pipelining — the decoding logic itself is unchanged. The effect is most pronounced with many fragments or pages, where the serialized I/O submission was the dominant bottleneck.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 14, 2026

Codecov Report

❌ Patch coverage is 90.90909% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
.../lance-encoding/src/encodings/logical/primitive.rs 90.90% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Collaborator

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice change, thank you!

@Xuanwo Xuanwo merged commit 9df651b into lance-format:main Apr 14, 2026
30 checks passed
@westonpace
Copy link
Copy Markdown
Member

Belated second approval. Thanks for catching this! This is the third time this issue (wrapping something with async move closure only to have the execution semantics get changed) has bitten me in two months 😆

@hushengquan hushengquan deleted the optimize-fullzip branch April 16, 2026 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FullZip scan latency regression on cloud storage (S3) due to lazy I/O submission

3 participants