FullZip scan latency regression on cloud storage (S3) due to lazy I/O submission

### Summary

PR #5981 (`e25f16909`) introduced `FullZipReadSource` and `create_page_load_task` as a unified abstraction for scheduling FullZip reads. While the PR itself brought significant performance improvements (especially the full-page scan shortcut and always-cached rep index), it inadvertently changed the I/O submission timing in two code paths: `schedule_ranges_simple` and the cached branch of `schedule_ranges_rep`. In both cases, `submit_request` was moved inside an `async move { ... }` block, which means the I/O is no longer submitted during the schedule phase — it is deferred until the decode phase polls the future.

### Affected Code Paths

| Path | Before #5981 | After #5981 | Expected |
|------|-------------|-------------|----------|
| `schedule_ranges_simple` (fixed-width, no rep index) | `submit_request` called **eagerly** during scheduling | `submit_request` inside `create_page_load_task` → deferred to decode | **Eager** (byte ranges are known at schedule time via simple arithmetic) |
| `schedule_ranges_rep` cached branch (rep index in memory) | All logic inside one `async` block (deferred) | `submit_request` inside `create_page_load_task` → still deferred | **Eager** (byte ranges computable from in-memory cached rep index, opportunity for optimization) |
| `schedule_ranges_rep` full-page scan | N/A (new path) | `submit_single` called eagerly ✅ | Correct |
| `schedule_ranges_rep` uncached (no rep index) | Deferred (two-stage I/O dependency) | Deferred ✅ | Correct (data byte ranges depend on first I/O result) |

### Impact

The scheduling architecture is designed as a two-thread pipeline: the scheduler thread issues I/O as fast as possible, and the decode stream consumes loaded pages. As described in `decoder.rs`:

> *Note that the scheduler thread does not need to wait for I/O to happen at any point. As soon as it starts it will start scheduling one page of I/O after another until it has scheduled the entire file's worth of I/O.*

When `submit_request` is deferred into the future, the I/O request is not enqueued into the `IoQueue` until the decode stream actually polls it. This eliminates the overlap between I/O and scheduling/decoding of other pages, effectively serializing I/O with decode and adding one full network RTT of latency per page for cloud storage (S3/GCS/Azure).

Note: this does **not** cause unbounded I/O pressure because `IoQueue` already enforces IOPS limits (`io_parallelism`: 64 for cloud, 8 for local) and byte-level backpressure (`io_buffer_size_bytes`). Early submission simply enqueues requests into the priority queue sooner, giving the I/O scheduler better visibility for prioritization.


Path	Before #5981	After #5981	Expected
`schedule_ranges_simple` (fixed-width, no rep index)	`submit_request` called eagerly during scheduling	`submit_request` inside `create_page_load_task` → deferred to decode	Eager (byte ranges are known at schedule time via simple arithmetic)
`schedule_ranges_rep` cached branch (rep index in memory)	All logic inside one `async` block (deferred)	`submit_request` inside `create_page_load_task` → still deferred	Eager (byte ranges computable from in-memory cached rep index, opportunity for optimization)
`schedule_ranges_rep` full-page scan	N/A (new path)	`submit_single` called eagerly ✅	Correct
`schedule_ranges_rep` uncached (no rep index)	Deferred (two-stage I/O dependency)	Deferred ✅	Correct (data byte ranges depend on first I/O result)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FullZip scan latency regression on cloud storage (S3) due to lazy I/O submission #6504

Summary

Affected Code Paths

Impact

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

FullZip scan latency regression on cloud storage (S3) due to lazy I/O submission #6504

Description

Summary

Affected Code Paths

Impact

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions