Skip to content

feat: move rate limiting to the object store#6293

Merged
westonpace merged 5 commits intolance-format:mainfrom
westonpace:feat/throttle-in-obj-store
Mar 27, 2026
Merged

feat: move rate limiting to the object store#6293
westonpace merged 5 commits intolance-format:mainfrom
westonpace:feat/throttle-in-obj-store

Conversation

@westonpace
Copy link
Copy Markdown
Member

Closes #6239

@westonpace westonpace marked this pull request as draft March 25, 2026 13:56
@github-actions github-actions Bot added the enhancement New feature or request label Mar 25, 2026
@westonpace
Copy link
Copy Markdown
Member Author

Keeping in draft until #6266 merges

@github-actions
Copy link
Copy Markdown
Contributor

PR Review

Well-designed change that moves rate limiting to the right layer. The AIMD algorithm, per-category throttles, and anti-thundering-herd token bucket are solid. A few concerns:

P0: Breaking change without deprecation

Removing LANCE_PROCESS_IO_THREADS_LIMIT is a silent breaking change for users who have configured it. Consider:

  • Logging a warning at startup if LANCE_PROCESS_IO_THREADS_LIMIT is still set, telling users to migrate to the new LANCE_AIMD_* env vars
  • Or keeping it as a no-op with a deprecation warning for one release cycle

P1: is_throttle_error string matching is fragile

The heuristic of matching "retries, max_retries" in error messages is acknowledged as crude, but deserves more attention:

  • If object_store changes its error format, throttle detection silently breaks and AIMD never decreases rate — the worst failure mode
  • Consider also logging/counting unrecognized Generic errors that don't match the pattern, so operators can notice if the heuristic stops matching
  • Worth adding a code comment with the object_store version this was tested against so future maintainers know when to re-verify

P1: ThrottledMultipartUpload::put_part — token acquired after future is created

fn put_part(&mut self, data: PutPayload) -> UploadPart {
    let fut = self.target.put_part(data);  // called eagerly
    Box::pin(async move {
        write.acquire_token().await;       // throttle happens later
        let result = fut.await;
        ...
    })
}

self.target.put_part(data) is called before the token is acquired. If an underlying store implementation starts I/O eagerly inside put_part (rather than deferring to the returned future), the throttle is bypassed. This is likely fine for current object_store implementations but is subtle — a brief comment explaining the assumption would help.

Minor nits (non-blocking)

  • AimdController uses std::sync::Mutex with .unwrap() — a poisoned mutex will panic. Fine in practice but parking_lot::Mutex avoids this class of issue.
  • observe_outcome silently drops rate updates via try_lock when contended. The doc comment explains this, which is good — just noting that under sustained throttling with high contention, the rate decrease could be delayed.

Overall this is a solid improvement over the old semaphore-based approach. The per-store scoping, per-category isolation, and AIMD adaptiveness are all well-motivated by the linked issue.

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 25, 2026

Codecov Report

❌ Patch coverage is 61.56584% with 108 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance-io/src/object_store/throttle.rs 60.58% 91 Missing and 17 partials ⚠️

📢 Thoughts on this report? Let us know!

@westonpace westonpace force-pushed the feat/throttle-in-obj-store branch from ec5d3f2 to ce0e646 Compare March 25, 2026 14:59
@westonpace westonpace marked this pull request as ready for review March 25, 2026 15:00
westonpace and others added 2 commits March 26, 2026 16:30
…onfig

The AIMD throttled store now retries throttle errors up to 3 times with
random 100-300ms backoff, so the underlying cloud object store's built-in
retry count is lowered from 10 to 3 to avoid redundant retries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@westonpace westonpace force-pushed the feat/throttle-in-obj-store branch from ce0e646 to 0c98572 Compare March 27, 2026 01:50
storage_options: &StorageOptions,
is_s3_express: bool,
) -> Result<Arc<dyn OSObjectStore>> {
let max_retries = storage_options.client_max_retries();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

client_max_retries is not useful anymore, do we need to remove it?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should still be used. We actually rely on object_store's retries to make the AIMD throttle work. The object_store errors have no "is_temporary" so we have no other way of knowing if an error is a temporary error except to see if object_store applied retries to it.

So by default we should do 3 object store retries now. Each time those fail we apply an AIMD retry (longer backoff, cut throttle). So in total we get 9 retries instead of 10.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems multipart upload process is not covered. Is it expected?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I've added multipart to the retry handling. I still can't apply the outer retry loop to the delete stream or list stream methods. These return a stream of items and there is no way of mapping results to underlying object store requests so there is nothing I can retry there. Since these operations are hopefully rare I think it will be ok.

@@ -354,7 +461,7 @@ impl AimdThrottledStore {
impl ObjectStore for AimdThrottledStore {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should consider overriding our own rename_if_not_exists handling.

Currently, the logic retries the entire copy and delete process. If the copy succeeds but the delete fails due to throttling, the entire operation is retried. After that, the copy will fail because the object already exists.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll create a follow-up, I don't think we use rename_if_not_exists in most cases anymore.

westonpace and others added 2 commits March 27, 2026 05:01
- Change client_max_retries default from 10 to 3 and restore
  configurability via OBJECT_STORE_CLIENT_MAX_RETRIES in cloud providers
- Add LANCE_AIMD_MAX_RETRIES, LANCE_AIMD_MIN_BACKOFF_MS,
  LANCE_AIMD_MAX_BACKOFF_MS env vars (and storage option equivalents)
  to configure AIMD throttle retry behavior

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add `is_disabled()` to `AimdThrottleConfig` so that setting
`lance_aimd_max_retries=0` bypasses the throttle layer entirely.
Also warn and skip the layer when `client_max_retries=0`, since the
AIMD implementation relies on the object store client surfacing retry
errors to detect throttling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@westonpace
Copy link
Copy Markdown
Member Author

westonpace commented Mar 27, 2026

Test results:

Test Description

On Azure, 40 VMs (8 cores each) hammer the same blob with random takes, each take grabbing 1024 rows. Each VM has 8 processes so there are a total of 320 processes taking from the same blob. This blob is in a standard class storage account. This should aggressively trigger rate limiting errors. Everything runs for 5 minutes.

With AIMD disabled, previous settings (note, red line is success in this chart for some reason)

8,330 failed takes, 50,176 rows taken

image

With AIMD enabled, default settings

0 Errors, 9,951,232 rows taken

image

Aggressive AIMD (1 object_store retry, 5 AIMD retries)

0 Errors, 7,606,272 rows taken

image

(will be updated with additional results)

- Retry put_part, complete, and abort in ThrottledMultipartUpload using
  LANCE_AIMD_MAX_RETRIES. put_part uses a std::sync::Mutex (not tokio)
  to share the inner upload across retry futures without risking
  deadlock from aborted futures. complete/abort use Arc::get_mut since
  &mut self guarantees exclusive access.
- Update client_max_retries docs: default 10 → 3, s3 client → object
  store client. Same wording fix for client_retry_timeout.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@westonpace westonpace merged commit 209f99b into lance-format:main Mar 27, 2026
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Move rate limiting from ScanScheduler into the object store layer

2 participants