-
Notifications
You must be signed in to change notification settings - Fork 638
feat: move rate limiting to the object store #6293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
fb2e775
0c98572
638240c
0f1a4ff
37a02a3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -30,6 +30,7 @@ use crate::object_store::{ | |
| DEFAULT_CLOUD_BLOCK_SIZE, DEFAULT_CLOUD_IO_PARALLELISM, DEFAULT_MAX_IOP_SIZE, ObjectStore, | ||
| ObjectStoreParams, ObjectStoreProvider, StorageOptions, StorageOptionsAccessor, | ||
| StorageOptionsProvider, | ||
| throttle::{AimdThrottleConfig, AimdThrottledStore}, | ||
| }; | ||
| use lance_core::error::{Error, Result}; | ||
|
|
||
|
|
@@ -44,12 +45,12 @@ impl AwsStoreProvider { | |
| storage_options: &StorageOptions, | ||
| is_s3_express: bool, | ||
| ) -> Result<Arc<dyn OSObjectStore>> { | ||
| let max_retries = storage_options.client_max_retries(); | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. client_max_retries is not useful anymore, do we need to remove it?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It should still be used. We actually rely on object_store's retries to make the AIMD throttle work. The object_store errors have no "is_temporary" so we have no other way of knowing if an error is a temporary error except to see if object_store applied retries to it. So by default we should do 3 object store retries now. Each time those fail we apply an AIMD retry (longer backoff, cut throttle). So in total we get 9 retries instead of 10. |
||
| let retry_timeout = storage_options.client_retry_timeout(); | ||
| // Use a low retry count since the AIMD throttle layer handles | ||
| // throttle recovery with its own retry loop. | ||
| let retry_config = RetryConfig { | ||
| backoff: Default::default(), | ||
| max_retries, | ||
| retry_timeout: Duration::from_secs(retry_timeout), | ||
| max_retries: storage_options.client_max_retries(), | ||
| retry_timeout: Duration::from_secs(storage_options.client_retry_timeout()), | ||
| }; | ||
|
|
||
| let mut s3_storage_options = storage_options.as_s3_options(); | ||
|
|
@@ -159,6 +160,19 @@ impl ObjectStoreProvider for AwsStoreProvider { | |
| self.build_amazon_s3_store(&mut base_path, params, &storage_options, is_s3_express) | ||
| .await? | ||
| }; | ||
| let throttle_config = AimdThrottleConfig::from_storage_options(params.storage_options())?; | ||
| let inner = if throttle_config.is_disabled() { | ||
| inner | ||
| } else if storage_options.client_max_retries() == 0 { | ||
| log::warn!( | ||
| "AIMD throttle disabled: the current implementation relies on the object store \ | ||
| client surfacing retry errors, which requires client_max_retries > 0. \ | ||
| No throttle or retry layer will be applied." | ||
| ); | ||
| inner | ||
| } else { | ||
| Arc::new(AimdThrottledStore::new(inner, throttle_config)?) as Arc<dyn OSObjectStore> | ||
| }; | ||
|
|
||
| Ok(ObjectStore { | ||
| inner, | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems multipart upload process is not covered. Is it expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I've added multipart to the retry handling. I still can't apply the outer retry loop to the delete stream or list stream methods. These return a stream of items and there is no way of mapping results to underlying object store requests so there is nothing I can retry there. Since these operations are hopefully rare I think it will be ok.