-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Add case-sensitivity handling to multi-object download operations #9926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+860
−13
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
5131b62 to
9de0618
Compare
9de0618 to
ad541d0
Compare
kdaily
requested changes
Dec 19, 2025
Member
kdaily
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM besides a nit in the changelog entry!
kdaily
approved these changes
Dec 22, 2025
aws-sdk-python-automation
added a commit
that referenced
this pull request
Dec 23, 2025
* release-1.44.6: Bumping version to 1.44.6 Update changelog based on model updates Add case-sensitivity handling to multi-object download operations (#9926)
SamRemis
added a commit
to SamRemis/aws-cli
that referenced
this pull request
Jan 2, 2026
…ions (aws#9926)" Revert case-sensitivity feature
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Addresses #9596
Problem
Case conflicts can happen if the following conditions are met:
sync,cp,mv.syncis always a directory operation andcpandmvwith the--recursiveflag becomes a directory operation.There are 2 case conflict scenarios we need to address:
a.txtandA.txt. When these are downloaded onto a case-insensitive filesystem, the last to write wins.a.txtis downloaded from S3, it will overwrite the localA.txtfile.Requirements
Detecting case conflicts requires 2 approaches:
a.txtand then we see the next to download isA.txt, we need to be able to detect the case conflict even ifa.txthasn't finished downloading yet.Handling case conflicts automatically is difficult because it's impossible to know what the user expectations are. We can't start throwing errors because we'll break customers. So we'll warn by default when a case conflict is detected and let them configure behavior:
errorfor throwing errorswarnfor emitting warnings (default)skipfor skipping an objectignorefor doing nothing (current behavior)Solution
syncalways has knowledge of local files, so it's more straightforward. Just create a new sync strategy that detects case conflicts and handles them. Instead of writing new filesystem logic for recursivecpandmv, we'll transform their instructions to mimic asynccommand if we determine that case conflicts need to be handled. There's not much extra hit in terms of performance since it's just local filesystem API calls.To keep track of objects submitted for downloads, we'll keep a simple in-memory hash set that tracks lower-cased forms of object keys. If
a.txthas been submitted for download and we tryA.txt, it'll see that the lower-cased form is in the set and handle the case conflict.To avoid the set growing large and consuming extra memory, the solution also implements a subscriber that will discard the key from the set once the download finishes. The key can be removed from the set at this point because it now exists in the filesystem. So when we attempt to download subsequent objects with case conflicts, it'll simply check if it exists locally.
Caveats
S3 Express can't be supported because it doesn't return objects in lexicographical order. We need ordered keys for
sync.