-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-41960: [Python] Expose new S3 option check_directory_existence_before_creation #41972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-41960: [Python] Expose new S3 option check_directory_existence_before_creation #41972
Conversation
|
|
|
@pitrou Here is the Python side exposure PR |
jorisvandenbossche
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good! Just some comments on the docstring
python/pyarrow/_s3fs.pyx
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| check_directory_existence_before_creation: boolean, default false | |
| check_directory_existence_before_creation : boolean, default false |
python/pyarrow/_s3fs.pyx
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's rather unclear what "pessimistic" means in this context (but I see this term is used on the C++ side as well). Also, we shouldn't mention "CreateDir" as that is a C++ function the python user is not familiar with. We can use a generic term, something like "when creating a directory"
python/pyarrow/_s3fs.pyx
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if false, then CreateDir will try to create the directory without checking its | |
| By default (False), when creating a directory, it will try to create the directory without checking its |
I would maybe also include the "It's an optimization to try directory creation and catch the error, rather than issue two dependent I/O calls." as is explained in C++
87d9d37 to
56f6875
Compare
|
@jorisvandenbossche thanks for the suggestions, please review again |
|
|
|
After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit b51e997. There were 8 benchmark results indicating a performance regression:
The full Conbench report has more details. It also includes information about 7 possible false positives for unstable benchmarks that are known to sometimes produce them. |
Rationale for this change
Expose new S3 option
check_directory_existence_before_creationfrom GH-41493What changes are included in this PR?
Expose new S3 option
check_directory_existence_before_creationfrom GH-41493Are these changes tested?
yes
Are there any user-facing changes?
Yes. Python function documentation is updated.
check_directory_existence_before_creation#41960