-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-34386: [C++] Add a PathFromUriOrPath method #34420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-34386: [C++] Add a PathFromUriOrPath method #34420
Conversation
|
|
|
Oh neat, thank you. I'll try to take a look at this soon |
|
Closing because it has been untouched for a while, in case it's still relevant feel free to reopen and move it forward 👍 |
|
Not sure why this was unceremoniously closed. Weston, do you think you would still have time to look at this? It would be useful in #35034 |
|
Yes, I will try and look at this. I think it still needs some changes. |
79dc526 to
9100358
Compare
|
I unified the various paths using a helper method in util_internal.cc. This highlighted one small difference between the various implementations. The s3 and gcs filesystems remove any trailing slash in the path while the hdfs and local filesystems do not. I have opened #35399 for any discussion on that difference but, for the moment, I have changed all the filesystems to always remove the trailing slash for consistency. |
|
Hmm, I suppose the trailing slash could change listing behavior in S3. |
S3 was already removing the trailing slash so there is no change to the S3 behavior. This only affected the local filesystem. |
|
Ah, I got that mixed up. LGTM then |
…u already have a filesystem
… option to true everywhere it is used as part of PathFromUri
da56f29 to
c89e9ca
Compare
| #ifdef _WIN32 | ||
| if (preserve_root && key.size() == 3 && key[1] == ':' && key[0] != '/') { | ||
| // If the user gives us C:/ then don't return C: | ||
| return key; | ||
| } | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh fun.
|
There are a few errors but they seem to be unrelated. I'm going to merge this. |
|
Benchmark runs are scheduled for baseline = febd0ff and contender = 7ca7724. 7ca7724 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
|
['Python', 'R'] benchmarks have high level of regressions. |
### Rationale for this change We have some URI parsing indirectly exposed through FilesystemFromUri. However, this isn't very useful if the user has multiple URIs or if they already have a filesystem. This method allows that same URI handling to be used even if the user already has a filesystem. ### What changes are included in this PR? Adds a new arrow::fs::PathFromUriOrPath method ### Are these changes tested? Yes, via unit tests ### Are there any user-facing changes? There is a new API method but no changes to any existing APIs * Closes: apache#34386 Authored-by: Weston Pace <weston.pace@gmail.com> Signed-off-by: Weston Pace <weston.pace@gmail.com>
### Rationale for this change We have some URI parsing indirectly exposed through FilesystemFromUri. However, this isn't very useful if the user has multiple URIs or if they already have a filesystem. This method allows that same URI handling to be used even if the user already has a filesystem. ### What changes are included in this PR? Adds a new arrow::fs::PathFromUriOrPath method ### Are these changes tested? Yes, via unit tests ### Are there any user-facing changes? There is a new API method but no changes to any existing APIs * Closes: apache#34386 Authored-by: Weston Pace <weston.pace@gmail.com> Signed-off-by: Weston Pace <weston.pace@gmail.com>
Rationale for this change
We have some URI parsing indirectly exposed through FilesystemFromUri. However, this isn't very useful if the user has multiple URIs or if they already have a filesystem. This method allows that same URI handling to be used even if the user already has a filesystem.
What changes are included in this PR?
Adds a new arrow::fs::PathFromUriOrPath method
Are these changes tested?
Yes, via unit tests
Are there any user-facing changes?
There is a new API method but no changes to any existing APIs