-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-43728: [Python] ChunkedArray fails gracefully on non-cpu devices #43795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-43728: [Python] ChunkedArray fails gracefully on non-cpu devices #43795
Conversation
|
|
cpp/src/arrow/chunked_array.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should consider not caching this piece of information as state in the ChunkedArray instance and instead derive it from the chunks when we need it.
Additionally, one advantage of chunking is the flexibility that it brings regarding allocation of buffers (they don't have to be contiguous), so now requiring that all chunks be allocated on the same device seems too rigid.
I proposed a solution to this: chunked arrays producing a DeviceAllocationTypeSet with all the allocation types of the chunks. This set can be represented by a single 64-bit word in memory (I used C++ <bitset>) so it can be copied and matches very efficiently.
Here is the draft PR: https://github.com/apache/arrow/pull/43542/files#diff-b4ffb36b29cfaa2cf9be4fab774921b8344efdc595a358b02c3187ba04141f7eR89
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like your PR! I think this would work great for PyArrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could move this "caching" to the Python side for now (if we don't want to do this in C++, which I think is certainly fine), or otherwise wait on your PR #43542 to land.
(and we should maybe still consider caching the DeviceAllocationTypeSet result? It might be cheap to calculcate in C++, but we still call this before every call of many methods on the python object)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, didn't see that the PR was already updated in the meantime :)
python/pyarrow/table.pxi
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An is_cpu predicate on chunked arrays can be defined without us forcing a single device type for chunked arrays. This would unblock the Python checks without ruling out the possibility of arrays with mixed device allocations.
felipecrv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C++ part looks good to me.
|
@github-actions crossbow submit test-cuda-python |
|
Revision: 06dfe493466961225babc34d90e48ce17eadf970 Submitted crossbow builds: ursacomputing/crossbow @ actions-291bb70866
|
|
@github-actions crossbow submit test-cuda-python |
|
Revision: e5bf77396ee1d63a1c88ed143caed8550d75093f Submitted crossbow builds: ursacomputing/crossbow @ actions-92f2f949c3
|
|
@github-actions crossbow submit test-cuda-python |
|
Revision: 739f2d70a40e1956ceac7fb496d6b313e612bab7 Submitted crossbow builds: ursacomputing/crossbow @ actions-b7a5f6c953
|
|
@github-actions crossbow submit test-cuda-python |
|
Revision: 0ac2ca4548d3484e3fdee44a18475c938cc8aa50 Submitted crossbow builds: ursacomputing/crossbow @ actions-41d0c41240
|
jorisvandenbossche
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good!
python/pyarrow/table.pxi
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if self._init_is_cpu == False: | |
| if not self._init_is_cpu: |
|
Need to fix some conflicts now I merged the other one |
This reverts commit d91cfabbcc374b3fd30e263284a2168c7c7cbf71.
…cpu devices" This reverts commit 1fcdb1f790f9d34b4d63e33f8a162b0346bc2ab5.
8a0f28c to
a1d857a
Compare
|
@github-actions crossbow submit test-cuda-python |
|
Revision: a1d857a Submitted crossbow builds: ursacomputing/crossbow @ actions-c503d748a8
|
|
After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 50219ef. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 60 possible false positives for unstable benchmarks that are known to sometimes produce them. |
…ices (apache#43795) ### Rationale for this change ChunkedArrays that are backed by non-cpu memory should not segfault when the user invokes an incompatible API. ### What changes are included in this PR? * Add IsCpu() to ChunkedArray * Throw a python exception for known incompatible APIs on non-cpu device ### Are these changes tested? Unit tests ### Are there any user-facing changes? The user should no longer see segfaults for certain APIs, just python exceptions. * GitHub Issue: apache#43728 Authored-by: Dane Pitkin <dpitkin@apache.org> Signed-off-by: Dane Pitkin <dpitkin@apache.org>
Rationale for this change
ChunkedArrays that are backed by non-cpu memory should not segfault when the user invokes an incompatible API.
What changes are included in this PR?
Are these changes tested?
Unit tests
Are there any user-facing changes?
The user should no longer see segfaults for certain APIs, just python exceptions.