Skip to content

Conversation

@danepitkin
Copy link
Member

@danepitkin danepitkin commented Jun 11, 2024

Rationale for this change

Common Array APIs should not segfault or abort on non-cpu devices.

What changes are included in this PR?

  • device_type and is_cpu methods added to the Array class
  • Any function that segfaults, aborts, or gives incorrect results on non-cpu devices now raises an exception

Are these changes tested?

  • Unit tests added

Are there any user-facing changes?

@github-actions
Copy link

⚠️ GitHub issue #42112 has been automatically assigned in GitHub to PR creator.

@danepitkin
Copy link
Member Author

@github-actions crossbow submit test-cuda-python

@github-actions
Copy link

Only contributors can submit requests to this bot. Please ask someone from the community for help with getting the first commit in.
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/9473856632

@danepitkin
Copy link
Member Author

@github-actions crossbow submit test-cuda-python

@github-actions
Copy link

Revision: 534f33e9f1b09dabe482404173fb44d6e75f5d37

Submitted crossbow builds: ursacomputing/crossbow @ actions-ac535722c7

Task Status
test-cuda-python GitHub Actions

@github-actions
Copy link

Revision: 9f3efef9c738808ac4976b930a215cefa67d1839

Submitted crossbow builds: ursacomputing/crossbow @ actions-8413b104ec

Task Status
test-cuda-python GitHub Actions

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Dane, looking good!

Comment on lines +1221 to 1223
self._assert_cpu()
cdef int64_t total_buffer_size
total_buffer_size = TotalBufferSize(deref(self.ap))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first thought was that getting the buffer size should be possible without looking at the actual data. But so it seems that to avoid counting identical buffers twice (if they are reused in a single array, which is possible), it uses the buffer's address to distinguish buffers. Currently that uses buffer->data() in DoTotalBufferSize. data() will return null for non-cpu data, but since it doesn't actually use that address, I think this could also use buffer->address() which will return the address always even for non-cpu data.

(but that could be fine for a follow-up as well)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That checks out! I tested it locally and witnessed a segfault, but hadn't dug into the reasoning.

for i in range(len(self)):
yield self.getitem(i)

def __repr__(self):
self._assert_cpu()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one should not be needed because this ends up calling to_string which already has it, I think

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

@@ -1307,6 +1319,8 @@ cdef class Array(_PandasConvertible):
-------
bool
"""
self._assert_cpu()
self.other._assert_cpu()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.other._assert_cpu()
other._assert_cpu()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOL, thank you. Nice catch.

@@ -1404,8 +1424,9 @@ cdef class Array(_PandasConvertible):
-------
sliced : RecordBatch
"""
cdef:
shared_ptr[CArray] result
self._assert_cpu()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that slicing actually works?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I tried printing the resulting RecordBatch, which segfaults. But .slice() itself does not segfault.

return self.device_type == DeviceAllocationType.CPU

def _assert_cpu(self):
if not self.is_cpu:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we would want to speed this up a bit (it's called in many places, although typically should only give tiny overhead), you could also inline similar code as you used above for the python attributes, like if self.sp_array.get().device_type() != CDeviceAllocationType_kCPU: .., and make it a cdef instead of def (not entirely sure this is worth it)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its worth doing, good suggestion


# Supported
arr.validate()
arr.validate(full=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if this will actually be supported for more complex data types. For example for variable size binary, it will check the actual offsets if they are correct numbers (eg not negative, not out of bounds, increasing, etc)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch, let me experiment and find out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right! There is an abort with validate(full=True) when using variable binary type.

Check failed: is_cpu() not a CPU buffer (device: CudaDevice(device_number=0, name="NVIDIA RTX A5000"))
Aborted

with pytest.raises(NotImplementedError):
arr.fill_null(0)
with pytest.raises(NotImplementedError):
arr.__getitem__(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
arr.__getitem__(0)
arr[0]

Comment on lines 4063 to 4064
with pytest.raises(NotImplementedError):
arr.getitem(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one can be removed (getitem is a cdef function which is not callable from python, so this will error for that reason), this is tested through testing __getitem__

Comment on lines 4084 to 4086
arr.__dlpack__()
with pytest.raises(NotImplementedError):
arr.__dlpack_device__()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a TODO comment here that this should be supported in the future

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Jun 12, 2024
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Jun 24, 2024
@danepitkin
Copy link
Member Author

@github-actions crossbow submit test-cuda-python

@github-actions
Copy link

Revision: ace94eccef774383124243b77f68e5f1912ce42b

Submitted crossbow builds: ursacomputing/crossbow @ actions-570bed1718

Task Status
test-cuda-python GitHub Actions

@danepitkin
Copy link
Member Author

@github-actions crossbow submit test-cuda-python

@github-actions
Copy link

Revision: abe942e0e68b176144d8b156b77bfc330bcd37a0

Submitted crossbow builds: ursacomputing/crossbow @ actions-26aaa9352b

Task Status
test-cuda-python GitHub Actions

@danepitkin danepitkin force-pushed the danepitkin/python-array-is-cpu branch from ff03596 to d5a547f Compare June 24, 2024 23:25
@danepitkin
Copy link
Member Author

@github-actions crossbow submit test-cuda-python

@github-actions
Copy link

Revision: d5a547f

Submitted crossbow builds: ursacomputing/crossbow @ actions-b8e234a0f3

Task Status
test-cuda-python GitHub Actions

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updates look good! There is still some test failure

@github-actions github-actions bot removed the awaiting change review Awaiting change review label Jun 25, 2024
@github-actions github-actions bot added the awaiting changes Awaiting changes label Jun 25, 2024
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
@danepitkin
Copy link
Member Author

@github-actions crossbow submit test-cuda-python

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Jun 25, 2024
@github-actions
Copy link

Revision: 1c903e2

Submitted crossbow builds: ursacomputing/crossbow @ actions-2a8fac0c0e

Task Status
test-cuda-python GitHub Actions

@danepitkin
Copy link
Member Author

@github-actions crossbow submit test-cuda-python

@github-actions
Copy link

Revision: c12765a

Submitted crossbow builds: ursacomputing/crossbow @ actions-599aea54a2

Task Status
test-cuda-python GitHub Actions

@danepitkin
Copy link
Member Author

@github-actions crossbow submit test-cuda-python

@github-actions
Copy link

Revision: 9740a90

Submitted crossbow builds: ursacomputing/crossbow @ actions-6cbf11bdde

Task Status
test-cuda-python GitHub Actions

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit a42ec1d.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 6 possible false positives for unstable benchmarks that are known to sometimes produce them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants