Skip to content

[Python][Docs] Improve the Python user guide on the CUDA integration (pyarrow.cuda) #41666

@jorisvandenbossche

Description

@jorisvandenbossche

We have a page in the user guide about the CUDA integration with the pyarrow.cuda module: https://arrow.apache.org/docs/15.0/python/integration/cuda.html. But this page is quite brief and outdated, even for the current state of the CUDA functionality (for example it only briefly shows buffers, but doesn't mention anything about having an Array or RecordBatch on the CUDA device, or copying reading/writing directly from/to IPC, etc).

A list of ideas:

  • Show reading/writing IPC
  • Show how to copy full Array or RecordBatch to/from host (depending on improvements in [Python] Add bindings for Device and MemoryManager classes and related methods #41126)
  • Expand section on interoperability with other tools (right now it only explains interop with numba, but could also add interop with eg pytorch or cupy, cudf, nanoarrow, etc)
  • Add guide on installing pyarrow with CUDA enabled (from binaries, not to have to build yourself, e.g. this is possible through conda-forge)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions