The Arrow IPC primitives support reading and writing Tables and Columns in GPU memory. We should add support for reading the Arrow IPC format when the input data is a CUDA buffer, as well as writing DFs to CUDA buffers of the Arrow IPC format.
This would allow us to easily serialize a DataFrame to GPU memory, share that memory with multiple processes (via CUDA IPC), and allow those processes to zero-copy read the Arrow Table from the shared memory pointer and use its buffers as the backing storage for a DataFrame.
cuDF Python has support for zero-copy reading the Arrow IPC format stored in a CUDA buffer 1 2 with a bit of help from libcudf 3. It doesn't support writing the Arrow IPC format to a CUDA buffer, but we should be able to use the reading logic as a guide.
GpuArrowReader in python/cudf/cudf/comm/gpuarrow.py
CudaRecordBatchStreamReader in python/cudf/cudf/_lib/gpuarrow.pyx
CudaMessageReader in cpp/src/comms/ipc/ipc.cpp
The Arrow IPC primitives support reading and writing Tables and Columns in GPU memory. We should add support for reading the Arrow IPC format when the input data is a CUDA buffer, as well as writing DFs to CUDA buffers of the Arrow IPC format.
This would allow us to easily serialize a DataFrame to GPU memory, share that memory with multiple processes (via CUDA IPC), and allow those processes to zero-copy read the Arrow Table from the shared memory pointer and use its buffers as the backing storage for a DataFrame.
cuDF Python has support for zero-copy reading the Arrow IPC format stored in a CUDA buffer 1 2 with a bit of help from libcudf 3. It doesn't support writing the Arrow IPC format to a CUDA buffer, but we should be able to use the reading logic as a guide.
GpuArrowReaderinpython/cudf/cudf/comm/gpuarrow.pyCudaRecordBatchStreamReaderinpython/cudf/cudf/_lib/gpuarrow.pyxCudaMessageReaderincpp/src/comms/ipc/ipc.cpp