Skip to content

Removing packed_columns from TableChunk#840

Merged
rapids-bot[bot] merged 12 commits intorapidsai:mainfrom
nirandaperera:remote_tablechunk_packed_cols
Feb 5, 2026
Merged

Removing packed_columns from TableChunk#840
rapids-bot[bot] merged 12 commits intorapidsai:mainfrom
nirandaperera:remote_tablechunk_packed_cols

Conversation

@nirandaperera
Copy link
Copy Markdown
Contributor

This PR removes the packed_columns class member from TableChunk.
Now, if a PackedData is provided with DEVICE Buffer, it will be trivially unpacked and made available.

Signed-off-by: niranda perera <niranda.perera@gmail.com>
@nirandaperera nirandaperera requested a review from a team as a code owner February 2, 2026 23:02
@nirandaperera nirandaperera requested a review from madsbk February 2, 2026 23:02
@nirandaperera nirandaperera added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Feb 2, 2026
@madsbk
Copy link
Copy Markdown
Member

madsbk commented Feb 3, 2026

This PR removes the packed_columns class member from TableChunk. Now, if a PackedData is provided with DEVICE Buffer, it will be trivially unpacked and made available.

But this also means that spilling now uses cudf::pack() instead of a simple cudaMemcpyAsync(). Are we sure this extra overhead is neglectable?

Signed-off-by: niranda perera <niranda.perera@gmail.com>
@nirandaperera nirandaperera requested a review from a team as a code owner February 3, 2026 07:54
@wence-
Copy link
Copy Markdown
Contributor

wence- commented Feb 3, 2026

This PR removes the packed_columns class member from TableChunk. Now, if a PackedData is provided with DEVICE Buffer, it will be trivially unpacked and made available.

But this also means that spilling now uses cudf::pack() instead of a simple cudaMemcpyAsync(). Are we sure this extra overhead is neglectable?

I don't understand why that would be required. If the backing store is PackedData surely we can just memcpy the buffer. After all PackedData is just cudf::packed_columns in a trenchcoat.

@madsbk
Copy link
Copy Markdown
Member

madsbk commented Feb 3, 2026

I don't understand why that would be required. If the backing store is PackedData surely we can just memcpy the buffer. After all PackedData is just cudf::packed_columns in a trenchcoat.

True, but what is the point of this PR then?

Comment thread cpp/include/rapidsmpf/memory/buffer.hpp Outdated
Comment thread cpp/include/rapidsmpf/streaming/cudf/table_chunk.hpp
Comment thread cpp/src/streaming/cudf/table_chunk.cpp
Signed-off-by: niranda perera <niranda.perera@gmail.com>
@nirandaperera
Copy link
Copy Markdown
Contributor Author

I don't understand why that would be required. If the backing store is PackedData surely we can just memcpy the buffer. After all PackedData is just cudf::packed_columns in a trenchcoat.

True, but what is the point of this PR then?

@madsbk Now there are only 2 storages, cudf::table or PackedData, rather than 3. If the device data is in PackedData and copy reservation is host/pinned, it will use buffer_copy. If device data is in cudf::table, it'll use pack.

@nirandaperera nirandaperera requested a review from wence- February 3, 2026 15:58
Signed-off-by: niranda perera <niranda.perera@gmail.com>
Copy link
Copy Markdown
Member

@madsbk madsbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, thanks @nirandaperera

Comment thread cpp/include/rapidsmpf/streaming/cudf/table_chunk.hpp Outdated
Comment thread cpp/src/streaming/cudf/table_chunk.cpp Outdated
Comment thread cpp/src/streaming/cudf/table_chunk.cpp Outdated
nirandaperera and others added 3 commits February 4, 2026 07:46
Co-authored-by: Mads R. B. Kristensen <madsbk@gmail.com>
Signed-off-by: niranda perera <niranda.perera@gmail.com>
@nirandaperera nirandaperera requested a review from madsbk February 4, 2026 15:50
Comment thread cpp/include/rapidsmpf/streaming/cudf/table_chunk.hpp
Comment thread cpp/src/streaming/cudf/table_chunk.cpp Outdated
@nirandaperera nirandaperera requested a review from madsbk February 4, 2026 19:07
nirandaperera and others added 4 commits February 4, 2026 12:21
Signed-off-by: niranda perera <niranda.perera@gmail.com>
…era/rapidsmpf into remote_tablechunk_packed_cols
Copy link
Copy Markdown
Member

@madsbk madsbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @nirandaperera. I took the liberty of adding a host-to-device test and some docs.

@wence-
Copy link
Copy Markdown
Contributor

wence- commented Feb 5, 2026

/merge

@rapids-bot rapids-bot bot merged commit e320279 into rapidsai:main Feb 5, 2026
87 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants