Skip to content

Add sort_actor to cudf-polars + rapidsmpf#21690

Merged
rapids-bot[bot] merged 70 commits intorapidsai:mainfrom
rjzamora:sort-actor
Apr 1, 2026
Merged

Add sort_actor to cudf-polars + rapidsmpf#21690
rapids-bot[bot] merged 70 commits intorapidsai:mainfrom
rjzamora:sort-actor

Conversation

@rjzamora
Copy link
Copy Markdown
Member

@rjzamora rjzamora commented Mar 6, 2026

Description

Closes #20486
Depends on rapidsai/rapidsmpf#891 (or similar)

Adds sort_actor for the "rapidsmpf" runtime.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@rjzamora rjzamora self-assigned this Mar 6, 2026
@rjzamora rjzamora requested a review from a team as a code owner March 6, 2026 19:30
@rjzamora rjzamora added feature request New feature or request 2 - In Progress Currently a work in progress non-breaking Non-breaking change labels Mar 6, 2026
@rjzamora rjzamora requested review from bdice and mroeschke March 6, 2026 19:30
@github-actions github-actions bot added Python Affects Python cuDF API. cudf-polars Issues specific to cudf-polars labels Mar 6, 2026
@GPUtester GPUtester moved this to In Progress in cuDF Python Mar 6, 2026
Comment thread python/cudf_polars/cudf_polars/experimental/rapidsmpf/collectives/common.py Outdated
Comment thread python/cudf_polars/cudf_polars/experimental/rapidsmpf/collectives/shuffle.py Outdated
)
if sort_ir.stable:
nrows = df.table.num_rows()
base = seq_num * (1 << 32)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect in a multi-rank setting. I think what you're trying to do is give each row in the (concatenated) stream of local table chunks a unique id.

seq_num * (1 << 32) is a good offset for this because by construction no sequence numbers from chunks on the same rank can now overlap.

But, it seems unnecessary (you have the number of rows when adding the column, so just keep track of that locally).

However, what this doesn't do is ensure that two different ranks have different sequence numbers. For that, you need to incorporate the rank in the high bits of the id, such that when you sort globally, rows from rank-1 (say) come after rows from rank-0 if their keys are equal.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thanks for explaining - I updated the logic here (hopefully in line with your suggestion).

Comment thread python/cudf_polars/cudf_polars/experimental/rapidsmpf/collectives/sort.py Outdated
Comment on lines +292 to +313
df = DataFrame.from_table(
out_table,
cast(Sequence[str], column_names_list),
dtypes_list,
stream,
)
sort_order = [
list(column_order)[by.index(n)]
if n in by
else plc.types.Order.ASCENDING
for n in column_names_list
]
nulls = [
list(null_order)[by.index(n)]
if n in by
else plc.types.NullOrder.AFTER
for n in column_names_list
]
sorted_tbl = plc.sorting.sort(
df.table, sort_order, nulls, stream=stream
)
out_table = plc.Table(sorted_tbl.columns()[:-1])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sort is wrong, because it sorts by all the columns in the table, whereas you only want to sort by the key columns (and the disambiguating seq_id column).

You want to be using sort_by_key

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense - I decided to construct a Sort so I can use do_evaluate, but I can move to an explicit sort_by_key if you'd prefer.

Comment thread python/cudf_polars/cudf_polars/experimental/rapidsmpf/collectives/sort.py Outdated
Comment thread python/cudf_polars/cudf_polars/experimental/rapidsmpf/collectives/sort.py Outdated
Comment on lines +75 to +76
by: list[str],
by_dtypes: list[DataType],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These arguments are only used to construct the schema of the "empty" table for the case that the local candidates are empty, which is then used to provide the schema for chunk_to_frame.

Two things:

  1. we only need the schema for all of those operations, so we should redo the utilities to take the schema.
  2. Let's construct the empty table outside if local_candidates is empty.

So this function becomes:

async def _compute_sort_boundaries(
    context: Context,
    comm: Communicator,
    ir_context: IRExecutionContext,
    local_candidates: list[TableChunk],
    schema: Schema,
    num_partitions: int,
    column_order: list[plc.types.Order],
    null_order: list[plc.types.NullOrder],
    allgather_id: int
) -> plc.Table:
    stream = ...
    boundaries = _get_final_sort_boundaries(
        chunk_to_frame(
            await concat_batch(local_candidates, context, schema, ir_context)
        )
    )
    if comm.nranks > 1:
         ...
    boundaries = _get_final_sort_boundaries(
         ...
    )
    return boundaries.table

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, yeah - I revised this a bit.

Comment thread python/cudf_polars/cudf_polars/experimental/rapidsmpf/collectives/sort.py Outdated
Comment thread python/cudf_polars/cudf_polars/experimental/rapidsmpf/collectives/sort.py Outdated
Comment thread python/cudf_polars/cudf_polars/experimental/rapidsmpf/collectives/sort.py Outdated
Copy link
Copy Markdown
Contributor

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some non-blocking ideas/questions around sortedness:

  1. Are there any simplifications to be made to computing sort boundaries if the incoming data is already sorted?
  2. When processing the resulting message of the sorted TableChunk, is there a practical way to construct the cudf_polars DataFrame with sorted metadata i.e. set the is_sorted flag on the Columns of the cudf_polars DataFrame? Maybe the message sends metadata with that sorted information?

@rjzamora
Copy link
Copy Markdown
Member Author

Are there any simplifications to be made to computing sort boundaries if the incoming data is already sorted?

The sort-boundary calculation requires the incoming chunks to be sorted. We guarantee this at lowering time by ensuring the child of a ShuffleSorted node is always a Sort (which is executed chunk-wise). In the future, we should have metadata like rapidsai/rapidsmpf#853 to skip those local sort operations when the data is already sorted.

When processing the resulting message of the sorted TableChunk, is there a practical way to construct the cudf_polars DataFrame with sorted metadata i.e. set the is_sorted flag on the Columns of the cudf_polars DataFrame? Maybe the message sends metadata with that sorted information?

I'm hoping to use rapidsai/rapidsmpf#853 to track this kind of information in the ChannelMetadata.

Copy link
Copy Markdown
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny changes, but I think the core logic looks good.

Comment thread python/cudf_polars/cudf_polars/experimental/rapidsmpf/collectives/sort.py Outdated
Comment thread python/cudf_polars/cudf_polars/experimental/rapidsmpf/collectives/sort.py Outdated
Comment thread python/cudf_polars/cudf_polars/experimental/rapidsmpf/collectives/sort.py Outdated
@rjzamora rjzamora added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Apr 1, 2026
@rjzamora
Copy link
Copy Markdown
Member Author

rjzamora commented Apr 1, 2026

/merge

@rapids-bot rapids-bot bot merged commit c28d2ee into rapidsai:main Apr 1, 2026
90 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in cuDF Python Apr 1, 2026
@rjzamora rjzamora deleted the sort-actor branch April 1, 2026 15:21
rapids-bot bot pushed a commit that referenced this pull request Apr 6, 2026
Should close #21824

See test case for repro, but basically if you had

```python
    df = pl.LazyFrame({"a": [1, 2, 3]})
    # Create two filters, both of which will give empty results
    q = pl.concat([df.filter(pl.col("a") == 0), df.filter(pl.col("a") == 4)]).sort("a")
```

then `q.collect(engine="gpu")` you'd get a cuda exception because we'd try to do an out of bounds access on an empty table.

edit: turns out #21690 fixed this issue. This PR now will only contribute a test case.

Authors:
  - J Berg (https://github.com/jberg5)

Approvers:
  - Matthew Roeschke (https://github.com/mroeschke)

URL: #21825
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

5 - Ready to Merge Testing and reviews complete, ready to merge cudf-polars Issues specific to cudf-polars feature request New feature or request non-breaking Non-breaking change Python Affects Python cuDF API.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[FEA]: Support multi-partition Sort in cudf-polars rapidsmpf runtime

5 participants