Skip to content

Conversation

@benibus
Copy link
Contributor

@benibus benibus commented Jun 20, 2023

Rationale for this change

Fixes a regression introduced in #35727.

What changes are included in this PR?

Re-implements a branch in the Table sorter that defers to the ChunkedArray sorter for single sort keys.

Are these changes tested?

Covered by existing tests.

Are there any user-facing changes?

No.

@benibus
Copy link
Contributor Author

benibus commented Jun 20, 2023

Local benchmark results (approx. best-case)

Before

-------------------------------------------------------------------------------------------------------
Benchmark                                             Time             CPU   Iterations UserCounters...
-------------------------------------------------------------------------------------------------------
TableSortIndicesInt64Narrow/1048576/100/1/32  367381664 ns    367357156 ns            2 chunks=32 columns=1 items_per_second=2.85438M/s null_percent=1
TableSortIndicesInt64Narrow/1048576/4/1/32    281608431 ns    281605789 ns            2 chunks=32 columns=1 items_per_second=3.72356M/s null_percent=25
TableSortIndicesInt64Narrow/1048576/0/1/32    354414823 ns    354413272 ns            2 chunks=32 columns=1 items_per_second=2.95863M/s null_percent=0
TableSortIndicesInt64Narrow/1048576/100/1/4   315257284 ns    315253535 ns            2 chunks=4 columns=1 items_per_second=3.32614M/s null_percent=1
TableSortIndicesInt64Narrow/1048576/4/1/4     240298213 ns    240291188 ns            3 chunks=4 columns=1 items_per_second=4.36377M/s null_percent=25
TableSortIndicesInt64Narrow/1048576/0/1/4     311795614 ns    311790481 ns            2 chunks=4 columns=1 items_per_second=3.36308M/s null_percent=0
TableSortIndicesInt64Narrow/1048576/100/1/1   282068938 ns    282067950 ns            2 chunks=1 columns=1 items_per_second=3.71746M/s null_percent=1
TableSortIndicesInt64Narrow/1048576/4/1/1     210222675 ns    210216608 ns            3 chunks=1 columns=1 items_per_second=4.98807M/s null_percent=25
TableSortIndicesInt64Narrow/1048576/0/1/1     281105329 ns    281101193 ns            2 chunks=1 columns=1 items_per_second=3.73024M/s null_percent=0

After

-------------------------------------------------------------------------------------------------------
Benchmark                                             Time             CPU   Iterations UserCounters...
-------------------------------------------------------------------------------------------------------
TableSortIndicesInt64Narrow/1048576/100/1/32  148802109 ns    148797635 ns            4 chunks=32 columns=1 items_per_second=7.04699M/s null_percent=1
TableSortIndicesInt64Narrow/1048576/4/1/32    125102993 ns    125100621 ns            6 chunks=32 columns=1 items_per_second=8.38186M/s null_percent=25
TableSortIndicesInt64Narrow/1048576/0/1/32    147291350 ns    147287710 ns            5 chunks=32 columns=1 items_per_second=7.11924M/s null_percent=0
TableSortIndicesInt64Narrow/1048576/100/1/4    72243393 ns     72241902 ns           10 chunks=4 columns=1 items_per_second=14.5148M/s null_percent=1
TableSortIndicesInt64Narrow/1048576/4/1/4      66780395 ns     66779462 ns           11 chunks=4 columns=1 items_per_second=15.7021M/s null_percent=25
TableSortIndicesInt64Narrow/1048576/0/1/4      70065128 ns     70063740 ns           10 chunks=4 columns=1 items_per_second=14.966M/s null_percent=0
TableSortIndicesInt64Narrow/1048576/100/1/1    19336900 ns     19335762 ns           36 chunks=1 columns=1 items_per_second=54.2299M/s null_percent=1
TableSortIndicesInt64Narrow/1048576/4/1/1      25458573 ns     25457697 ns           27 chunks=1 columns=1 items_per_second=41.189M/s null_percent=25
TableSortIndicesInt64Narrow/1048576/0/1/1      17580368 ns     17577897 ns           40 chunks=1 columns=1 items_per_second=59.6531M/s null_percent=0

@benibus
Copy link
Contributor Author

benibus commented Jun 20, 2023

cc @pitrou

@pitrou
Copy link
Member

pitrou commented Jun 20, 2023

This fixes the regression here.

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Jun 20, 2023
@pitrou pitrou merged commit 8a4c19b into apache:main Jun 20, 2023
@pitrou
Copy link
Member

pitrou commented Jun 20, 2023

Thanks for the quick fix @benibus !

@conbench-apache-arrow
Copy link

Conbench analyzed the 6 benchmark runs on commit 8a4c19bd.

There were 7 benchmark results indicating a performance regression:

The full Conbench report has more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[C++] Large regression in single-key table sort performance

2 participants