ARROW-12016 [C++] Implement array_sort_indices and sort_indices for BOOL type #10585

nirandaperera · 2021-06-23T19:35:43Z

Adding array_sort_indices and partition_nth_indices for BooleanType using existing sort and Nth-partition utils.
This may be rather inefficient, since the values are traversed bit-by-bit rather than working on a byte/word.

May be we could work on it as a separate improvement?

github-actions · 2021-06-23T19:36:02Z

https://issues.apache.org/jira/browse/ARROW-12016

nirandaperera · 2021-06-23T19:36:21Z

@pitrou Can you take a look at this?

ianmcook · 2021-06-23T19:54:27Z

@nirandaperera as part of this PR, could you please make two small changes to the R package tests to exercise this new capability?

Remove the comment at the end of this line:

arrow/r/tests/testthat/helper-data.R

Line 155 in 515b05c

lgl = c(rep(FALSE, 4L), rep(TRUE, 5L), NA), # bool is not supported (ARROW-12016)
Remove this line:

arrow/r/tests/testthat/test-dplyr-arrange.R

Line 162 in 450e0eb

skip("Sorting by bool columns is not supported (ARROW-12016)")

Thanks!

nirandaperera · 2021-06-23T20:08:32Z

@ianmcook Done! :-)

nirandaperera · 2021-06-23T20:24:21Z

@ianmcook it looks like something else is missing.
https://github.com/apache/arrow/pull/10585/checks?check_run_id=2898806771

pitrou · 2021-06-24T07:29:20Z

@nirandaperera You may want to add a benchmark in vector_sort_benchmark.cc.

pitrou · 2021-06-24T07:31:31Z

Also, if you want to work on performance, note that a dedicated counting sort for boolean should be really simple.
You can first call null_count, true_count and false_count, then you just have to walk individual bits and emit indices.

nirandaperera · 2021-06-24T18:22:50Z

@nirandaperera You may want to add a benchmark in vector_sort_benchmark.cc.

@pitrou I added a simple benchmark now. I'll add the improved version and run it against that bench. Didn't have to do much for RecordBatches and Tables because they are already using ::GetView methods to access values. So, it was working OOB for bools.

cpp/src/arrow/compute/kernels/vector_sort_benchmark.cc

nirandaperera · 2021-06-28T19:57:55Z

@pitrou I think I'll open a new JIRA for the optimized bool sort implementation. I feel like I can reuse some of the stuff from the #10487 PR here.

nirandaperera · 2021-07-02T23:17:10Z

@pitrou I added a separate ArrayCountSorter impl for bool types.

pitrou · 2021-07-05T15:04:25Z

cpp/src/arrow/compute/kernels/vector_sort.cc

Really, you needn't write this yourself. Just call null_count() and true_count().

Oh okay. I did this way because we can count both nulls and trues in a single pass. But sure, I'll use that.

pitrou · 2021-07-05T15:09:14Z

cpp/src/arrow/compute/kernels/vector_sort.cc

This sounds rather weird. Are you sure about this?

well, I didn't test this. Just following the ArrayCountSorter impl.
https://github.com/apache/arrow/blob/36d7a562f08d77dbf5ed699d486badcc9403f619/cpp/src/arrow/compute/kernels/vector_sort.cc#L438

This comment was added by me very early when implementing counting sort approach.
I think it's possibly due to 32bit counter array is smaller and have more chance to stay in L1 cache.

Unless it's benchmarked here as well, maybe let's remove the comment to avoid being confusing.

This comment is there in the primitive type impl as well. Should that be removed as well.

That one was benchmarked!

More seriously…maybe at least edit them to reflect that it was done due to a benchmark. Though I worry about the comment effectively bitrotting.

pitrou · 2021-07-05T15:33:06Z

cpp/src/arrow/compute/kernels/vector_sort_test.cc

I don't think it's worth testing the non-null case separately. This will make less test code to maintain.

pitrou · 2021-07-05T15:33:11Z

cpp/src/arrow/compute/kernels/vector_sort_test.cc

lidavidm

LGTM. One minor comment about that confusing comment.

lidavidm · 2021-07-16T15:08:44Z

cpp/src/arrow/compute/kernels/vector_sort.cc

Unless it's benchmarked here as well, maybe let's remove the comment to avoid being confusing.

nirandaperera · 2021-07-19T15:56:38Z

I made the suggested changes and I think this is ready now

lidavidm · 2021-07-19T17:03:05Z

If we're removing the 32 vs 64 bit counter branch, can we benchmark it to make sure there's no impact?

nirandaperera · 2021-07-19T18:45:01Z

If we're removing the 32 vs 64 bit counter branch, can we benchmark it to make sure there's no impact?

I'm still thinking how to do this benchmark? 😄 Because we can't call separate ArrayCountSorter<BooleanType> impls from the bench suite, isn't it?

lidavidm · 2021-07-19T19:00:51Z

It would be a before vs after benchmark not side by side

kszucs · 2021-07-19T22:34:36Z

@nirandaperera MSVC doesn't look happy.

edponce · 2021-07-19T22:53:42Z

@nirandaperera @kszucs The MSVC error seems unrelated to this PR and is cause by a timeout in a Flight test.

kszucs · 2021-07-19T22:56:15Z

I'm referring to these errors (appveyor not mingw).

nirandaperera · 2021-07-20T06:12:43Z

I'm referring to these errors (appveyor not mingw).

I believe this is due to a method being declared static. Let's see! thanks @kszucs

nirandaperera · 2021-07-20T13:53:28Z

It would be a before vs after benchmark not side by side

I tested this with the ArraySortIndicesBool benchmark and didn't see any significant difference in performance. So I think its okay to leave it with int64_t.

int64_t

-------------------------------------------------------------------------------------------
Benchmark                                 Time             CPU   Iterations UserCounters...
-------------------------------------------------------------------------------------------
ArraySortIndicesBool/32768/10000     856525 ns       790696 ns         1067 bytes_per_second=39.5221M/s items_per_second=331.536M/s null_percent=0.01 size=32.768k
ArraySortIndicesBool/32768/100      1068125 ns      1067932 ns          644 bytes_per_second=29.2622M/s items_per_second=245.469M/s null_percent=1 size=32.768k
ArraySortIndicesBool/32768/10       1322777 ns      1320088 ns          523 bytes_per_second=23.6727M/s items_per_second=198.581M/s null_percent=10 size=32.768k
ArraySortIndicesBool/32768/2        1898999 ns      1871787 ns          372 bytes_per_second=16.6953M/s items_per_second=140.05M/s null_percent=50 size=32.768k
ArraySortIndicesBool/32768/1         228503 ns       218945 ns         3266 bytes_per_second=142.73M/s items_per_second=1.19731G/s null_percent=100 size=32.768k
ArraySortIndicesBool/32768/0         755168 ns       735851 ns          960 bytes_per_second=42.4678M/s items_per_second=356.246M/s null_percent=0 size=32.768k
ArraySortIndicesBool/1048576/100   35812425 ns     35626885 ns           19 bytes_per_second=28.0687M/s items_per_second=235.457M/s null_percent=1 size=1048.58k
ArraySortIndicesBool/8388608/100  320588616 ns    320491041 ns            2 bytes_per_second=24.9617M/s items_per_second=209.394M/s null_percent=1 size=8.38861M


int32_t

-------------------------------------------------------------------------------------------
Benchmark                                 Time             CPU   Iterations UserCounters...
-------------------------------------------------------------------------------------------
ArraySortIndicesBool/32768/10000     768462 ns       763410 ns         1005 bytes_per_second=40.9347M/s items_per_second=343.385M/s null_percent=0.01 size=32.768k
ArraySortIndicesBool/32768/100      1060476 ns      1026142 ns          765 bytes_per_second=30.4539M/s items_per_second=255.466M/s null_percent=1 size=32.768k
ArraySortIndicesBool/32768/10       1447835 ns      1375450 ns          499 bytes_per_second=22.7198M/s items_per_second=190.588M/s null_percent=10 size=32.768k
ArraySortIndicesBool/32768/2        2002536 ns      2001982 ns          343 bytes_per_second=15.6095M/s items_per_second=130.942M/s null_percent=50 size=32.768k
ArraySortIndicesBool/32768/1         274177 ns       268186 ns         2750 bytes_per_second=116.523M/s items_per_second=977.469M/s null_percent=100 size=32.768k
ArraySortIndicesBool/32768/0         735371 ns       734928 ns         1025 bytes_per_second=42.5211M/s items_per_second=356.693M/s null_percent=0 size=32.768k
ArraySortIndicesBool/1048576/100   42704206 ns     42695362 ns           12 bytes_per_second=23.4217M/s items_per_second=196.476M/s null_percent=1 size=1048.58k
ArraySortIndicesBool/8388608/100  310066885 ns    310040314 ns            2 bytes_per_second=25.8031M/s items_per_second=216.452M/s null_percent=1 size=8.38861M

lidavidm · 2021-07-20T16:28:36Z

@ursabot please benchmark lang=C++

lidavidm · 2021-07-20T16:29:00Z

Thanks for checking. Let's also check with Conbench and if that's alright, then let's merge.

ursabot · 2021-07-20T16:29:09Z

Benchmark runs are scheduled for baseline = d7a8b46 and contender = 85445cf37da25a953bb15478938853613b64cc18. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Provided benchmark filters do not have any benchmark groups to be executed on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2 (mimalloc)
[Skipped ⚠️ Only ['Python', 'R'] langs are supported on ursa-i9-9960x] ursa-i9-9960x (mimalloc)
[Failed] ursa-thinkcentre-m75q (mimalloc)
Supported benchmarks:
ursa-i9-9960x: langs = Python, R
ursa-thinkcentre-m75q: langs = C++, Java
ec2-t3-xlarge-us-east-2: cloud = True

lidavidm · 2021-07-20T17:03:32Z

@nirandaperera this apparently needs rebasing against master before we can run Conbench on it

kszucs · 2021-07-21T10:37:58Z

@ursabot please benchmark lang=C++

ursabot · 2021-07-21T10:38:09Z

Benchmark runs are scheduled for baseline = 737492e and contender = 992f8dcf38caf30405f100636760a77e5e98d056. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Provided benchmark filters do not have any benchmark groups to be executed on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2 (mimalloc)
[Skipped ⚠️ Only ['Python', 'R'] langs are supported on ursa-i9-9960x] ursa-i9-9960x (mimalloc)
[Finished ⬇️0.43% ⬆️0.05%] ursa-thinkcentre-m75q (mimalloc)
Supported benchmarks:
ursa-i9-9960x: langs = Python, R
ursa-thinkcentre-m75q: langs = C++, Java
ec2-t3-xlarge-us-east-2: cloud = True

pitrou · 2021-07-21T11:06:58Z

@ursabot please benchmark lang=C++

ursabot · 2021-07-21T11:07:09Z

Benchmark runs are scheduled for baseline = 737492e and contender = 257527c. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Provided benchmark filters do not have any benchmark groups to be executed on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2 (mimalloc)
[Skipped ⚠️ Only ['Python', 'R'] langs are supported on ursa-i9-9960x] ursa-i9-9960x (mimalloc)
[Finished ⬇️0.19% ⬆️0.0%] ursa-thinkcentre-m75q (mimalloc)
Supported benchmarks:
ursa-i9-9960x: langs = Python, R
ursa-thinkcentre-m75q: langs = C++, Java
ec2-t3-xlarge-us-east-2: cloud = True

github-actions bot added the Component: C++ label Jun 23, 2021

github-actions bot added the Component: R label Jun 23, 2021

pitrou reviewed Jun 28, 2021

View reviewed changes

cpp/src/arrow/compute/kernels/vector_sort_benchmark.cc Outdated Show resolved Hide resolved

nirandaperera force-pushed the ARROW-12016 branch 2 times, most recently from c1ef90b to 9cffeae Compare July 2, 2021 23:04

nirandaperera requested a review from pitrou July 2, 2021 23:05

pitrou requested changes Jul 5, 2021

View reviewed changes

nirandaperera requested a review from pitrou July 6, 2021 14:19

lidavidm approved these changes Jul 16, 2021

View reviewed changes

nirandaperera added 14 commits July 21, 2021 12:24

adding bool sort and partition_nth_indices

62b9d01

adding chunked array capability

bec60ae

editing R files

be3f0eb

adding bool type for record batches and tables

5af2751

adding a simple bench

ad1fe2b

minor change

cc0f790

adding specialized impl for bool

efb4800

changing vector to array

cb0cead

adding PR comments

c81df3f

Update vector_sort.cc

5d58b53

minor changes

ce74d34

removing non-null case in TestTableSortIndices

151e466

removing unnecessary uint32/ uint64 branch

fbb2129

trying to fix appveyor error

b6ccdb2

kszucs force-pushed the ARROW-12016 branch from 85445cf to 992f8dc Compare July 21, 2021 10:37

Update docs, simplify code

3de75a7

pitrou force-pushed the ARROW-12016 branch from 992f8dc to 3de75a7 Compare July 21, 2021 10:42

pitrou approved these changes Jul 21, 2021

View reviewed changes

Improve performance in the no-nulls case

257527c

pitrou closed this in 1ce1f10 Jul 21, 2021

asfimport mentioned this pull request Jul 21, 2021

[C++] Implement array_sort_indices and sort_indices for BOOL type #27847

Closed

ARROW-12016 [C++] Implement array_sort_indices and sort_indices for BOOL type #10585

ARROW-12016 [C++] Implement array_sort_indices and sort_indices for BOOL type #10585

Uh oh!

Conversation

nirandaperera commented Jun 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jun 23, 2021

Uh oh!

nirandaperera commented Jun 23, 2021

Uh oh!

ianmcook commented Jun 23, 2021

Uh oh!

nirandaperera commented Jun 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nirandaperera commented Jun 23, 2021

Uh oh!

pitrou commented Jun 24, 2021

Uh oh!

pitrou commented Jun 24, 2021

Uh oh!

nirandaperera commented Jun 24, 2021

Uh oh!

Uh oh!

nirandaperera commented Jun 28, 2021

Uh oh!

nirandaperera commented Jul 2, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nirandaperera Jul 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lidavidm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nirandaperera commented Jul 19, 2021

Uh oh!

lidavidm commented Jul 19, 2021

Uh oh!

nirandaperera commented Jul 19, 2021

Uh oh!

lidavidm commented Jul 19, 2021

Uh oh!

kszucs commented Jul 19, 2021

Uh oh!

edponce commented Jul 19, 2021

Uh oh!

kszucs commented Jul 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nirandaperera commented Jul 20, 2021

Uh oh!

nirandaperera commented Jul 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lidavidm commented Jul 20, 2021

Uh oh!

lidavidm commented Jul 20, 2021

Uh oh!

nirandaperera commented Jun 23, 2021 •

edited

Loading

nirandaperera commented Jun 23, 2021 •

edited

Loading

nirandaperera Jul 16, 2021 •

edited

Loading

kszucs commented Jul 19, 2021 •

edited

Loading

nirandaperera commented Jul 20, 2021 •

edited

Loading

ursabot commented Jul 20, 2021 •

edited

Loading

ursabot commented Jul 21, 2021 •

edited

Loading

ursabot commented Jul 21, 2021 •

edited

Loading