-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-8979: [C++] Refine bitmap unaligned word access #7340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@ursabot benchmark --suite-filter=arrow-bit-util-benchmark |
|
Not sure of "C GLib & Ruby" ci failure, is it related to this patch? |
|
AMD64 Ubuntu 18.04 C++ Benchmark (#109272) builder has been succeeded. Revision: ef578eef3463717ce18e95c1375a327288f645dc ====================================== =============== =============== ========
benchmark baseline contender change
====================================== =============== =============== ========
BenchmarkBitmapVisitUInt64And/32768/1 2.350 GiB/sec 2.338 GiB/sec -0.499%
BenchmarkBitmapVisitBitsetAnd/131072/0 17.121 MiB/sec 17.584 MiB/sec 2.703%
CopyBitmapWithoutOffset/8192 65.302 GiB/sec 65.929 GiB/sec 0.961%
BenchmarkBitmapAnd/131072/2 3.266 GiB/sec 3.285 GiB/sec 0.573%
SetBitsTo/16 1.546 GiB/sec 1.545 GiB/sec -0.007%
BenchmarkBitmapAnd/32768/1 3.195 GiB/sec 3.223 GiB/sec 0.879%
BenchmarkBitmapVisitUInt8And/32768/1 309.541 MiB/sec 309.631 MiB/sec 0.029%
BenchmarkBitmapAnd/32768/0 7.178 GiB/sec 7.172 GiB/sec -0.078%
VisitBits/8192 102.771 MiB/sec 102.787 MiB/sec 0.015%
BenchmarkBitmapAnd/131072/0 5.712 GiB/sec 5.677 GiB/sec -0.623%
BenchmarkBitmapVisitUInt8And/131072/0 316.207 MiB/sec 315.990 MiB/sec -0.069%
BenchmarkBitmapVisitBitsetAnd/32768/1 17.123 MiB/sec 17.517 MiB/sec 2.299%
BenchmarkBitmapVisitUInt64And/131072/0 2.823 GiB/sec 2.817 GiB/sec -0.211%
BenchmarkBitmapVisitBitsetAnd/131072/1 17.123 MiB/sec 17.583 MiB/sec 2.690%
BenchmarkBitmapVisitBitsetAnd/32768/2 17.125 MiB/sec 17.521 MiB/sec 2.312%
SetBitsTo/131072 26.165 GiB/sec 26.099 GiB/sec -0.252%
GenerateBits/8192 80.970 MiB/sec 81.408 MiB/sec 0.541%
BenchmarkBitmapVisitUInt64And/131072/1 2.657 GiB/sec 2.653 GiB/sec -0.177%
CopyBitmapWithOffset/8192 4.177 GiB/sec 5.041 GiB/sec 20.692%
BenchmarkBitmapVisitUInt8And/131072/1 311.659 MiB/sec 311.634 MiB/sec -0.008%
BitmapEqualsWithOffset/8192 3.521 GiB/sec 3.842 GiB/sec 9.108%
BenchmarkBitmapVisitUInt8And/131072/2 311.577 MiB/sec 311.609 MiB/sec 0.010%
BenchmarkBitmapVisitUInt8And/32768/0 315.946 MiB/sec 315.471 MiB/sec -0.150%
SetBitsTo/2 190.502 MiB/sec 190.490 MiB/sec -0.006%
BenchmarkBitmapVisitBitsetAnd/131072/2 17.111 MiB/sec 17.557 MiB/sec 2.608%
BenchmarkBitmapVisitUInt64And/32768/2 2.353 GiB/sec 2.345 GiB/sec -0.328%
BitmapEqualsWithoutOffset/8192 59.317 GiB/sec 57.058 GiB/sec -3.808%
BenchmarkBitmapAnd/32768/2 3.196 GiB/sec 3.225 GiB/sec 0.898%
BenchmarkBitmapVisitUInt64And/32768/0 2.876 GiB/sec 2.881 GiB/sec 0.173%
GenerateBitsUnrolled/8192 136.468 MiB/sec 137.508 MiB/sec 0.761%
BenchmarkBitmapVisitUInt64And/131072/2 2.671 GiB/sec 2.636 GiB/sec -1.297%
FirstTimeBitmapWriter/8192 98.413 MiB/sec 98.393 MiB/sec -0.020%
CopyBitmapWithOffsetBoth/8192 2.309 GiB/sec 2.441 GiB/sec 5.729%
BenchmarkBitmapAnd/131072/1 3.270 GiB/sec 3.285 GiB/sec 0.452%
BitmapReader/8192 100.907 MiB/sec 100.937 MiB/sec 0.030%
BenchmarkBitmapVisitBitsetAnd/32768/0 17.102 MiB/sec 17.542 MiB/sec 2.574%
BitmapWriter/8192 80.342 MiB/sec 80.334 MiB/sec -0.010%
SetBitsTo/1024 49.457 GiB/sec 51.567 GiB/sec 4.266%
VisitBitsUnrolled/8192 285.766 MiB/sec 285.735 MiB/sec -0.011%
BenchmarkBitmapVisitUInt8And/32768/2 309.632 MiB/sec 309.660 MiB/sec 0.009%
====================================== =============== =============== ======== |
|
Benchmark result is better than my machine. Pick some items from above table (one minor glitch of the benchmark output is the items are not sorted by name like local benchmark) |
It's not related to this patch. Please ignore it in this pull request. |
|
Perhaps #7352 can be merged first and then this patch can be rebased. Sorry about the extra work. I can help with rebasing also if it helps. |
This patch adds word based bitmap reader/writer class to hold common code in unaligned bitmap operations(copy, compare, logical). It processes in words, then bytes, and finally in bits. Bitmap copying improves about 5% ~ 10% performance. Slight drop(2% ~ 5%) for bitmap comparing. No obvious difference for logical operations.
|
rebased |
wesm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, this definitely makes things cleaner / more readable. Thanks @cyb70289!
This patch adds word based bitmap reader/writer class to hold common
code in unaligned bitmap operations(copy, compare, logical).
It processes in words, then bytes, and finally in bits. Bitmap copying
improves about 5% ~ 10% performance. Slight drop(2% ~ 5%) for bitmap
comparing. No obvious difference for logical operations.