-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-13010: [C++][Compute] Support outputting to slices from kleene kernels #10487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@github-actions autotune |
bkietz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good, thanks for working on this
|
@ursabot please benchmark |
|
Benchmark runs are scheduled for baseline = 39dcb43 and contender = d0cead51aa9eb0f5e4a9b094236632ffdb8c4436. Results will be available as each benchmark for each run completes. |
3a4f269 to
95c3688
Compare
|
@ursabot please benchmark |
|
Benchmark runs are scheduled for baseline = c913aa3 and contender = bcce18e5d4d83f0831de71b363ad91470376084c. Results will be available as each benchmark for each run completes. |
|
@ursabot please benchmark command=cpp-micro --suite-filter=arrow-compute-scalar-boolean-benchmark |
|
@bkietz I added the changes we discussed. Following are the latest bench results in my machine. |
|
@ursabot please benchmark command=cpp-micro --suite-filter=arrow-compute-scalar-boolean-benchmark |
|
Benchmark runs are scheduled for baseline = c913aa3 and contender = 788bd495a8ca8c180355e9066387824fc972d734. Results will be available as each benchmark for each run completes. |
bkietz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nits
|
@ursabot please benchmark |
|
Benchmark runs are scheduled for baseline = c913aa3 and contender = 0631e7bbb5042adb5299440572c5b49633dc58fb. Results will be available as each benchmark for each run completes. |
|
@ursabot please benchmark |
|
Benchmark runs are scheduled for baseline = c913aa3 and contender = 2663d92be3f95598b00391e254eefa11cfb11279. Results will be available as each benchmark for each run completes. |
This reverts commit 97091f85
Co-authored-by: Benjamin Kietzman <bengilgit@gmail.com>
2663d92 to
1b3144b
Compare
|
Locally, I seem to get varying results from run to run (and also depending on the compiler), but (on Ubuntu 20.04, AMD Ryzen 3900) |
|
There are more regressions with gcc 9, though: |
Regression for |
|
What's weird as well is that, sometimes, L2-sized benchmarks are faster than L1-sized, but sometimes they are slower. |
|
@nirandaperera I see, thanks for the insight. |
|
In any case, I don't think the regressions are really terrible in themselves. |
|
I got archery running on my machine and I can confirm that gcc-9 is the problem there. If I use clang-10, it shows a better performance. But gcc-9 shows a lot of regressions. |
|
Did some further analysis on this. It turns out that gcc-10 works much better than gcc-9. |
bkietz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I think since the performance regression is compiler dependent we don't need to worry about it here. Thanks for doing this!
I'll merge shortly
This change adds a
Bitmap::VisitWordsAndWritemethod, that outputs the values of the visitor lambda function to a provided bitmap.