-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-5269: [C++][Archery] Mark relevant benchmarks as regression #4285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-5269: [C++][Archery] Mark relevant benchmarks as regression #4285
Conversation
|
I'm not sure why the "Regression" designation is necessary. It seems like we would want to monitor the performance of all of our benchmarks for significant changes. |
|
Multiple reasons:
I think it's preferable to white list than include everything and possibly increase false positive regression alerts. It also lower the time to run benchmarks since archery is explicitly passing (by default) the |
|
By the way, I have no attachment to |
|
If it's an issue with reporting on flaky benchmarks, we could create a "blacklist file" to instruct the reporter not to report regressions in known flakes. |
|
Some thoughts, for what they're worth:
|
|
Ok, I'll just remove the filtering by default. Should we strip the |
|
It's been several days and I'm still having a hard time swallowing the "Regression*" naming convention |
fe050bb to
0592ecb
Compare
|
@ursabot benchmark |
|
I've successfully started builds for this PR |
|
AMD64 Ubuntu 18.04 C++ Benchmark Build failed. |
Codecov Report
@@ Coverage Diff @@
## master #4285 +/- ##
==========================================
+ Coverage 88.3% 89.28% +0.98%
==========================================
Files 780 636 -144
Lines 98400 87138 -11262
Branches 1251 0 -1251
==========================================
- Hits 86891 77801 -9090
+ Misses 11273 9337 -1936
+ Partials 236 0 -236
Continue to review full report at Codecov.
|
|
@ursabot benchmark |
|
@pitrou note that I added the cmake configuration flag Thus if you add a new benchmark and you know that it's only used for a point of reference of the true benchmark, you should probably wrap it with said |
|
I've successfully started builds for this PR |
|
AMD64 Ubuntu 18.04 C++ Benchmark Build failed with an exception. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really just a nit, but ARROW_WITH_REFERENCE_BENCHMARKS might be a better choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, there's ARROW_BUILD_BENCHMARKS already. The convention seems to be ARROW_BUILD_xxx to enable the xxx target, and ARROW_WITH_yyy to enable support for the yyy external library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you suggesting changing the s/BUILD/WITH/ or the swap of BENCHMARKS/REFERENCE?
kszucs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a couple nits
|
@ursabot benchmark |
|
I've successfully started builds for this PR |
|
Why did We get empty output in https://ci.ursalabs.org/#builders/73/builds/20 ? |
|
@kszucs because benchmarks were renamed and thus the intersection between 2 run is empty. See https://github.com/apache/arrow/blob/master/dev/archery/archery/benchmark/compare.py#L115-L116 |
|
AMD64 Ubuntu 18.04 C++ Benchmark Build failed with an exception. |
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for doing this. Here are some comments and questions.
- Ensure that Builder benchmarks are working on inputs of the same (approximately) size in bytes. This allows relative comparison between builders. - Renamed benchmarks by prefixing `Regression`. - Fixed extra string copy in BuildStringDictionary.
- Use the same buffer size in all bitmap benchmarks. - Fix some reporting numbers
- Add '\n' do diff json output adhering to jsonlines - Add support for items_per_second metrics - Add `--pdb` option to drop a pdb shell on uncaught exception
- Use a single data generator - Fix multithread input size - Exclude multithread from regressions
- Favor external repetitions over manual repetitions and mintime when possible. - Add cmake ARROW_BUILD_BENCHMARKS_REFERENCE to toggle reference benchmarks. - Remove default benchmark filter of `^Regression`. - Remove Regression prefix from benchmark
b8c4c41 to
a9ecffd
Compare
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @fsaintjacques . just a couple more things and it'll be good IMO :-)
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 :-)
The goal of this change is to mark benchmarks candidate for automated regression checks. Some benchmarks were refactored for various reasons:
^Regression.BM_prefix from benchmark namesarcheryto support the--pdboption for debugging and add support for benchmarks reported asitems_per_seconds.archery benchmark listsub-command to list suites and benchmarks