-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-5071: [C++] CMake benchmark wrapper #4077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-5071: [C++] CMake benchmark wrapper #4077
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to define the named arguments explicitly:
__init__(self, date=None, executable=None)
03ef28e to
3a2b53a
Compare
bkietz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you comment on your organization? For example your classes don't map exactly onto https://github.com/apache/arrow/tree/master/dev/benchmarking/ddl/ (which some might expect)
When benchmarks are ran under ctest (via `ninja benchmark` or `make benchmark`), a `benchmark` directory under the build directory will be created. For each benchmark, 2 files are generated, the `$name.json.original` which is the json output of google benchmark and `$name.json` which is the output expected by the import script of the benchmarks ddl found in `dev/benchmarking`.
3a2b53a to
04b6bdc
Compare
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM in general. Just a couple nits.
| head = run_cmd("git rev-parse HEAD") | ||
| # %ai: author date, ISO 8601-like format | ||
| fmt = "%ai" | ||
| timestamp = run_cmd(f"git log -1 --pretty='{fmt}' {head}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't you pass "HEAD" directly instead of the explicit changeset id?
| if maybe_median: | ||
| return maybe_median[0].value | ||
| # fallback | ||
| return self.runs[int(self.n_runs / 2)].value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct only if n_runs is odd (otherwise it should probably be the average of the two "middle" values)... Is there a circumstance where gbenchmark does several runs but doesn't output the median / mean / stddev ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the statistics just for the sake of completeness with min/q1/q3/max (which are not provided by google benchmark). According to wikipedia, the median is n*n+1/2 when odd, I'll fix it.
| return [cls(k, version, list(bs)) for k, bs in groups] | ||
|
|
||
|
|
||
| def as_arrow(version, payload): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to add a comment or docstring somewhere explaining what this script does. It will make maintenance easier in 6 months ;-)
|
|
||
| benchmark_json = json.load(in_fd) | ||
| converted = as_arrow(version, benchmark_json) | ||
| json.dump(converted, out_fd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want to add indent=2 to get a pretty-printed JSON output.
|
I'll re-open this once #4141 is merged, I'll re-implement this functionality as part of |
When benchmarks are ran under ctest (via
ninja benchmarkormake benchmark), abenchmarksdirectory under the build directory will be created.For each benchmark, 2 files are generated, the
$name.json.originalwhich is the json output of google benchmark and$name.jsonwhich is the output expected by the import script of the benchmarks ddl found indev/benchmarking.