[SPARK-35266][TESTS] Fix error in BenchmarkBase.scala that occurs when creating benchmark files in non-existent directory#32394
Conversation
…files in non-existent directory
| if (!dir.exists()) { | ||
| dir.mkdirs() | ||
| } | ||
| val file = new File(s"${dir}$resultFileName") |
There was a problem hiding this comment.
Ah, okay. the new benchmark were added at SPARK-33882 and SPARK-35150, that was after #32015 and #32044.
| val file = new File(s"${prefix}benchmarks/$resultFileName") | ||
| val dir = new File(s"${prefix}benchmarks/") | ||
| if (!dir.exists()) { | ||
| dir.mkdirs() |
There was a problem hiding this comment.
Can you add println and say the directory is going to be created? e.g.)
// scalastyle:off println
println(s"Creating ${dir.getAbsolutePath} for benchmark results.")
// scalastyle:on printlnThere was a problem hiding this comment.
My concern is that the benchmark directory is based on jars paths which are flaky. Might be better to explicitly show.
There was a problem hiding this comment.
Thanks for the comment :) I added println as suggested.
|
|
ok to test |
|
@srowen and @zhengruifeng FYI from 9244066 and 5b77ebb. I think it was perfectly fine without including benchmark results (but codes only) because It was a bit weird to upload the results based on different spec machines. Now there have been some latest changes at #32015 and #32044, and now the PR authors can run the benchmarks in similar specification very easily (https://spark.apache.org/developer-tools.html#github-workflow-benchmarks), and it makes more sense to include benchmark results in a PR :). |
|
ok to test |
@HyukjinKwon
|
|
Merged to master. |
|
Thanks for your first contribution and congrats for being a contributor! |
What changes were proposed in this pull request?
This PR fixes an error in
BenchmarkBase.scalathat occurs when creating a benchmark file in a non-existent directory.Why are the changes needed?
When submitting a benchmark job using
org.apache.spark.benchmark.Benchmarksclass withSPARK_GENERATE_BENCHMARK_FILES=1option, an exception is raised if the directory where the benchmark file will be generated does not exist.For more information, please refer to SPARK-35266.
Does this PR introduce any user-facing change?
No
How was this patch tested?
After building Spark, manually tested with the following command:
It successfully generated the benchmark result files.
Why it is sufficient:
As illustrated in the comments in
Benchmarks.scala, the command below runs all benchmarks and generates the results:Of all the benchmarks (55 benchmarks in total), only
BLASBenchmarkfails due to the proposed issue for the current code in the master branch. Thus, it is currently sufficient to testBLASBenchmarkto validate this change.