Run TPC-H SF10 during PR benchmarks #9822

gruuya · 2024-03-27T13:24:10Z

Which issue does this PR close?

Progresses #5504.

Rationale for this change

Make the variance a smaller component in benchmarks by using a larger SF
and thus reduce the noise/false-positives.

What changes are included in this PR?

Run SF 10 in PR benchmarks too.

Are these changes tested?

Are there any user-facing changes?

Dandandan · 2024-03-27T14:48:17Z

.github/workflows/pr_benchmarks.yml

          
-          # Setup the TPC-H data set with a scale factor of 10
+          # Setup the TPC-H data sets for scale factors 1 and 10
          ./bench.sh data tpch


Does it make sense to do SF=1?

Fair point; I guess it can be beneficial to detect minor systemic/non-linear regressions that are larger than the noise level, but smaller then the sensitivity of SF 10?

Dandandan · 2024-03-27T14:49:18Z

.github/workflows/pr_benchmarks.yml

          cd benchmarks

          ./bench.sh run tpch
+          ./bench.sh run tpch10


Perhaps we could run tpch10_mem as well if it doesn't run OOM, or tpch_mem otherwise which should have less variance.

Yeah I can add tpch10_mem as well.

Ok I've added both tpch_mem10 and tpch_mem, so that we can observe and compare the noise level for each one.

Also distinguish the output file by the SF used.

* Run TPC-H SF10 during PR benchmarks * Add memory benchmarks to the workflow Also distinguish the output file by the SF used.

Run TPC-H SF10 during PR benchmarks

99f1ea6

Dandandan reviewed Mar 27, 2024

View reviewed changes

Add memory benchmarks to the workflow

bc5da25

Also distinguish the output file by the SF used.

Dandandan approved these changes Mar 27, 2024

View reviewed changes

Dandandan merged commit 7f4b338 into apache:main Mar 27, 2024

gruuya deleted the pr-bench-tpch-sf10 branch March 28, 2024 07:49

alamb mentioned this pull request Mar 29, 2024

Minor: Remove TPCH scale factor 10 from PR benchmark checks #9856

Closed

Lordworms pushed a commit to Lordworms/arrow-datafusion that referenced this pull request Apr 1, 2024

Run TPC-H SF10 during PR benchmarks (apache#9822)

21c7254

* Run TPC-H SF10 during PR benchmarks * Add memory benchmarks to the workflow Also distinguish the output file by the SF used.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Run TPC-H SF10 during PR benchmarks #9822

Run TPC-H SF10 during PR benchmarks #9822

Uh oh!

gruuya commented Mar 27, 2024

Uh oh!

Dandandan Mar 27, 2024

Uh oh!

gruuya Mar 27, 2024

Uh oh!

Dandandan Mar 27, 2024

Uh oh!

gruuya Mar 27, 2024

Uh oh!

gruuya Mar 27, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Run TPC-H SF10 during PR benchmarks #9822

Run TPC-H SF10 during PR benchmarks #9822

Uh oh!

Conversation

gruuya commented Mar 27, 2024

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

Dandandan Mar 27, 2024

Choose a reason for hiding this comment

Uh oh!

gruuya Mar 27, 2024

Choose a reason for hiding this comment

Uh oh!

Dandandan Mar 27, 2024

Choose a reason for hiding this comment

Uh oh!

gruuya Mar 27, 2024

Choose a reason for hiding this comment

Uh oh!

gruuya Mar 27, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants