Skip to content

Conversation

@zhuqi-lucas
Copy link
Contributor

@zhuqi-lucas zhuqi-lucas commented Jul 3, 2025

Which issue does this PR close?

Currently we only have sort_tpch for benchmark, recently when optimizing sort, i found sort_tpch10 will show more stable result sometimes, so i added this to benchmark script also.

cc @alamb @Dandandan

Rationale for this change

Sometimes, i found sort_tpch10 will get the more accurate or good result when we optimize the merge part, because our in_mem sort buffer is 1MB, so the sort_tpch will have less count for merge compare count, i added the sort_tpch10 to bench.sh, hope it will be helpful.

Currently we only have sort_tpch for benchmark, recently when optimizing sort, i found sort_tpch10 will show more stable result sometimes, so i added this to benchmark script also.

What changes are included in this PR?

Sometimes, i found sort_tpch10 will get the more accurate or good result when we optimize the merge part, because our in_mem sort buffer is 1MB, so the sort_tpch will have less count for merge compare count, i added the sort_tpch10 to bench.sh, hope it will be helpful.

Currently we only have sort_tpch for benchmark, recently when optimizing sort, i found sort_tpch10 will show more stable result sometimes, so i added this to benchmark script also.

Are these changes tested?

Yes

./bench.sh run sort_tpch10
***************************
DataFusion Benchmark Script
COMMAND: run
BENCHMARK: sort_tpch10
QUERY: All
DATAFUSION_DIR: /Users/zhuqi/arrow-datafusion/benchmarks/..
BRANCH_NAME: sort_tpch_10_benchmark_support
DATA_DIR: /Users/zhuqi/arrow-datafusion/benchmarks/data
RESULTS_DIR: /Users/zhuqi/arrow-datafusion/benchmarks/results/sort_tpch_10_benchmark_support
CARGO_COMMAND: cargo run --release
PREFER_HASH_JOIN: true
***************************
RESULTS_FILE: /Users/zhuqi/arrow-datafusion/benchmarks/results/sort_tpch_10_benchmark_support/sort_tpch10.json
Running sort tpch benchmark...
+ cargo run --release --bin dfbench -- sort-tpch --iterations 5 --path /Users/zhuqi/arrow-datafusion/benchmarks/data/tpch_sf10 -o /Users/zhuqi/arrow-datafusion/benchmarks/results/sort_tpch_10_benchmark_support/sort_tpch10.json
    Finished `release` profile [optimized] target(s) in 0.35s
     Running `/Users/zhuqi/arrow-datafusion/target/release/dfbench sort-tpch --iterations 5 --path /Users/zhuqi/arrow-datafusion/benchmarks/data/tpch_sf10 -o /Users/zhuqi/arrow-datafusion/benchmarks/results/sort_tpch_10_benchmark_support/sort_tpch10.json`
Q1 iteration 0 took 1554.1 ms and returned 59986052 rows
Q1 iteration 1 took 1532.3 ms and returned 59986052 rows
Q1 iteration 2 took 1518.5 ms and returned 59986052 rows
Q1 iteration 3 took 1528.3 ms and returned 59986052 rows
Q1 iteration 4 took 1533.0 ms and returned 59986052 rows
Q1 avg time: 1533.22 ms
Q2 iteration 0 took 1406.7 ms and returned 59986052 rows
Q2 iteration 1 took 1431.7 ms and returned 59986052 rows
Q2 iteration 2 took 1398.6 ms and returned 59986052 rows
Q2 iteration 3 took 1394.7 ms and returned 59986052 rows
Q2 iteration 4 took 1414.2 ms and returned 59986052 rows
Q2 avg time: 1409.18 ms
Q3 iteration 0 took 6564.8 ms and returned 59986052 rows
Q3 iteration 1 took 6535.8 ms and returned 59986052 rows
Q3 iteration 2 took 6638.9 ms and returned 59986052 rows
Q3 iteration 3 took 6677.1 ms and returned 59986052 rows
Q3 iteration 4 took 6712.8 ms and returned 59986052 rows
Q3 avg time: 6625.86 ms
Q4 iteration 0 took 1963.8 ms and returned 59986052 rows
Q4 iteration 1 took 1967.9 ms and returned 59986052 rows
Q4 iteration 2 took 1931.0 ms and returned 59986052 rows
Q4 iteration 3 took 1925.3 ms and returned 59986052 rows
Q4 iteration 4 took 1946.5 ms and returned 59986052 rows
Q4 avg time: 1946.88 ms
Q5 iteration 0 took 2431.6 ms and returned 59986052 rows
Q5 iteration 1 took 2472.9 ms and returned 59986052 rows
Q5 iteration 2 took 2504.5 ms and returned 59986052 rows
Q5 iteration 3 took 2485.8 ms and returned 59986052 rows
Q5 iteration 4 took 2350.9 ms and returned 59986052 rows
Q5 avg time: 2449.12 ms
Q6 iteration 0 took 2623.9 ms and returned 59986052 rows
Q6 iteration 1 took 2580.0 ms and returned 59986052 rows
Q6 iteration 2 took 2622.9 ms and returned 59986052 rows
Q6 iteration 3 took 2622.1 ms and returned 59986052 rows
Q6 iteration 4 took 2579.9 ms and returned 59986052 rows
Q6 avg time: 2605.76 ms
Q7 iteration 0 took 4385.8 ms and returned 59986052 rows
Q7 iteration 1 took 4385.7 ms and returned 59986052 rows
Q7 iteration 2 took 4233.3 ms and returned 59986052 rows
Q7 iteration 3 took 4209.4 ms and returned 59986052 rows
Q7 iteration 4 took 4233.9 ms and returned 59986052 rows
Q7 avg time: 4289.63 ms
Q8 iteration 0 took 2797.1 ms and returned 59986052 rows
Q8 iteration 1 took 2781.4 ms and returned 59986052 rows
Q8 iteration 2 took 2882.1 ms and returned 59986052 rows
Q8 iteration 3 took 2784.8 ms and returned 59986052 rows
Q8 iteration 4 took 2883.5 ms and returned 59986052 rows
Q8 avg time: 2825.80 ms
Q9 iteration 0 took 2897.6 ms and returned 59986052 rows
Q9 iteration 1 took 3006.3 ms and returned 59986052 rows
Q9 iteration 2 took 2968.0 ms and returned 59986052 rows
Q9 iteration 3 took 2965.9 ms and returned 59986052 rows
Q9 iteration 4 took 2964.7 ms and returned 59986052 rows
Q9 avg time: 2960.51 ms
Q10 iteration 0 took 7662.0 ms and returned 59986052 rows
Q10 iteration 1 took 7396.4 ms and returned 59986052 rows
Q10 iteration 2 took 8013.9 ms and returned 59986052 rows
Q10 iteration 3 took 7553.1 ms and returned 59986052 rows
Q10 iteration 4 took 6845.6 ms and returned 59986052 rows
Q10 avg time: 7494.20 ms
Q11 iteration 0 took 3567.1 ms and returned 59986052 rows
Q11 iteration 1 took 3424.9 ms and returned 59986052 rows
Q11 iteration 2 took 3425.2 ms and returned 59986052 rows
Q11 iteration 3 took 3375.3 ms and returned 59986052 rows
Q11 iteration 4 took 3357.2 ms and returned 59986052 rows
Q11 avg time: 3429.94 ms
+ set +x
Done

Are there any user-facing changes?

No

@Dandandan Dandandan merged commit 3ca09a6 into apache:main Jul 3, 2025
29 checks passed
@alamb
Copy link
Contributor

alamb commented Jul 3, 2025

🚀

@alamb
Copy link
Contributor

alamb commented Jul 3, 2025

Thanks @zhuqi-lucas and @Dandandan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants