Skip to content

Conversation

@mbutrovich
Copy link
Contributor

@mbutrovich mbutrovich commented Jan 10, 2026

Which issue does this PR close?

Closes #.

Rationale for this change

  • We're busy-polling in executePlan, but this only seems necessary when CometScan is the data source. Operators that implement proper async I/O, like IcebergScanExec and DataSourceExec, should not be busy-polled and in the case of the former it creates a ton of unnecessary replace_waker calls in traces I'm looking at.
flamegraph

What changes are included in this PR?

  • Conditionally yield and only busy-poll if there are CometScan operators in the plan.

How are these changes tested?

  • Existing tests

@mbutrovich mbutrovich marked this pull request as draft January 10, 2026 15:36
@mbutrovich
Copy link
Contributor Author

@andygrove do you mind benchmarking this whenever you get a moment?

@mbutrovich mbutrovich marked this pull request as ready for review January 10, 2026 15:44
@codecov-commenter
Copy link

codecov-commenter commented Jan 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 59.50%. Comparing base (f09f8af) to head (b8b8c0d).
⚠️ Report is 834 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3063      +/-   ##
============================================
+ Coverage     56.12%   59.50%   +3.38%     
- Complexity      976     1382     +406     
============================================
  Files           119      167      +48     
  Lines         11743    15524    +3781     
  Branches       2251     2575     +324     
============================================
+ Hits           6591     9238    +2647     
- Misses         4012     4989     +977     
- Partials       1140     1297     +157     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sqlbenchmark
Copy link

Comet TPC-H Benchmark Results

Commit: b8b8c0d - Don't busy-poll tokio stream for plans without CometScan.
Scale Factor: SF100
Iterations: 1

Query Times

Query Time (s) Query Time (s)
Q1 10.60 Q12 6.75
Q2 5.86 Q13 6.82
Q3 9.49 Q14 3.58
Q4 11.30 Q15 7.22
Q5 18.43 Q16 4.47
Q6 2.51 Q17 31.82
Q7 11.71 Q18 33.07
Q8 24.02 Q19 6.92
Q9 37.24 Q20 6.58
Q10 10.15 Q21 45.38
Q11 4.24 Q22 4.95

Total Time: 303.10 seconds

Spark Configuration
Setting Value
Spark Master local[*]
Driver Memory 32G
Driver Cores 8
Executor Memory 32G
Executor Cores 8
Off-Heap Enabled true
Off-Heap Size 24g
Shuffle Manager CometShuffleManager
Comet Replace SMJ true

Automated benchmark run by dfbench

@apache apache deleted a comment from sqlbenchmark Jan 10, 2026
@andygrove
Copy link
Member

@sqlbenchmark run tpch --conf spark.comet.scan.impl=native_datafusion

@sqlbenchmark
Copy link

Comet TPC-H Benchmark Results

Commit: b8b8c0d - Don't busy-poll tokio stream for plans without CometScan.
Scale Factor: SF100
Iterations: 1

Query Times

Query Time (s) Query Time (s)
Q1 9.37 Q12 6.09
Q2 5.69 Q13 6.51
Q3 8.61 Q14 3.02
Q4 10.71 Q15 6.05
Q5 17.67 Q16 4.71
Q6 1.85 Q17 30.44
Q7 11.24 Q18 32.15
Q8 23.12 Q19 5.75
Q9 35.21 Q20 6.00
Q10 9.43 Q21 44.41
Q11 4.12 Q22 4.82

Total Time: 286.96 seconds

Spark Configuration
Setting Value
Spark Master local[*]
Driver Memory 32G
Driver Cores 8
Executor Memory 32G
Executor Cores 8
Off-Heap Enabled true
Off-Heap Size 24g
Shuffle Manager CometShuffleManager
Comet Replace SMJ true

Automated benchmark run by dfbench

@andygrove
Copy link
Member

@sqlbenchmark invalid command

@mbutrovich mbutrovich merged commit 3a6452b into apache:main Jan 10, 2026
130 checks passed
@mbutrovich mbutrovich deleted the busy_poll_fix branch January 10, 2026 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants