Allow Spark partial / Comet final for compatible aggregates by Shekharrajak · Pull Request #2994 · apache/datafusion-comet

Shekharrajak · 2025-12-27T09:35:54Z

Which issue does this PR close?

Rationale for this change

Comet currently falls back to Spark for ALL final hash aggregates when there's no Comet partial aggregate in the child plan. This is overly conservative because some aggregates have compatible intermediate buffer formats between Spark and Comet.
For example, MIN, MAX, COUNT, and bitwise aggregates (BIT_AND, BIT_OR, BIT_XOR) have simple intermediate buffers (single value) that are compatible between Spark and Comet. These can safely run with "Spark partial / Comet final" execution.
Other aggregates like SUM, AVG, VARIANCE, etc. have known incompatibilities (e.g., decimal overflow handling differences, complex intermediate buffers) and should continue to fall back when there's no Comet partial aggregate.

What changes are included in this PR?

Added supportsSparkPartialCometFinal method to CometAggregateExpressionSerde trait - Default is false

Added helper function - aggSupportsMixedExecution() in QueryPlanSerde

How are these changes tested?

"CometExecRule should not allow Spark partial and Comet final for unsafe aggregates" - Verifies SUM still falls back to Spark

"CometExecRule should allow Spark partial and Comet final for safe aggregates" - Verifies MIN/MAX/COUNT can use Comet final with Spark partial

codecov-commenter · 2025-12-27T15:42:03Z

Codecov Report

❌ Patch coverage is 53.84615% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 54.58%. Comparing base (f09f8af) to head (51869b1).
⚠️ Report is 803 commits behind head on main.

Files with missing lines	Patch %	Lines
...main/scala/org/apache/comet/serde/aggregates.scala	50.00%	3 Missing ⚠️
.../scala/org/apache/comet/serde/QueryPlanSerde.scala	33.33%	1 Missing and 1 partial ⚠️
...n/scala/org/apache/spark/sql/comet/operators.scala	66.66%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #2994      +/-   ##
============================================
- Coverage     56.12%   54.58%   -1.54%     
- Complexity      976     1256     +280     
============================================
  Files           119      167      +48     
  Lines         11743    15505    +3762     
  Branches       2251     2571     +320     
============================================
+ Hits           6591     8464    +1873     
- Misses         4012     5822    +1810     
- Partials       1140     1219      +79

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

andygrove · 2026-03-16T13:55:43Z

Sorry for the late review @Shekharrajak. This LGTM except for the missing end-to-end tests for bitwise aggregates that @parthchandra already stated.

I will go ahead and add those tests and push to this branch if permissions allow, or create a new branch from this one.

andygrove · 2026-03-16T14:00:15Z

Sorry for the late review @Shekharrajak. This LGTM except for the missing end-to-end tests for bitwise aggregates that @parthchandra already stated.

I will go ahead and add those tests and push to this branch if permissions allow, or create a new branch from this one.

Something is wrong with the git history on this branch so I cannot rebase or upmerge.

@Shekharrajak let me know if you are still interested in working on this. If not, I will create a new PR based on your changes.

Shekharrajak · 2026-03-16T18:17:36Z

Thanks for checking, Let me work on this.

andygrove

LGTM pending CI. Thanks @Shekharrajak

andygrove · 2026-03-27T20:20:02Z

@Shekharrajak golden files now need updating because more aggregates can run natively 🎉

Shekharrajak · 2026-03-28T07:31:52Z

Regenerated the golden files using

 ./dev/regenerate-golden-files.sh --spark-version 4.0

 ./dev/regenerate-golden-files.sh --spark-version 3.5

andygrove · 2026-03-28T15:03:05Z

This seems to have exposed a bug:

[info] - group-by.sql *** FAILED *** (6 seconds, 481 milliseconds)
[info]   group-by.sql
[info]   Expected Some("struct<count(1):bigint>"), but got Some("struct<>") Schema did not match for query #82
[info]   SELECT count(*)
[info]   FROM VALUES (ARRAY(MAP(1, 2, 2, 3), MAP(1, 3))), (ARRAY(MAP(2, 3), MAP(1, 3))), (ARRAY(MAP(2, 3, 1, 2), MAP(1, 3))) as t(a)
[info]   GROUP BY a: -- !query
[info]   SELECT count(*)
[info]   FROM VALUES (ARRAY(MAP(1, 2, 2, 3), MAP(1, 3))), (ARRAY(MAP(2, 3), MAP(1, 3))), (ARRAY(MAP(2, 3, 1, 2), MAP(1, 3))) as t(a)
[info]   GROUP BY a
[info]   -- !query schema
[info]   struct<>
[info]   -- !query output
[info]   org.apache.comet.CometNativeException
[info]   Not yet implemented: Row format support not yet implemented for: [SortField { options: SortOptions { descending: false, nulls_first: true }, data_type: List(Field { data_type: Map(Field { name: "entries", data_type: Struct([Field { name: "key", data_type: Int32 }, Field { name: "value", data_type: Int32 }]) }, false) }) }] (SQLQueryTestSuite.scala:679)

Shekharrajak · 2026-03-29T07:48:20Z

Looks like this open ticket is connected to the failure #2837 - we have array of map in test case.

Let me try fixing to accept the array of map groupingExpressions

Shekharrajak · 2026-03-29T10:08:09Z

The previous check only rejected top-level MapType grouping expressions.
This caused a CometNativeException crash when grouping by ARRAY(MAP(...))
because Comet native cannot handle 'Row format support not yet implemented
for List(Map(...))'.

Fix by recursively checking for MapType within ArrayType and StructType.
This ensures we correctly fall back to Spark for any grouping expression
that contains a MapType at any nesting level.

Shekharrajak · 2026-03-30T07:39:21Z

locally CometSqlFileTestSuite and AdaptiveQueryExecSuite tests all pass now.

Shekharrajak · 2026-03-30T07:40:47Z

-        })) {
+    def containsMapType(dt: DataType): Boolean = dt match {
+      case _: MapType => true
+      case a: ArrayType => containsMapType(a.elementType)


some queries have array of maps - recursively checking for MapType within ArrayType and StructType.
This ensures we correctly fall back to Spark for any grouping expression
that contains a MapType at any nesting level.

andygrove · 2026-03-30T13:39:13Z

Looks like some Spark diffs now need updating:

2026-03-30T10:34:11.5550470Z [info] - SPARK-29894 test Codegen Stage Id in SparkPlanInfo *** FAILED *** (12 milliseconds)
2026-03-30T10:34:11.5552926Z [info]   "[CometNativeColumnarToRow]" did not equal "[WholeStageCodegen (2)]" (SQLAppStatusListenerSuite.scala:706)

Shekharrajak · 2026-03-30T16:37:17Z

Regenerate Spark 3.4 golden files for plan stability tests

Shekharrajak · 2026-03-31T06:35:50Z

Struggling a bit to make sure all checks are passing in github CI. Somehow I miss one or another tests locally when I run them to verify if everything is fine.

Shekharrajak · 2026-03-31T07:39:38Z

Below tests assert on Spark-internal AQE optimization behavior (empty relation propagation, partition coalescing) that legitimately doesn't work when Comet's native operators are in the plan. IgnoreCometNativeDataFusion annotations skip these tests only in native_datafusion mode.

SPARK-35442: Support propagate empty relation through aggregate
SPARK-35442: Support propagate empty relation through union
SPARK-34980: Support coalesce partition through union

…ates

…, STRUCT containing MAP)

…for all Spark versions

Shekharrajak · 2026-04-01T19:11:11Z

CI check failure - debugging :

df1: count(DISTINCT 2), count(DISTINCT 2, 3) | [1, 1] -- PASS
df2: count(DISTINCT 2), count(DISTINCT 3, 2) | [2, 2] -- FAIL

Seems related to #3835

…0, add regression tests

Shekharrajak · 2026-04-02T07:41:27Z

Found issue in CI checks : #3881

…egate-fallback # Conflicts: # dev/diffs/3.4.3.diff # dev/diffs/4.0.1.diff # spark/src/test/scala/org/apache/comet/exec/CometExecSuite.scala

Shekharrajak · 2026-04-03T17:53:06Z

Looks fine, now.

andygrove · 2026-04-25T14:47:10Z

There is a newer PR #4015 which was partly inspired by this one, so will close this. Thanks again @Shekharrajak for working on this.

Shekharrajak force-pushed the fix/issue-2894-aggregate-fallback branch from f2e6748 to 51869b1 Compare December 27, 2025 09:42

parthchandra reviewed Jan 9, 2026

View reviewed changes

Comment thread spark/src/test/scala/org/apache/comet/rules/CometExecRuleSuite.scala

Shekharrajak force-pushed the fix/issue-2894-aggregate-fallback branch from 91e0de7 to 274f38b Compare January 22, 2026 16:19

Shekharrajak force-pushed the fix/issue-2894-aggregate-fallback branch 2 times, most recently from 863ba03 to 141ba57 Compare March 16, 2026 19:20

andygrove approved these changes Mar 17, 2026

View reviewed changes

Shekharrajak force-pushed the fix/issue-2894-aggregate-fallback branch from 4eaef6b to a8d6c0f Compare March 28, 2026 06:22

Shekharrajak force-pushed the fix/issue-2894-aggregate-fallback branch from ff7bcf9 to b5e0c71 Compare March 30, 2026 07:31

Shekharrajak commented Mar 30, 2026

View reviewed changes

Shekharrajak force-pushed the fix/issue-2894-aggregate-fallback branch from 4eec927 to 34bee9b Compare March 30, 2026 16:17

Shekharrajak force-pushed the fix/issue-2894-aggregate-fallback branch from 34bee9b to ba19751 Compare March 31, 2026 03:28

Shekharrajak force-pushed the fix/issue-2894-aggregate-fallback branch 5 times, most recently from 8d4b8ae to 3089f35 Compare April 1, 2026 16:51

Shekharrajak force-pushed the fix/issue-2894-aggregate-fallback branch from 3089f35 to 91684b2 Compare April 1, 2026 17:11

Shekharrajak added 10 commits April 1, 2026 22:46

Allow Spark partial / Comet final for compatible aggregates

0a1871e

Add unit tests for aggSupportsMixedExecution including bitwise aggreg…

d33a1a4

…ates

Fix build errors: Remove unused imports and fix feature flag conflicts

d88a71b

Add bitwise aggregate mixed execution test

5db1fed

minor change

5acbdde

chore: update Spark 3.5 golden files for mixed aggregate execution

676ec35

chore: update Spark 4.0 goldens

53aea3d

fix: reject grouping on nested types containing maps (ARRAY(MAP(...))…

6f8e4e1

…, STRUCT containing MAP)

Regenerate Spark 3.4 golden files for plan stability tests

11500f1

Fix CI: remove stale aggregate filter check, add AQE test exclusions …

74ddc04

…for all Spark versions

Shekharrajak force-pushed the fix/issue-2894-aggregate-fallback branch from 91684b2 to 74ddc04 Compare April 1, 2026 17:20

Shekharrajak added 2 commits April 2, 2026 10:01

Ignore codegen-dependent Spark tests that fail when Comet is enabled

0a86919

Disable Comet for in-count-bug and aggregates_part3 tests on Spark 4.…

8748a1e

…0, add regression tests

Merge remote-tracking branch 'upstream/main' into fix/issue-2894-aggr…

a660456

…egate-fallback # Conflicts: # dev/diffs/3.4.3.diff # dev/diffs/4.0.1.diff # spark/src/test/scala/org/apache/comet/exec/CometExecSuite.scala

Shekharrajak force-pushed the fix/issue-2894-aggregate-fallback branch from 85872c9 to a660456 Compare April 3, 2026 03:58

Merge branch 'main' into fix/issue-2894-aggregate-fallback

54aacac

andygrove mentioned this pull request Apr 21, 2026

fix: allow safe mixed Spark/Comet partial/final aggregate execution #4015

Draft

andygrove closed this Apr 25, 2026

Conversation

Shekharrajak commented Dec 27, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Uh oh!

codecov-commenter commented Dec 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

andygrove commented Mar 16, 2026

Uh oh!

andygrove commented Mar 16, 2026

Uh oh!

Shekharrajak commented Mar 16, 2026

Uh oh!

andygrove left a comment

Choose a reason for hiding this comment

Uh oh!

andygrove commented Mar 27, 2026

Uh oh!

Shekharrajak commented Mar 28, 2026

Uh oh!

andygrove commented Mar 28, 2026

Uh oh!

Shekharrajak commented Mar 29, 2026

Uh oh!

Shekharrajak commented Mar 29, 2026

Uh oh!

Shekharrajak commented Mar 30, 2026

Uh oh!

Shekharrajak Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

andygrove commented Mar 30, 2026

Uh oh!

Shekharrajak commented Mar 30, 2026

Uh oh!

Shekharrajak commented Mar 31, 2026

Uh oh!

Shekharrajak commented Mar 31, 2026

Uh oh!

Shekharrajak commented Apr 1, 2026

Uh oh!

Shekharrajak commented Apr 2, 2026

Uh oh!

Shekharrajak commented Apr 3, 2026

Uh oh!

andygrove commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Dec 27, 2025 •

edited

Loading