Describe the bug
On Spark 4.1.1 with Comet enabled, two SQLQueryTestSuite queries return incorrect results. The same .sql and golden .out files pass on Spark 4.0.2.
except-all.sql query #22
SELECT v FROM tab3 GROUP BY v
EXCEPT ALL
SELECT k FROM tab4 GROUP BY k
Expected output: 3. Actual output: 2\n3 (one extra row).
intersect-all.sql query #15
SELECT v FROM tab1 GROUP BY v
INTERSECT ALL
SELECT k FROM tab2 GROUP BY k
Expected output: 2\n3\nNULL. Actual output: empty result.
Steps to reproduce
Run Spark 4.1.1's SQL test suite with Comet enabled (the Spark SQL Tests matrix entry for 4.1.1). Both files fail in SQLQueryTestSuite.
Expected behavior
Comet should produce the same EXCEPT ALL / INTERSECT ALL results as Spark.
Workaround
Both files are currently disabled when Comet is enabled via --SET spark.comet.enabled = false at the top of each file in dev/diffs/4.1.1.diff.
Additional context
The input .sql files and golden .out files are byte-identical between Spark 4.0.2 and 4.1.1, so the regression is in either Spark planner/optimizer behavior or in Comet's interaction with it on 4.1. PR #4093 enables Spark 4.1.1 in the Spark SQL Tests workflow.
Describe the bug
On Spark 4.1.1 with Comet enabled, two
SQLQueryTestSuitequeries return incorrect results. The same.sqland golden.outfiles pass on Spark 4.0.2.except-all.sqlquery #22Expected output:
3. Actual output:2\n3(one extra row).intersect-all.sqlquery #15Expected output:
2\n3\nNULL. Actual output: empty result.Steps to reproduce
Run Spark 4.1.1's SQL test suite with Comet enabled (the
Spark SQL Testsmatrix entry for 4.1.1). Both files fail inSQLQueryTestSuite.Expected behavior
Comet should produce the same EXCEPT ALL / INTERSECT ALL results as Spark.
Workaround
Both files are currently disabled when Comet is enabled via
--SET spark.comet.enabled = falseat the top of each file indev/diffs/4.1.1.diff.Additional context
The input
.sqlfiles and golden.outfiles are byte-identical between Spark 4.0.2 and 4.1.1, so the regression is in either Spark planner/optimizer behavior or in Comet's interaction with it on 4.1. PR #4093 enables Spark 4.1.1 in theSpark SQL Testsworkflow.