Backend
VL (Velox)
Gluten version: main branch
Description
Spark 4.1 introduced memory-based shuffle spill thresholds (SPARK-49386, JIRA type: Improvement). The new spillSizeThreshold parameter enables spilling by data size rather than only by row count. Gluten's shuffle implementation does not support this threshold.
Spark 4.1 only.
Parent issue: #11910 ([VL] Spark 4.x: Tracking new feature support)
Impact
| Suite |
Exclude |
spark40 |
spark41 |
| GlutenDataFrameWindowFunctionsSuite |
SPARK-49386 spill |
🟢 |
🔴 |
| GlutenJoinSuite |
SPARK-49386 SortMergeJoin spill |
🟢 |
🔴 |
Note: GlutenSQLWindowFunctionSuite has a pre-existing spill issue ("low buffer spill threshold") unrelated to SPARK-49386 — out of scope for this issue.
References
Backend
VL (Velox)
Gluten version: main branch
Description
Spark 4.1 introduced memory-based shuffle spill thresholds (SPARK-49386, JIRA type: Improvement). The new
spillSizeThresholdparameter enables spilling by data size rather than only by row count. Gluten's shuffle implementation does not support this threshold.Spark 4.1 only.
Parent issue: #11910 (
[VL] Spark 4.x: Tracking new feature support)Impact
Note:
GlutenSQLWindowFunctionSuitehas a pre-existing spill issue ("low buffer spill threshold") unrelated to SPARK-49386 — out of scope for this issue.References