[GLUTEN-10544] Remove unnecessary method separateScanRDD#10545
Merged
zml1206 merged 2 commits intoapache:mainfrom Sep 9, 2025
Merged
[GLUTEN-10544] Remove unnecessary method separateScanRDD#10545zml1206 merged 2 commits intoapache:mainfrom
zml1206 merged 2 commits intoapache:mainfrom
Conversation
|
Run Gluten Clickhouse CI on x86 |
| * 2. test case where query plan is constructed from simple dataframes (e.g. | ||
| * GlutenDataFrameAggregateSuite) in these cases, separate RDDs takes care of SCAN as a | ||
| * result, genFinalStageIterator rather than genFirstStageIterator will be invoked | ||
| * 1. SCAN with clickhouse backend (check |
Contributor
There was a problem hiding this comment.
What's the change?
Contributor
Author
There was a problem hiding this comment.
I just want replace ColumnarCollapseTransformStages#separateScanRDD() with BackendsApiManager.getSettings.excludeScanExecFromCollapsedStage(), but the spotless make this change.
Member
There was a problem hiding this comment.
@beliefer, I suggest to refine the comments for better readability as follows. Thanks.
diff --git a/gluten-substrait/src/main/scala/org/apache/gluten/execution/WholeStageTransformer.scala b/gluten-substrait/src/main/scala/org/apache/gluten/execution/WholeStageTransformer.scala
index 0c5e1b58b..588ba4567 100644
--- a/gluten-substrait/src/main/scala/org/apache/gluten/execution/WholeStageTransformer.scala
+++ b/gluten-substrait/src/main/scala/org/apache/gluten/execution/WholeStageTransformer.scala
@@ -438,11 +438,14 @@ case class WholeStageTransformer(child: SparkPlan, materializeInput: Boolean = f
} else {
/**
- * the whole stage contains NO [[LeafTransformSupport]]. this the default case for:
- * 1. SCAN with clickhouse backend (check ColumnarCollapseTransformStages#separateScanRDD())
- * 2. test case where query plan is constructed from simple dataframes (e.g.
- * GlutenDataFrameAggregateSuite) in these cases, separate RDDs takes care of SCAN as a
- * result, genFinalStageIterator rather than genFirstStageIterator will be invoked
+ * The whole stage contains NO [[LeafTransformSupport]]. This is the default case for:
+ * - SCAN of clickhouse backend. See
+ * BackendsApiManager.getSettings.excludeScanExecFromCollapsedStage.
+ * - Test case where query plan is constructed from simple DataFrames, e.g.
+ * GlutenDataFrameAggregateSuite.
+ *
+ * In these cases, separate RDDs take care of SCAN. As a result, genFinalStageIterator rather
+ * than genFirstStageIterator will be invoked.
*/
new WholeStageZippedPartitionsRDD(
sparkContext,
Contributor
Author
Contributor
Author
|
cc @philo-he |
philo-he
approved these changes
Sep 9, 2025
Member
philo-he
left a comment
There was a problem hiding this comment.
Looks good. One minor comment. Thanks.
| * 2. test case where query plan is constructed from simple dataframes (e.g. | ||
| * GlutenDataFrameAggregateSuite) in these cases, separate RDDs takes care of SCAN as a | ||
| * result, genFinalStageIterator rather than genFirstStageIterator will be invoked | ||
| * 1. SCAN with clickhouse backend (check |
Member
There was a problem hiding this comment.
@beliefer, I suggest to refine the comments for better readability as follows. Thanks.
diff --git a/gluten-substrait/src/main/scala/org/apache/gluten/execution/WholeStageTransformer.scala b/gluten-substrait/src/main/scala/org/apache/gluten/execution/WholeStageTransformer.scala
index 0c5e1b58b..588ba4567 100644
--- a/gluten-substrait/src/main/scala/org/apache/gluten/execution/WholeStageTransformer.scala
+++ b/gluten-substrait/src/main/scala/org/apache/gluten/execution/WholeStageTransformer.scala
@@ -438,11 +438,14 @@ case class WholeStageTransformer(child: SparkPlan, materializeInput: Boolean = f
} else {
/**
- * the whole stage contains NO [[LeafTransformSupport]]. this the default case for:
- * 1. SCAN with clickhouse backend (check ColumnarCollapseTransformStages#separateScanRDD())
- * 2. test case where query plan is constructed from simple dataframes (e.g.
- * GlutenDataFrameAggregateSuite) in these cases, separate RDDs takes care of SCAN as a
- * result, genFinalStageIterator rather than genFirstStageIterator will be invoked
+ * The whole stage contains NO [[LeafTransformSupport]]. This is the default case for:
+ * - SCAN of clickhouse backend. See
+ * BackendsApiManager.getSettings.excludeScanExecFromCollapsedStage.
+ * - Test case where query plan is constructed from simple DataFrames, e.g.
+ * GlutenDataFrameAggregateSuite.
+ *
+ * In these cases, separate RDDs take care of SCAN. As a result, genFinalStageIterator rather
+ * than genFirstStageIterator will be invoked.
*/
new WholeStageZippedPartitionsRDD(
sparkContext,
|
Run Gluten Clickhouse CI on x86 |
zml1206
approved these changes
Sep 9, 2025
Contributor
Author
|
@zml1206 @philo-he @jinchengchenghh Thank you! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes are proposed in this pull request?
This PR proposes to remove unnecessary method
separateScanRDD.Fixes #10544
How was this patch tested?
GA tests.