Skip to content

[GLUTEN-3154][CORE] Override doCanonicalize in scan operators#3155

Merged
ulysses-you merged 3 commits intoapache:mainfrom
liujiayi771:canonicalize-batchscan
Sep 18, 2023
Merged

[GLUTEN-3154][CORE] Override doCanonicalize in scan operators#3155
ulysses-you merged 3 commits intoapache:mainfrom
liujiayi771:canonicalize-batchscan

Conversation

@liujiayi771
Copy link
Copy Markdown
Contributor

@liujiayi771 liujiayi771 commented Sep 14, 2023

What changes were proposed in this pull request?

Fix #3154

Spark will try to get reused queryStage based on plan.canonicalized.
https://github.com/apache/spark/blob/822f58f0d26b7d760469151a65eaf9ee863a07a1/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala#L523-L523

How was this patch tested?

image image

@github-actions
Copy link
Copy Markdown

#3154

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI

@philo-he
Copy link
Copy Markdown
Member

Thanks for your fix! Does FileSourceScanExecTransformer have the same issue?

@kerwin-zk
Copy link
Copy Markdown
Contributor

Thanks for your fix! Does FileSourceScanExecTransformer have the same issue?

@philo-he It seems that both HiveTableScanExecTransformer and FileSourceScanExecTransformer may have this issue.

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI

@liujiayi771
Copy link
Copy Markdown
Contributor Author

Thanks for your fix! Does FileSourceScanExecTransformer have the same issue?

@philo-he It seems that both HiveTableScanExecTransformer and FileSourceScanExecTransformer may have this issue.

Yes, I have added doCanonicalize in FileSourceScanExecTransformer and HiveTableScanExecTransformer.

@philo-he
Copy link
Copy Markdown
Member

@ulysses-you, @zzcclp, could you take a review?

canonicalized.scan,
canonicalized.runtimeFilters,
QueryPlan.normalizePredicates(
pushdownFilters.filterNot(_ == DynamicPruningExpression(Literal.TrueLiteral)),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not related to this pr, but it seems pushdownFilters is never been set and used... not sure the original idea, maybe we can just remove pushdownFilters

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also noticed this when support iceberg read. @philo-he @rui-mo Can we remove it?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can remove it in this PR. Thanks!

Copy link
Copy Markdown
Contributor Author

@liujiayi771 liujiayi771 Sep 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ulysses-you @philo-he I encountered an error during the test. This parameter here cannot be deleted. Although it is not used, it is possible that the parameter was added unintentionally that avoid the error. We may need to add keyGroupedPartitioning to BatchScanExecTransformer, pushedFilters parameter had ambiguity in name.

Exception in thread "main" java.lang.IllegalStateException:
Failed to copy node.
Is otherCopyArgs specified correctly for BatchScanExecTransformer.
Exception message: wrong number of arguments
ctor: public io.glutenproject.execution.BatchScanExecTransformer(scala.collection.Seq,org.apache.spark.sql.connector.read.Scan,scala.collection.Seq)?
types: class scala.collection.immutable.$colon$colon, class org.apache.iceberg.spark.source.SparkBatchQueryScan, class scala.collection.immutable.$colon$colon, class scala.None$
args: List(xxxxx#10037, xxxxx#10046, xxxxx#10047L, xxxxx#10048L, xxxxx#10079, xxxxx#10080, xxxxx#10081), IcebergScan(table=default_iceberg.xxxxx.xxxxx, type=struct<1: xxxxx: optional string, 10: xxxxx: optional string, 11: xxxxx: optional long, 12: xxxxx: optional long, 43: xxxxx: optional string, 44: xxxxx: optional string, 45: xxxxx: optional string>, filters=[not_null(ref(name="xxxxx")), not_null(ref(name="xxxxx")), ref(name="xxxxx") == "2023-08-01", not(ref(name="xxxxx") == "xxxxx"), not_null(ref(name="xxxxx")), not_null(ref(name="xxxxx"))], runtimeFilters=[], caseSensitive=false), List(dynamicpruningexpression(true), dynamicpruningexpression(xxxxx#10080 IN dynamicpruning#10622)), None

	at org.apache.spark.sql.catalyst.trees.TreeNode.makeCopy(TreeNode.scala:854)
	at org.apache.spark.sql.catalyst.trees.TreeNode.makeCopy(TreeNode.scala:797)
	at org.apache.spark.sql.execution.SparkPlan.super$makeCopy(SparkPlan.scala:100)
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$makeCopy$1(SparkPlan.scala:100)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
	at org.apache.spark.sql.execution.SparkPlan.makeCopy(SparkPlan.scala:100)
	at org.apache.spark.sql.execution.SparkPlan.makeCopy(SparkPlan.scala:60)
	at org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:223)
	at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUpWithPruning(QueryPlan.scala:188)
	at org.apache.spark.sql.execution.reuse.ReuseExchangeAndSubquery$$anonfun$org$apache$spark$sql$execution$reuse$ReuseExchangeAndSubquery$$reuse$1$1.applyOrElse(ReuseExchangeAndSubquery.scala:54)
	at org.apache.spark.sql.execution.reuse.ReuseExchangeAndSubquery$$anonfun$org$apache$spark$sql$execution$reuse$ReuseExchangeAndSubquery$$reuse$1$1.applyOrElse(ReuseExchangeAndSubquery.scala:44)

It may be a little difficult to explain here. I try to make it clear. When Spark is in TreeNode.makeCopy, it will call the productArity method of Product to obtain the number of parameters of the case class. The current number of parameters of BatchScanExecTransformer is 3, but productArity gets 4. After my test, productArity gets the number of parameters of the first specific case class, which is the number of parameters of BatchScanExec, which does not match the number of BatchScanExecTransformer constructors.

I think the standard approach here is that if we need to inherit Spark's case class plan, we need to ensure that the number of parameters is consistent, otherwise it will cause problems when plan.transform calls makeCopy. Or implement the otherCopyArgs method to ensure that not transformed args can be copied.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a subtle issue. Why productArity returns 4 for BatchScanExec? It looks BatchScanExec has only 3 constructor args.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a subtle issue. Why productArity returns 4 for BatchScanExec? It looks BatchScanExec has only 3 constructor args.

Spark 3.3 has 4 args.
https://github.com/apache/spark/blob/b170a670b68616af49b26cbcc84894c939713837/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/BatchScanExec.scala#L40

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

case class A(a1: Int, a2: Int) extends Product
class B(a1: Int, a2: Int, a3: Int) extends A(a1, a2)
val b = new B(1, 2, 3)
b.productArity

b.productArity will return 2.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Please help fix it in a separate PR and leave some comments to explain in source code. Thanks!

@philo-he philo-he changed the title [GLUTEN-3154][CORE] Add doCanonicalize in BatchScanExecTransformer [GLUTEN-3154][CORE] Override doCanonicalize in scan operators Sep 15, 2023
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI

@liujiayi771 liujiayi771 force-pushed the canonicalize-batchscan branch from ad69c5f to 07d1c74 Compare September 15, 2023 12:57
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI

Copy link
Copy Markdown
Contributor

@zzcclp zzcclp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ulysses-you ulysses-you merged commit 38e748f into apache:main Sep 18, 2023
}

override def doCanonicalize(): BatchScanExecTransformer = {
val canonicalized = super.doCanonicalize()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering whether we can directly use the members of BatchScanExecTransformer to doCanonicalize. There may have some differences. Thanks!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BatchScanExecTransformer inherits BatchScanExec, and their parameters should be exactly the same.

@GlutenPerfBot
Copy link
Copy Markdown
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_3155_time.csv log/native_master_09_17_2023_8cd51b20d_time.csv difference percentage
q1 43.37 43.28 -0.087 99.80%
q2 24.63 24.40 -0.228 99.07%
q3 35.13 37.21 2.081 105.92%
q4 41.45 41.84 0.384 100.93%
q5 70.63 70.08 -0.542 99.23%
q6 4.89 6.51 1.618 133.05%
q7 84.38 85.79 1.405 101.66%
q8 80.31 78.95 -1.356 98.31%
q9 116.39 115.22 -1.170 98.99%
q10 47.74 47.02 -0.726 98.48%
q11 18.85 18.81 -0.039 99.79%
q12 26.74 24.49 -2.253 91.58%
q13 51.45 50.28 -1.170 97.73%
q14 18.70 13.73 -4.964 73.45%
q15 30.28 26.40 -3.877 87.20%
q16 15.73 15.90 0.175 101.11%
q17 119.58 119.56 -0.028 99.98%
q18 164.12 160.35 -3.771 97.70%
q19 12.23 12.01 -0.218 98.22%
q20 30.55 26.92 -3.626 88.13%
q21 235.96 232.60 -3.351 98.58%
q22 15.47 15.58 0.114 100.74%
total 1288.59 1266.96 -21.631 98.32%

@liujiayi771 liujiayi771 deleted the canonicalize-batchscan branch November 6, 2023 04:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CORE] BatchScanExecTransformer not override doCanonicalize

6 participants