[GLUTEN-9034][VL] Fix the VeloxResizeBatch not add for ReusedExchange#11069
[GLUTEN-9034][VL] Fix the VeloxResizeBatch not add for ReusedExchange#11069jinchengchenghh merged 2 commits intoapache:mainfrom
Conversation
| shuffle: ColumnarShuffleExchangeExec | | ||
| ReusedExchangeExec(_, _: ColumnarShuffleExchangeExec), |
There was a problem hiding this comment.
It doesn't likely compile.
Let's use this shortcut instead: https://github.com/apache/spark/blob/9ff0fba06e758d2509dd8eb7a38b01cc8720d43d/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/QueryStageExec.scala#L203-L208
There was a problem hiding this comment.
I found this comment:
// Since it's transformed in a bottom to up order, so we may first encounter
// ShuffeQueryStageExec, which is transformed to VeloxResizeBatchesExec(ShuffeQueryStageExec),
// then we see AQEShuffleReadExec
Maybe transformDown is more feasible in this case. cc @WangGuangxin
There was a problem hiding this comment.
It may caused by fix conflict, the old version is following, it can work well, I will fix it
_: ColumnarShuffleExchangeExec |
ReusedExchangeExec(_, _: ColumnarShuffleExchangeExec)
There was a problem hiding this comment.
Since the logic here is wrapping a new node VeloxResizeBatches when matched, if we use transformDown here, the logic here will be much more complicated .
For example, if it's transform down and when we match node a, it transform it to VeloxResizeBatches(a), then go down the tree node, it will encounter node a again, so we have to do something to avoid wrap VeloxResizeBatches again and again
There was a problem hiding this comment.
Sounds reasonable. Thanks for the explanation.
Before that, in TPCDS Q95, there are 4 AQEShuffleRead, but only one add the VeloxResizeBatch node.

After that,

The physical plan is
Related issue: #9034