[GLUTEN-9034][VL] Add VeloxResizeBatchesExec for Shuffle by WangGuangxin · Pull Request #9035 · apache/gluten

WangGuangxin · 2025-03-18T00:15:50Z

What changes were proposed in this pull request?

Shuffle read may generate small batch with few rows, which may hurt performance a lot.

A example in our production case is

So in this PR proposed to add VeloxResizeBatchesExec right after shuffle read, the plan is changed to

As we can see, the average batch size increased from 9 to 1000, and the total hour reduced from 2066h to 766h.

(Fixes: #9034)

How was this patch tested?

manually and UT

github-actions · 2025-03-18T00:16:09Z

#9034

WangGuangxin · 2025-03-18T00:16:32Z

cc @jinchengchenghh @jackylee-ch

jackylee-ch · 2025-03-18T02:45:57Z

Can you provide the complete dag diagram? Maybe it can be solved by adjusting the number of input partitions, such as maxPartitionSize?

WangGuangxin · 2025-03-18T03:02:03Z

without VeloxResizeBatchesExec

with VeloxResizeBatchesExec

WangGuangxin · 2025-03-18T03:08:21Z

Can you provide the complete dag diagram? Maybe it can be solved by adjusting the number of input partitions, such as maxPartitionSize?

@jackylee-ch In some cases, it can be controled by maxPartitionSize. But if it's under Join/Filter etc, it's not easy to make sure the batch size in a reasonable range

jackylee-ch · 2025-03-18T03:19:01Z

It seems that the anomal number of vectors occurs after the shuffle read. Maybe consider adding a resize operation right after the shuffle read so that all subsequent operators can benefit?

WangGuangxin · 2025-03-18T04:42:54Z

It seems that the anomal number of vectors occurs after the shuffle read. Maybe consider adding a resize operation right after the shuffle read so that all subsequent operators can benefit?

@jackylee-ch yeah, agree with that. I'll update it later

github-actions · 2025-03-22T02:28:07Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-03-22T02:30:17Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-03-22T04:54:40Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-03-24T04:18:03Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-03-24T04:19:21Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-03-24T05:48:08Z

Run Gluten Clickhouse CI on x86

WangGuangxin · 2025-03-24T05:48:47Z

It seems that the anomal number of vectors occurs after the shuffle read. Maybe consider adding a resize operation right after the shuffle read so that all subsequent operators can benefit?

@jackylee-ch yeah, agree with that. I'll update it later

@jackylee-ch @jinchengchenghh @zhztheplayer updated, and also update the title and description. please take a look when you are convenient

github-actions · 2025-03-24T19:42:52Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-03-24T19:50:25Z

Run Gluten Clickhouse CI on x86

marin-ma · 2025-04-04T10:11:11Z

@WangGuangxin Any update on this patch? Could you do a rebase? Thanks!

WangGuangxin · 2025-04-09T01:58:31Z

@WangGuangxin Any update on this patch? Could you do a rebase? Thanks!

thinks, will rebase today

github-actions · 2025-04-11T12:22:15Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-04-11T12:24:38Z

Run Gluten Clickhouse CI on x86

zhztheplayer · 2025-04-11T16:02:54Z

      .intConf
      .createOptional

+  val COLUMNAR_VELOX_RESIZE_BATCHES_SHUFFLE_INPUT_OUTPUT_MIN_SIZE =


s/SHUFFLE_INPUT_OUTPUT/SHUFFLE_OUTPUT ?

This is used both for the BatchResizing before shuffle input and after shuffle output, so that we can reduce some config. Usually there is need to do too much customized min size config for these two scenario. What do you think?

zhztheplayer · 2025-04-11T16:03:05Z

  }

-  def veloxResizeBatchesShuffleInputRange: ResizeRange = {
+  def veloxResizeBatchesShuffleInputOutputRange: ResizeRange = {


zhztheplayer · 2025-04-11T16:04:42Z

+import org.apache.spark.sql.execution.{ColumnarShuffleExchangeExec, SparkPlan}
+import org.apache.spark.sql.execution.adaptive.{AQEShuffleReadExec, ShuffleQueryStageExec}
+
+case class AppendBatchResizeAfterShuffleRead() extends Rule[SparkPlan] {


We don't have a rule for the resizing on the input side of shuffle. Can we make the ways of optimizations more consistent? Either both via rules, or both not?

At first, resizing on the output side of shuffle followings the way for shuffle read, that's do it when converting to transformer. But after the DummpyLeafExec is introduced, it doesn't work.
So I'll refactor the way to add resizing on the input side of shuffle, to make it enabled by rule

Fair enough. Let's keep it as an individual rule.

WangGuangxin · 2025-04-14T09:41:51Z

@jackylee-ch @zhztheplayer Please take another look.
The tpcds 1T in our env shows about 1% ~ 3% performance improvements.

marin-ma · 2025-04-17T08:19:18Z

@jackylee-ch @zhztheplayer Do you have any further comments?

zhztheplayer · 2025-04-17T08:46:04Z

-    def maybeAddAppendBatchesExec(plan: SparkPlan): SparkPlan = {
-      plan match {
-        case shuffle: ColumnarShuffleExchangeExec
-            if !shuffle.useSortBasedShuffle &&
-              VeloxConfig.get.veloxResizeBatchesShuffleInput =>
-          val range = VeloxConfig.get.veloxResizeBatchesShuffleInputRange
-          val appendBatches =
-            VeloxResizeBatchesExec(shuffle.child, range.min, range.max)
-          shuffle.withNewChildren(Seq(appendBatches))
-        case _ => plan
-      }
-    }
-


Thanks for factoring this out!

WangGuangxin · 2025-04-22T07:08:27Z

@jackylee-ch @zhztheplayer Can we merge this?

jackylee-ch

Sorry for late response. Great work!

zhouyuan · 2025-04-22T12:46:07Z

+    val range = VeloxConfig.get.veloxResizeBatchesShuffleInputOutputRange
+    plan.transformUp {
+      case shuffle: ColumnarShuffleExchangeExec
+          if !shuffle.useSortBasedShuffle &&


it looks like this will be only enabled on hash based shuffle?
Cc @marin-ma

Yes. We don't need to resize input batches for sort-based shuffle.

jinchengchenghh · 2025-11-11T14:55:58Z

Do you notice in some queries, the plan cannot be fully replaced? Like TPCDS Q95, there are 4 AQEShuffleRead, but only one add the VeloxResizeBatch node. @WangGuangxin

jinchengchenghh · 2025-11-11T16:14:54Z

Fixed in #11069 @WangGuangxin @jackylee-ch

github-actions bot added the VELOX label Mar 18, 2025

github-actions bot added the CORE works for Gluten Core label Mar 22, 2025

WangGuangxin force-pushed the feat_partial_project_opt branch from 70d695b to 25afa8e Compare March 22, 2025 02:29

WangGuangxin changed the title ~~[GLUTEN-9034][VL] Add VeloxResizeBatchesExec before ColumnarPartialProject~~ [GLUTEN-9034][VL] Add VeloxResizeBatchesExec right after ShuffleRead Mar 22, 2025

WangGuangxin force-pushed the feat_partial_project_opt branch from 25afa8e to 2991053 Compare March 22, 2025 04:54

WangGuangxin force-pushed the feat_partial_project_opt branch from 497d69b to 438a101 Compare March 24, 2025 04:18

jackylee-ch reviewed Mar 24, 2025

View reviewed changes

Comment thread backends-velox/src/main/scala/org/apache/gluten/backendsapi/velox/VeloxSparkPlanExecApi.scala Outdated

WangGuangxin added 2 commits April 11, 2025 20:14

Add VeloxResizeBatches before ShuffleRead

c6ce9ae

make config defalut to false

6b21ba9

WangGuangxin force-pushed the feat_partial_project_opt branch from cf4ff8b to 6b21ba9 Compare April 11, 2025 12:21

Merge branch 'main' into feat_partial_project_opt

cb0e2bd

fix DummyLeaf

1d3175e

github-actions bot removed the CORE works for Gluten Core label Apr 11, 2025

zhztheplayer reviewed Apr 11, 2025

View reviewed changes

WangGuangxin added 2 commits April 14, 2025 14:27

update

150f05e

fix ras

5930c9a

zhztheplayer reviewed Apr 17, 2025

View reviewed changes

zhztheplayer approved these changes Apr 17, 2025

View reviewed changes

zhztheplayer added the ready to merge label Apr 17, 2025

WangGuangxin added 3 commits April 18, 2025 10:47

Merge branch 'main' into feat_partial_project_opt

af3cd9c

resolve conflicts

513b3e8

resolve conflicts

2852177

jackylee-ch approved these changes Apr 22, 2025

View reviewed changes

jackylee-ch changed the title ~~[GLUTEN-9034][VL] Add VeloxResizeBatchesExec right after ShuffleRead~~ [GLUTEN-9034][VL] Add VeloxResizeBatchesExec for Shuffle Apr 22, 2025

jackylee-ch merged commit 9a7d5fc into apache:main Apr 22, 2025
46 checks passed

zhouyuan reviewed Apr 22, 2025

View reviewed changes

Conversation

WangGuangxin commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

github-actions bot commented Mar 18, 2025

Uh oh!

WangGuangxin commented Mar 18, 2025

Uh oh!

jackylee-ch commented Mar 18, 2025

Uh oh!

WangGuangxin commented Mar 18, 2025

without VeloxResizeBatchesExec

with VeloxResizeBatchesExec

Uh oh!

WangGuangxin commented Mar 18, 2025

Uh oh!

jackylee-ch commented Mar 18, 2025

Uh oh!

WangGuangxin commented Mar 18, 2025

Uh oh!

github-actions bot commented Mar 22, 2025

Uh oh!

github-actions bot commented Mar 22, 2025

Uh oh!

github-actions bot commented Mar 22, 2025

Uh oh!

github-actions bot commented Mar 24, 2025

Uh oh!

github-actions bot commented Mar 24, 2025

Uh oh!

github-actions bot commented Mar 24, 2025

Uh oh!

WangGuangxin commented Mar 24, 2025

Uh oh!

Uh oh!

github-actions bot commented Mar 24, 2025

Uh oh!

github-actions bot commented Mar 24, 2025

Uh oh!

marin-ma commented Apr 4, 2025

Uh oh!

WangGuangxin commented Apr 9, 2025

Uh oh!

github-actions bot commented Apr 11, 2025

Uh oh!

github-actions bot commented Apr 11, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhztheplayer Apr 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WangGuangxin commented Apr 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marin-ma commented Apr 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WangGuangxin commented Apr 22, 2025

Uh oh!

jackylee-ch left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

WangGuangxin commented Mar 18, 2025 •

edited

Loading

zhztheplayer Apr 14, 2025 •

edited

Loading

WangGuangxin commented Apr 14, 2025 •

edited

Loading