[VL] Disable FlushableHashAggreagte when aggregates contains sum/avg for floating type by kecookier · Pull Request #8986 · apache/gluten

kecookier · 2025-03-13T12:28:12Z

What changes were proposed in this pull request?

(Fixes: #8985)

Disable FlushableHashAggregate when aggregates contain sum/avg for floating types.
To control PartialAgg flush easily in unit tests, add a Velox configuration s.g.s.c.b.v.maxPartialAggregationMemory to set PartialAgg memory, which has higher priority than s.g.s.c.b.v.maxPartialAggregationMemoryRatio.

How was this patch tested?

unit tests

github-actions · 2025-03-13T12:28:34Z

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Other pull requests

kecookier · 2025-03-21T01:32:48Z

@zhztheplayer Can you help review this PR?

zhztheplayer

Some users who don't need 100% alignment with Spark may still tend to turn on flushing in this case to speed up queries.

I would suggest having an individual config option like s.g.floatingPointMode=strict/loose to control the tolerance of this kind of diffs in Gluten. While the mode is set to strict, we disable flushing for sum(float), etc.

zhztheplayer · 2025-03-21T10:34:02Z

+          "partial aggregation is enabled. Ignored when spark.gluten.sql.columnar.backend." +
+          "velox.flushablePartialAggregation=false."
+      )
+      .bytesConf(ByteUnit.BYTE)


Should it be

.byteConf(...) .createOptional

…for floating type

kecookier · 2025-03-25T02:09:31Z

@zhztheplayer I've updated the code as suggested. Please take a look.

zhztheplayer · 2025-04-07T15:37:48Z

Hi @kecookier sorry for the late response. Missed the notification.

Given the purpose is to disable flushing in some cases, do we have to add new option maxPartialAggregationMemory? Any background of that?

kecookier · 2025-04-10T04:04:07Z

Given the purpose is to disable flushing in some cases, do we have to add new option maxPartialAggregationMemory? Any background of that?

@zhztheplayer For easier control of flushable memory during unit tests.

…or floating type (apache#8986) (cherry picked from commit f667e81) Change-Id: I74a595766972f8b561c98ae45632788a2bdd705f Reviewed-on: https://bigdataoss-internal-review.googlesource.com/c/third_party/apache/incubator-gluten/+/115777 Reviewed-by: Preetesh Verma <preeteshverma@google.com> Reviewed-by: Revanth Venkat Mikkilineni <revanthvenkat@google.com> Tested-by: Srinivas S T <srst@google.com>

github-actions bot added the VELOX label Mar 13, 2025

kecookier requested a review from zhztheplayer March 13, 2025 12:33

zhztheplayer reviewed Mar 21, 2025

View reviewed changes

zhaokuo03 and others added 3 commits March 24, 2025 14:40

[VL] Disable FlushableHashAggreagte when aggregates contains sum/avg …

0e27d9a

…for floating type

fix

dc8e779

fix

cc15f83

kecookier force-pushed the fix-double-sum branch from 4176119 to cc15f83 Compare March 24, 2025 06:43

kecookier added 2 commits March 24, 2025 14:52

fix ut

7587ed8

fix

891a840

zhztheplayer approved these changes Apr 10, 2025

View reviewed changes

zhztheplayer added the ready to merge label Apr 10, 2025

zhztheplayer merged commit f667e81 into apache:main Apr 10, 2025
50 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[VL] Disable FlushableHashAggreagte when aggregates contains sum/avg for floating type#8986

[VL] Disable FlushableHashAggreagte when aggregates contains sum/avg for floating type#8986
zhztheplayer merged 5 commits intoapache:mainfrom
kecookier:fix-double-sum

kecookier commented Mar 13, 2025

Uh oh!

github-actions bot commented Mar 13, 2025

Uh oh!

kecookier commented Mar 21, 2025

Uh oh!

zhztheplayer left a comment •

edited

Loading

Uh oh!

zhztheplayer Mar 21, 2025

Uh oh!

kecookier commented Mar 25, 2025

Uh oh!

zhztheplayer commented Apr 7, 2025

Uh oh!

kecookier commented Apr 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kecookier commented Mar 13, 2025

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

github-actions bot commented Mar 13, 2025

Uh oh!

kecookier commented Mar 21, 2025

Uh oh!

zhztheplayer left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhztheplayer Mar 21, 2025

Choose a reason for hiding this comment

Uh oh!

kecookier commented Mar 25, 2025

Uh oh!

zhztheplayer commented Apr 7, 2025

Uh oh!

kecookier commented Apr 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zhztheplayer left a comment •

edited

Loading