Skip to content

[VL][TEST] Add test for aggregate with duplicate project#11440

Merged
rui-mo merged 5 commits intoapache:mainfrom
zml1206:agg_test
Mar 1, 2026
Merged

[VL][TEST] Add test for aggregate with duplicate project#11440
rui-mo merged 5 commits intoapache:mainfrom
zml1206:agg_test

Conversation

@zml1206
Copy link
Copy Markdown
Contributor

@zml1206 zml1206 commented Jan 19, 2026

What changes are proposed in this pull request?

How was this patch tested?

@github-actions github-actions bot added the VELOX label Jan 19, 2026
@zml1206
Copy link
Copy Markdown
Contributor Author

zml1206 commented Jan 19, 2026

    runQueryAndCompare("""
                         |select l_orderkey, sum(l_partkey), sum(l_partkey1) from
                         | (select l_orderkey, l_partkey, l_partkey as l_partkey1 from lineitem)
                         | group by l_orderkey
                         |""".stripMargin)
  == Results ==
  !== Correct Answer - 15000 ==   == Gluten Answer - 15000 ==
   struct<>                       struct<>
  ![1,3283,3283]                  [1,3283,-57773257175858473]
  ![100,3159,3159]                [100,3159,0]
  ![10016,321,321]                [10016,321,1738]
  ![10017,1645,1645]              [10017,1645,1172]
  ![10018,3238,3238]              [10018,3238,3072]
  ![10019,3435,3435]              [10019,3435,3241]
  ![10020,2132,2132]              [10020,2132,1321]
......

@zml1206
Copy link
Copy Markdown
Contributor Author

zml1206 commented Jan 19, 2026

This seems to be a new case of result mismatch, likely a problem with Velox. Do you know the reason? @rui-mo @jinchengchenghh @zhli1142015

@rui-mo
Copy link
Copy Markdown
Contributor

rui-mo commented Jan 19, 2026

@zml1206 Thanks for reporting! I need to debug this first to identify the cause. Please let me know if you’re already working on it.

@zml1206
Copy link
Copy Markdown
Contributor Author

zml1206 commented Jan 20, 2026

@zml1206 Thanks for reporting! I need to debug this first to identify the cause. Please let me know if you’re already working on it.

I'm not familiar with Veox agg code and I'm currently stuck. I hope you can help me take a look. Thanks @rui-mo

@rui-mo
Copy link
Copy Markdown
Contributor

rui-mo commented Jan 20, 2026

@zml1206 Got it. I'll take a look.

@rui-mo
Copy link
Copy Markdown
Contributor

rui-mo commented Jan 21, 2026

@zml1206 I can reproduce this, and am taking a look, thanks.

@rui-mo
Copy link
Copy Markdown
Contributor

rui-mo commented Jan 23, 2026

I implemented a fix in Velox: facebookincubator/velox#16108.

@zml1206
Copy link
Copy Markdown
Contributor Author

zml1206 commented Jan 24, 2026

I implemented a fix in Velox: facebookincubator/velox#16108.

Thank you very much @rui-mo

liuneng1994 pushed a commit to liuneng1994/gluten that referenced this pull request Feb 5, 2026
Upstream Velox's New Commits:
37d40f65c by Mario Ruiz, test: Add EnsureWritableVectorTest for lazy vector (#13298)
6bdc4e569 by duanmeng, feat: Zip the sorting column indices and compare flags (#13269)
c96d12c1d by Kevin Wilfong, fix: Add check for overflow in Presto's from_unixtime (#13262)
cde4cdb2a by Kevin Wilfong, fix: Casting complex types is only supposed to cast recursively if a child's type will change (#13245)
a3ac7d43e by arnavb, feat: Add support for leftSemiProject join in nested loop join (#12172)
8b0505d35 by Xiaoxuan Meng, feat: Add barrier support for unnest operator (#13293)
28579fa78 by Ke Wang, feat: Add total used bytes to stats reporter (#13267)
56e1ea3ae by Mario Ruiz, fix(type): Fix use after free in BaseVector (#13277)
e4e81e761 by aditi-pandit, refactor: TopNRowNumber::getOutputFromMemory loop (apache#11440)
06b99c77d by Xiaoxuan Meng, feat: Add task barrier support for streaming aggregation (#13273)
7c0f17307 by Jimmy Lu, fix: Optimize selective ARRAY and MAP reader (#13240)
fccb60d2a by Jialiang Tan, misc: Add comment to reclaim by abort (#13284)
f1e712ea4 by Kien Nguyen, feat(type): Add NOISY_COUNT_IF_GAUSSIAN (#13230)
39c08b939 by Xiaoxuan Meng, misc: Remove the legacy task cursor code (#13282)

Signed-off-by: GlutenPerfBot <glutenperfbot@intel-internal.com>
Co-authored-by: GlutenPerfBot <glutenperfbot@intel-internal.com>
@zml1206
Copy link
Copy Markdown
Contributor Author

zml1206 commented Feb 27, 2026

@rui-mo Thanks very much, UT validation passed, and the rule PullOutDuplicateProject seems to be no longer needed; I've submitted a new PR to remove it.

@zml1206 zml1206 closed this Feb 27, 2026
@rui-mo
Copy link
Copy Markdown
Contributor

rui-mo commented Feb 27, 2026

@zml1206 Would you like to add the case in this PR as a unit test in Gluten?

@zml1206 zml1206 reopened this Feb 28, 2026
@zml1206 zml1206 changed the title [DNM] Test agg [VL][TEST] Add test for aggregate with duplicate project Feb 28, 2026
@zml1206
Copy link
Copy Markdown
Contributor Author

zml1206 commented Feb 28, 2026

@zml1206 Would you like to add the case in this PR as a unit test in Gluten?

Of course, added.

@rui-mo rui-mo merged commit 3c81799 into apache:main Mar 1, 2026
61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants