-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[fix](having) revert 15143 and fix having clause with multi-conditions #15745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
TeamCity pipeline, clickbench performance test result: |
| List<Expr> havingSlots = Lists.newArrayList(); | ||
| havingClause.collect(SlotRef.class, havingSlots); | ||
| for (Expr expr : havingSlots) { | ||
| if (excludeAliasSMap.get(expr) == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like sql "SELECT id, v1-2 as v, sum(v2) v2 FROM test_having_alias_tb GROUP BY id,v having(v2>1);" still report the same error. Please double check if it's expected behavior of this pr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This behavior is the same as before pr15143, because here we use column name firstly, so it report an error.
- Before 15143, we use column name firstly; (So this SQL failed)
- After 15143, we use column name firstly for columns inside group by; for other columns, we use alias name firstly;
(So the SQL success) - Now we revert 15143 to keep the logic simple: always use column name firstly. (So the SQL failed)
If we change having(v2>1) to having(sum(v2)>1), the SQL will success.
Later we will add a config to choose alias name firstly or column name firstly: #15748
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
xy720
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
#15745) Describe your changes. Firstly having clause of Mysql is really very complex, we are hard to follow all rules, so we revert pr15143 to keep the logic the same as before. Secondly the origin implementation has problem while having clause has multi-conditions. For example: case1: here v2 inside having clause use table column test_having_alias_tb.v2 SELECT id, v1-2 as v, sum(v2) v2 FROM test_having_alias_tb GROUP BY id,v having(v2>1); ERROR 1105 (HY000): errCode = 2, detailMessage = HAVING clause not produced by aggregation output (missing from GROUP BY clause?): (`v2` > 1) case2: here v2 inside having clause use alias name v2 =sum(test_having_alias_tb.v2), another condition make logic of v2 differently. SELECT id, v1-2 as v, sum(v2) v2 FROM test_having_alias_tb GROUP BY id,v having(v>0 AND v2>1) ORDER BY id,v; +------+------+------+ | id | v | v2 | +------+------+------+ | 2 | 1 | 3 | +------+------+------+ So here we try to make the having clause rules simple: Rule1: if alias name inside having clause is the same as column name, we use column name not alias name; Rule2: if alias name inside having clause do not have same name as column name, we use alias name; Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
Proposed changes
Issue Number: close #xxx
Problem summary
Describe your changes.
Firstly having clause of Mysql is really very complex, we are hard to follow all rules, so we revert pr15143 to keep the logic the same as before.
Secondly the origin implementation has problem while having clause has multi-conditions.
For example:
v2inside having clause use table columntest_having_alias_tb.v2SELECT id, v1-2 as v, sum(v2) v2 FROM test_having_alias_tb GROUP BY id,v having(v2>1);v2inside having clause use alias namev2=sum(test_having_alias_tb.v2), another condition make logic of v2 differently.SELECT id, v1-2 as v, sum(v2) v2 FROM test_having_alias_tb GROUP BY id,v having(v>0 AND v2>1) ORDER BY id,v;So here we try to make the having clause rules simple:
Rule1: if alias name inside having clause is the same as column name, we use column name not alias name;
Rule2: if alias name inside having clause do not have same name as column name, we use alias name;
Checklist(Required)
Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...