SQL planning: Consider subqueries in fewer scenarios.#14123
SQL planning: Consider subqueries in fewer scenarios.#14123gianm merged 3 commits intoapache:masterfrom
Conversation
Further adjusts logic in DruidRules that was previously adjusted in apache#13902. The reason for the original change was that the comment "Subquery must be a groupBy, so stage must be >= AGGREGATE" was no longer accurate. Subqueries do not need to be groupBy anymore; they can really be any type of query. If I recall correctly, the change was needed for certain window queries to be able to plan on top of Scan queries. However, this impacts performance negatively, because it causes many additional outer-query scenarios to be considered, which is expensive. So, this patch updates the matching logic to consider fewer scenarios. The skipped scenarios are ones where we expect that, for one reason or another, it isn't necessary to consider a subquery.
imply-cheddar
left a comment
There was a problem hiding this comment.
What's the reason for the various changes in the tests? They all seem to actually be better plans anyway (well, there's the addition of a filter on an effectively constant virtual column, which I hope primarily ends up as an always-true bitmap and effectively becomes a noop), so I don't think they are problematic. Just wondering if we know why they changed or if it's more of a "well, they changed, but not necessarily in a bad way, so consider it good"?
Analysis:
|
* SQL planning: Consider subqueries in fewer scenarios. Further adjusts logic in DruidRules that was previously adjusted in apache#13902. The reason for the original change was that the comment "Subquery must be a groupBy, so stage must be >= AGGREGATE" was no longer accurate. Subqueries do not need to be groupBy anymore; they can really be any type of query. If I recall correctly, the change was needed for certain window queries to be able to plan on top of Scan queries. However, this impacts performance negatively, because it causes many additional outer-query scenarios to be considered, which is expensive. So, this patch updates the matching logic to consider fewer scenarios. The skipped scenarios are ones where we expect that, for one reason or another, it isn't necessary to consider a subquery. * Remove unnecessary escaping. * Fix test.
* SQL planning: Consider subqueries in fewer scenarios. Further adjusts logic in DruidRules that was previously adjusted in #13902. The reason for the original change was that the comment "Subquery must be a groupBy, so stage must be >= AGGREGATE" was no longer accurate. Subqueries do not need to be groupBy anymore; they can really be any type of query. If I recall correctly, the change was needed for certain window queries to be able to plan on top of Scan queries. However, this impacts performance negatively, because it causes many additional outer-query scenarios to be considered, which is expensive. So, this patch updates the matching logic to consider fewer scenarios. The skipped scenarios are ones where we expect that, for one reason or another, it isn't necessary to consider a subquery. * Remove unnecessary escaping. * Fix test.
Further adjusts logic in DruidRules that was previously adjusted in #13902. The reason for the original change was that the comment "Subquery must be a groupBy, so stage must be >= AGGREGATE" was no longer accurate. Subqueries do not need to be groupBy anymore; they can really be any type of query. If I recall correctly, the change was needed for certain window queries to be able to plan on top of Scan queries.
However, this impacts performance negatively, because it causes many additional outer-query scenarios to be considered, which is expensive.
So, this patch updates the matching logic to consider fewer scenarios. The skipped scenarios are ones where we expect that, for one reason or another, it isn't necessary to consider a subquery.