Skip to content

Window planning: use collation traits, improve subquery logic.#13902

Merged
gianm merged 9 commits intoapache:masterfrom
gianm:window-updates
Mar 9, 2023
Merged

Window planning: use collation traits, improve subquery logic.#13902
gianm merged 9 commits intoapache:masterfrom
gianm:window-updates

Conversation

@gianm
Copy link
Copy Markdown
Contributor

@gianm gianm commented Mar 8, 2023

SQL changes:

  1. Attach RelCollation (sorting) trait to any PartialDruidQuery
    that ends in AGGREGATE or AGGREGATE_PROJECT. This allows planning to
    take advantage of the fact that Druid sorts by dimensions when
    doing aggregations.

  2. Windowing: inspect RelCollation trait from input, and insert naiveSort
    if, and only if, necessary.

  3. Windowing: add support for Project after Window, when the Project
    is a simple mapping. Helps eliminate subqueries.

  4. DruidRules: update logic for considering subqueries to reflect that
    subqueries are not required to be GroupBys, and that we have a bunch
    of new Stages now. With all of this evolution that has happened, the
    old logic didn't quite make sense.

Native changes:

  1. Use merge sort (stable) rather than quicksort when sorting
    RowsAndColumns. Makes it easier to write test cases for plans that
    involve re-sorting the data.

SQL changes:

1) Attach RelCollation (sorting) trait to any PartialDruidQuery
   that ends in AGGREGATE or AGGREGATE_PROJECT. This allows planning to
   take advantage of the fact that Druid sorts by dimensions when
   doing aggregations.

2) Windowing: inspect RelCollation trait from input, and insert naiveSort
   if, and only if, necessary.

3) Windowing: add support for Project after Window, when the Project
   is a simple mapping. Helps eliminate subqueries.

4) DruidRules: update logic for considering subqueries to reflect that
   subqueries are not required to be GroupBys, and that we have a bunch
   of new Stages now. With all of this evolution that has happened, the
   old logic didn't quite make sense.

Native changes:

1) Use merge sort (stable) rather than quicksort when sorting
   RowsAndColumns. Makes it easier to write test cases for plans that
   involve re-sorting the data.
Comment thread sql/src/main/java/org/apache/druid/sql/calcite/rel/Windowing.java Fixed
Comment on lines 1491 to 1497
plannerContext.setPlanningError(
"SQL query is a scan and requires order by on a datasource[%s], which is not supported.",
"SQL query requires order by on non-concrete datasource, which is not supported.",
dataSource
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you dropped the interpolation of the datasource value. In a complex query that does lots and lots of stuff, not having anything interpolated that tells you what you did wrong makes it completely impossible to actually fix your query.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. That was a mistake, you can tell because dataSource is still in the call. Added back the %s.

sortColumns.addAll(group.getOrdering());

// Add sorting and partitioning if needed.
if (!sortMatches(ImmutableList.copyOf(priorSortColumns), ImmutableList.copyOf(sortColumns))) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why create the copy into a list? Given that sortMatches is doing a prefix match anyway, you might as well just pass in 2 iterators?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to use iterators.

Comment on lines +485 to +491
private static boolean sortMatches(
final List<ColumnWithDirection> priorSort,
final List<ColumnWithDirection> currentSort
)
{
return currentSort.size() <= priorSort.size() && currentSort.equals(priorSort.subList(0, currentSort.size()));
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an pretty high garbage way of doing things, and it doesn't seem much simpler than just having 2 iterators and walking them together.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure it's simpler, it's only one line vs, like, 5 😛

Anyway, I changed it.

Comment on lines 224 to 226
// The ordering required for partitioning is actually not important for the semantics. However, it *is*
// important that it be consistent across the query. Because if the incoming data is sorted descending
// and we try to partition on an ascending sort, we will think the data is not sorted correctly
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling like this comment has gone stale with your changes?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, it doesn't seem to be saying something useful, so I deleted it.

Comment on lines +454 to +464
switch (fieldCollation.getDirection()) {
case ASCENDING:
case STRICTLY_ASCENDING:
direction = ColumnWithDirection.Direction.ASC;
break;

case DESCENDING:
case STRICTLY_DESCENDING:
direction = ColumnWithDirection.Direction.DESC;
break;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one of the fieldCollation values is using CLUSTERED direction, I think that this code will assume that the data is not sorted from that point forward and return? Is that intended behavior? If so, it would be a lot more explicit to have a case in the switch that covers it and returns retVal immediately instead of relying on a null value and then falling through.

Alternatively, we could generate an error message? I'm unsure why this would receive a CLUSTERED direction, so not sure which is correct.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted it to return retVal immediately. I'm not sure if we'll ever see CLUSTERED order, but if we do, we should treat it as unsorted.

Comment on lines +21 to +23
# Not correct: there should actually be results here. Therefore, currently, this test only verifies that the
# query is planned as expected, not that the results are correct.
expectedResults: []
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the test not failing? I would expect this test to be failing saying that the results are different. If we want plan-only tests, I think I'd want the type to become operatorPlanValidation and then the test harness to only look at the expected operators and ignore the results.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not failing. I am not sure why the results are empty. The native query looks good to me so I figured something was wrong with the execution part. I looked into it a little but in that time wasn't able to figure out where the results were getting dropped. One difference here is that the base query type is scan, all other tests have a base query type groupBy. Is that something that is supposed to work?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooooh, yeah, scan doesn't work because it doesn't actually produce a good RowSignature for the ArrayListSegment (this is addressed some in #13773). So that's likely the culprit. Though, I would've expected that to generate an exception...

final LinkedHashSet<ColumnWithDirection> retVal = new LinkedHashSet<>();

for (RelFieldCollation fieldCollation : collation.getFieldCollations()) {
ColumnWithDirection.Direction direction;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think you can make this final.

@gianm gianm merged commit bf39b4d into apache:master Mar 9, 2023
@gianm gianm deleted the window-updates branch March 9, 2023 23:48
@clintropolis clintropolis added this to the 26.0 milestone Apr 10, 2023
// Scan cannot ORDER BY non-concrete datasources on _any_ column.
plannerContext.setPlanningError(
"SQL query is a scan and requires order by on a datasource[%s], which is not supported.",
"SQL query requires order by on non-concrete datasource [%s], which is not supported.",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking at this PR for some other reason. I don't think we should be using "concrete" since that concept is not well understood by the users. As a user, it will not be clear to me to know what a non-concrete data source means. what do you think?

Copy link
Copy Markdown
Contributor Author

@gianm gianm Apr 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message was removed anyway in #13965: https://github.com/apache/druid/pull/13965/files#r1145293604.

Although, TBH, I don't understand why it was removed. The comment "Since we are using a table data source and not a query data source now the isConcrete() check is not needed" doesn't really make sense to me. It's still possible for the dataSource to be non-concrete at this point in the code. I tried a test query that generates a non-concrete datasource at this point in the code, and the error you get is like this:

Time-ordering on scan queries is only supported for queries with segment specs of type MultipleSpecificSegmentSpec

It's from native execution, not from SQL planning, since the query does now pass the SQL planner. IMO in terms of clarity, it's even worse 😛

Test query is:

select __time as t, m1
from (select __time, m1 from druid.foo where __time >= timestamp '1970-01-01 00:00:00')
where (m1 in (select distinct m1 from druid.foo))
order by 1
limit 1

I think we could address this by restoring a check here, but instead of checking isConcrete, check isConcreteBased. That'd allow joins on concrete stuff, but not subqueries. Then for the message, we could do something like:

ORDER BY is only supported for __time, and only on regular tables (not subqueries)

Or, we could spend the time supporting order-by for all scans ✨

gianm added a commit to gianm/druid that referenced this pull request Apr 20, 2023
Further adjusts logic in DruidRules that was previously adjusted in apache#13902.
The reason for the original change was that the comment "Subquery must be
a groupBy, so stage must be >= AGGREGATE" was no longer accurate. Subqueries
do not need to be groupBy anymore; they can really be any type of query.
If I recall correctly, the change was needed for certain window queries
to be able to plan on top of Scan queries.

However, this impacts performance negatively, because it causes many
additional outer-query scenarios to be considered, which is expensive.

So, this patch updates the matching logic to consider fewer scenarios. The
skipped scenarios are ones where we expect that, for one reason or another,
it isn't necessary to consider a subquery.
gianm added a commit that referenced this pull request Apr 21, 2023
* SQL planning: Consider subqueries in fewer scenarios.

Further adjusts logic in DruidRules that was previously adjusted in #13902.
The reason for the original change was that the comment "Subquery must be
a groupBy, so stage must be >= AGGREGATE" was no longer accurate. Subqueries
do not need to be groupBy anymore; they can really be any type of query.
If I recall correctly, the change was needed for certain window queries
to be able to plan on top of Scan queries.

However, this impacts performance negatively, because it causes many
additional outer-query scenarios to be considered, which is expensive.

So, this patch updates the matching logic to consider fewer scenarios. The
skipped scenarios are ones where we expect that, for one reason or another,
it isn't necessary to consider a subquery.

* Remove unnecessary escaping.

* Fix test.
gianm added a commit to gianm/druid that referenced this pull request Apr 21, 2023
* SQL planning: Consider subqueries in fewer scenarios.

Further adjusts logic in DruidRules that was previously adjusted in apache#13902.
The reason for the original change was that the comment "Subquery must be
a groupBy, so stage must be >= AGGREGATE" was no longer accurate. Subqueries
do not need to be groupBy anymore; they can really be any type of query.
If I recall correctly, the change was needed for certain window queries
to be able to plan on top of Scan queries.

However, this impacts performance negatively, because it causes many
additional outer-query scenarios to be considered, which is expensive.

So, this patch updates the matching logic to consider fewer scenarios. The
skipped scenarios are ones where we expect that, for one reason or another,
it isn't necessary to consider a subquery.

* Remove unnecessary escaping.

* Fix test.
vogievetsky pushed a commit that referenced this pull request Apr 21, 2023
* SQL planning: Consider subqueries in fewer scenarios.

Further adjusts logic in DruidRules that was previously adjusted in #13902.
The reason for the original change was that the comment "Subquery must be
a groupBy, so stage must be >= AGGREGATE" was no longer accurate. Subqueries
do not need to be groupBy anymore; they can really be any type of query.
If I recall correctly, the change was needed for certain window queries
to be able to plan on top of Scan queries.

However, this impacts performance negatively, because it causes many
additional outer-query scenarios to be considered, which is expensive.

So, this patch updates the matching logic to consider fewer scenarios. The
skipped scenarios are ones where we expect that, for one reason or another,
it isn't necessary to consider a subquery.

* Remove unnecessary escaping.

* Fix test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants