Skip to content

SQL: More straightforward handling of join planning.#9648

Merged
gianm merged 3 commits intoapache:masterfrom
gianm:joins-planner-fixes
Apr 9, 2020
Merged

SQL: More straightforward handling of join planning.#9648
gianm merged 3 commits intoapache:masterfrom
gianm:joins-planner-fixes

Conversation

@gianm
Copy link
Copy Markdown
Contributor

@gianm gianm commented Apr 8, 2020

Two changes that simplify how joins are planned:

  1. Stop using JoinProjectTransposeRule as a way of guiding subquery
    removal. Instead, add logic to DruidJoinRule that identifies removable
    subqueries and removes them at the point of creating a DruidJoinQueryRel.
    This approach reduces the size of the planning space and allows the
    planner to complete quickly.

  2. Remove rules that reorder joins. Not because of an impact on the
    planning time (it seems minimal), but because the decisions that the
    planner was making in the new tests were sometimes worse than the
    user-provided order. I think we'll need to go with the user-provided
    order for now, and revisit reordering when we can add more smarts to
    the cost estimator.

A third change updates numeric ExprEval classes to store their
value as a boxed type that corresponds to what it is supposed to be.
This is useful because it affects the behavior of "asString", and
is included in this patch because it is needed for the new test
"testInnerJoinTwoLookupsToTableUsingNumericColumnInReverse". This
test relies on CAST('6', 'DOUBLE') stringifying to "6.0" like an
actual double would.

Fixes #9646.

@gianm gianm added this to the 0.18.0 milestone Apr 8, 2020
Two changes that simplify how joins are planned:

1) Stop using JoinProjectTransposeRule as a way of guiding subquery
removal. Instead, add logic to DruidJoinRule that identifies removable
subqueries and removes them at the point of creating a DruidJoinQueryRel.
This approach reduces the size of the planning space and allows the
planner to complete quickly.

2) Remove rules that reorder joins. Not because of an impact on the
planning time (it seems minimal), but because the decisions that the
planner was making in the new tests were sometimes worse than the
user-provided order. I think we'll need to go with the user-provided
order for now, and revisit reordering when we can add more smarts to
the cost estimator.

A third change updates numeric ExprEval classes to store their
value as a boxed type that corresponds to what it is supposed to be.
This is useful because it affects the behavior of "asString", and
is included in this patch because it is needed for the new test
"testInnerJoinTwoLookupsToTableUsingNumericColumnInReverse". This
test relies on CAST('6', 'DOUBLE') stringifying to "6.0" like an
actual double would.

Fixes apache#9646.
@gianm gianm force-pushed the joins-planner-fixes branch from 5cf1e12 to b35fc89 Compare April 8, 2020 11:43
@jihoonson jihoonson added the Bug label Apr 8, 2020
Copy link
Copy Markdown
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤘

this.literalSubConditions = literalSubConditions;
}

public ConditionAnalysis pushThroughLeftProject(final Project leftProject)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: javadocs might make these a bit friendlier (comments and javadoc are very nice up to this point)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good call. If you don't mind I'd like to add them in a follow up patch, because this patch is release-blocking.

return new ConditionAnalysis(
leftProject.getInput().getRowType().getFieldCount(),
equalitySubConditions
.stream()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of any specific examples off the top of my head, but I know we've seen a handful of performance issues using .stream(); I don't think anything needs to change right now, just commenting to increase the chances I'll remember this area as much as anything else as a potential area to look at to make small gains in planning time in the future.

I don't imagine these collections will be so large, so maybe isn't a very big deal, but then again could maybe add up depending how planning goes and how many times this stuff gets evaluated.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some benchmark results for stream (StreamBenchmark and HighlyNestedStreamBenchmark).

StreamBenchmark.flatMapIterator            avgt   10   9.584 ± 0.055  ms/op
StreamBenchmark.flatMapSequenceAccumulate  avgt   10   5.908 ± 0.041  ms/op
StreamBenchmark.flatMapSequenceYielder     avgt   10  22.976 ± 0.064  ms/op
StreamBenchmark.flatMapStream              avgt   10   6.249 ± 0.050  ms/op
StreamBenchmark.fluentIteratorToList       avgt   10  18.455 ± 0.254  ms/op
StreamBenchmark.sequenceToList             avgt   10  14.326 ± 0.662  ms/op
StreamBenchmark.streamToList               avgt   10  17.720 ± 0.173  ms/op
StreamBenchmark.sumIterator                avgt   10   2.648 ± 0.023  ms/op  <-- native for loop
StreamBenchmark.sumIteratorFlatMap         avgt   10   7.270 ± 0.095  ms/op
StreamBenchmark.sumSequence                avgt   10   6.654 ± 0.137  ms/op
StreamBenchmark.sumStream                  avgt   10   2.162 ± 0.048  ms/op
Benchmark                                              Mode  Cnt     Score    Error  Units
HighlyNestedStreamBenchmark.flatMapIterator            avgt   10  2413.571 ± 15.151  ms/op
HighlyNestedStreamBenchmark.flatMapSequenceAccumulate  avgt   10  1157.817 ± 13.575  ms/op
HighlyNestedStreamBenchmark.flatMapSequenceYielder     avgt   10  6308.657 ± 28.802  ms/op
HighlyNestedStreamBenchmark.flatMapStream              avgt   10   953.539 ±  6.755  ms/op
HighlyNestedStreamBenchmark.sumIteratorFlatMap         avgt   10  2129.499 ± 20.541  ms/op
HighlyNestedStreamBenchmark.sumNestedFor               avgt   10   297.307 ±  0.626  ms/op <-- native for loop
HighlyNestedStreamBenchmark.sumSequence                avgt   10  1503.816 ±  5.720  ms/op
HighlyNestedStreamBenchmark.sumStream                  avgt   10  1136.636 ± 12.353  ms/op

I did notice that stream is sometimes bad and these benchmarks were to see what makes it bad. Unfortunately, these benchmarks show that stream is pretty good for computing a sum, it can compete with even native for loop when the stream is not highly nested. For highly nested stream, it's worse than native for loop, but still best among others. For toList, stream seems worse than others which I'm not sure how much it matters.

I'm suspecting that stream could be very efficient in these benchmarks because the benchmark code could be easily inlined with stream. I was planning to do another benchmark with a map function which is complex enough to hinder inlining, but haven't done yet.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally my strategy has been to use whatever seems most readable for code that isn't performance critical (small collections, not called tons of times). And for situations where performance is critical then to benchmark.

* inlined and then the join must run on the Broker.
*/
static final double MULTIPLIER_JOIN_SUBQUERY = 1000000000;
static final double COST_JOIN_SUBQUERY = 1e5;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity where did these numbers come from? Experiments I guess?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly I made them up, but then verified through experiments that they achieve the plans that we want to achieve.


if (joinRel.getCondition().isA(SqlKind.LITERAL) && !joinRel.getCondition().isAlwaysFalse()) {
cost += CostEstimates.COST_JOIN_CROSS;
}
Copy link
Copy Markdown
Contributor

@suneet-s suneet-s Apr 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add a branch to set the cost to 0 if the joinCondition is a literal and is always false?

since a false join condition means nothing will match therefore you don't need to do work for either the left or the right hand side?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the native query join handling isn't smart enough to totally eliminate a join that has a false condition. It will still evaluate the left and right hand sides if they are subqueries. And it will still walk through every row on the left hand side. So I think it is fair to keep considering these costs in the cost estimator.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah makes sense

if (child instanceof DruidRel) {
return (DruidRel<?>) child;
} else {
final RelSubset subset = (RelSubset) child;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do we know this will be a RelSubset? I couldn't trace that path down here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of Calcite's planner tells me that the children will either be single rels or will be a subset of equivalent rels. So DruidRel and RelSubset are the two cases that can happen.

I hope I understand the planner correctly — I can't point to any specific code that guarantees what I am saying is true.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when I wrote this, I was wondering whether bindable parameters would change the node here somehow. The calcite query tests are pretty comprehensive around different types of JOINs and nested queries, that I feel pretty confident to agree with your understanding. Any chance you can test this with parameterized sql.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will look at adding some tests for this in the same followup as #9648 (comment).

Copy link
Copy Markdown
Contributor

@maytasm maytasm Apr 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"subset of equivalent rels" -> Does this imply that the cost of each RelNode will be equal? I saw that computeLeftRequiresSubquery can returns different depending on which child we pick from the list of RelList. This then result in a very very different cost since a child that results in requiring subquery will have very high cost and a child that doesnt will have a much lower cost.

More specifically, I saw two RelNode in the list. One RelNode has a filter = null and the other has "filter":{"type":"selector","dimension":"v","value":"xa","extractionFn":null}

Copy link
Copy Markdown
Contributor

@suneet-s suneet-s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanations. LGTM.

Don't consider any of my comments blockers - they're more just ramblings of a mad man :)


if (joinRel.getCondition().isA(SqlKind.LITERAL) && !joinRel.getCondition().isAlwaysFalse()) {
cost += CostEstimates.COST_JOIN_CROSS;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah makes sense

if (child instanceof DruidRel) {
return (DruidRel<?>) child;
} else {
final RelSubset subset = (RelSubset) child;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when I wrote this, I was wondering whether bindable parameters would change the node here somehow. The calcite query tests are pretty comprehensive around different types of JOINs and nested queries, that I feel pretty confident to agree with your understanding. Any chance you can test this with parameterized sql.

Copy link
Copy Markdown
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gianm gianm merged commit 75c543b into apache:master Apr 9, 2020
@gianm gianm deleted the joins-planner-fixes branch April 9, 2020 23:21
gianm added a commit to gianm/druid that referenced this pull request Apr 9, 2020
* SQL: More straightforward handling of join planning.

Two changes that simplify how joins are planned:

1) Stop using JoinProjectTransposeRule as a way of guiding subquery
removal. Instead, add logic to DruidJoinRule that identifies removable
subqueries and removes them at the point of creating a DruidJoinQueryRel.
This approach reduces the size of the planning space and allows the
planner to complete quickly.

2) Remove rules that reorder joins. Not because of an impact on the
planning time (it seems minimal), but because the decisions that the
planner was making in the new tests were sometimes worse than the
user-provided order. I think we'll need to go with the user-provided
order for now, and revisit reordering when we can add more smarts to
the cost estimator.

A third change updates numeric ExprEval classes to store their
value as a boxed type that corresponds to what it is supposed to be.
This is useful because it affects the behavior of "asString", and
is included in this patch because it is needed for the new test
"testInnerJoinTwoLookupsToTableUsingNumericColumnInReverse". This
test relies on CAST('6', 'DOUBLE') stringifying to "6.0" like an
actual double would.

Fixes apache#9646.

* Fix comments.

* Fix tests.
gianm added a commit to gianm/druid that referenced this pull request Apr 9, 2020
* SQL: More straightforward handling of join planning.

Two changes that simplify how joins are planned:

1) Stop using JoinProjectTransposeRule as a way of guiding subquery
removal. Instead, add logic to DruidJoinRule that identifies removable
subqueries and removes them at the point of creating a DruidJoinQueryRel.
This approach reduces the size of the planning space and allows the
planner to complete quickly.

2) Remove rules that reorder joins. Not because of an impact on the
planning time (it seems minimal), but because the decisions that the
planner was making in the new tests were sometimes worse than the
user-provided order. I think we'll need to go with the user-provided
order for now, and revisit reordering when we can add more smarts to
the cost estimator.

A third change updates numeric ExprEval classes to store their
value as a boxed type that corresponds to what it is supposed to be.
This is useful because it affects the behavior of "asString", and
is included in this patch because it is needed for the new test
"testInnerJoinTwoLookupsToTableUsingNumericColumnInReverse". This
test relies on CAST('6', 'DOUBLE') stringifying to "6.0" like an
actual double would.

Fixes apache#9646.

* Fix comments.

* Fix tests.
gianm added a commit to implydata/druid-public that referenced this pull request Apr 9, 2020
* SQL: More straightforward handling of join planning.

Two changes that simplify how joins are planned:

1) Stop using JoinProjectTransposeRule as a way of guiding subquery
removal. Instead, add logic to DruidJoinRule that identifies removable
subqueries and removes them at the point of creating a DruidJoinQueryRel.
This approach reduces the size of the planning space and allows the
planner to complete quickly.

2) Remove rules that reorder joins. Not because of an impact on the
planning time (it seems minimal), but because the decisions that the
planner was making in the new tests were sometimes worse than the
user-provided order. I think we'll need to go with the user-provided
order for now, and revisit reordering when we can add more smarts to
the cost estimator.

A third change updates numeric ExprEval classes to store their
value as a boxed type that corresponds to what it is supposed to be.
This is useful because it affects the behavior of "asString", and
is included in this patch because it is needed for the new test
"testInnerJoinTwoLookupsToTableUsingNumericColumnInReverse". This
test relies on CAST('6', 'DOUBLE') stringifying to "6.0" like an
actual double would.

Fixes apache#9646.

* Fix comments.

* Fix tests.
fjy pushed a commit that referenced this pull request Apr 10, 2020
* SQL: More straightforward handling of join planning.

Two changes that simplify how joins are planned:

1) Stop using JoinProjectTransposeRule as a way of guiding subquery
removal. Instead, add logic to DruidJoinRule that identifies removable
subqueries and removes them at the point of creating a DruidJoinQueryRel.
This approach reduces the size of the planning space and allows the
planner to complete quickly.

2) Remove rules that reorder joins. Not because of an impact on the
planning time (it seems minimal), but because the decisions that the
planner was making in the new tests were sometimes worse than the
user-provided order. I think we'll need to go with the user-provided
order for now, and revisit reordering when we can add more smarts to
the cost estimator.

A third change updates numeric ExprEval classes to store their
value as a boxed type that corresponds to what it is supposed to be.
This is useful because it affects the behavior of "asString", and
is included in this patch because it is needed for the new test
"testInnerJoinTwoLookupsToTableUsingNumericColumnInReverse". This
test relies on CAST('6', 'DOUBLE') stringifying to "6.0" like an
actual double would.

Fixes #9646.

* Fix comments.

* Fix tests.
JulianJaffePinterest pushed a commit to JulianJaffePinterest/druid that referenced this pull request Jun 12, 2020
* SQL: More straightforward handling of join planning.

Two changes that simplify how joins are planned:

1) Stop using JoinProjectTransposeRule as a way of guiding subquery
removal. Instead, add logic to DruidJoinRule that identifies removable
subqueries and removes them at the point of creating a DruidJoinQueryRel.
This approach reduces the size of the planning space and allows the
planner to complete quickly.

2) Remove rules that reorder joins. Not because of an impact on the
planning time (it seems minimal), but because the decisions that the
planner was making in the new tests were sometimes worse than the
user-provided order. I think we'll need to go with the user-provided
order for now, and revisit reordering when we can add more smarts to
the cost estimator.

A third change updates numeric ExprEval classes to store their
value as a boxed type that corresponds to what it is supposed to be.
This is useful because it affects the behavior of "asString", and
is included in this patch because it is needed for the new test
"testInnerJoinTwoLookupsToTableUsingNumericColumnInReverse". This
test relies on CAST('6', 'DOUBLE') stringifying to "6.0" like an
actual double would.

Fixes apache#9646.

* Fix comments.

* Fix tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SQL: Excessive time and memory use while planning joins on subqueries

5 participants