Skip to content

Decoupled planning: improve join support#17039

Merged
abhishekagarwal87 merged 23 commits intoapache:masterfrom
kgyrtkirk:decouple-joins-finalize-and-right
Sep 18, 2024
Merged

Decoupled planning: improve join support#17039
abhishekagarwal87 merged 23 commits intoapache:masterfrom
kgyrtkirk:decouple-joins-finalize-and-right

Conversation

@kgyrtkirk
Copy link
Copy Markdown
Member

There were some problematic cases

  • join branches are run with finalize=false instead of finalize=true like normal subqueries
    • this inconsistency is not good - but fixing it is a bigger thing
  • ensure that right hand sides of joins are always subqueries - or accessible globally

To achieve the above:

  • operand indexes were needed for the upstream reltree nodes in the generator
  • source unwrapping now takes the join situation into account as well

this.vertexFactory = new PDQVertexFactory(plannerContext, rexBuilder);
}

static class DruidNodeStack extends Stack<DruidLogicalNode>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option instead of maintaining another parallel Stack would be to add a mutable operandIndex field to DruidLogicalNode and walk the logical nodes assigning ids once before doing the translation.

I tend to think that mutating the object to have an id and then referencing that id is a bit less error prone than the parallel Stack approach, is there a reason that it cannot work?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried different approaches; DruidLogicalNode -s are immutable - and supposed to describe a subtree - the index of that node in its parent is an external detail - so I don't feel it right to add it.

I've covered 2 stacks with DruidNodeStack - which was better as it made possible to introduce methods which could made the logic more readable.

And as alternate approach to attach the data more naturally I've added an Entry class which contains the index and the DruidLogicalNode - I've retained the custom methods...but it did disturbed some existing places a bit.

Comment on lines +177 to +191
static JoinSupportTweaks analyze(DruidNodeStack stack)
{
if (stack.size() < 2) {
return NONE;
}
DruidLogicalNode possibleJoin = stack.get(stack.size() - 2);
if (!(possibleJoin instanceof DruidJoin)) {
return NONE;
}
if (stack.operandIndexStack.get(stack.size() - 1) == 1) {
return RIGHT;
} else {
return LEFT;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this method to be a bit obtuse. I think it's trying to figure out what to put on the right hand side (prioritizing queries on the right?). But, == 1 in the final if doesn't really carry semantic meaning for me. It would be nice if you could either adjust the code to be a bit more descriptive in what it's doing or add comments that explain the magic numbers.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

part of the problem was that a Join may have the same subtree on both left and right - which mandated the need for the operandIndex.

By adding some usefull methods to DruidNodeStack ; these magic numbers and substraction have more or less disappeared :)

I don't think JoinSupportTweaks will be a long lived enum ; I just haven't figured it out yet how many more details I'll need later - I have a feeling that later this will become something like: VertexOptions with a set of booleans/etc....

Comment on lines +87 to +88
EQUIV_PLAN_EXTRA_COLUMNS,
EQUIV_PLAN_CAST_MATERIALIZED_EARLIER;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I know what EXTRA_COLUMNS means, but CAST_MATERIALIZED_EARLIER I'm unclear on. Can you add the comments for these like you have on the others?

+-------+----+
| dummy | |
| dummy | b |
| dummy | |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised by this. null tends to sort first rather than last. This makes me think that SQL compat mode has inverted null comparisons? More than that, I really don't understand why your changes here would've caused this?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't taking a closer look as its a grouping on t1,t2 so it looked okayish :)
its very strange as even the native query is the same for the 2 variants - no idea why its different

diff
dev@decouple-joins:~/druid$ diff -Naur -U30 `f testGroupByWithLiteralInSubqueryGrouping*iq`
--- ./sql/src/test/quidem/org.apache.druid.sql.calcite.DecoupledPlanningCalciteQueryTest/testGroupByWithLiteralInSubqueryGrouping@NullHandling=sql.iq	2024-09-13 19:29:16.364090523 +0000
+++ ./sql/src/test/quidem/org.apache.druid.sql.calcite.DecoupledPlanningCalciteQueryTest/testGroupByWithLiteralInSubqueryGrouping@NullHandling=default.iq	2024-09-13 09:08:49.372218528 +0000
@@ -1,64 +1,64 @@
-# testGroupByWithLiteralInSubqueryGrouping@NullHandling=sql case-crc:a63c5a2f
+# testGroupByWithLiteralInSubqueryGrouping@NullHandling=default case-crc:a63c5a2f
 # quidem testcase reason: IMPROVED_PLAN
 !set debug true
 !set defaultTimeout 300000
 !set maxScatterGatherBytes 9223372036854775807
 !set plannerStrategy DECOUPLED
 !set sqlCurrentTimestamp 2000-01-01T00:00:00Z
 !set sqlQueryId dummy
 !set outputformat mysql
 !use druidtest:///
 SELECT 
    t1, t2
   FROM
    ( SELECT
      'dummy' as t1,
      CASE
        WHEN 
          dim4 = 'b'
        THEN dim4
        ELSE NULL
      END AS t2
      FROM
        numfoo
      GROUP BY
        dim4
    )
  GROUP BY
    t1,t2
 ;
 +-------+----+
 | t1    | t2 |
 +-------+----+
-| dummy | b  |
 | dummy |    |
+| dummy | b  |
 +-------+----+
 (2 rows)
 
 !ok
 LogicalProject(t1=['dummy'], t2=[$0])
   LogicalAggregate(group=[{0}])
     LogicalProject(t2=[CASE(=($0, 'b'), $0, null:VARCHAR)])
       LogicalAggregate(group=[{4}])
         LogicalTableScan(table=[[druid, numfoo]])
 
 !logicalPlan
 DruidProject(t1=['dummy'], t2=[$0], druid=[logical])
   DruidAggregate(group=[{0}], druid=[logical])
     DruidProject(t2=[CASE(=($0, 'b'), $0, null:VARCHAR)], druid=[logical])
       DruidAggregate(group=[{4}], druid=[logical])
         DruidTableScan(table=[[druid, numfoo]], druid=[logical])
 
 !druidPlan
 {
   "queryType" : "groupBy",
   "dataSource" : {
     "type" : "query",
     "query" : {
       "queryType" : "groupBy",
       "dataSource" : {
         "type" : "table",
         "name" : "numfoo"
       },
       "intervals" : {
         "type" : "intervals",

@abhishekagarwal87 abhishekagarwal87 merged commit d84d53c into apache:master Sep 18, 2024
@kfaraz kfaraz added this to the 31.0.0 milestone Oct 4, 2024
@kfaraz
Copy link
Copy Markdown
Contributor

kfaraz commented Oct 4, 2024

Added to milestone 31 as this is required to fix compilation failures in the backport #17251

[ERROR] /Users/kfaraz/Workspace/github/druid/sql/src/test/java/org/apache/druid/sql/calcite/CalciteQueryTest.java:[7730,59] cannot find symbol
[ERROR]   symbol:   variable EQUIV_PLAN_EXTRA_COLUMNS
[ERROR]   location: class org.apache.druid.sql.calcite.DecoupledTestConfig.QuidemTestCaseReason
[ERROR] 

kfaraz pushed a commit to kfaraz/druid that referenced this pull request Oct 4, 2024
There were some problematic cases

join branches are run with finalize=false instead of finalize=true like normal subqueries
this inconsistency is not good - but fixing it is a bigger thing
ensure that right hand sides of joins are always subqueries - or accessible globally
To achieve the above:

operand indexes were needed for the upstream reltree nodes in the generator
source unwrapping now takes the join situation into account as well
kfaraz added a commit that referenced this pull request Oct 5, 2024
…) (#17251)

* SQL: Use regular filters for time filtering in subqueries. (#17173)
* RunWorkOrder: Account for two simultaneous statistics collectors. (#17216)
* DartTableInputSpecSlicer: Fix for TLS workers. (#17224)
* Upgrade avro - minor version (#17230)
* SuperSorter: Don't set allDone if it's already set. (#17238)
* Decoupled planning: improve join support (#17039)
---------
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>
Co-authored-by: Zoltan Haindrich <kirk@rxd.hu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants