Convert the Druid planner to use statement handlers by paul-rogers · Pull Request #12905 · apache/druid

paul-rogers · 2022-08-16T06:33:40Z

Druid has traditionally supported just one kind of SQL statement: SELECT. The planner was thus designed to process "a query", and an ever-increasing amount of conditional code was added to support other statements such as INSERT and REPLACE. As we look toward adding DDL statements, the current approach will become unworkable. Other SQL products introduce an additional layer to handle statement types: the statement handler. This PR adds statement handlers to Druid.

This PR builds on the single-pass planner PR to heavily refactor the Druid planner to split statement-specific code into a set of statement-specific handler classes. All handlers implement a simple interface:

interface SqlStatementHandler
{
  SqlNode sqlNode();
  void validate() throws ValidationException;
  Set<ResourceAction> resourceActions();
  void prepare();
  PrepareResult prepareResult();
  PlannerResult plan() throws ValidationException;
}

The details of what is needed for each statement is a (complex) implementation detail of the handler classes.

At present, all the SQL statements which Druid supports include a SELECT: EXPLAIN, INSERT, REPLACE and, of course, SELECT itself. To reflect this fact, a base QueryHandler class handles the common aspects. As we add other statements (such as DDL), completely new handlers will handle those cases.

For the most part, the code is identical between master and this PR, but the code is heavily refactored and shifted around.

This PR is a step toward modifying the SQL validator to handle the newer INSERT and REPLACE nodes. The validation logic for these two statements that migrated to handers in this PR will migrate again into a Druid version of the Calcite validator.

This PR is a redo of an earlier one that did this same work. This version incorporates the many planner changes done recently.

This PR has:

been self-reviewed.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
been tested in a test Druid cluster.

abhishekagarwal87 · 2022-08-22T08:37:07Z

+    this.ingestionGranularity = insertNode.getPartitionedBy();
+  }
+
+  protected static SqlNode convertQuery(DruidSqlIngest sqlNode) throws ValidationException


I feel that the DruidSqlIngest abstraction will become limiting as query conversion of an INSERT and REPLACE diverges in future. Do we really need the Ingest abstraction?

The class seems justified by the 180 lines of code in this class that would otherwise be copy/paste duplicated across the INSERT and REPLACE handlers. This is a DIY argument.

Another argument is that REPLACE and INSERT are both forms of "ingest" in Druid: they funnel down to the same MSQ engine. There difference is mainly in how they treat existing data (INSERT adds to it, REPLACE replaces it.)

If the two statements diverge, then we've got some user experience issues to deal with. (Why do I need to learn two different ways to do basically the same thing?) But, if it does occur, we simply copy the once-common code into the two handlers, and change it in one or both places.

Does this answer the question about whether this abstraction is needed?

Thanks for the explanation. btw I meant that the interval implementation of INSERT vs REPLACE can diverge (e.g. validation logic is a bit different, error messages thrown could be different, etc.) But I agree that code reuse is significant.

paul-rogers · 2022-09-07T20:45:26Z

@abhishekagarwal87, thank you for your review. This commit addresses the issues you raised.

Also rebased this PR on the latest master to resolve a merge conflict. A line in DruidPlanner changed, and that line now lives in a handler file, so copied the change to that handler file.

abhishekagarwal87 · 2022-09-12T15:43:23Z

+    this.ingestionGranularity = insertNode.getPartitionedBy();
+  }
+
+  protected static SqlNode convertQuery(DruidSqlIngest sqlNode) throws ValidationException


Thanks for the explanation. btw I meant that the interval implementation of INSERT vs REPLACE can diverge (e.g. validation logic is a bit different, error messages thrown could be different, etc.) But I agree that code reuse is significant.

Converts the large collection of if-statements for statement types into a set of classes: one per supported statement type. Cleans up a few error messages.

paul-rogers · 2022-09-15T23:25:27Z

@abhishekagarwal87, thanks for the clarification. I agree that the analysis of INSERT and REPLACE will differ. That difference will appear in the next step: when we move the analysis out of these planner classes and into the Calcite planner. This particular PR sets us up to make the shift of the analysis code out of the various ad-hoc places where it now resides into a customized validator. (And, we want to do that so that the validation can use catalog information available via the Calcite metadata mechanism.)

Thanks for the approval. There was one build failure due to a race condition fixed in a PR a few days back. Rebased on the latest master to pick up that change and @imply-cheddar's change. Resolved the conflict with @imply-cheddar's change. Let's verify that the build works.

Forces a rebuild due to a flaky test

paul-rogers · 2022-09-17T22:25:35Z

Rebased on the recent QueryResponse change. Added type parameters to avoid IDE warnings.

paul-rogers · 2022-09-18T03:59:20Z

@abhishekagarwal87, the build is now clean except for one flaky IT, described in issue #13112. Let me know if you believe this IT to be actually flaky, or if we should investigate the failure (and rerun the build) before we commit this change.

abhishekagarwal87 · 2022-09-19T06:28:59Z

I have merged the PR. Thank you @paul-rogers.

clintropolis added Area - SQL Refactoring labels Aug 16, 2022

paul-rogers force-pushed the 220815-handler branch 2 times, most recently from efddda1 to a0a76bd Compare August 20, 2022 01:26

abhishekagarwal87 reviewed Aug 22, 2022

View reviewed changes

paul-rogers force-pushed the 220815-handler branch from a0a76bd to 4221071 Compare September 7, 2022 20:43

paul-rogers force-pushed the 220815-handler branch from 7807d9b to 90811a6 Compare September 11, 2022 21:12

abhishekagarwal87 approved these changes Sep 12, 2022

View reviewed changes

paul-rogers added 4 commits September 15, 2022 16:04

Converted Druid planner to use statement handlers

c1ef064

Converts the large collection of if-statements for statement types into a set of classes: one per supported statement type. Cleans up a few error messages.

Revisions from review comments

1a36224

Build fix

f27b9f6

Build fix

e085f50

Resolve merge confict.

3fb24c0

paul-rogers force-pushed the 220815-handler branch from 90811a6 to 3fb24c0 Compare September 15, 2022 23:27

paul-rogers added 2 commits September 16, 2022 10:58

More merges with QueryResponse PR

f4e2f14

More parameterized type cleanup

33ba174

Forces a rebuild due to a flaky test

abhishekagarwal87 merged commit 8ce03eb into apache:master Sep 19, 2022

kfaraz added this to the 25.0 milestone Nov 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert the Druid planner to use statement handlers#12905

Convert the Druid planner to use statement handlers#12905
abhishekagarwal87 merged 7 commits intoapache:masterfrom
paul-rogers:220815-handler

paul-rogers commented Aug 16, 2022 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

abhishekagarwal87 Aug 22, 2022

Uh oh!

paul-rogers Sep 7, 2022 •

edited

Loading

Uh oh!

abhishekagarwal87 Sep 12, 2022

Uh oh!

paul-rogers commented Sep 7, 2022

Uh oh!

abhishekagarwal87 Sep 12, 2022

Uh oh!

paul-rogers commented Sep 15, 2022

Uh oh!

paul-rogers commented Sep 17, 2022

Uh oh!

paul-rogers commented Sep 18, 2022

Uh oh!

abhishekagarwal87 commented Sep 19, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

paul-rogers commented Aug 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

abhishekagarwal87 Aug 22, 2022

Choose a reason for hiding this comment

Uh oh!

paul-rogers Sep 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

abhishekagarwal87 Sep 12, 2022

Choose a reason for hiding this comment

Uh oh!

paul-rogers commented Sep 7, 2022

Uh oh!

abhishekagarwal87 Sep 12, 2022

Choose a reason for hiding this comment

Uh oh!

paul-rogers commented Sep 15, 2022

Uh oh!

paul-rogers commented Sep 17, 2022

Uh oh!

paul-rogers commented Sep 18, 2022

Uh oh!

abhishekagarwal87 commented Sep 19, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

paul-rogers commented Aug 16, 2022 •

edited

Loading

paul-rogers Sep 7, 2022 •

edited

Loading