Skip to content

Convert Druid planner to use statement handlers#12637

Closed
paul-rogers wants to merge 10 commits intoapache:masterfrom
paul-rogers:220611-handler
Closed

Convert Druid planner to use statement handlers#12637
paul-rogers wants to merge 10 commits intoapache:masterfrom
paul-rogers:220611-handler

Conversation

@paul-rogers
Copy link
Copy Markdown
Contributor

Druid has traditionally supported just one kind of SQL statement: SELECT. The planner was thus designed to process "a query", and an ever-increasing amount of conditional code was added to support other statements such as INSERT and REPLACE. As we look toward adding DDL statements, the current approach will become unworkable. Other SQL products introduce an additional layer to handle statement types: the statement handler. This PR adds statement handlers to Druid.

This PR builds on the single-pass planner PR to heavily refactor the Druid planner to split statement-specific code into a set of statement-specific handler classes. All handlers implement a simple interface:

interface SqlStatementHandler
{
  void analyze() throws ValidationException;
  Set<ResourceAction> resourceActions();
  PrepareResult prepare() throws RelConversionException, ValidationException;
  PlannerResult plan() throws SqlParseException, ValidationException, RelConversionException;
}

The details of what is needed for each statement is a (complex) implementation detail of the handler classes.

At present, all the SQL statements which Druid supports include a SELECT: EXPLAIN, INSERT, REPLACE and, of course, SELECT itself. To reflect this fact, a base QueryHandler class handles the common aspects. As we add other statements (such as DDL), completely new handlers will handle those cases.

For the most part, the code is identical between master and this PR, but the code is heavily refactored and shifted around.

The key risk with this kind of change is that we break something. To catch any regression, this work was done in a private branch that also had the planner test framework. The planner artifacts (schema, logical plan, native query) were identical before and after the change. The various Calcite?QueryTest cases provide a lighter validation in this PR itself, since the planner framework is not yet in master, nor is it included in this PR.

Since the PR incorporates #12636, we should review and merge that PR first. I'll then rebase this one on the updated master which will remove the common code, leaving only the handler-related changes.


This PR has:

  • been self-reviewed.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • been tested in a test Druid cluster.

Converts the large collection of if-statements for statement
types into a set of classes: one per supported statement type.
@paul-rogers paul-rogers marked this pull request as draft June 18, 2022 18:41
@paul-rogers
Copy link
Copy Markdown
Contributor Author

Closing for now; too much parallel change. Will redo later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants