Planner test framework by paul-rogers · Pull Request #12545 · apache/druid

paul-rogers · 2022-05-19T18:29:03Z

Description

Converted to draft temporarily as some of the generic changes were split out into separate PRs.

SQL planners are complex and difficult to test. They are a black box: SQL goes in, and results come out. From a testing perspective, a planner tends to be a box of magic that either works or doesn’t. This PR provides a framework to turn classic black-box planner testing into a white-box experience by capturing and verifying the multiple planner artifacts. The framework is inspired by the one used in Apache Impala.

The core idea is the “case” file which contains SQL statements, any required query context or options, and expected results, serialized as text. Expected results include the Calcite plan, the native query, the schema, the “physical plan” (such as it is in Druid), and so on.

Creating tests is easy. This PR includes a “starter set” captured from the various Calcite?QueryTest classes. To add a new one, add a test case with a query, add dummy values for the desired artifacts and run the test. It will fail, of course. Now, use your favorite diff tool to copy the “actual” results to the “expected” results file (assuming the results are correct.) After that, rerun the tests after each planner change to ensure nothing changes (except whatever you specifically wanted to change.)

Like the Calcite?QueryTest test classes, you can specify expected results when the query is run in a restricted "test" execution engine. Such test runs run single process, against in-memory segments, so check query semantics, but they do not test the full server stack nor do they give accurate performance characterizations. Druid has many different run modes (with or without vectorization, SQL-compatible nulls or not, various context variables. The test framework provides tools to work with these variations.

Tests of this sort are quite strict: change anything and the test will fail. This can be painful if you just want to change something and are not concerned about side-effects. It can, however, make changing the planner a breeze because you get extreme visibility into the exact impact your change has: no surprises.

See the README.md file for full details of the case file syntax, how to create a test, and how to convert existing test files.

Example

The following shows a test converted from CalciteQueryTest, including the various test case sections:

==============================================================
Converted from testTopNLimitWrapping()
=== case
Top n limit wrapping
=== SQL
SELECT dim1, COUNT(*)
FROM druid.foo
GROUP BY dim1
ORDER BY dim1 DESC
=== context
sqlOuterLimit=2
=== schema
dim1 VARCHAR
EXPR$1 BIGINT
=== plan
LogicalSort(sort0=[$0], dir0=[DESC], fetch=[2:BIGINT])
  LogicalAggregate(group=[{0}], EXPR$1=[COUNT()])
    LogicalProject(dim1=[$2])
      LogicalTableScan(table=[[druid, foo]])
=== native
{
  "queryType" : "topN",
  ...
=== results
["def",1]
["abc",1]

Multiple other sections are available: the above is one of the simpler cases.

Running Queries

TheCalcite?QueryTest cases verify the native query and results. Verifying results is harder than one would expect: most tests run with two variations (SQL and classic null handling) which produces different results. Others use generators to run up to eight option variations for a total of 16 runs, with differing results. Adding results to the test framework is thus non-trivial and left as a second PR.

Since the new framework can’t check results, we are left with using both the old and new versions of converted tests. This is unfortunate: a goal for a second PR is to retire the “legacy” test files. A big bonus: someone had to laboriously write the Java code to build up native queries for those tests. In the figure, the computer will do it for us.

PR Details

This section summarizes the many changes required to create the test framework.

Druid Planner State Capture

Planners want to be black boxes: query in, plan out. Tests want visibility into the internal states as described above. The planner is modified to allow a PlannerStateCapture instance which can record (capture) this internal state. By default, the planner uses a "no op" capture. Tests provide a capturing instance.

The code to parse SQL was repeated three times. Factored that into a function.

Added the ability to capture the internal "execution plan" or "physical plan" in the QueryMaker via a new explain() method. This method does everything except send the query to another node for execution. The various query makers are refactored to support the two paths (explain and execute). For a SELECT, the physical plan is just the native plan with a few extra fields sent to the Broker. For INSERT, it is whatever the particular INSERT extension chooses to provide.

Union Queries

As it turns out, SQL UNION ALL queries have no direct native query representation: they turn into a list of native queries. Since we want to serialize "the" native query, the framework creates a "fake" "union" native query type on the fly to represent unions. Ugly, but it works.

INSERT/REPLACE Statements

The INSERT and REPLACE support in the planner is a bit of a bolt-on. The planner knows how to work with SELECT statements only, including if they are wrapped in a DESCRIBE or INSERT/REPLACE. When exiting the Calcite logical plan, the INSERT and REPLACE nodes don't appear. The test framework "invents" nodes for these cases (in the test framework only) so the the logical plan makes sense.

Planner Config

The planner config class has grown to be a monster: it holds config not just for the planner, but also for the segment metadata caching layer. This PR resisted the temptation to fix that issue.

Instead, it fixes another issue: tests want to tinker with the planner config via the "options" setting. But, the planner context is opaque: the only way to create one is via JSON serialization. Modified the class to provide a builder for use in tests.

The builder is then used in place of the detailed code in the withOverrides method.

Calcite Test Refactoring

ExpressionProcessing now allows a test to change the expression settings on the fly. This lets the test case declare, via the "options" section, how it wants to process expressions, allowing the variations to be tested in a single run.

Calcites has a escapeStringLiteral() method which escapes string literals, but is a bit too aggressive and results in a hard-to-read string with all punctuation converted to Unicode escapes. A new method, escapeStringLiteralLenient() is added to do the lightest-possible escaping and results in much better output in the test cases.

The BaseCalciteTest class was refactored a bit to:

Use the new PlannerConfig.Builder to create the various planner configs rather than anonymous classes.
Add support for planner state capture while running tests. Does nothing by default; can be turned on to capture planner state when converting tests from Java to the new framework. (This capture is distinct from the existing "hook" mechanism to capture the native query while running the query.)

RootSchemaBuilder holds the code to build up Druid's Calcite root schema. It is refactored to a builder so that tests can add to the root schema.
CalciteTests is refactored to use the above builder.

Druid Planner Test

The new DruidPlannerTest class, and its associated ".case" files holds test converted from:

CalciteInsertDmlTest
CalciteArrayQueriesTest
CalciteCorrelatedQueryTest
CalciteMultiValueStringQueryTest
CalciteParameterQueryTest

This isn't all such tests, just enough to allow a particular project to proceed with confidence. More conversions can be done later. And, as noted, the test file does not currently run the queries: it only plans them and verifies the artifacts.

The other classes in sql/src/test/java/org/apache/druid/sql/calcite/tester/ support the test framework:

README.md contains instructions for using the test framework.
internals.md provides details, such as a reference for the test case file format.
PlannerFixture is a kitchen-sink class that encapsulates the code to set up a planner (and all its associated knick-knacks) to run a test. Other tests could use this class rather than doing the setup themselves, but that conversion is not done in this PR.
QueryTestCase represents one planner test case and its sections.
ActualResults records the actual results of a run, compares them with the expected results, and writes a report if there are errors.
TestCaseLoader simple parser for the test case file format.
Other classes are lower-level helpers.

This PR has:

been self-reviewed.
added documentation for new or modified features or behaviors.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.

imply-cheddar · 2022-05-19T22:32:38Z

So great!

On the .case files, how do you feel about, instead of having it all be in one big, long, running file, each query be chunked into independent files? For example, the arrayQueries.case, would become a directory and under that directory you would have select_constant.sql for the query. select_constant.planner for the planner artifacts and select_constant.results for expected results?

This does create a proliferation of files and reading files can be slow, so it runs the risk of slowing down tests (though, presently, the tests are reindexing all of the test data on every single test, which is probably more work than loading files that specify queries and results and stuff), but it also separates things a bit. Especially when we want to bundle in result verification, putting the expected results into the big flowing file can make that file rather unweildy (especially if the expected result set is hundreds or thousands of lines long).

paul-rogers · 2022-05-19T22:58:38Z

@imply-cheddar, thanks for the kind words. I think you can turn it into a one-query-per-file setup without any changes. You need a way to get the contents of a resource directory, but then you should be home free. Look at what was done for the CalciteQueryTest cases: there were about 400, so I broke then down into files of about 25 each.

paul-rogers · 2022-06-04T00:14:15Z

@imply-cheddar, per your suggestion, a new commit extends the framework to capture results. Doing so wasn't quite as easy as it looked...

Took your suggestion to represent results as JSON arrays per row. Result comparison is textual unless an option is set to force Java-object comparison, which parses the JSON array into Java objects, and uses approximate equality for floating-point values. (The parsed Java objects are often not the same type as those produced in Druid results, so did some magic to make it work.)

The results handle SQL-compatible nulls or not, in the same run. No need to manually change properties to run one way, then the other. This double-duty occurs only in if -Ddruid.debug=true is provided, else the tests run as other tests: use whatever mode Travis provides.

The runs handle the "vectorize or not" cases also, as well as the join provider class that offers up to eight different ways to set options. This gives up to a total of 32 runs per test case.

When all is happy, and everything works the same way for every variation, the test case is pretty simple. But, when the native query (or results) differ depending on SQL-compatible setting, or other variations, there are ways to capture this variation.

You mentioned having a way to load data from an external file. That would be a good addition: have the JUnit test case use your framework to load the data, then run the tests using this mechanism.

paul-rogers · 2022-06-04T00:18:11Z

The various Calcite tests have changed since this work started. It is a bit tedious to update the test cases with those changes. I'll do so in a single go just before we're ready to commit this PR. For now, be aware that I'm tracking changes and am aware that folks are busy adding and changing tests.

lgtm-com · 2022-06-06T23:08:25Z

This pull request introduces 3 alerts when merging cc7f919 into 81c37c6 - view on LGTM.com

new alerts:

3 for Inconsistent equals and hashCode

paul-rogers · 2022-06-09T16:31:47Z

Note: the LGTM warnings are due to an LGTM timing issues. The required suppressions are in the code, but won't actually be honored until the PR is committed to master.

lgtm-com · 2022-06-09T20:22:56Z

This pull request introduces 3 alerts when merging 9e57a53 into a3603ad - view on LGTM.com

new alerts:

3 for Inconsistent equals and hashCode

lgtm-com · 2022-06-09T23:11:14Z

This pull request introduces 3 alerts when merging c02dd91 into a3603ad - view on LGTM.com

new alerts:

3 for Inconsistent equals and hashCode

lgtm-com · 2022-06-10T21:03:26Z

This pull request introduces 3 alerts when merging 8101ca0 into 1ace733 - view on LGTM.com

new alerts:

3 for Inconsistent equals and hashCode

paul-rogers · 2022-06-10T21:45:51Z

Split this PR into smaller ones: Clean up query contexts and Remove null and empty fields from native queries. Once these two are approved and merge, I'll rebase this PR onto master, which will reduce the file count by about a third, leaving only the framework itself, and its supporting changes.

lgtm-com · 2022-06-11T02:46:59Z

This pull request introduces 3 alerts when merging d5b8743 into 1ace733 - view on LGTM.com

new alerts:

3 for Inconsistent equals and hashCode

paul-rogers · 2022-06-12T01:36:26Z

Hitting flaky tests in PrefetchableTextFilesFirehoseFactoryTest. See this issue.

lgtm-com · 2022-06-13T19:45:07Z

This pull request introduces 3 alerts when merging 7140b01 into 1ace733 - view on LGTM.com

new alerts:

3 for Inconsistent equals and hashCode

paul-rogers · 2022-06-16T22:57:11Z

Split out the generic cleanup into two PRs which are now merged. Rebased this PR onto those changes so that reviewers need only consider the test-related revisions. There are a large number of files! Most are new: either test code to support the planner test framework, or actual test cases. The original JUnit tests are left in place, though they are now redundant: we can remove them in a later PR. The framework is self-testing: if it works, the tests pass. Please focus on the parts where the PR tinkers with existing classes.

Also cleans up various "Calcite test" helper classes. Includes a large set of tests converted from the various Calcite?QueryTest JUnit tests.

paul-rogers · 2022-06-25T01:05:17Z

Pulled out most of the clean-up changes and most of the converted tests to reduce cognitive load on reviewers. We'll add those bits back in as we convert and add tests.

paul-rogers · 2022-06-27T16:53:24Z

Moving to a private branch for now.

paul-rogers mentioned this pull request May 19, 2022

Druid Catalog Proposal #12546

Open

clintropolis added Area - Testing Area - SQL labels Jun 1, 2022

paul-rogers force-pushed the 220518-tester branch from 8bab30d to f010853 Compare June 4, 2022 00:14

paul-rogers force-pushed the 220518-tester branch from 9e57a53 to c02dd91 Compare June 9, 2022 21:49

paul-rogers mentioned this pull request Jun 10, 2022

Clean up query contexts #12633

Merged

3 tasks

paul-rogers mentioned this pull request Jun 10, 2022

Remove null and empty fields from native queries #12634

Merged

3 tasks

paul-rogers marked this pull request as draft June 10, 2022 21:42

This was referenced Jun 11, 2022

Today is that day - Single pass through Calcite planner #12636

Merged

Convert Druid planner to use statement handlers #12637

Closed

paul-rogers force-pushed the 220518-tester branch from 80f1f3b to 7140b01 Compare June 13, 2022 18:34

paul-rogers mentioned this pull request Jun 14, 2022

Foundation for the Druid metadata catalog #12647

Closed

8 tasks

paul-rogers force-pushed the 220518-tester branch 2 times, most recently from f19ebe9 to f9d5f1d Compare June 16, 2022 22:52

paul-rogers marked this pull request as ready for review June 16, 2022 22:57

paul-rogers force-pushed the 220518-tester branch from 02508a8 to 3181105 Compare June 19, 2022 06:06

paul-rogers marked this pull request as draft June 23, 2022 16:50

paul-rogers added 7 commits June 24, 2022 16:27

Planner test framework

3841184

Also cleans up various "Calcite test" helper classes. Includes a large set of tests converted from the various Calcite?QueryTest JUnit tests.

Build fix

9c6f9e3

Intellj inspections fix

7b87a05

Cleanup

1d443dc

Fix

6cd5d51

Snapshot

b2e8f9c

Removed all but the core test framework changes

5c3e026

paul-rogers force-pushed the 220518-tester branch from 3181105 to 5c3e026 Compare June 25, 2022 00:49

Build fixes

86ed974

paul-rogers closed this Jun 27, 2022

paul-rogers mentioned this pull request Jul 13, 2022

Multi-stage distributed queries #12262

Closed

paul-rogers mentioned this pull request Aug 24, 2022

Modular Calcite Test Framework #12965

Merged

3 tasks

Conversation

paul-rogers commented May 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Example

Running Queries

PR Details

Druid Planner State Capture

Union Queries

INSERT/REPLACE Statements

Planner Config

Calcite Test Refactoring

Druid Planner Test

Uh oh!

imply-cheddar commented May 19, 2022

Uh oh!

paul-rogers commented May 19, 2022

Uh oh!

paul-rogers commented Jun 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paul-rogers commented Jun 4, 2022

Uh oh!

lgtm-com Bot commented Jun 6, 2022

Uh oh!

paul-rogers commented Jun 9, 2022

Uh oh!

lgtm-com Bot commented Jun 9, 2022

Uh oh!

lgtm-com Bot commented Jun 9, 2022

Uh oh!

lgtm-com Bot commented Jun 10, 2022

Uh oh!

paul-rogers commented Jun 10, 2022

Uh oh!

lgtm-com Bot commented Jun 11, 2022

Uh oh!

paul-rogers commented Jun 12, 2022

Uh oh!

lgtm-com Bot commented Jun 13, 2022

Uh oh!

paul-rogers commented Jun 16, 2022

Uh oh!

paul-rogers commented Jun 25, 2022

Uh oh!

paul-rogers commented Jun 27, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

paul-rogers commented May 19, 2022 •

edited

Loading

paul-rogers commented Jun 4, 2022 •

edited

Loading