Skip to content

Planner test framework#12545

Closed
paul-rogers wants to merge 8 commits intoapache:masterfrom
paul-rogers:220518-tester
Closed

Planner test framework#12545
paul-rogers wants to merge 8 commits intoapache:masterfrom
paul-rogers:220518-tester

Conversation

@paul-rogers
Copy link
Copy Markdown
Contributor

@paul-rogers paul-rogers commented May 19, 2022

Description

Converted to draft temporarily as some of the generic changes were split out into separate PRs.

SQL planners are complex and difficult to test. They are a black box: SQL goes in, and results come out. From a testing perspective, a planner tends to be a box of magic that either works or doesn’t. This PR provides a framework to turn classic black-box planner testing into a white-box experience by capturing and verifying the multiple planner artifacts. The framework is inspired by the one used in Apache Impala.

The core idea is the “case” file which contains SQL statements, any required query context or options, and expected results, serialized as text. Expected results include the Calcite plan, the native query, the schema, the “physical plan” (such as it is in Druid), and so on.

Creating tests is easy. This PR includes a “starter set” captured from the various Calcite?QueryTest classes. To add a new one, add a test case with a query, add dummy values for the desired artifacts and run the test. It will fail, of course. Now, use your favorite diff tool to copy the “actual” results to the “expected” results file (assuming the results are correct.) After that, rerun the tests after each planner change to ensure nothing changes (except whatever you specifically wanted to change.)

Like the Calcite?QueryTest test classes, you can specify expected results when the query is run in a restricted "test" execution engine. Such test runs run single process, against in-memory segments, so check query semantics, but they do not test the full server stack nor do they give accurate performance characterizations. Druid has many different run modes (with or without vectorization, SQL-compatible nulls or not, various context variables. The test framework provides tools to work with these variations.

Tests of this sort are quite strict: change anything and the test will fail. This can be painful if you just want to change something and are not concerned about side-effects. It can, however, make changing the planner a breeze because you get extreme visibility into the exact impact your change has: no surprises.

See the README.md file for full details of the case file syntax, how to create a test, and how to convert existing test files.

Example

The following shows a test converted from CalciteQueryTest, including the various test case sections:

==============================================================
Converted from testTopNLimitWrapping()
=== case
Top n limit wrapping
=== SQL
SELECT dim1, COUNT(*)
FROM druid.foo
GROUP BY dim1
ORDER BY dim1 DESC
=== context
sqlOuterLimit=2
=== schema
dim1 VARCHAR
EXPR$1 BIGINT
=== plan
LogicalSort(sort0=[$0], dir0=[DESC], fetch=[2:BIGINT])
  LogicalAggregate(group=[{0}], EXPR$1=[COUNT()])
    LogicalProject(dim1=[$2])
      LogicalTableScan(table=[[druid, foo]])
=== native
{
  "queryType" : "topN",
  ...
=== results
["def",1]
["abc",1]

Multiple other sections are available: the above is one of the simpler cases.

Running Queries

TheCalcite?QueryTest cases verify the native query and results. Verifying results is harder than one would expect: most tests run with two variations (SQL and classic null handling) which produces different results. Others use generators to run up to eight option variations for a total of 16 runs, with differing results. Adding results to the test framework is thus non-trivial and left as a second PR.

Since the new framework can’t check results, we are left with using both the old and new versions of converted tests. This is unfortunate: a goal for a second PR is to retire the “legacy” test files. A big bonus: someone had to laboriously write the Java code to build up native queries for those tests. In the figure, the computer will do it for us.

PR Details

This section summarizes the many changes required to create the test framework.

Druid Planner State Capture

Planners want to be black boxes: query in, plan out. Tests want visibility into the internal states as described above. The planner is modified to allow a PlannerStateCapture instance which can record (capture) this internal state. By default, the planner uses a "no op" capture. Tests provide a capturing instance.

The code to parse SQL was repeated three times. Factored that into a function.

Added the ability to capture the internal "execution plan" or "physical plan" in the QueryMaker via a new explain() method. This method does everything except send the query to another node for execution. The various query makers are refactored to support the two paths (explain and execute). For a SELECT, the physical plan is just the native plan with a few extra fields sent to the Broker. For INSERT, it is whatever the particular INSERT extension chooses to provide.

Union Queries

As it turns out, SQL UNION ALL queries have no direct native query representation: they turn into a list of native queries. Since we want to serialize "the" native query, the framework creates a "fake" "union" native query type on the fly to represent unions. Ugly, but it works.

INSERT/REPLACE Statements

The INSERT and REPLACE support in the planner is a bit of a bolt-on. The planner knows how to work with SELECT statements only, including if they are wrapped in a DESCRIBE or INSERT/REPLACE. When exiting the Calcite logical plan, the INSERT and REPLACE nodes don't appear. The test framework "invents" nodes for these cases (in the test framework only) so the the logical plan makes sense.

Planner Config

The planner config class has grown to be a monster: it holds config not just for the planner, but also for the segment metadata caching layer. This PR resisted the temptation to fix that issue.

Instead, it fixes another issue: tests want to tinker with the planner config via the "options" setting. But, the planner context is opaque: the only way to create one is via JSON serialization. Modified the class to provide a builder for use in tests.

The builder is then used in place of the detailed code in the withOverrides method.

Calcite Test Refactoring

ExpressionProcessing now allows a test to change the expression settings on the fly. This lets the test case declare, via the "options" section, how it wants to process expressions, allowing the variations to be tested in a single run.

Calcites has a escapeStringLiteral() method which escapes string literals, but is a bit too aggressive and results in a hard-to-read string with all punctuation converted to Unicode escapes. A new method, escapeStringLiteralLenient() is added to do the lightest-possible escaping and results in much better output in the test cases.

The BaseCalciteTest class was refactored a bit to:

  • Use the new PlannerConfig.Builder to create the various planner configs rather than anonymous classes.
  • Add support for planner state capture while running tests. Does nothing by default; can be turned on to capture planner state when converting tests from Java to the new framework. (This capture is distinct from the existing "hook" mechanism to capture the native query while running the query.)

RootSchemaBuilder holds the code to build up Druid's Calcite root schema. It is refactored to a builder so that tests can add to the root schema.
CalciteTests is refactored to use the above builder.

Druid Planner Test

The new DruidPlannerTest class, and its associated ".case" files holds test converted from:

  • CalciteInsertDmlTest
  • CalciteArrayQueriesTest
  • CalciteCorrelatedQueryTest
  • CalciteMultiValueStringQueryTest
  • CalciteParameterQueryTest

This isn't all such tests, just enough to allow a particular project to proceed with confidence. More conversions can be done later. And, as noted, the test file does not currently run the queries: it only plans them and verifies the artifacts.

The other classes in sql/src/test/java/org/apache/druid/sql/calcite/tester/ support the test framework:

  • README.md contains instructions for using the test framework.
  • internals.md provides details, such as a reference for the test case file format.
  • PlannerFixture is a kitchen-sink class that encapsulates the code to set up a planner (and all its associated knick-knacks) to run a test. Other tests could use this class rather than doing the setup themselves, but that conversion is not done in this PR.
  • QueryTestCase represents one planner test case and its sections.
  • ActualResults records the actual results of a run, compares them with the expected results, and writes a report if there are errors.
  • TestCaseLoader simple parser for the test case file format.
  • Other classes are lower-level helpers.

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.

@imply-cheddar
Copy link
Copy Markdown
Contributor

So great!

On the .case files, how do you feel about, instead of having it all be in one big, long, running file, each query be chunked into independent files? For example, the arrayQueries.case, would become a directory and under that directory you would have select_constant.sql for the query. select_constant.planner for the planner artifacts and select_constant.results for expected results?

This does create a proliferation of files and reading files can be slow, so it runs the risk of slowing down tests (though, presently, the tests are reindexing all of the test data on every single test, which is probably more work than loading files that specify queries and results and stuff), but it also separates things a bit. Especially when we want to bundle in result verification, putting the expected results into the big flowing file can make that file rather unweildy (especially if the expected result set is hundreds or thousands of lines long).

@paul-rogers
Copy link
Copy Markdown
Contributor Author

@imply-cheddar, thanks for the kind words. I think you can turn it into a one-query-per-file setup without any changes. You need a way to get the contents of a resource directory, but then you should be home free. Look at what was done for the CalciteQueryTest cases: there were about 400, so I broke then down into files of about 25 each.

@paul-rogers
Copy link
Copy Markdown
Contributor Author

paul-rogers commented Jun 4, 2022

@imply-cheddar, per your suggestion, a new commit extends the framework to capture results. Doing so wasn't quite as easy as it looked...

Took your suggestion to represent results as JSON arrays per row. Result comparison is textual unless an option is set to force Java-object comparison, which parses the JSON array into Java objects, and uses approximate equality for floating-point values. (The parsed Java objects are often not the same type as those produced in Druid results, so did some magic to make it work.)

The results handle SQL-compatible nulls or not, in the same run. No need to manually change properties to run one way, then the other. This double-duty occurs only in if -Ddruid.debug=true is provided, else the tests run as other tests: use whatever mode Travis provides.

The runs handle the "vectorize or not" cases also, as well as the join provider class that offers up to eight different ways to set options. This gives up to a total of 32 runs per test case.

When all is happy, and everything works the same way for every variation, the test case is pretty simple. But, when the native query (or results) differ depending on SQL-compatible setting, or other variations, there are ways to capture this variation.

You mentioned having a way to load data from an external file. That would be a good addition: have the JUnit test case use your framework to load the data, then run the tests using this mechanism.

@paul-rogers
Copy link
Copy Markdown
Contributor Author

The various Calcite tests have changed since this work started. It is a bit tedious to update the test cases with those changes. I'll do so in a single go just before we're ready to commit this PR. For now, be aware that I'm tracking changes and am aware that folks are busy adding and changing tests.

@lgtm-com
Copy link
Copy Markdown

lgtm-com Bot commented Jun 6, 2022

This pull request introduces 3 alerts when merging cc7f919 into 81c37c6 - view on LGTM.com

new alerts:

  • 3 for Inconsistent equals and hashCode

@paul-rogers
Copy link
Copy Markdown
Contributor Author

Note: the LGTM warnings are due to an LGTM timing issues. The required suppressions are in the code, but won't actually be honored until the PR is committed to master.

@lgtm-com
Copy link
Copy Markdown

lgtm-com Bot commented Jun 9, 2022

This pull request introduces 3 alerts when merging 9e57a53 into a3603ad - view on LGTM.com

new alerts:

  • 3 for Inconsistent equals and hashCode

@lgtm-com
Copy link
Copy Markdown

lgtm-com Bot commented Jun 9, 2022

This pull request introduces 3 alerts when merging c02dd91 into a3603ad - view on LGTM.com

new alerts:

  • 3 for Inconsistent equals and hashCode

@paul-rogers paul-rogers mentioned this pull request Jun 10, 2022
3 tasks
@lgtm-com
Copy link
Copy Markdown

lgtm-com Bot commented Jun 10, 2022

This pull request introduces 3 alerts when merging 8101ca0 into 1ace733 - view on LGTM.com

new alerts:

  • 3 for Inconsistent equals and hashCode

@paul-rogers paul-rogers marked this pull request as draft June 10, 2022 21:42
@paul-rogers
Copy link
Copy Markdown
Contributor Author

Split this PR into smaller ones: Clean up query contexts and Remove null and empty fields from native queries. Once these two are approved and merge, I'll rebase this PR onto master, which will reduce the file count by about a third, leaving only the framework itself, and its supporting changes.

@lgtm-com
Copy link
Copy Markdown

lgtm-com Bot commented Jun 11, 2022

This pull request introduces 3 alerts when merging d5b8743 into 1ace733 - view on LGTM.com

new alerts:

  • 3 for Inconsistent equals and hashCode

@paul-rogers
Copy link
Copy Markdown
Contributor Author

Hitting flaky tests in PrefetchableTextFilesFirehoseFactoryTest. See this issue.

@lgtm-com
Copy link
Copy Markdown

lgtm-com Bot commented Jun 13, 2022

This pull request introduces 3 alerts when merging 7140b01 into 1ace733 - view on LGTM.com

new alerts:

  • 3 for Inconsistent equals and hashCode

@paul-rogers paul-rogers force-pushed the 220518-tester branch 2 times, most recently from f19ebe9 to f9d5f1d Compare June 16, 2022 22:52
@paul-rogers
Copy link
Copy Markdown
Contributor Author

Split out the generic cleanup into two PRs which are now merged. Rebased this PR onto those changes so that reviewers need only consider the test-related revisions. There are a large number of files! Most are new: either test code to support the planner test framework, or actual test cases. The original JUnit tests are left in place, though they are now redundant: we can remove them in a later PR. The framework is self-testing: if it works, the tests pass. Please focus on the parts where the PR tinkers with existing classes.

@paul-rogers paul-rogers marked this pull request as ready for review June 16, 2022 22:57
@paul-rogers paul-rogers marked this pull request as draft June 23, 2022 16:50
Also cleans up various "Calcite test" helper classes.

Includes a large set of tests converted from the various
Calcite?QueryTest JUnit tests.
@paul-rogers
Copy link
Copy Markdown
Contributor Author

Pulled out most of the clean-up changes and most of the converted tests to reduce cognitive load on reviewers. We'll add those bits back in as we convert and add tests.

@paul-rogers
Copy link
Copy Markdown
Contributor Author

Moving to a private branch for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants