Conversation
|
So great! On the This does create a proliferation of files and reading files can be slow, so it runs the risk of slowing down tests (though, presently, the tests are reindexing all of the test data on every single test, which is probably more work than loading files that specify queries and results and stuff), but it also separates things a bit. Especially when we want to bundle in result verification, putting the expected results into the big flowing file can make that file rather unweildy (especially if the expected result set is hundreds or thousands of lines long). |
|
@imply-cheddar, thanks for the kind words. I think you can turn it into a one-query-per-file setup without any changes. You need a way to get the contents of a resource directory, but then you should be home free. Look at what was done for the |
8bab30d to
f010853
Compare
|
@imply-cheddar, per your suggestion, a new commit extends the framework to capture results. Doing so wasn't quite as easy as it looked... Took your suggestion to represent results as JSON arrays per row. Result comparison is textual unless an option is set to force Java-object comparison, which parses the JSON array into Java objects, and uses approximate equality for floating-point values. (The parsed Java objects are often not the same type as those produced in Druid results, so did some magic to make it work.) The results handle SQL-compatible nulls or not, in the same run. No need to manually change properties to run one way, then the other. This double-duty occurs only in if The runs handle the "vectorize or not" cases also, as well as the join provider class that offers up to eight different ways to set options. This gives up to a total of 32 runs per test case. When all is happy, and everything works the same way for every variation, the test case is pretty simple. But, when the native query (or results) differ depending on SQL-compatible setting, or other variations, there are ways to capture this variation. You mentioned having a way to load data from an external file. That would be a good addition: have the JUnit test case use your framework to load the data, then run the tests using this mechanism. |
|
The various Calcite tests have changed since this work started. It is a bit tedious to update the test cases with those changes. I'll do so in a single go just before we're ready to commit this PR. For now, be aware that I'm tracking changes and am aware that folks are busy adding and changing tests. |
|
This pull request introduces 3 alerts when merging cc7f919 into 81c37c6 - view on LGTM.com new alerts:
|
|
Note: the LGTM warnings are due to an LGTM timing issues. The required suppressions are in the code, but won't actually be honored until the PR is committed to master. |
|
This pull request introduces 3 alerts when merging 9e57a53 into a3603ad - view on LGTM.com new alerts:
|
9e57a53 to
c02dd91
Compare
|
This pull request introduces 3 alerts when merging c02dd91 into a3603ad - view on LGTM.com new alerts:
|
|
This pull request introduces 3 alerts when merging 8101ca0 into 1ace733 - view on LGTM.com new alerts:
|
|
Split this PR into smaller ones: Clean up query contexts and Remove null and empty fields from native queries. Once these two are approved and merge, I'll rebase this PR onto master, which will reduce the file count by about a third, leaving only the framework itself, and its supporting changes. |
|
This pull request introduces 3 alerts when merging d5b8743 into 1ace733 - view on LGTM.com new alerts:
|
|
Hitting flaky tests in |
80f1f3b to
7140b01
Compare
|
This pull request introduces 3 alerts when merging 7140b01 into 1ace733 - view on LGTM.com new alerts:
|
f19ebe9 to
f9d5f1d
Compare
|
Split out the generic cleanup into two PRs which are now merged. Rebased this PR onto those changes so that reviewers need only consider the test-related revisions. There are a large number of files! Most are new: either test code to support the planner test framework, or actual test cases. The original JUnit tests are left in place, though they are now redundant: we can remove them in a later PR. The framework is self-testing: if it works, the tests pass. Please focus on the parts where the PR tinkers with existing classes. |
02508a8 to
3181105
Compare
Also cleans up various "Calcite test" helper classes. Includes a large set of tests converted from the various Calcite?QueryTest JUnit tests.
3181105 to
5c3e026
Compare
|
Pulled out most of the clean-up changes and most of the converted tests to reduce cognitive load on reviewers. We'll add those bits back in as we convert and add tests. |
|
Moving to a private branch for now. |
Description
Converted to draft temporarily as some of the generic changes were split out into separate PRs.
SQL planners are complex and difficult to test. They are a black box: SQL goes in, and results come out. From a testing perspective, a planner tends to be a box of magic that either works or doesn’t. This PR provides a framework to turn classic black-box planner testing into a white-box experience by capturing and verifying the multiple planner artifacts. The framework is inspired by the one used in Apache Impala.
The core idea is the “case” file which contains SQL statements, any required query context or options, and expected results, serialized as text. Expected results include the Calcite plan, the native query, the schema, the “physical plan” (such as it is in Druid), and so on.
Creating tests is easy. This PR includes a “starter set” captured from the various
Calcite?QueryTestclasses. To add a new one, add a test case with a query, add dummy values for the desired artifacts and run the test. It will fail, of course. Now, use your favorite diff tool to copy the “actual” results to the “expected” results file (assuming the results are correct.) After that, rerun the tests after each planner change to ensure nothing changes (except whatever you specifically wanted to change.)Like the
Calcite?QueryTesttest classes, you can specify expected results when the query is run in a restricted "test" execution engine. Such test runs run single process, against in-memory segments, so check query semantics, but they do not test the full server stack nor do they give accurate performance characterizations. Druid has many different run modes (with or without vectorization, SQL-compatible nulls or not, various context variables. The test framework provides tools to work with these variations.Tests of this sort are quite strict: change anything and the test will fail. This can be painful if you just want to change something and are not concerned about side-effects. It can, however, make changing the planner a breeze because you get extreme visibility into the exact impact your change has: no surprises.
See the
README.mdfile for full details of the case file syntax, how to create a test, and how to convert existing test files.Example
The following shows a test converted from
CalciteQueryTest, including the various test case sections:Multiple other sections are available: the above is one of the simpler cases.
Running Queries
The
Calcite?QueryTestcases verify the native query and results. Verifying results is harder than one would expect: most tests run with two variations (SQL and classic null handling) which produces different results. Others use generators to run up to eight option variations for a total of 16 runs, with differing results. Adding results to the test framework is thus non-trivial and left as a second PR.Since the new framework can’t check results, we are left with using both the old and new versions of converted tests. This is unfortunate: a goal for a second PR is to retire the “legacy” test files. A big bonus: someone had to laboriously write the Java code to build up native queries for those tests. In the figure, the computer will do it for us.
PR Details
This section summarizes the many changes required to create the test framework.
Druid Planner State Capture
Planners want to be black boxes: query in, plan out. Tests want visibility into the internal states as described above. The planner is modified to allow a
PlannerStateCaptureinstance which can record (capture) this internal state. By default, the planner uses a "no op" capture. Tests provide a capturing instance.The code to parse SQL was repeated three times. Factored that into a function.
Added the ability to capture the internal "execution plan" or "physical plan" in the
QueryMakervia a newexplain()method. This method does everything except send the query to another node for execution. The various query makers are refactored to support the two paths (explain and execute). For a SELECT, the physical plan is just the native plan with a few extra fields sent to the Broker. For INSERT, it is whatever the particular INSERT extension chooses to provide.Union Queries
As it turns out, SQL UNION ALL queries have no direct native query representation: they turn into a list of native queries. Since we want to serialize "the" native query, the framework creates a "fake" "union" native query type on the fly to represent unions. Ugly, but it works.
INSERT/REPLACE Statements
The INSERT and REPLACE support in the planner is a bit of a bolt-on. The planner knows how to work with SELECT statements only, including if they are wrapped in a DESCRIBE or INSERT/REPLACE. When exiting the Calcite logical plan, the INSERT and REPLACE nodes don't appear. The test framework "invents" nodes for these cases (in the test framework only) so the the logical plan makes sense.
Planner Config
The planner config class has grown to be a monster: it holds config not just for the planner, but also for the segment metadata caching layer. This PR resisted the temptation to fix that issue.
Instead, it fixes another issue: tests want to tinker with the planner config via the "options" setting. But, the planner context is opaque: the only way to create one is via JSON serialization. Modified the class to provide a builder for use in tests.
The builder is then used in place of the detailed code in the
withOverridesmethod.Calcite Test Refactoring
ExpressionProcessingnow allows a test to change the expression settings on the fly. This lets the test case declare, via the "options" section, how it wants to process expressions, allowing the variations to be tested in a single run.Calciteshas aescapeStringLiteral()method which escapes string literals, but is a bit too aggressive and results in a hard-to-read string with all punctuation converted to Unicode escapes. A new method,escapeStringLiteralLenient()is added to do the lightest-possible escaping and results in much better output in the test cases.The
BaseCalciteTestclass was refactored a bit to:PlannerConfig.Builderto create the various planner configs rather than anonymous classes.RootSchemaBuilderholds the code to build up Druid's Calcite root schema. It is refactored to a builder so that tests can add to the root schema.CalciteTestsis refactored to use the above builder.Druid Planner Test
The new
DruidPlannerTestclass, and its associated ".case" files holds test converted from:CalciteInsertDmlTestCalciteArrayQueriesTestCalciteCorrelatedQueryTestCalciteMultiValueStringQueryTestCalciteParameterQueryTestThis isn't all such tests, just enough to allow a particular project to proceed with confidence. More conversions can be done later. And, as noted, the test file does not currently run the queries: it only plans them and verifies the artifacts.
The other classes in
sql/src/test/java/org/apache/druid/sql/calcite/tester/support the test framework:README.mdcontains instructions for using the test framework.internals.mdprovides details, such as a reference for the test case file format.PlannerFixtureis a kitchen-sink class that encapsulates the code to set up a planner (and all its associated knick-knacks) to run a test. Other tests could use this class rather than doing the setup themselves, but that conversion is not done in this PR.QueryTestCaserepresents one planner test case and its sections.ActualResultsrecords the actual results of a run, compares them with the expected results, and writes a report if there are errors.TestCaseLoadersimple parser for the test case file format.This PR has: