Improve the output of SQL explain message by LakshSingla · Pull Request #11908 · apache/druid

LakshSingla · 2021-11-11T07:29:25Z

Description

Currently, when we try to do EXPLAIN PLAN FOR, it returns the structure of the SQL parsed (via Calcite's internal planner util), which is verbose (since it tries to explain about the nodes in the SQL, instead of the Druid Query), and not representative of the native Druid query which will get executed on the broker side.

This PR aims to change the format when user tries to EXPLAIN PLAN FOR for queries which are executed by converting them into Druid's native queries (i.e. not sys schemas).

The explanation now will be a list of columns with information about resources (no change) and explanation.
Explanation is a string representing an array of queries & their signatures (in JSON format). Final shape of the explanation will look like:

[
  { 
    "query" : <native_druid_query>,
    "signature": <signature of the query>
  },
  {
    "query" : <native_druid_query>,
    "signature": <signature of the query>
  }
]

Examples:

Simple query
Query Shape:
EXPLAIN PLAN FOR ( SELECT dim1, dim2 FROM druid.foo )

BEFORE

Output in console

Plan (expanded)

AFTER

[
  {
    "query": {
      "queryType": "scan",
      "dataSource": {
        "type": "table",
        "name": "foo"
      },
      "intervals": {
        "type": "intervals",
        "intervals": [
          "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
        ]
      },
      "virtualColumns": [],
      "resultFormat": "compactedList",
      "batchSize": 20480,
      "order": "none",
      "filter": null,
      "columns": [
        "dim1",
        "dim2"
      ],
      "legacy": false,
      "context": {
        "defaultTimeout": 300000,
        "maxScatterGatherBytes": 9223372036854776000,
        "sqlCurrentTimestamp": "2000-01-01T00:00:00Z",
        "sqlQueryId": "dummy",
        "vectorize": "false",
        "vectorizeVirtualColumns": "false"
      },
      "descending": false,
      "granularity": {
        "type": "all"
      }
    },
    "signature": "{dim1:STRING, dim2:STRING}"
  }
]

UNION ALL which generates multiple native queries
Query Shape:
EXPLAIN PLAN FOR ( SELECT dim1 FROM druid.foo UNION ALL SELECT dim2 FROM druid.foo2)

BEFORE

Output in console

Plan (expanded)

AFTER

Some parts truncated

[
  {
    "query": {
      "queryType": "scan",
      "dataSource": {
        "type": "table",
        "name": "foo"
      },
      "intervals": {
        "type": "intervals",
        "intervals": [
          "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
        ]
      },
...
      }
    },
    "signature": "{dim1:STRING}"
  },
  {
    "query": {
      "queryType": "scan",
      "dataSource": {
        "type": "table",
        "name": "foo2"
      },
      "intervals": {
        "type": "intervals",
        "intervals": [
          "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
        ]
      },
    ...
      }
    },
    "signature": "{dim2:STRING}"
  }
]

JOIN on a Table datasource and a Union query
Query Shape:
EXPLAIN PLAN FOR ( SELECT a.dim1, COUNT(*) FROM druid.foo a INNER JOIN ( SELECT dim1, dim2 FROM druid.foo UNION ALL SELECT dim1, dim2 FROM druid.foo2 ) b ON a.dim1 = b.dim1 WHERE a.dim1 = 1.0 b.dim1 = 2.0 GROUP BY a.dim1 )

BEFORE

Output in console

Plan(expanded)

AFTER

Some parts truncated

[
  {
    "query": {
      "queryType": "groupBy",
      "dataSource": {
        "type": "join",
        "left": {
          "type": "table",
          "name": "foo"
        },
        "right": {
          "type": "query",
          "query": {
            "queryType": "scan",
            "dataSource": {
              "type": "union",
              "dataSources": [
                {
                  "type": "table",
                  "name": "foo4"
                },
                {
                  "type": "table",
                  "name": "foo"
                }
              ]
            },
            "intervals": {
              "type": "intervals",
              "intervals": [
                "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
              ]
            },
...
            },
            "descending": false,
            "granularity": {
              "type": "all"
            }
          }
        },
        "rightPrefix": "j0.",
        "condition": "(\"dim1\" == \"j0.dim1\")",
        "joinType": "INNER",
        "leftFilter": null
      },
      "intervals": {
        "type": "intervals",
        "intervals": [
          "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
        ]
      },
      "virtualColumns": [],
      "filter": {
        "type": "or",
     ...
          }
        ]
      },
      "granularity": {
        "type": "all"
      },
      "dimensions": [
        {
          "type": "default",
          "dimension": "dim1",
          "outputName": "d0",
          "outputType": "STRING"
        }
      ],
      "aggregations": [
        {
          "type": "count",
          "name": "a0"
        }
      ],
...
      "descending": false
    },
    "signature": "{d0:STRING, a0:LONG}"
  }
]

The older format vs the newer format

Older Format:

Gave the structure of the RelNodes formed from the SQL statement. This could help in understanding how the given query was arranged
The older variation was verbose, and repetitive. See the examples above.

Newer Format

Gives the native queries as is, therefore gives a clearer understanding of what is going to be run under the hood.
JSON output has a fixed structure and could be parsed easily.
Doesn't match the EXPLAIN PLAN FOR semantics of other databases like SQL/Oracle etc. (Can be said for the older version as well)
Since a default public facing change is being made, this would cause the users to update their applications, if they are relying on the structure of the format.

We are using a external context flag to switch between legacy and native mode, and not relying on EXPLAIN PLAN FOR ... AS JSON since the latter should ideally only vary the format of the EXPLAIN PLAN and not the output itself (which is being done here).

Key changed/added classes in this PR

Modified implementation of DruidPlanner#planExplanation.
RowSignature implemented to be JSON serializable
RowSignatureTest for serializability

This PR has:

been self-reviewed.
- using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
added documentation for new or modified features or behaviors.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added or updated version, license, or notice information in licenses.yaml
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
added integration tests.
been tested in a test Druid cluster.

kfaraz

Thanks for the changes, @LakshSingla !
On the whole, the PR looks good.

There are a few things that need changing/explanation.

LakshSingla · 2021-11-11T14:02:37Z

We can support both the formats simultaneously in Calcite's syntax using SqlExplainLevel and SqlExplain.Depth. From the Calcite's docs.

explain:
      EXPLAIN PLAN
      [ WITH TYPE | WITH IMPLEMENTATION | WITHOUT IMPLEMENTATION ]
      [ EXCLUDING ATTRIBUTES | INCLUDING [ ALL ] ATTRIBUTES ]
      [ AS JSON | AS XML | AS DOT ]
      FOR { query | insert | update | merge | delete }

So potentially,
EXPLAIN PLAN FOR {query} can give older output and say EXPLAIN PLAN EXCLUDING ATTRIBUTES {query} can give the newer one. (Can be altered to whatever makes more sense semantically).

LakshSingla · 2021-11-12T07:44:42Z

Commented out the explanation of the original tests in case we decide to follow up with the above suggestion. Will remove it if not required.

dbardbar · 2021-11-16T06:34:41Z

@LakshSingla - just a quick question - will this be backward-compatible?
I have a use-case that we use the EXPLAIN to convert SQL to native queries automatically, so I'm wondering if I'll need to adapt my code to the new proposed format.

abhishekagarwal87 · 2021-11-16T08:10:11Z

@dbardbar - will you still need that custom code once this change is merged? This change is trying to solve the same problem of not being able to see the final native query.

abhishekagarwal87 · 2021-11-16T08:35:40Z

hmm. I guess you would still want to rid of the extra fields such as row signature etc

dbardbar · 2021-11-16T08:59:22Z

@LakshSingla - the native query can be extracted from the response today, but it does require some logic to extract it. Not the prettiest piece of code, but not that complicated.
If I understand correctly your proposal, it seems like it make our lives better, and will simplify our parser, but making a breaking change does have it's drawbacks.
If your new code will return the new response only based on some new flag (global, or on the request), then that would be great.

dbardbar · 2021-11-17T13:19:32Z

@abhishekagarwal87 - for our use-case we want to extract from the native query on the part related to the 'WHERE' clause, so anyway we'll need some basic handling, even with the new code. With the new code, the extraction will be a bit easier.

clintropolis · 2021-11-17T17:51:24Z

This looks nice 👍

Since this changes the output of a query, i'm +1 for adding a feature flag to allow the previous results to be returned. PlannerConfig would probably be the most appropriate place, maybe something like druid.sql.useLegacyDruidExplain?

Since this output seems totally better I think it is ok to default it to the new stuff. It might be nice to also add a context parameter so that it could be overridden per query to make it easier for developers to migrate their apps to the new output while debugging without having to set it for the entire cluster.

LakshSingla · 2021-11-18T12:54:08Z

In accordance with the above suggestions, I have added a config option in the PlannerConfig which will allow the user to switch between the explain plan outputs. It can also be overridden on a per query basis.
By default the newer output would be visible. This default behavior can be changed by setting druid.sql.planner.useLegacyDruidExplain = true (default is false)
Irrespective of the default behavior set in the properties, the explain plan output can also be modified on a per query basis by setting the useLegacyDruidExplain to true or false in the query's context.

abhishekagarwal87

looks good to me overall. just one minor comment.

LakshSingla · 2021-11-18T15:12:17Z

Should the property druid.sql.planner.useLegacyDruidExplain and the overriding context key be documented somewhere?

abhishekagarwal87 · 2021-11-18T15:23:42Z

Should the property druid.sql.planner.useLegacyDruidExplain and the overriding context key be documented somewhere?

Yes. You can put it in docs/querying/query-context.md

LakshSingla · 2021-11-22T12:20:32Z

Updated the description with pros and cons of the newer approach.

vogievetsky · 2021-11-24T05:57:00Z

Very excited for this change! I got 2 questions:

(1) is it possible to provide the signature as JSON and not as a string?

instead of "signature": "{d0:STRING, a0:LONG}"

provide it as "signature": [{name: "d0", "type":"STRING"}, {"name":"a0", "type":"LONG"}]

the signature strings are really annoying to parse and these signatures are really important (sometimes more so than the query)

(2) you included screenshots of the old format in the console query view, but how does it work with the explain dialog? Does it need to be updated as part of this PR or right afterwards?

Just to be clear I am talking about this dialog:

As you can see the current dialog parses the old format thus relying on it. I would LOVE nothing more than to switch to this new format (providing pt.1) but does this PR break this dialog? Ideally there would be a context flag to trigger the new format but the old format would be the default (at least for a few releases).

clintropolis · 2021-11-24T06:08:48Z

it looks like RowSignature is made JSON friendly in https://github.com/apache/druid/pull/11959/files#diff-efdda11ff1dc815218691ec3a5c8d487bd01cd5c297de95690b7b83cea1eb5b8R82 so might want to do that in a follow-up after #11959 goes in.

…he outputs

LakshSingla · 2021-11-24T13:27:29Z

Thanks for the comment @vogievetsky

As mentioned by @clintropolis, Gian's PR should make the RowSignature serializable. However, I have added a temporary JsonValue method till then which will help the Jackson serialize it, till those changes get merged. I have kept the output of my change is similar to Gian's change, so ideally no testcases need to updated, and the DruidPlanner's code also works without any tweaks. (This would introduce some merge conflicts in RowSignature and RowSignatureTest in SQL INSERT planner support. #11959, but resolving it should be simple - discard the older changes for the new ones.
I haven't tested it out, but it should break the console output. I have changed the default behavior to show the legacy output (i.e. the original behaviour), and therefore, no breaking changes are introduced in the PR. There is no requirement of updating the UI code immediately.

clintropolis

👍

#11959 got merged faster than I was expecting 😅, so maybe consider the rename/inversion when fixing up conflicts

clintropolis · 2021-11-24T20:29:40Z

 |`druid.sql.planner.metadataSegmentPollPeriod`|How often to poll coordinator for published segments list if `druid.sql.planner.metadataSegmentCacheEnable` is set to true. Poll period is in milliseconds. |60000|
 |`druid.sql.planner.authorizeSystemTablesDirectly`|If true, Druid authorizes queries against any of the system schema tables (`sys` in SQL) as `SYSTEM_TABLE` resources which require `READ` access, in addition to permissions based content filtering.|false|
-|`druid.sql.planner.useLegacyDruidExplain`|If true, `EXPLAIN PLAN FOR` will return the explain plan in legacy format, else it will return a JSON representation of the native queries the given SQL statement translates to. It can be overridden per query with `useLegacyDruidExplain` context key.|false|
+|`druid.sql.planner.useLegacyDruidExplain`|If true, `EXPLAIN PLAN FOR` will return the explain plan in legacy format, else it will return a JSON representation of the native queries the given SQL statement translates to. It can be overridden per query with `useLegacyDruidExplain` context key.|true|


since the default has been flipped, I wonder if we should invert this setting and call something like `useJsonStringExplain" or something similar that defaults to false. (I just did a similar thing when swapping the default to keep old behavior in #11184, swapping a "use legacy" to "use new thing", which I think maybe is better since is a bit odd for "use legacy" to be the default of something)

If in the future, the newer explain plan is to be used, then I think useLegacyDruidExplain would work better, since we won't have to change the feature flag then. But it does look weird right now, since it hasn't been deprecated yet. I am fine with keeping it either way.

I agree. The choice is between legacy vs new plan so from that the flag name does sound good to me.

LakshSingla · 2021-11-24T20:35:43Z

Just merged in the changes!

clintropolis

nice 👍

LakshSingla · 2021-11-25T10:29:24Z

Minor comment:
Currently, deserializing the list of queries in JSON is done by converting it into instance of ArrayNode, since a simple List<Map<String, Object>> was losing the information about queryType referenced code, stackoverflow I have recently found a way to not use these by creating an inner class

private static class ExplainOutputNode
  {
    @JsonProperty
    Query query;

    @JsonProperty
    RowSignature signature;

    public ExplainOutputNode(RowSignature signature, Query query) {
      this.signature = signature;
      this.query = query;
    }
  }

    String outputString =  jsonMapper.writerFor(new TypeReference<List<ExplainOutputNode>>()
    {
    }).writeValueAsString(queryList);

which frees the referenced code from usage of ArrayNode and ObjectNode.
Should that be a preferred approach to the current one?

abhishekagarwal87 · 2021-11-25T10:41:09Z

Minor comment: Currently, deserializing the list of queries in JSON is done by converting it into instance of ArrayNode, since a simple List<Map<String, Object>> was losing the information about queryType referenced code, stackoverflow I have recently found a way to not use these by creating an inner class
private static class ExplainOutputNode
  {
    @JsonProperty
    Query query;

    @JsonProperty
    RowSignature signature;

    public ExplainOutputNode(RowSignature signature, Query query) {
      this.signature = signature;
      this.query = query;
    }
  }

    String outputString =  jsonMapper.writerFor(new TypeReference<List<ExplainOutputNode>>()
    {
    }).writeValueAsString(queryList);
which frees the referenced code from usage of ArrayNode and ObjectNode. Should that be a preferred approach to the current one?

I think that the current approach is fine since the new approach means creating a new type. If in future, we do similar deserialization in other places, we can use a custom type specifically for explain output.

abhishekagarwal87 · 2021-11-25T15:39:18Z

Thank you @LakshSingla. I have merged your change.

…2009) This is the UI followup to the work done in #11908 Updated the Explain dialog to use the new output format.

This is a follow up to the PR #11908. This fixes the bug in top level union all queries when there are more than 2 SQL subqueries are present.

Initial commit, first implementation

c18be9e

abhishekagarwal87 added Area - SQL Design Review Release Notes labels Nov 11, 2021

kfaraz requested changes Nov 11, 2021

View reviewed changes

Clean up the code, reuse existing implementation of explain planner

336e0ba

LakshSingla marked this pull request as ready for review November 11, 2021 20:15

Update the existing EXPLAIN test cases, add one for UNION ALL

ed057ba

Fix checkstyle, remove unused import

9c8d85b

LakshSingla added 6 commits November 18, 2021 16:03

Add flag to control the output being generated in SQL explain

48af23f

Merge branch 'master' into sql-explain

4a0e382

Add tests for the overridding of the context

890dd4a

Remove commented out explanations

5023eb1

Checkstyle fix

dcb58d9

Fix testcases, cleanup the remaining comments

0fe78ac

abhishekagarwal87 approved these changes Nov 18, 2021

View reviewed changes

Comment thread sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidPlanner.java Outdated

LakshSingla added 2 commits November 19, 2021 01:31

Add docs, throw ISE in a conditional

b728bd8

Merge branch 'master' into sql-explain

82136e3

abhishekagarwal87 reviewed Nov 23, 2021

View reviewed changes

Comment thread sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidPlanner.java Outdated

Add exception in the logging

c2b4690

LakshSingla added 2 commits November 24, 2021 14:28

Merge branch 'master' into sql-explain

4c17d49

Change default format to legacy, update testcases to check for both t…

6d19f48

…he outputs

abhishekagarwal87 reviewed Nov 24, 2021

View reviewed changes

Comment thread docs/configuration/index.md Outdated

LakshSingla added 4 commits November 24, 2021 16:02

Add tests for overrides

a7b880e

Checkstyle, testcase fix

11274af

Nit, doc default value fix

0dd494a

Add temporary serialization for rowsignature

1a52f16

abhishekagarwal87 reviewed Nov 24, 2021

View reviewed changes

Comment thread sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidPlanner.java Outdated

Update testcases for the new signature implementation

3e41ba3

clintropolis reviewed Nov 24, 2021

View reviewed changes

Merge with master, remove temp implementation

97607e5

Change the flag to useNativeQueryExplain

28aac50

clintropolis approved these changes Nov 25, 2021

View reviewed changes

abhishekagarwal87 merged commit c381cae into apache:master Nov 25, 2021

vogievetsky mentioned this pull request Nov 30, 2021

Web console: updated the explain dialog to use new explain output #12009

Merged

2 tasks

abhishekagarwal87 pushed a commit that referenced this pull request Dec 1, 2021

Web console: updated the explain dialog to use new explain output (#1…

1f95a42

…2009) This is the UI followup to the work done in #11908 Updated the Explain dialog to use the new output format.

LakshSingla mentioned this pull request Dec 2, 2021

Fix the error case when there are multi top level unions #12017

Merged

9 tasks

abhishekagarwal87 added this to the 0.23.0 milestone May 11, 2022

abhishekagarwal87 mentioned this pull request Jun 22, 2022

[Draft] 0.23.0 Release notes #12510

Closed

LakshSingla deleted the sql-explain branch March 22, 2024 16:33

Conversation

LakshSingla commented Nov 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

BEFORE

AFTER

BEFORE

AFTER

BEFORE

AFTER

The older format vs the newer format

Key changed/added classes in this PR

Uh oh!

kfaraz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LakshSingla commented Nov 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LakshSingla commented Nov 12, 2021

Uh oh!

dbardbar commented Nov 16, 2021

Uh oh!

abhishekagarwal87 commented Nov 16, 2021

Uh oh!

abhishekagarwal87 commented Nov 16, 2021

Uh oh!

dbardbar commented Nov 16, 2021

Uh oh!

dbardbar commented Nov 17, 2021

Uh oh!

clintropolis commented Nov 17, 2021

Uh oh!

LakshSingla commented Nov 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abhishekagarwal87 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

LakshSingla commented Nov 18, 2021

Uh oh!

abhishekagarwal87 commented Nov 18, 2021

Uh oh!

LakshSingla commented Nov 22, 2021

Uh oh!

Uh oh!

vogievetsky commented Nov 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clintropolis commented Nov 24, 2021

Uh oh!

Uh oh!

Uh oh!

LakshSingla commented Nov 24, 2021

Uh oh!

clintropolis left a comment

Choose a reason for hiding this comment

Uh oh!

clintropolis Nov 24, 2021

Choose a reason for hiding this comment

Uh oh!

LakshSingla Nov 24, 2021

Choose a reason for hiding this comment

Uh oh!

abhishekagarwal87 Nov 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LakshSingla commented Nov 24, 2021

Uh oh!

clintropolis left a comment

Choose a reason for hiding this comment

Uh oh!

LakshSingla commented Nov 25, 2021

LakshSingla commented Nov 11, 2021 •

edited

Loading

LakshSingla commented Nov 11, 2021 •

edited

Loading

LakshSingla commented Nov 18, 2021 •

edited

Loading

vogievetsky commented Nov 24, 2021 •

edited

Loading

abhishekagarwal87 Nov 25, 2021 •

edited

Loading