Feature configurable calcite bloat by nozjkoitop · Pull Request #16248 · apache/druid

nozjkoitop · 2024-04-09T08:28:55Z

Description

This pull request introduces a configurable druid.sql.planner.bloat parameter in CalciteRulesManager.

Fixed the problem when complicated nested sql queries which where working before Druid 29.0.0 (1.21.0 calcite version) are being rejected by calcite planner.

With exception:
There are not enough rules to produce a node with desired properties: convention=DRUID, sort=[]

Appears that the exceptiom is being thrown, when the root RelNode which initially is instance of AbstractConverter is being treated as ProjectMergeRule, which counts the result of merge not optimal and reject merging, which means that transformation of AbstractConverter is not happening, so at the end of the day this AbstractConverter instance keeps skipping bestCost initialization, because it returns default infinite value which is being skipped by VolcanoPlanner. That's why bestCost to root relNode is never set which is causing exception
This issue is probably caused by lots of optimizations in VolcanoPlanner and ProjectMergeRule and looks like as a calcite bug.

Release note

Created workaround: configurable druid.sql.planner.bloat parameter, which should be set on broker jvm.conf, it will adjust the limits of ProjectMergeRule result cost and could allow running complex queries(default value 100, suggested 1000, but it depends on data to be processed).

Key changed/added classes in this PR

CalciteRulesManager

This PR has:

been self-reviewed.
added documentation for new or modified features or behaviors.
a release note entry in the PR description.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added or updated version, license, or notice information in licenses.yaml
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
added integration tests.
been tested in a test Druid cluster.

kgyrtkirk · 2024-04-09T12:35:00Z

  );
+  private static final String BLOAT_PROPERTY = "druid.sql.planner.bloat";
+  private static final int BLOAT = Integer.parseInt(
+      System.getProperty(BLOAT_PROPERTY, "100")


I think the druid planner we'll not like at all if multiple projects are kep on top of eachother...

the default seems to be the same (100) - what do you think about raising it to a bigger value; like a few millions - so that its less likely to hit this issue out of the box ?

Hi, considering that this is primarily a workaround for a potential Calcite problem, I'm hesitant to increase the default value that much. However, I agree that raising the default to at least 1000 would be a good compromise.

yeah - I agree...an off switch would be better than tweaking an integer;
it depends on the execution engine how it evaluates a set of expressions which may share common subexpressions ...it might not cause much problem in case it could avoid recomputations.

abhishekagarwal87 · 2024-04-10T16:01:09Z

      System.getProperty(HEP_DEFAULT_MATCH_LIMIT_CONFIG_STRING, "1200")
  );
+  private static final String BLOAT_PROPERTY = "druid.sql.planner.bloat";
+  private static final int BLOAT = Integer.parseInt(


It should be passed via context parameter or through runtime property. A context param seems a better fit as it would allow you to run those complex queries without having any impact on the rest of the queries.

Good point, thanks! I will have a look

Updated to use context parameter

cryptoe · 2024-04-12T09:50:53Z

          CoreRules.AGGREGATE_STAR_TABLE,
          CoreRules.AGGREGATE_PROJECT_STAR_TABLE,
-          CoreRules.PROJECT_MERGE,
+          ProjectMergeRule.Config.DEFAULT.withBloat(BLOAT).toRule(),


How do users know when to set this context parameter ?
I think we should change the error message so that they can be a bit more helpful for the end users.

Honestly, these are calcite internals, and all we get on the druid side is RelOptPlanner.CannotPlanException, which I believe could be thrown by tons of reasons. We can add the suggestion to try to set this context parameter at QueryHandler, but IMO it's a real shot in the dark there.

nozjkoitop added 2 commits April 8, 2024 13:54

Configurable bloat for calcite ProjectMergeRule implemented

0f67e86

Comment added

368a655

github-actions Bot added the Area - Querying label Apr 9, 2024

github-advanced-security AI found potential problems Apr 9, 2024

View reviewed changes

Comment thread sql/src/main/java/org/apache/druid/sql/calcite/planner/CalciteRulesManager.java Fixed

kgyrtkirk reviewed Apr 9, 2024

View reviewed changes

Default bloat value increased to 1000

9f599ee

kgyrtkirk approved these changes Apr 10, 2024

View reviewed changes

github-advanced-security AI found potential problems Apr 10, 2024

View reviewed changes

Comment thread sql/src/main/java/org/apache/druid/sql/calcite/planner/CalciteRulesManager.java Fixed

abhishekagarwal87 reviewed Apr 10, 2024

View reviewed changes

cryptoe reviewed Apr 12, 2024

View reviewed changes

nozjkoitop added 2 commits April 22, 2024 14:47

Implemented bloat configuration from QueryContext

f4c9109

Code refactored, docs updated

8cd5f37

github-actions Bot added the Area - Documentation label Apr 22, 2024

nozjkoitop requested review from abhishekagarwal87 and cryptoe April 25, 2024 09:05

cryptoe merged commit b5958b6 into apache:master May 6, 2024

kfaraz added this to the 31.0.0 milestone Oct 4, 2024

kfaraz mentioned this pull request Oct 11, 2024

[DRAFT] 31.0.0 Release Notes #17332

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature configurable calcite bloat#16248

Feature configurable calcite bloat#16248
cryptoe merged 5 commits intoapache:masterfrom
deep-bi:feature-configurable-calcite-bloat

nozjkoitop commented Apr 9, 2024 •

edited

Loading

Uh oh!

Uh oh!

kgyrtkirk Apr 9, 2024 •

edited

Loading

Uh oh!

nozjkoitop Apr 10, 2024

Uh oh!

kgyrtkirk Apr 10, 2024

Uh oh!

Uh oh!

abhishekagarwal87 Apr 10, 2024

Uh oh!

nozjkoitop Apr 11, 2024

Uh oh!

nozjkoitop Apr 22, 2024

Uh oh!

cryptoe Apr 12, 2024

Uh oh!

nozjkoitop Apr 22, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

nozjkoitop commented Apr 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Fixed the problem when complicated nested sql queries which where working before Druid 29.0.0 (1.21.0 calcite version) are being rejected by calcite planner.

Release note

Key changed/added classes in this PR

Uh oh!

Uh oh!

kgyrtkirk Apr 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nozjkoitop Apr 10, 2024

Choose a reason for hiding this comment

Uh oh!

kgyrtkirk Apr 10, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

abhishekagarwal87 Apr 10, 2024

Choose a reason for hiding this comment

Uh oh!

nozjkoitop Apr 11, 2024

Choose a reason for hiding this comment

Uh oh!

nozjkoitop Apr 22, 2024

Choose a reason for hiding this comment

Uh oh!

cryptoe Apr 12, 2024

Choose a reason for hiding this comment

Uh oh!

nozjkoitop Apr 22, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

nozjkoitop commented Apr 9, 2024 •

edited

Loading

kgyrtkirk Apr 9, 2024 •

edited

Loading