Skip to content

Fail query planning if a CLUSTERED BY column contains descending order#14436

Merged
zachjsh merged 7 commits intoapache:masterfrom
abhishekrb19:validate_clustered_by
Jun 16, 2023
Merged

Fail query planning if a CLUSTERED BY column contains descending order#14436
zachjsh merged 7 commits intoapache:masterfrom
abhishekrb19:validate_clustered_by

Conversation

@abhishekrb19
Copy link
Copy Markdown
Contributor

@abhishekrb19 abhishekrb19 commented Jun 15, 2023

Before this change, queries will successfully plan if any CLUSTERED BY column contains a DESC order. MSQE will validate that after the plan is generated and throw an InsertCannotOrderByDescending fault. This deferred validation makes it hard for clients that rely on whether a query can be successfully planned or not.

Now with this change, the validation is pushed down to the planning stage as a sql validation step. So the query planning will fail with a ValidationException if an INSERT or REPLACE query contains one or more clustering columns with descending order.

Release note

The MSQ fault, InsertCannotOrderByDescending, is marked deprecated. An INSERT or REPLACE query containing a CLUSTERED BY expression cannot be in descending order. Druid's segment generation code only supports ascending order. A query ValidationException is thrown instead of the fault in this scenario.


Key changed/added classes in this PR
  • DruidSqlParserUtils.java
  • Removed InsertCannotOrderByDescendingFault.java
  • Adjust existing tests to use ASC instead of DESC; add new tests with DESC where the thrown exception is validated.

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@abhishekrb19 abhishekrb19 changed the title WIP: Fail query planning if any CLUSTERED BY column contains descending order Fail query planning if a CLUSTERED BY column contains descending order Jun 16, 2023
@abhishekrb19 abhishekrb19 force-pushed the validate_clustered_by branch from 32ca25d to 32a9bbf Compare June 16, 2023 05:44
Copy link
Copy Markdown
Contributor

@cryptoe cryptoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @abhishekrb19

Comment thread docs/multi-stage-query/reference.md Outdated
@cryptoe cryptoe added the Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 label Jun 16, 2023
@zachjsh zachjsh merged commit 04fb757 into apache:master Jun 16, 2023
@abhishekrb19 abhishekrb19 deleted the validate_clustered_by branch July 18, 2023 16:34
@abhishekagarwal87 abhishekagarwal87 added this to the 27.0 milestone Jul 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area - Documentation Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants