Skip to content

Rewrite SQL node early in the planning phase to support parameterization of MSQ DML queries#17126

Merged
abhishekrb19 merged 1 commit intoapache:masterfrom
abhishekrb19:dynamic_param_ingest
Sep 23, 2024
Merged

Rewrite SQL node early in the planning phase to support parameterization of MSQ DML queries#17126
abhishekrb19 merged 1 commit intoapache:masterfrom
abhishekrb19:dynamic_param_ingest

Conversation

@abhishekrb19
Copy link
Copy Markdown
Contributor

@abhishekrb19 abhishekrb19 commented Sep 20, 2024

It's useful to supply dynamic parameters to automate periodic jobs. However, previously only MSQ SELECT queries could be parameterized. For example, consider the following MSQ REPLACE query with valid dynamic parameters:

REPLACE INTO foo
OVERWRITE WHERE __time >= ? AND __time < ?
SELECT TIME_PARSE(ts) AS __time, c1
FROM (VALUES('2023-01-01', 'day1_1'), ('2023-01-01', 'day2')) AS t(ts, c1)
WHERE c1 = 'day2'
PARTITIONED BY DAY
[
  {
    "type": "TIMESTAMP",
    "value": "2023-01-01"
  },
  {
    "type": "TIMESTAMP",
    "value": "2023-01-05"
  },
]

The expectation is that the parameters will bind to the ? placeholders in the query in place of the literal, but the query fails with the following exception:

Invalid OVERWRITE WHERE clause [`__time` >= ?]: Cannot get a timestamp from sql expression [?]

This is because there is a bunch of validation in the IngestHandler code that happens during the planning stage, which runs on the node where the rewrite hasn’t happened yet. This change moves the call to rewriteParameters() earlier in the planning stage so any validations will run on the rewritten root node.

This PR has:

  • been self-reviewed.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • been tested in a test Druid cluster.

@github-actions github-actions Bot added Area - Batch Ingestion Area - Querying Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Sep 20, 2024

@MethodSource("data")
@ParameterizedTest(name = "{index}:with context {0}")
public void testReplaceWithDynamicParameters(String contextName, Map<String, Object> context)

Check notice

Code scanning / CodeQL

Useless parameter

The parameter 'contextName' is never used.
Copy link
Copy Markdown
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@abhishekrb19 abhishekrb19 merged commit 37a2a12 into apache:master Sep 23, 2024
@abhishekrb19 abhishekrb19 deleted the dynamic_param_ingest branch September 23, 2024 19:49
@adarshsanjeev adarshsanjeev added this to the 32.0.0 milestone Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 Area - Querying

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants