Skip to content

[CALCITE-7465] Make MATCH_RECOGNIZE tolerant to FINAL and RUNNING non function MEASURES#4899

Merged
snuyanzin merged 1 commit intoapache:mainfrom
snuyanzin:calcite7465
Apr 24, 2026
Merged

[CALCITE-7465] Make MATCH_RECOGNIZE tolerant to FINAL and RUNNING non function MEASURES#4899
snuyanzin merged 1 commit intoapache:mainfrom
snuyanzin:calcite7465

Conversation

@snuyanzin
Copy link
Copy Markdown
Contributor

Jira Link

CALCITE-7465

Current issue is that some of MEASURES

  1. might be without FINAL or RUNNING keyword
  2. might be with a keyword however the computed one is different.
    In both cases unparsed sql is wrong.

Since calculation is happenning during validation it impacts unparse logic,
The proposal is movement calculation to SqlToRelConverter then unparse will work as expected

query for the first case

SELECT *
FROM emp
MATCH_RECOGNIZE (
  MEASURES
     FINAL COUNT(A.deptno) AS deptno,
     A.ename AS ename
  PATTERN (A B)
  DEFINE
    A AS A.empno = 123
) AS T

if not all rows then the default operation will be FINAL and it will be unparsed (non parsable anymore) to

SELECT *
FROM `EMP` MATCH_RECOGNIZE(
MEASURES FINAL COUNT(`A`.`DEPTNO`) AS `DEPTNO`, FINAL `A`.`ENAME` AS `ENAME`
PATTERN (`A` `B`)
DEFINE `A` AS (PREV(`A`.`EMPNO`, 0) = 123)) AS `T`

for the second case

SELECT *
  from "product" match_recognize
  (
   measures STRT."net_weight" as start_nw,
   FINAL COUNT("net_weight") as down_cnt,
   RUNNING COUNT("net_weight") as running_cnt
    pattern (strt down+ up+)
    define
      down as down."net_weight" < PREV(down."net_weight"),
      up as up."net_weight" > prev(up."net_weight")
  ) mr

here for one of the measures it is RUNNING however it is not ALL ROWS then will be replaced with FINAL
the problem is that it is unparsed as

SELECT *
FROM (SELECT *
FROM "foodmart"."product") 
MATCH_RECOGNIZE(
MEASURES 
FINAL "STRT"."net_weight" AS "START_NW",
FINAL COUNT("product"."net_weight") AS "DOWN_CNT", 
FINAL (RUNNING COUNT("product"."net_weight")) AS "RUNNING_CNT"
ONE ROW PER MATCH
AFTER MATCH SKIP TO NEXT ROW
PATTERN ("STRT" "DOWN" + "UP" +)
DEFINE 
"DOWN" AS PREV("DOWN"."net_weight", 0) < 
PREV("DOWN"."net_weight", 1), 
"UP" AS PREV("UP"."net_weight", 0) > 
PREV("UP"."net_weight", 1))

i.e. there is FINAL and RUNNING together for the same measure which makes the query unparsable.

Comment on lines 7313 to 7315
+ "FINAL LAST(\"DOWN\".\"net_weight\", 0) AS \"BOTTOM_NW\", "
+ "FINAL LAST(\"UP\".\"net_weight\", 0) AS \"END_NW\"\n"
+ "ONE ROW PER MATCH\n"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

old test producing unparsable query

@snuyanzin snuyanzin force-pushed the calcite7465 branch 2 times, most recently from 91f44e4 to dd71d5b Compare April 21, 2026 15:19
@sonarqubecloud
Copy link
Copy Markdown

@snuyanzin
Copy link
Copy Markdown
Contributor Author

The PR name is the subject to be changed once we align on the approach in jira

{ s = span(); }
func = NamedFunctionCall() {
return runningOp.createCall(s.end(func), func);
e = Expression3(ExprContext.ACCEPT_NON_QUERY) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a spec of the grammar for this construct somewhere?
Ideally we would follow that.
I don't know enough about this construct to make a choice myself.

Copy link
Copy Markdown
Contributor Author

@snuyanzin snuyanzin Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't get if your question is about Expression3 then https://github.com/snuyanzin/calcite/blob/dd71d5b3aa0bd004f4d5369986e50301b628b0ad/core/src/main/codegen/templates/Parser.jj#L4045-L4049

if your question is about MATCH_RECOGNIZE then I have only links to other vendors
like Snowflake https://docs.snowflake.com/en/sql-reference/constructs/match_recognize#syntax

    [ MEASURES <expr> [AS] <alias> [, ... ] ]

and below they have explanation for FINAL, RUNNING (Snowlake doesn't allow FINAL, RUNNING for non function, however Calcite always sets either FINAL or RUNNING and unparses it like that)

expr ::= ... [ { RUNNING | FINAL } ] windowFunction ...

BigQuery seems doesn't have FINAL, RUNNING at all https://docs.cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#match_recognize_clause

Oracle doesn't use these keyword and determines FINAL or RUNNING implicitly https://docs.oracle.com/cd/E29542_01/apirefs.1111/e12048/pattern_recog.htm#BEJBFEGJ

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so there isn't a universally agreed standard for MATCH RECOGNIZE.
It's fine for the parser to be more permissive and the validator to reject some things later (e.g., if you want to emulate restrictions of a specific dialect).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But is Expression3 too permissive?
Can you make up some expressions that would be accepted which are not legal?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise this approach is nice, because it is minimally invasive.

Copy link
Copy Markdown
Contributor Author

@snuyanzin snuyanzin Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spotted another case where this PR will help #4907
I tried other case where expression under FIRST, LAST, NEXT , PREV might be expanded however parser after that not able to parse it.
After changing to expression parser started to parse such unparsed queries, example in linked PR and related jira

@sonarqubecloud
Copy link
Copy Markdown

@snuyanzin snuyanzin changed the title [CALCITE-7465] Unparse of MATCH_RECOGNIZE MEASURES might produce unparsable sql [CALCITE-7465] Make MATCH_RECOGNIZE tolerant to FINAL and RUNNING non function MEASURES Apr 23, 2026
@snuyanzin snuyanzin merged commit 915c769 into apache:main Apr 24, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants