feat(duckDB): Add transpilation support for ANY_VALUE function with HAVING MAX and MIN clauses#6325
Conversation
There was a problem hiding this comment.
I have a general comment here regarding the transpilation and the tie-breaking.
WITH data AS (
SELECT 'A' AS fruit, 20 AS sold UNION ALL
SELECT 'D' AS fruit, 20 AS sold UNION ALL
SELECT 'C' AS fruit, 0 AS sold UNION ALL
SELECT 'B' AS fruit, 0 AS sold
)
SELECT
ANY_VALUE(fruit = 'D' HAVING MAX sold) AS res
FROM data;
^ this query in bq always results to false.
WITH data AS (
SELECT 'A' AS fruit, 20 AS sold UNION ALL
SELECT 'D' AS fruit, 20 AS sold UNION ALL
SELECT 'C' AS fruit, 0 AS sold UNION ALL
SELECT 'B' AS fruit, 0 AS sold
)
SELECT arg_max(fruit = 'D', sold) AS res
FROM data;
^ this query in duckdb results to true or false.
The comment here is about the non-deterministic nature of agg functions without ordering. This is an expected behavour, despite the fact that for multiple runs bq returns false the result is ANY non-deterministic . Same for duckdb AGG funcs non-deterministic.
@georgesittas any thoughts on that ?
There was a problem hiding this comment.
bq:
WITH data AS (
SELECT 'A' AS fruit, 20 AS sold UNION ALL
SELECT NULL AS fruit, 25 AS sold UNION ALL
SELECT 'D' AS fruit, 20 AS sold
)
SELECT ANY_VALUE(fruit HAVING MAX sold) AS res
FROM data;
> null
duckdb:
WITH data AS (
SELECT 'A' AS fruit, 20 AS sold UNION ALL
SELECT NULL AS fruit, 25 AS sold UNION ALL
SELECT 'D' AS fruit, 20 AS sold
)
SELECT arg_max(fruit, sold) AS res
FROM data;
┌─────────┐
│ res │
│ varchar │
├─────────┤
│ A │
└─────────┘
As it seems current approach doesn't support the correct null handling. (maybe arg_max_null ?)
I'd say it seems fine, since BigQuery's docs specify that the result is non-deterministic:
This seems more important, and we should fix it:
|
|
@fivetran-amrutabhimsenayachit so, let's just solve the |
a89eef2 to
4a1350b
Compare
|
After using MIN: |
Add transpilation support for ANY_VALUE function with HAVING MAX and MIN clauses.
Issue:
Fix:
Transform
ANY_VALUE(expr HAVING MAX/MIN having_expr)toARG_MAX/ARG_MINMAX:
MIN: