Skip to content

Add support for non-correlated subqueries #3266

@andygrove

Description

@andygrove

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

I would like to be able to run this query:

CREATE TABLE paintings AS SELECT 'Mona Lisa' as name, 1000000 as listed_price;

SELECT name, listed_price
FROM paintings
WHERE listed_price > (
    SELECT AVG(listed_price)
    FROM paintings
);

It currently fails with:

Skipping optimizer rule decorrelate_scalar_subquery due to unexpected error: scalar subqueries must have a filter to be correlated at /home/andy/git/apache/arrow-datafusion/datafusion/optimizer/src/decorrelate_scalar_subquery.rs:177
caused by
Error during planning: Could not coerce into Filter! at /home/andy/git/apache/arrow-datafusion/datafusion/expr/src/logical_plan/plan.rs:1127
NotImplemented("Physical plan does not support logical expression (<subquery>)")

This is because the existing subquery optimizer rules fail to rewrite the query.

It should be possible to rewrite the query as a join:

SELECT name, listed_price
FROM paintings
CROSS JOIN (SELECT AVG(listed_price) AS avg_price FROM paintings) temp
WHERE listed_price > temp.avg_price;

Describe the solution you'd like
☝️

Describe alternatives you've considered
None

Additional context
None

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestsqlSQL Planner

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions