Skip to content

[substrait] Scalar subquery in select not supported #18066

@bvolpato-dd

Description

@bvolpato-dd

Describe the bug

For certain queries with subqueries, DataFusion optimizes them as a LEFT join without condition (ScalarSubqueryToJoin).

A recent addition to require conditions for joins (#15334) has added the validation that causes problems in this type of query now:

---- cases::roundtrip_logical_plan::scalar_subquery_in_select stdout ----
Error: Plan("join condition should not be empty")

To Reproduce

SELECT a, (SELECT MAX(b) FROM data2) as max_b FROM data

Returns error:

Error: Plan("join condition should not be empty")

Expected behavior

Boolean(true) in the filter:

Projection: data.a, max(data2.b) AS max_b
  Left Join: 
    TableScan: data projection=[a]
    Aggregate: groupBy=[[]], aggr=[[max(data2.b)]]
      TableScan: data2 projection=[b], partial_filters=[Boolean(true)]

Additional context

No response

Metadata

Metadata

Labels

bugSomething isn't working

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions