Skip to content

fix(substrait): Allow scalar subquery in projection#18068

Closed
bvolpato wants to merge 1 commit intoapache:mainfrom
bvolpato:fix/substrait-scalar-subquery
Closed

fix(substrait): Allow scalar subquery in projection#18068
bvolpato wants to merge 1 commit intoapache:mainfrom
bvolpato:fix/substrait-scalar-subquery

Conversation

@bvolpato
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

Recent validations added at #15334 started to cause problems for some LEFT JOIN queries that don't have conditions, for example:

SELECT a, (SELECT MAX(b) FROM data2) as max_b FROM data

This query gets optimized as a left join without conditions.

This basically follows the same approach from #15334 :

vec![Expr::Literal(ScalarValue::Boolean(Some(true)), None)],

What changes are included in this PR?

For joins without conditions, use cross_join if inner, otherwise use a filter with Boolean(true)

Are these changes tested?

Added a roundtrip test case.

Are there any user-facing changes?

n/a

@github-actions github-actions Bot added the substrait Changes to the substrait crate label Oct 15, 2025
@bvolpato-dd bvolpato-dd force-pushed the fix/substrait-scalar-subquery branch from 88a4027 to 5cbfa7b Compare October 15, 2025 04:19
Comment on lines +79 to +82
// For joins without conditions, use cross_join if inner, otherwise use a filter with Boolean(true)
if join_type == JoinType::Inner {
left.cross_join(right.build()?)?.build()
} else {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing to take into account is that Postgres does not allow this. In Postgres, it's invalid to provide an inner join without a join condition (although other DBs like MySQL allow it).

My impression is that the change in #15334 was actually trying to adhere to this standard, which is a fair decision, so that means that probably the Substrait path should also adhere to that standard, unless there's a strong reason to diverge.

Don't have a strong opinion about whether if we should automatically convert inner joins without conditions to cross joins or not, but I'd advocate for consistency in both the normal SQL path and the Substrait path

@github-actions
Copy link
Copy Markdown

Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days.

@github-actions github-actions Bot added the Stale PR has not had any activity for some time label Dec 16, 2025
@github-actions github-actions Bot closed this Dec 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Stale PR has not had any activity for some time substrait Changes to the substrait crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[substrait] Scalar subquery in select not supported

2 participants