Skip to content

feat: Integrate CastColumnExpr into PhysicalExprAdapter#20269

Merged
kosiew merged 3 commits intoapache:mainfrom
kumarUjjawal:feat/cast_to_physical_adapter
Mar 9, 2026
Merged

feat: Integrate CastColumnExpr into PhysicalExprAdapter#20269
kosiew merged 3 commits intoapache:mainfrom
kumarUjjawal:feat/cast_to_physical_adapter

Conversation

@kumarUjjawal
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

Ensure physical expression schema adaptation uses struct-aware casting so schema evolution for nested structs preserves field ordering, fills missing fields with NULLs, and propagates logical field metadata/nullability correctly.

What changes are included in this PR?

Wire schema adaptation to emit struct-aware cast expressions for column/field mismatches (including metadata and nullability), extend expression property propagation so optimizer logic continues to work, and add unit + integration coverage for struct projection and filtering under schema evolution.

Are these changes tested?

Yes, added new unit tests aswell

Are there any user-facing changes?

No

@github-actions github-actions Bot added physical-expr Changes to the physical-expr crates core Core DataFusion crate labels Feb 10, 2026
@kumarUjjawal
Copy link
Copy Markdown
Contributor Author

cc @kosiew

Copy link
Copy Markdown
Contributor

@adriangb adriangb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a step in the right direction to me, but I still really think we need to urgently merge CastColumnExpr into CastExpr to avoid more code paths and drift accumulating.

Comment on lines +182 to +183
/// A [`CastColumnExpr`] preserves the ordering of its child if the cast is done
/// under the same datatype family.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a high level SLT test for this? I'm afraid that as we refactor to e.g. fold CastColumnExpr into CastExpr we'll loose this property (which seems very valuable).

Comment on lines +856 to +857
} else if let Some(cast_expr) =
r_expr.as_any().downcast_ref::<CastColumnExpr>()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the unfortunate part of having two cast expressions: we have to handle both in multiple places.

@paleolimbot this makes me think that even customizing casts with a cast registry that chooses a different cast expression will be problematic. It almost feels like we either need to (1) have a single Cast expression that internally dispatches (so code snippets like this can match on a single expression) or (2) have some way to determine "is this a cast expression?". To complicate matters further: do all cast expressions share the same properties e.g. preserving order (see)

@github-actions github-actions Bot added the sqllogictest SQL Logic Tests (.slt) label Mar 6, 2026
@kosiew kosiew added this pull request to the merge queue Mar 9, 2026
Merged via the queue into apache:main with commit 1eb5206 Mar 9, 2026
34 checks passed
de-bgunter pushed a commit to de-bgunter/datafusion that referenced this pull request Mar 24, 2026
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes apache#123` indicates that this PR will close issue apache#123.
-->

- Closes apache#20163

## Rationale for this change

Ensure physical expression schema adaptation uses struct-aware casting
so schema evolution for nested structs preserves field ordering, fills
missing fields with NULLs, and propagates logical field
metadata/nullability correctly.

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

Wire schema adaptation to emit struct-aware cast expressions for
column/field mismatches (including metadata and nullability), extend
expression property propagation so optimizer logic continues to work,
and add unit + integration coverage for struct projection and filtering
under schema evolution.

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

Yes, added new unit tests aswell

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

No

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate physical-expr Changes to the physical-expr crates sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Integrate CastColumnExpr into PhysicalExprAdapter

3 participants