Skip to content

Conversation

@adriangb
Copy link
Contributor

@adriangb adriangb commented Sep 4, 2025

#17395 (comment)

This makes code more legible and easier to follow.

@adriangb adriangb requested a review from blaginin September 4, 2025 00:55
@github-actions github-actions bot added optimizer Optimizer rules core Core DataFusion crate proto Related to proto crate datasource Changes to the datasource crate physical-plan Changes to the physical-plan crate labels Sep 4, 2025
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Sep 4, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @adriangb

I agree this is a nicer API. The only think I worry about is the disruption to existing users .

I left a suggestion that I think would reduce the API churn needed substantially. Let me know what you think

(col("c1", schema.as_ref()).unwrap(), "c1".to_string()),
(col("c2", schema.as_ref()).unwrap(), "c2".to_string()),
(col("c3", schema.as_ref()).unwrap(), "c3".to_string()),
ProjectionExpr {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might be able to make the API change easier to handle by using a From impl:

so like instead of

vec![
 (col("c1", schema.as_ref()).unwrap(), "c1".to_string()),
]

something like

vec![
 ProjectionExpr::from(col("c1", schema.as_ref()).unwrap(), "c1".to_string()),
]

This would mean that people upgrading could simply add ProjectionExpr::from

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going one step farther, I think we could make ProjectionExec generic, like this

impl ProjectionExec {
    /// Create a projection on an input
    pub fn try_new(
        expr: impl IntoIterator<Item=ProjectionExpr>,
        input: Arc<dyn ExecutionPlan>,
    ) -> Result<Self> {

That would let people use the same code mostly without change

Update: I tried it out and it seems to work pretty well. Check out

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice yeah I was thinking of doing that but thought it best to update our internal tests anyway.

The unfortunate part is that there's no way to mark an implementation as deprecated: ideally we'd allow this for a couple releases then remove it to simplify things, but that's not possible. I guess we keep it around forever.

@alamb alamb added the api change Changes the API exposed to users of the crate label Sep 4, 2025
* Keep ProjectionExec mostly backwards compatible

* Add doc examples and explicit test for old API
@adriangb adriangb merged commit a951fc9 into apache:main Sep 4, 2025
29 checks passed
destrex271 pushed a commit to destrex271/datafusion that referenced this pull request Sep 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api change Changes the API exposed to users of the crate core Core DataFusion crate datasource Changes to the datasource crate documentation Improvements or additions to documentation optimizer Optimizer rules physical-plan Changes to the physical-plan crate proto Related to proto crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants