-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-11690: [Rust][DataFusion] Avoid expr copies while using builder methods #9527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| pub fn eq(&self, other: Expr) -> Expr { | ||
| binary_expr(self.clone(), Operator::Eq, other) | ||
| /// Return `self == other` | ||
| pub fn eq(self, other: Expr) -> Expr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change is to have all these functions take self rather than &self.
Given they almost always are used like let expr = lit("foo").eq(lit("bar")) self is not reused.
Furthermore, the clone() is actually doing a deep clone, so in the above example lit("foo") gets copied twice.
| min_column_expr | ||
| .lt_eq(expr_builder.scalar_expr().clone()) | ||
| .and(expr_builder.scalar_expr().lt_eq(max_column_expr)) | ||
| .and(expr_builder.scalar_expr().clone().lt_eq(max_column_expr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the only change where the caller had to do an extra clone (and in this case it was actually needed as scalar_expr() is used twice. In other locations the expr was ignored and this can be consumed without issue
andygrove
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Codecov Report
@@ Coverage Diff @@
## master #9527 +/- ##
==========================================
+ Coverage 82.24% 82.27% +0.03%
==========================================
Files 244 244
Lines 55289 55371 +82
==========================================
+ Hits 45470 45556 +86
+ Misses 9819 9815 -4
Continue to review full report at Codecov.
|
|
FYI @seddonm1 and @Dandandan |
Dandandan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
|
LGTM |
This is part of a larger body of work I would like to do to DataFusion to make it more efficient and idomatic Rust. See https://issues.apache.org/jira/browse/ARROW-11689 for more context.
The theme is to make the plan and expression rewriting phases of DataFusion more efficient by avoiding copies
This particular PR avoids deep cloning
Exprs when building up new exprs. While this is technically a backwards incompatible change, given there was only a single place in the datafusion codebase that needs to be updated, I think the impact will be minimal.The basic principle is if the function needs to
cloneone of its arguments, the caller should be given the choice of when to do that. Often, the caller has no more need of the object and thus can give up ownership without issue