Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Feb 18, 2021

This is part of a larger body of work I would like to do to DataFusion to make it more efficient and idomatic Rust. See https://issues.apache.org/jira/browse/ARROW-11689 for more context.

The theme is to make the plan and expression rewriting phases of DataFusion more efficient by avoiding copies

This particular PR avoids deep cloning Exprs when building up new exprs. While this is technically a backwards incompatible change, given there was only a single place in the datafusion codebase that needs to be updated, I think the impact will be minimal.

The basic principle is if the function needs to clone one of its arguments, the caller should be given the choice of when to do that. Often, the caller has no more need of the object and thus can give up ownership without issue

@github-actions
Copy link

pub fn eq(&self, other: Expr) -> Expr {
binary_expr(self.clone(), Operator::Eq, other)
/// Return `self == other`
pub fn eq(self, other: Expr) -> Expr {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change is to have all these functions take self rather than &self.

Given they almost always are used like let expr = lit("foo").eq(lit("bar")) self is not reused.

Furthermore, the clone() is actually doing a deep clone, so in the above example lit("foo") gets copied twice.

min_column_expr
.lt_eq(expr_builder.scalar_expr().clone())
.and(expr_builder.scalar_expr().lt_eq(max_column_expr))
.and(expr_builder.scalar_expr().clone().lt_eq(max_column_expr))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the only change where the caller had to do an extra clone (and in this case it was actually needed as scalar_expr() is used twice. In other locations the expr was ignored and this can be consumed without issue

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@codecov-io
Copy link

Codecov Report

Merging #9527 (5fa7da8) into master (bca7d2f) will increase coverage by 0.03%.
The diff coverage is 91.01%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9527      +/-   ##
==========================================
+ Coverage   82.24%   82.27%   +0.03%     
==========================================
  Files         244      244              
  Lines       55289    55371      +82     
==========================================
+ Hits        45470    45556      +86     
+ Misses       9819     9815       -4     
Impacted Files Coverage Δ
rust/arrow/src/compute/kernels/comparison.rs 95.84% <ø> (ø)
rust/datafusion/src/logical_plan/expr.rs 80.75% <84.84%> (ø)
rust/arrow/src/json/reader.rs 83.63% <89.89%> (+0.81%) ⬆️
rust/arrow/src/compute/kernels/cast.rs 97.28% <100.00%> (+0.04%) ⬆️
rust/datafusion/src/physical_plan/limit.rs 57.47% <100.00%> (-0.49%) ⬇️
rust/datafusion/src/physical_plan/parquet.rs 88.05% <100.00%> (ø)
rust/datafusion/src/physical_plan/planner.rs 79.58% <100.00%> (-0.05%) ⬇️
rust/datafusion/src/physical_plan/sort.rs 90.17% <100.00%> (+0.90%) ⬆️
rust/parquet/src/encodings/encoding.rs 95.05% <0.00%> (+0.19%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a951c6c...5fa7da8. Read the comment docs.

@alamb alamb requested a review from jorgecarleitao February 20, 2021 12:20
@alamb
Copy link
Contributor Author

alamb commented Feb 20, 2021

FYI @seddonm1 and @Dandandan

Copy link
Contributor

@Dandandan Dandandan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@seddonm1
Copy link
Contributor

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants