Skip to content

Conversation

@findepi
Copy link
Member

@findepi findepi commented Aug 8, 2025

Previously, the WindowUDFImpl trait contained equals and
hash_value methods with contracts following the Eq and Hash
traits. However, the existence of default implementations of these
methods made it error-prone, with many functions (scalar, aggregate,
window) missing to customize the equals even though they ought to.
There is no fix to this that's not an API breaking change, so a breaking
change is warranted.

Removing the default implementations would be enough of a solution, but
at the cost of a lot of boilerplate needed in implementations.

Instead, this removes the methods from the trait, and reuses DynEq,
DynHash traits used previously only for physical expressions. This
allows for functions to provide their implementations using no more than
#[derive(PartialEq, Eq, Hash)] in a typical case.

@findepi findepi added the api change Changes the API exposed to users of the crate label Aug 8, 2025
@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates optimizer Optimizer rules core Core DataFusion crate proto Related to proto crate functions Changes to functions implementation ffi Changes to the ffi crate labels Aug 8, 2025
@findepi findepi force-pushed the findepi/udwf-perfect-eq branch 3 times, most recently from 4e26aaa to b6fcc88 Compare August 8, 2025 09:51
@findepi findepi changed the title Derive WindowUDFImpl eq,hash from Eq,Hash traits Derive WindowUDFImpl equality, hash from Eq, Hash traits Aug 8, 2025
@findepi findepi requested review from alamb, kosiew and timsaucer August 8, 2025 09:51
@findepi findepi force-pushed the findepi/udwf-perfect-eq branch 2 times, most recently from c50b82d to 58c1e4f Compare August 8, 2025 10:17
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @findepi -- I think the code is looking good here

The only thing I think is missing is some additional documentation in the Upgrading.md guide to explain to people upgrading what they will need to do (add PartialEq and Eq derivation to their functions, right?)

@findepi
Copy link
Member Author

findepi commented Aug 8, 2025

findepi added 2 commits August 8, 2025 22:23
Use `DynEq` and `DynHash` traits from physical expressions crate to a
common crate for physical and logical expressions. This allows them to
be used by logical expressions.
Previously, the `WindowUDFImpl` trait contained `equals` and
`hash_value` methods with contracts following the `Eq` and `Hash`
traits.  However, the existence of default implementations of these
methods made it error-prone, with many functions (scalar, aggregate,
window) missing to customize the equals even though they ought to.
There is no fix to this that's not an API breaking change, so a breaking
change is warranted.

Removing the default implementations would be enough of a solution, but
at the cost of a lot of boilerplate needed in implementations.

Instead, this removes the methods from the trait, and reuses `DynEq`,
`DynHash` traits used previously only for physical expressions. This
allows for functions to provide their implementations using no more than
`#[derive(PartialEq, Eq, Hash)]` in a typical case.
@findepi findepi force-pushed the findepi/udwf-perfect-eq branch from 58c1e4f to ab03a96 Compare August 8, 2025 20:23
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Aug 8, 2025
@findepi findepi requested a review from alamb August 9, 2025 20:43
Copy link
Member

@timsaucer timsaucer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

Copy link
Contributor

@kosiew kosiew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments for your consideration.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @findepi @xudong963 and @kosiew

findepi and others added 2 commits August 11, 2025 19:24
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
@findepi
Copy link
Member Author

findepi commented Aug 11, 2025

All comments addressed. Thank you @alamb @xudong963 @timsaucer @kosiew for your diligent reviews!

@findepi findepi merged commit 8494a39 into apache:main Aug 11, 2025
28 checks passed
@findepi findepi deleted the findepi/udwf-perfect-eq branch August 11, 2025 19:31
@findepi
Copy link
Member Author

findepi commented Aug 11, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api change Changes the API exposed to users of the crate core Core DataFusion crate documentation Improvements or additions to documentation ffi Changes to the ffi crate functions Changes to functions implementation logical-expr Logical plan and expressions optimizer Optimizer rules physical-expr Changes to the physical-expr crates proto Related to proto crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Replace WindowUDFImpl::{equals,hash_value} with UdfHash, UdfEq traits Implement PartialEq, Hash for all UDWFs (WindowUDFImpl)

5 participants