-
Notifications
You must be signed in to change notification settings - Fork 4.5k
[BEAM-4008] Futurize utils subpackage #5336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| # Returns False (0) if equal, and True (1) if not. | ||
| return not WindowedValue._typed_eq(left, right) | ||
| def __eq__(self, other): | ||
| return type(self) == type(other) and self._key() == other._key() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robertwb should __eq__ be defined in the .pxd file or is this optimized automatically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eq is a builtin (slots) method that is optimized automatically.
| return self.micros // 1000000 | ||
|
|
||
| def __cmp__(self, other): | ||
| # Allow comparisons between Duration and Timestamp values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep this comment to explain the reason for the casting.
|
|
||
| def __cmp__(self, other): | ||
| # Allow comparisons between Duration and Timestamp values. | ||
| def __eq__(self, other): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we keep __eq__ and __lt__, remove the rest and decorate the class with @functools.total_ordering?
| and left.windows == right.windows | ||
| and left.pane_info == right.pane_info) | ||
| def __hash__(self): | ||
| return hash((type(self),) + self._key()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think hash(self._key()) would be sufficient.
| import cProfile | ||
| from __future__ import absolute_import | ||
|
|
||
| import cProfile # pylint: disable=bad-python3-import |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my education, why does py3 lint complain here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bug in pylint.
See pylint-dev/pylint#1612
| and left.value == right.value | ||
| and left.windows == right.windows | ||
| and left.pane_info == right.pane_info) | ||
| def __hash__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your patience for awaiting the feedback, @RobbeSneyders.
I doublechecked performance of hash using a microbenchmark. Previous implementation of hash appears to be at least 7.5% faster, possibly due to overheads to create a tuple, and/or the way Cython compiles it.
I suggest we revert to previous implementation of hash. For consistency we can also implement eq by comparing fields directly.
My microbenchmark can be found in master...tvalentyn:utils_futurization_benchmark, see tvalentyn@67d3df0. It depends on the mini-framework that's being finalized in #5565.
CC: @robertwb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the benchmarking, this is very useful!
I just ran the benchmark myself and got a performance regression of 6% for reading and 15% for adding to a dict.
I will revert the changes.
| self.windows, | ||
| self.pane_info) | ||
|
|
||
| def _key(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need this anymore.
| def __eq__(self, other): | ||
| return (type(self) == type(other) | ||
| and self.timestamp_micros == other.timestamp_micros | ||
| and self.value == self.value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/self/other in right hand side of the equality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, took this straight from the benchmark and didn't notice it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries, thank you.
|
LGTM, thank you, @RobbeSneyders! |
aaltay
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
This pull request prepares the utils subpackage for Python 3 support. This PR is part of a series in which all subpackages will be updated using the same approach.
This approach has been documented here and the first pull request in the series (Futurize coders subpackage) demonstrating this approach can be found at #5053.
R: @aaltay @tvalentyn