Skip to content

Conversation

@RobbeSneyders
Copy link
Contributor

This pull request prepares the utils subpackage for Python 3 support. This PR is part of a series in which all subpackages will be updated using the same approach.
This approach has been documented here and the first pull request in the series (Futurize coders subpackage) demonstrating this approach can be found at #5053.

R: @aaltay @tvalentyn

# Returns False (0) if equal, and True (1) if not.
return not WindowedValue._typed_eq(left, right)
def __eq__(self, other):
return type(self) == type(other) and self._key() == other._key()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robertwb should __eq__ be defined in the .pxd file or is this optimized automatically?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eq is a builtin (slots) method that is optimized automatically.

return self.micros // 1000000

def __cmp__(self, other):
# Allow comparisons between Duration and Timestamp values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep this comment to explain the reason for the casting.


def __cmp__(self, other):
# Allow comparisons between Duration and Timestamp values.
def __eq__(self, other):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we keep __eq__ and __lt__, remove the rest and decorate the class with @functools.total_ordering?

and left.windows == right.windows
and left.pane_info == right.pane_info)
def __hash__(self):
return hash((type(self),) + self._key())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think hash(self._key()) would be sufficient.

import cProfile
from __future__ import absolute_import

import cProfile # pylint: disable=bad-python3-import
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my education, why does py3 lint complain here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug in pylint.
See pylint-dev/pylint#1612

and left.value == right.value
and left.windows == right.windows
and left.pane_info == right.pane_info)
def __hash__(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patience for awaiting the feedback, @RobbeSneyders.

I doublechecked performance of hash using a microbenchmark. Previous implementation of hash appears to be at least 7.5% faster, possibly due to overheads to create a tuple, and/or the way Cython compiles it.

I suggest we revert to previous implementation of hash. For consistency we can also implement eq by comparing fields directly.

My microbenchmark can be found in master...tvalentyn:utils_futurization_benchmark, see tvalentyn@67d3df0. It depends on the mini-framework that's being finalized in #5565.

CC: @robertwb

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the benchmarking, this is very useful!
I just ran the benchmark myself and got a performance regression of 6% for reading and 15% for adding to a dict.

I will revert the changes.

self.windows,
self.pane_info)

def _key(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this anymore.

def __eq__(self, other):
return (type(self) == type(other)
and self.timestamp_micros == other.timestamp_micros
and self.value == self.value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/self/other in right hand side of the equality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, took this straight from the benchmark and didn't notice it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, thank you.

@tvalentyn
Copy link
Contributor

LGTM, thank you, @RobbeSneyders!

Copy link
Member

@aaltay aaltay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@aaltay aaltay merged commit 1009c93 into apache:master Jun 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants