-
Notifications
You must be signed in to change notification settings - Fork 4.5k
[BEAM-3981] [WIP] Futurize and fix python 2 compatibility for coders subpackage #4990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
3b53cfe
Add pylint2 --py3k check per updated subpackage
RobbeSneyders 5aca33a
Futurize and fix coder_impl.py
RobbeSneyders 267aa26
Futurize and fix coders.py
RobbeSneyders 74e2e53
Add future imports to __init__ so pylint doesn't complain
RobbeSneyders 5825249
futurize and fix all non-test modules in package
RobbeSneyders 18f9645
Futurize and fix coders_test.py
RobbeSneyders 89d854b
Add future imports to multiple modules
RobbeSneyders 64cb8e3
Futurize and fix standard_coders_test.py
RobbeSneyders 4d96741
Futurize and fix stream_test.py
RobbeSneyders 4c59207
Futurize and fix proto2_coder_test_messages_pb2.py
RobbeSneyders 6ffaf75
Futurize and fix typecoders_test.py
RobbeSneyders 32ffa52
Futurize and fix coders_test_common.py
RobbeSneyders f48f3f9
Fix custom coders
RobbeSneyders 835fded
Add py2/3 cpickle -> pickle comment
RobbeSneyders b23ee25
Replace bytes with memoryview in stream cython files
RobbeSneyders 4cd5a76
Revert isinstance checks to type checks
RobbeSneyders f6dd643
Split pylint for Python 3 compatibility into separate script
RobbeSneyders 8e2d9b7
Revert "Futurize and fix proto2_coder_test_messages_pb2.py"
RobbeSneyders 348605f
Fix some builtin typechecks
RobbeSneyders File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -25,20 +25,33 @@ | |
| coder_impl.pxd file for type hints. | ||
|
|
||
| For internal use only; no backwards-compatibility guarantees. | ||
|
|
||
| isort:skip_file | ||
| """ | ||
| from __future__ import absolute_import | ||
| from __future__ import division | ||
| from __future__ import print_function | ||
|
|
||
| native_int = int | ||
|
|
||
| from types import NoneType | ||
| # pylint: disable=wrong-import-order, wrong-import-position, ungrouped-imports | ||
| from builtins import bytes | ||
| from builtins import chr | ||
| from builtins import int | ||
| from builtins import object | ||
| from builtins import range | ||
| from builtins import str | ||
|
|
||
| import six | ||
| from past.builtins import str as old_str | ||
| from past.builtins import long | ||
| from past.builtins import unicode | ||
|
|
||
| from apache_beam.coders import observable | ||
| from apache_beam.utils import windowed_value | ||
| from apache_beam.utils.timestamp import MAX_TIMESTAMP | ||
| from apache_beam.utils.timestamp import MIN_TIMESTAMP | ||
| from apache_beam.utils.timestamp import Timestamp | ||
|
|
||
| # pylint: disable=wrong-import-order, wrong-import-position, ungrouped-imports | ||
| try: | ||
| from .stream import InputStream as create_InputStream | ||
| from .stream import OutputStream as create_OutputStream | ||
|
|
@@ -54,11 +67,6 @@ | |
| from .slow_stream import get_varint_size | ||
| # pylint: enable=wrong-import-order, wrong-import-position, ungrouped-imports | ||
|
|
||
| try: | ||
| long # Python 2 | ||
| except NameError: | ||
| long = int # Python 3 | ||
|
|
||
|
|
||
| class CoderImpl(object): | ||
| """For internal use only; no backwards-compatibility guarantees.""" | ||
|
|
@@ -199,7 +207,7 @@ def __init__(self, coder, step_label): | |
| self._step_label = step_label | ||
|
|
||
| def _check_safe(self, value): | ||
| if isinstance(value, (str, six.text_type, long, int, float)): | ||
| if isinstance(value, (str, bytes, int, float)): | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we need to keep |
||
| pass | ||
| elif value is None: | ||
| pass | ||
|
|
@@ -279,18 +287,21 @@ def get_estimated_size_and_observables(self, value, nested=False): | |
|
|
||
| def encode_to_stream(self, value, stream, nested): | ||
| t = type(value) | ||
| if t is NoneType: | ||
| if value is None: | ||
| stream.write_byte(NONE_TYPE) | ||
| elif t is int: | ||
| elif t is bool: | ||
| stream.write_byte(BOOL_TYPE) | ||
| stream.write_byte(value) | ||
| elif t is int or t is native_int or t is long: | ||
| stream.write_byte(INT_TYPE) | ||
| stream.write_var_int64(value) | ||
| elif t is float: | ||
| stream.write_byte(FLOAT_TYPE) | ||
| stream.write_bigendian_double(value) | ||
| elif t is str: | ||
| elif t is bytes or t is old_str: | ||
| stream.write_byte(STR_TYPE) | ||
| stream.write(value, nested) | ||
| elif t is six.text_type: | ||
| elif t is str or t is unicode: | ||
| unicode_value = value # for typing | ||
| stream.write_byte(UNICODE_TYPE) | ||
| stream.write(unicode_value.encode('utf-8'), nested) | ||
|
|
@@ -304,12 +315,9 @@ def encode_to_stream(self, value, stream, nested): | |
| dict_value = value # for typing | ||
| stream.write_byte(DICT_TYPE) | ||
| stream.write_var_int64(len(dict_value)) | ||
| for k, v in dict_value.iteritems(): | ||
| for k, v in dict_value.items(): | ||
| self.encode_to_stream(k, stream, True) | ||
| self.encode_to_stream(v, stream, True) | ||
| elif t is bool: | ||
| stream.write_byte(BOOL_TYPE) | ||
| stream.write_byte(value) | ||
| else: | ||
| stream.write_byte(UNKNOWN_TYPE) | ||
| self.fallback_coder_impl.encode_to_stream(value, stream, nested) | ||
|
|
@@ -318,6 +326,8 @@ def decode_from_stream(self, stream, nested): | |
| t = stream.read_byte() | ||
| if t == NONE_TYPE: | ||
| return None | ||
| elif t == BOOL_TYPE: | ||
| return not not stream.read_byte() | ||
| elif t == INT_TYPE: | ||
| return stream.read_var_int64() | ||
| elif t == FLOAT_TYPE: | ||
|
|
@@ -341,8 +351,6 @@ def decode_from_stream(self, stream, nested): | |
| k = self.decode_from_stream(stream, True) | ||
| v[k] = self.decode_from_stream(stream, True) | ||
| return v | ||
| elif t == BOOL_TYPE: | ||
| return not not stream.read_byte() | ||
|
|
||
| return self.fallback_coder_impl.decode_from_stream(stream, nested) | ||
|
|
||
|
|
@@ -394,8 +402,9 @@ def _from_normal_time(self, value): | |
|
|
||
| def encode_to_stream(self, value, out, nested): | ||
| span_micros = value.end.micros - value.start.micros | ||
| out.write_bigendian_uint64(self._from_normal_time(value.end.micros / 1000)) | ||
| out.write_var_int64(span_micros / 1000) | ||
| out.write_bigendian_uint64( | ||
| self._from_normal_time(value.end.micros // 1000)) | ||
| out.write_var_int64(span_micros // 1000) | ||
|
|
||
| def decode_from_stream(self, in_, nested): | ||
| end_millis = self._to_normal_time(in_.read_bigendian_uint64()) | ||
|
|
@@ -409,7 +418,7 @@ def estimate_size(self, value, nested=False): | |
| # An IntervalWindow is context-insensitive, with a timestamp (8 bytes) | ||
| # and a varint timespam. | ||
| span = value.end.micros - value.start.micros | ||
| return 8 + get_varint_size(span / 1000) | ||
| return 8 + get_varint_size(span // 1000) | ||
|
|
||
|
|
||
| class TimestampCoderImpl(StreamCoderImpl): | ||
|
|
@@ -427,7 +436,7 @@ def estimate_size(self, unused_value, nested=False): | |
| return 8 | ||
|
|
||
|
|
||
| small_ints = [chr(_) for _ in range(128)] | ||
| small_ints = [chr(_).encode('latin-1') for _ in range(128)] | ||
|
|
||
|
|
||
| class VarIntCoderImpl(StreamCoderImpl): | ||
|
|
@@ -783,7 +792,7 @@ def encode_to_stream(self, value, out, nested): | |
| # TODO(BEAM-1524): Clean this up once we have a BEAM wide consensus on | ||
| # precision of timestamps. | ||
| self._from_normal_time( | ||
| restore_sign * (abs(wv.timestamp_micros) / 1000))) | ||
| restore_sign * (abs(wv.timestamp_micros) // 1000))) | ||
| self._windows_coder.encode_to_stream(wv.windows, out, True) | ||
| # Default PaneInfo encoded byte representing NO_FIRING. | ||
| self._pane_info_coder.encode_to_stream(wv.pane_info, out, True) | ||
|
|
@@ -797,9 +806,9 @@ def decode_from_stream(self, in_stream, nested): | |
| # were indeed MIN/MAX timestamps. | ||
| # TODO(BEAM-1524): Clean this up once we have a BEAM wide consensus on | ||
| # precision of timestamps. | ||
| if timestamp == -(abs(MIN_TIMESTAMP.micros) / 1000): | ||
| if timestamp == -(abs(MIN_TIMESTAMP.micros) // 1000): | ||
| timestamp = MIN_TIMESTAMP.micros | ||
| elif timestamp == (MAX_TIMESTAMP.micros / 1000): | ||
| elif timestamp == (MAX_TIMESTAMP.micros // 1000): | ||
| timestamp = MAX_TIMESTAMP.micros | ||
| else: | ||
| timestamp *= 1000 | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -14,11 +14,14 @@ | |
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
| # | ||
|
|
||
| from __future__ import absolute_import | ||
| from __future__ import division | ||
| from __future__ import print_function | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Which of these is really necessary? Do we use division and print in this file? |
||
|
|
||
| import base64 | ||
| import logging | ||
| import unittest | ||
| from builtins import object | ||
|
|
||
| from apache_beam.coders import proto2_coder_test_messages_pb2 as test_message | ||
| from apache_beam.coders import coders | ||
|
|
@@ -99,6 +102,9 @@ def __eq__(self, other): | |
| return True | ||
| return False | ||
|
|
||
| def __hash__(self): | ||
| return hash(type(self)) | ||
|
|
||
|
|
||
| class FallbackCoderTest(unittest.TestCase): | ||
|
|
||
|
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to keep
longfor Python 2 compatibility.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The int from the future.builtins is a subclass of python 2's long.