-
Notifications
You must be signed in to change notification settings - Fork 109
Handle (new) buffer protocol conforming types in Pickle.decode
#143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
On Python 3, `pickle.loads` is able to work with anything that supports the (new) buffer protocol. Unfortunately Python 2's, `cPickle.loads` is not so flexible and requires a `bytes` object specifically. So this ensures that other objects are coerced to `bytes` objects specifically on Python 2.
On Python 3, `pickle.loads` expects an object that implements the buffer protocol and is contiguous. While it would already raise this error, it helpful for us to check this first. Plus we should be able to handle Fortran order buffers should that be needed. On Python 2, the more strict constraint of a `bytes` object is required, which is still enforced.
This ensures that Pickle can decode anything that supports the (new) buffer protocol.
| if PY2: | ||
| buf = ensure_bytes(buf) | ||
| else: | ||
| buf = ensure_contiguous_ndarray(buf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to provide context from the Python codebase, Python 2 requires bytes here and Python 3 uses the (new) buffer protocol to get a C-contiguous array.
|
|
||
| enc = codec.encode(arr) | ||
| dec = codec.decode(ensure_ndarray(enc)) | ||
| assert_array_items_equal(arr, dec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems this worked for everything except Pickle on Python 2 as demonstrated by this CI failure. This is likely a consequence of the improved buffer handling under the hood, which MsgPack, JSON, etc. all use already. Pickle was the only one not using these functions, which this PR fixes. So testing this generally seems to be fine. All the other codecs already test decoding with ndarray.
alimanfoo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks.
|
Thanks for taking a look. |
Python 3's
pickle.loadsis able to handle anything that conforms to the (new) buffer protocol as long as it is C-contiguous. We go ahead and enforce this on all inputs; raising if that is not possible. On Python 2cPickle.loadsis not as flexible as it requires abytesobject explicitly. This enforces that case as well.TODO:
tox -e py37passes locallytox -e py27passes locallytox -e docspasses locally