VectorNumpy pickle suppport to enable multiprocessing#163
Conversation
… VectorNumpy object
for more information, see https://pre-commit.ci
jpivarski
left a comment
There was a problem hiding this comment.
You're right: VectorNumpy*D and MomentumNumpy*D instances do have state beyond the array state: they have
self._azimuthal_type(all classes)self._longitudinal_type(3D and 4D classes)self._temporal_type(4D classes)
which specifies which coordinate system the array belongs to. That information could have been generated every time it's needed by looking at the set of field names in its dtype, but instead it's assigned in __array_finalize__ (which you've likely seen when you fixed the obj is None case).
Another solution would have been to make this data lazy ("generate every time"), but the current implementation could be thought of as an optimization: the values of self._azimuthal_type, etc., are class objects, so checking their value is a quick is comparison. Doing it lazily would add several string comparisons. That would slow down the function dispatch—deciding which compute function to run—but it wouldn't slow down the execution of the compute function, which is where most of the time should be spent. So it's not obvious that we need this optimization, but why change things to make them slower? Functional purity (i.e. no mutable state other than what NumPy needs)?
Your solution of adding this state to the pickle data works well. I think the self.__dict__ includes instance state (self._azimuthal_type, etc.) and not class object state (VectorNumpy*D.ObjectClass and VectorNumpy*D._IS_MOMENTUM), which we wouldn't want to double-count because then the unpickled instances might have these as class attributes and instance attributes. I just checked: it does the right thing (__dict__ only includes instance state, not class object state). So that's good.
I like the fact that pickling could be solved in one place without code duplication. There's the right number of obj is None checks in __array_finalize__ methods (6, for the 6 concrete classes), so that's good. When pickling, there's a choice between __reduce__/__setstate__ and __getstate__/__setstate__. I don't know which is more modern and preferred. I just looked it up (here):
Although powerful, implementing
__reduce__()directly in your classes is error prone. For this reason, class designers should use the high-level interface (i.e.,__getnewargs_ex__(),__getstate__()and__setstate__()) whenever possible. We will show, however, cases where using__reduce__()is the only option or leads to more efficient pickling or both.
On the one hand, it's not saying that __reduce__ is deprecated; it's not less future-proof than __getstate__. If "error prone" is the only argument against it, then demonstrating that it works with tests is a good answer to it, unless it's also difficult to maintain. It doesn't look difficult to maintain. I'm fine with it being __reduce__ if you are.
If you'd prefer to replace __reduce__ with a __getstate__ (or some other change), let me know and I'll hold off merging until you're done. Otherwise, let me know that you're done with this PR and I'll merge it.
And thank you for contributing!
|
I'd usually use setstate/getstate (reduce should only be used if getstate cannot be used). Was there a reason getstate could not be used or would be significantly more complex? Given you are just slightly modifying a super call to reduce, I think reduce is pretty likely to be safe in this case. Sort of related, but nothing to do directly with this PR: can we add |
henryiii
left a comment
There was a problem hiding this comment.
I'm happy with this, ideally you can commend on __reduce__, but as long as it's clearly the best solution, I'm happy.
|
|
||
| def __setstate__(self, state: typing.Any) -> None: | ||
| self.__dict__.update(state[-1]) | ||
| super().__setstate__(state[0:-1]) # type: ignore |
There was a problem hiding this comment.
Reminder to myself: we should enable --show-error-codes on mypy and require the error codes here, so we know why these are being ignored. I can do that after the PR.
We could use From what I know, |
|
@cansik Give us an indication of whether you're done and Henry or I will squash-and-merge your PR. Thanks! |
|
My favorite reason to add |
|
@jpivarski & @henryiii Thanks for the review. I mainly followed the explanation in the stackoverflow answer and it seems that numpy does not support getstate, but reduce by default: https://stackoverflow.com/a/54487727/1138326 For this reason, I would leave the implementation as it is and I have nothing to add to it. |
That's definitely a good enough reason. NumPy's restrictions on subclasses are pretty tight. Thanks, I'll merge it now! |
When sharing
VectorNumpybased arrays (likeVectorNumpy3D) between processes in python, pickle is used for serialisation and deserialisation of the objects. But it seems that structured arrays which inherhit form a Numpy array lose part of their state during that process.Here is an example that shows the problem:
Exception A
First, there is a problem that
__array_finalize__is called twice frompickle.loads, once with the actual data and once with a None type. This leads to the following error message:It is possible to prevent this behaviour by checking for
Nonearguments and just ignore them (L900-L901)Exception B
After that it is possible to
printthe unpickled array, but accessing the elements itself (test_new[0]) leads to another error message. I guess it has to do with the construction of the VectorObjects itself.Solution
After a bit of research I found this stackoverflow answer, why subclasses of numpy can lose their state when being pickled. Implementing this in the
VectorNumpyparent class (L433-L440) fixes it also for all subclasses likeVectorNumpy3D. To check if the implementation works, there are six new tests pickling/unpickling every Vector/Momentum-Numpy array variant.In my own projects the code is already used for multi-processed calculations and result-sharing in the
VectorNumpy4Dformat. Would be great if you could review it and give me feedback if this implementation is ok.