Refactor Transformations from closure into class#2859
Refactor Transformations from closure into class#2859orbeckst merged 33 commits intoMDAnalysis:developfrom
Conversation
|
Hello @yuxuanzhuang! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-09-06 17:15:25 UTC |
e2bd2ba to
05f2ddd
Compare
richardjgowers
left a comment
There was a problem hiding this comment.
this is a good idea but it's important not to be doing validity checks every frame
|
So for most parts I just moved code around, the only change is trans = PositionAverager(3, check_reset=True)
u = mda.Universe(TPR, XTC)
u.trajectory.add_transformations(trans)
u.trajectory[3]
print(u.atoms.positions[0])
u.trajectory[3]
print(u.atoms.positions[0])
u.trajectory[3]
print(u.atoms.positions[0])
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #2859 +/- ##
========================================
Coverage 93.01% 93.02%
========================================
Files 187 187
Lines 24940 24962 +22
Branches 3261 3261
========================================
+ Hits 23198 23220 +22
Misses 1694 1694
Partials 48 48 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
There's a segfault in a pickle transformation test https://travis-ci.com/github/MDAnalysis/mdanalysis/jobs/365340906 – do you have an idea what's happening there? |
|
There are also test failures related to pickle transformations on WIndows, eg https://dev.azure.com/mdanalysis/mdanalysis/_build/results?buildId=658&view=logs&j=6337f8de-d220-572c-a53d-20c279e44ba9&t=38e5c318-25f5-5156-9fa7-6cd72ffb271c&l=99 |
|
I am not sure why it's happening, but the reason is that I define a universe (PDB, XTC) at the top level of the file (line 44). I guess some specific version of python/numpy/? is not quite happy with that. =================================== ERRORS ====================================
_ ERROR collecting MDAnalysisTests/parallelism/test_pickle_transformation.py __
MDAnalysisTests\parallelism\test_pickle_transformation.py:44: in <module>
u = mda.Universe(TPR, XTC)
C:\hostedtoolcache\windows\Python\3.7.8\x64\lib\site-packages\mdanalysis-2.0.0.dev0-py3.7-win-amd64.egg\MDAnalysis\core\universe.py:386: in __init__
in_memory_step=in_memory_step, **kwargs)
C:\hostedtoolcache\windows\Python\3.7.8\x64\lib\site-packages\mdanalysis-2.0.0.dev0-py3.7-win-amd64.egg\MDAnalysis\core\universe.py:578: in load_new
self.trajectory = reader(filename, format=format, **kwargs)
C:\hostedtoolcache\windows\Python\3.7.8\x64\lib\site-packages\mdanalysis-2.0.0.dev0-py3.7-win-amd64.egg\MDAnalysis\coordinates\XDR.py:149: in __init__
self._load_offsets()
C:\hostedtoolcache\windows\Python\3.7.8\x64\lib\site-packages\mdanalysis-2.0.0.dev0-py3.7-win-amd64.egg\MDAnalysis\coordinates\XDR.py:194: in _load_offsets
data = read_numpy_offsets(fname)
C:\hostedtoolcache\windows\Python\3.7.8\x64\lib\site-packages\mdanalysis-2.0.0.dev0-py3.7-win-amd64.egg\MDAnalysis\coordinates\XDR.py:84: in read_numpy_offsets
return {k: v for k, v in np.load(filename).items()}
C:\hostedtoolcache\windows\Python\3.7.8\x64\lib\site-packages\numpy\lib\npyio.py:444: in load
raise ValueError("Cannot load file containing pickled data "
E ValueError: Cannot load file containing pickled data when allow_pickle=FalseAfter I changed it to DCD, it's fine now. |
|
Uh, oh, ... I think @IAlibay and @lilyminium have been chasing an elusive segfault that seems to be connected to numpy and xtc. Just pinging so that they can comment. |
orbeckst
left a comment
There was a problem hiding this comment.
I only had time to look at the docs in more detail, see comments.
|
ping @richardjgowers for re-review |
|
Met a problem when I was wrapping things up. After we change the behaviour of fit_trans = trans.fit_rot_trans(u.atoms, u.atoms)
u.trajectory.add_transformations(fit_trans)
u_dumped = pickle.dumps(u)
u_p = pickle.loads(u_dumped)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-8-e66d2dd7c314> in <module>
----> 1 u_p = pickle.loads(pickle.dumps(u))
~/mdanalysis/package/MDAnalysis/core/groups.py in _unpickle(u, ix)
113 def _unpickle(u, ix):
114 print(u.__dict__)
--> 115 return u.atoms[ix]
116
117
AttributeError: 'Universe' object has no attribute 'atoms'Not totally sure the exact reason. Seems that in the new process, transformations (in which have EDIT: My hunch is, for pickling
while |
| u_p = pickle.loads(pickle.dumps(u)) | ||
| u.trajectory[0] | ||
| for u_ts, u_p_ts in zip(u.trajectory[:5], u_p.trajectory[:5]): | ||
| assert_equal(u_ts.positions, u_p_ts.positions) |
There was a problem hiding this comment.
Don't use equality comparison for floats
| assert_equal(u_ts.positions, u_p_ts.positions) | |
| assert_almost_equal(u_ts.positions, u_p_ts.positions) |
| u_p = pickle.loads(pickle.dumps(u)) | ||
| u.trajectory[0] | ||
| for u_ts, u_p_ts in zip(u.trajectory[:5], u_p.trajectory[:5]): | ||
| assert_equal(u_ts.positions, u_p_ts.positions) |
There was a problem hiding this comment.
Don't use equality comparison for floats
| assert_equal(u_ts.positions, u_p_ts.positions) | |
| assert_almost_equal(u_ts.positions, u_p_ts.positions) |
| u_p = pickle.loads(pickle.dumps(u)) | ||
| u.trajectory[0] | ||
| for u_ts, u_p_ts in zip(u.trajectory[:5], u_p.trajectory[:5]): | ||
| assert_equal(u_ts.positions, u_p_ts.positions) |
There was a problem hiding this comment.
Don't use equality comparison for floats
| assert_equal(u_ts.positions, u_p_ts.positions) | |
| assert_almost_equal(u_ts.positions, u_p_ts.positions) |
| u_p = pickle.loads(pickle.dumps(u)) | ||
| u.trajectory[0] | ||
| for u_ts, u_p_ts in zip(u.trajectory[:5], u_p.trajectory[:5]): | ||
| assert_equal(u_ts.positions, u_p_ts.positions) |
There was a problem hiding this comment.
Don't use equality comparison for floats
| assert_equal(u_ts.positions, u_p_ts.positions) | |
| assert_almost_equal(u_ts.positions, u_p_ts.positions) |
| u_p = pickle.loads(pickle.dumps(u)) | ||
| u.trajectory[0] | ||
| for u_ts, u_p_ts in zip(u.trajectory[:5], u_p.trajectory[:5]): | ||
| assert_equal(u_ts.positions, u_p_ts.positions) |
There was a problem hiding this comment.
Don't use equality comparison for floats
| assert_equal(u_ts.positions, u_p_ts.positions) | |
| assert_almost_equal(u_ts.positions, u_p_ts.positions) |
| u_p = pickle.loads(pickle.dumps(u)) | ||
| u.trajectory[0] | ||
| for u_ts, u_p_ts in zip(u.trajectory[:5], u_p.trajectory[:5]): | ||
| assert_equal(u_ts.positions, u_p_ts.positions) |
There was a problem hiding this comment.
Don't use equality comparison for floats
| assert_equal(u_ts.positions, u_p_ts.positions) | |
| assert_almost_equal(u_ts.positions, u_p_ts.positions) |
| # | ||
| import pytest | ||
| import pickle | ||
| from numpy.testing import assert_equal |
There was a problem hiding this comment.
Use this for float comparisons:
| from numpy.testing import assert_equal | |
| from numpy.testing import assert_almost_equal |
| # apply changes to the Timestep object | ||
| return ts | ||
|
|
||
| See `MDAnalysis.transformations.translate` for a simple example. |
There was a problem hiding this comment.
The above is not a valid sphinx link.
| See `MDAnalysis.transformations.translate` for a simple example. | |
| See :mod:`MDAnalysis.transformations.translate` for a simple example. |
| are often required for some analyses and visualization, and the functions in | ||
| this module allow transformations to be applied on-the-fly. | ||
|
|
||
| A typical transformation class looks like this: |
There was a problem hiding this comment.
| A typical transformation class looks like this: | |
| A typical transformation class looks like this (note that we keep its name | |
| lowercase because we will treat it as a function, thanks to the ``__call__`` | |
| method): |
| # or modify an AtomGroup and return Timestep | ||
|
|
||
| return ts | ||
|
|
There was a problem hiding this comment.
You showed the abstract class. Now show a concrete example to address @richardjgowers comment. For instance
| As a concrete example we will write a transformation that rotates a group of atoms around | |
| the z-axis through the center of geometry by a fixed increment for every time step. We will use | |
| :meth:`MDAnalysis.core.groups.AtomGroup.rotateby` and simply increase the rotation angle | |
| every time the transformation is called:: | |
| class spin_atoms(object): | |
| def __init__(self, atoms, dphi): | |
| """Rotate atoms by dphi degrees for every time step (around the z axis)""" | |
| self.atoms = atoms | |
| self.dphi = dphi | |
| self.axis = np.array([0, 0, 1]) | |
| def __call__(self, ts): | |
| phi = self.dphi * ts.frame | |
| self.atoms.rotateby(phi, self.axis) | |
| return ts | |
| This transformation can be used as :: | |
| u = mda.Universe(PSF, DCD) | |
| u.trajectory.add_transformations(spin_atoms(u.select_atoms("protein"), 1.0)) | |
I currently can't get nglview to work in my notebook so you'll need to check that this actually works... or come up with another example.
EDIT: forgot to increment phi...
EDIT 2: yes, it's pretty dumb that the phi angle just keeps incrementing, no matter what you do with the trajectory. Perhaps better to do something like phi = ts.frame * self.dphi
EDIT 3: changed it to phi = ts.frame * self.dphi
There was a problem hiding this comment.
Thanks! it works. And I think it makes sense to just rotate by a fixed angle. (for visualization perhaps)
|
And yeah, I don't know why we need to use |
|
Does it still work with dask? |
No, I think it is just low-level func for On the other hand, for performance, it seems that the >>> u = mda.Universe(PSF, DCD)
>>> fit_trans = trans.fit_rot_trans(u.atoms, u.atoms)
>>> u.trajectory.add_transformations(fit_trans)
>>> dmp = pickle.dumps(u)
run `__reduce__` (Universe)
run `__reduce__` (AtomGroup)
run `__reduce__` (Universe)
>>> u_p = pickle.loads(dmp)
run `_unpickle_U`
run `unpickle AtomGroup`
run `_unpickle_U`I have to admit I am a bit confused here...when pickling intertwined stuff, e.g when we pickle them apart, it's not an issue: >>> u = mda.Universe(PSF, DCD)
>>> fit_trans = trans.fit_rot_trans(u.atoms, u.atoms)
>>> dmp = pickle.dumps([u, fit_trans]) # Note they are not linked together here.
run `__reduce__` (AtomGroup)
run `__reduce__` (Universe)
>>> u_p = pickle.loads(dmp)
run `_unpickle_U`
run `unpickle AtomGroup` |
|
Would be good to understand the path the code takes when it is pickling Universe twice. Even if it were not a performance issue, it is a potential bug somewhere down the line. Btw, is the Universe stored twice in the pickle? |
|
I think it's a normal behavior of python handling circular reference pickling. class Universe(object):
def __init__(self, top):
self.top = top
def add_trajectory(self,traj):
self.traj = traj
def __reduce__(self):
print('reduce')
return self._unpickle_u, (self.top, self.traj)
@classmethod
def _unpickle_u(cls, top, traj):
print('unpickle')
u = cls(top)
u.traj = traj
return u
class Topology(object):
def __init__(self):
pass
class Trajectory(object):
def __init__(self):
pass
def add_transformation(self, transformation):
self.transformation = transformation
class Transformation(object):
def __init__(self, atomgroup):
self.atomgroup = atomgroup
class Atomgroup(object):
def __init__(self, universe):
self.universe = universe
top = Topology()
traj = Trajectory()
u = Universe(top)
u.add_trajectory(traj)
ag = Atomgroup(u)
trans = Transformation(ag)
traj.add_transformation(trans)
>>> import pickle
>>> u_dmp = pickle.dumps(u)
>>> u_p = pickle.loads(u_dmp)
reduce
reduce
unpickle
unpickleEDIT: f = pickle.dumps(obj)
print(len(f))
81439 (with transformation)
69086 (without transformation)
63713 (u._topology) |
orbeckst
left a comment
There was a problem hiding this comment.
Thanks for looking at the pickling order. If we only store the data once then I think that's not a problem if it gets pickled twice.
I found a minor typo in the docs (see suggestion). Just correct it. I am already approving the PR.
|
@richardjgowers , were your comments addressed? |
Co-authored-by: Oliver Beckstein <orbeckst@gmail.com>
From what I can see, the reviewer's comments were addressed. To move the PR towards a merge, I am dismissing the stale review.
* Fixes MDAnalysis#2860 * refactor transformations from closure into class (necessary so that a Universe with transformations can be pickled). * change universe pickling to `__reduce__` (instead of `__setstate__`/`__getstate__`, which did not work with transformations) * add test for pickling a Universe with transformations * update docs for transformations - examples (written as class) - deprecate transformations as closures (still works but cannot be pickled due to limitations in Python's pickle module) * update changelog
Fixes #2860
Changes made in this Pull Request:
PR Checklist