Conversation
|
I'm a little confused what the expectations are for the MemoryReader wrt modifying attributes of Timesteps. From what I can tell, it tries to give views not copies of the underlying arrays. This means that if you edit the positions, change frame, your changes are saved (ie go back to that frame and you've still got the modified positions rather than the original). Is this correct? Therefore, does it make sense to make |
|
Yes, I'd say the expectation is that everything is permanent. In this regard, MemoryReader is a very different beast from all the other Readers. But I think it's required so that MemoryReader can become a swiss-army knife for trajectory manipulation. I think that the new transformations are also permanent for MemoryReader (@jbarnoud @davidercruz ?) |
|
The transformations are indeed permanent on the memory reader. In practice, the transformation is applied on all frames in one go, and then purged from the list of transformation to not apply the same transformation again when changing frame. The expectation with the memory reader is that changes done to the coordinates are kept, on the contrary to the other readers. Having the box following the same pattern is what I would expect. |
| from .base import SingleFrameReaderBase | ||
|
|
||
|
|
||
| class DummyReader(SingleFrameReaderBase): |
There was a problem hiding this comment.
Shouldn't this be deprecated first?
There was a problem hiding this comment.
It was only used internally, and I think it's so buggy (it doesn't iterate?) I doubt anyone found any good use for it.
29f904f to
91e59d9
Compare
| filename=self.filename, | ||
| ) | ||
| new[self.ts.frame] | ||
| new.ts = self.ts.copy() |
There was a problem hiding this comment.
this line was problematic as base.Timestep.copy() isn't respectful of MemoryReader behaviour, see #2081. I don't think removing it causes any problems, as new[self.ts.frame] should populate the Timestep with everything needed
|
@MDAnalysis/coredevs ok this could do with some eyes, it's got some nice MR fixes and improvements |
orbeckst
left a comment
There was a problem hiding this comment.
I have some minor comments/questions; someone else should probably also have a look.
| self.ts.dt = dt | ||
| if dimensions is not None: | ||
| self.ts.dimensions = dimensions | ||
| if dimensions is None: |
There was a problem hiding this comment.
you could do
if dimensions is None:
pass
elif dimensions.shape == (6,):
dimensions = np.tile(dimensions, (self.n_frames, 1))
elif dimensions.shape != (self.n_frames, 6):
raise Hell
self.dimensions_array = dimensions
package/MDAnalysis/core/universe.py
Outdated
| n_atoms=n_atoms, | ||
| velocities=velocities, forces=forces) | ||
| coords = np.zeros((n_atoms, 3), dtype=np.float32) | ||
| if velocities: |
There was a problem hiding this comment.
Maybe ternary?
vels = np.zeros((n_atoms, 3), dtype=np.float32) if velocities else None
forces = np.zeros((n_atoms, 3), dtype=np.float32) if forces else Nonefor compactness?
| forces = np.zeros((n_atoms, 3), dtype=np.float32) | ||
| else: | ||
| forces = None | ||
| u.trajectory = get_reader_for(coords)( |
There was a problem hiding this comment.
Does an numpy array trigger MemoryReader selection? (I think so... can't remember the discussion a while back.)
There was a problem hiding this comment.
Maybe add a comment that this creates a MemoryReader.
| # This is significantly faster, but only implemented for certain | ||
| # trajectory file formats | ||
| try: | ||
| coordinates = self.trajectory.timeseries( |
There was a problem hiding this comment.
Is it true that we do not have dedicated C-level timeseries anymore and thus the self.trajectory.timeseries was essentially just doing what happened in the except block? Or are there still cases where the timeseries approach would be beneficial?
There was a problem hiding this comment.
Only DCDReader ever had a fast timeseries (and still does). It didn't return dimensions information though, which is needed. We could change DCD.timeseries to have DCD.timeseries(...., dimensions=True) to then return both coordinates and dimensions.
Then maybe the base.Reader could implement the entire else: branch below (I started this in #1400 once...)
So yeah maybe I should try and use the fast DCD routine here still since it still exists, then manually grab the dimensions..
There was a problem hiding this comment.
I doubt that you improve performance by cycling through the trajectory twice. Maybe leave as you have, add a note, and raise an issue so that it is not forgotten. A before/after benchmark (or Benchmark) would be good, just so that we have an idea how much slower this gets. analysis.encore uses DCD timeseries reading quite a bit, IIRC.
package/MDAnalysis/core/universe.py
Outdated
| pm = ProgressMeter(n_frames, interval=1, | ||
| verbose=verbose, format=pm_format) | ||
| coordinates = np.zeros((n_frames, n_atoms, 3), dtype=np.float32) | ||
| # TODO: This assumes constant... |
There was a problem hiding this comment.
I don't understand this comment.
package/MDAnalysis/core/universe.py
Outdated
| has_fors = ts.has_forces | ||
| has_dims = ts.dimensions is not None | ||
|
|
||
| if has_vels: |
There was a problem hiding this comment.
ternary operator for readability?
velocities = np.zeros_like(coordinates) if has_vels or NoneDoes Python short-circuit, i.e., is the creation of np.zeros_like(coordinates) skipped if not needed?
There was a problem hiding this comment.
Python (at least 3.5.3) properly short-circuits and does not execute the function in the skipped part:
In [1]: def dome():
...: print("did it")
...: return 10
...:
In [2]: x = dome() if True else 0
did it
In [3]: x = dome() if False else 0| mr2 = mr_reader.copy() | ||
|
|
||
| ts = mr2.ts | ||
| setattr(ts, attr, 7) |
There was a problem hiding this comment.
Remind me how this works: Are you setting e.g. ts.positions to 7? All positions in the array or are you replacing the array with the integer "7"?
If it is the latter, shouldn't we doing the former?
| ts = mr2[1] | ||
| ts = mr2[0] | ||
|
|
||
| assert_equal(getattr(ts, attr), 7) |
There was a problem hiding this comment.
array comparison? But the arrays should be float32 so you really ought to use assert_almost_equal.
Codecov Report
@@ Coverage Diff @@
## develop #2080 +/- ##
===========================================
- Coverage 89.41% 89.41% -0.01%
===========================================
Files 159 157 -2
Lines 18731 18759 +28
Branches 2696 2709 +13
===========================================
+ Hits 16749 16774 +25
- Misses 1381 1382 +1
- Partials 601 603 +2
Continue to review full report at Codecov.
|
| # single box, tile this to trajectory length | ||
| # allows modifying the box of some frames | ||
| dimensions = np.tile(dimensions, (self.n_frames, 1)) | ||
| elif dimensions.shape != (self.n_frames, 6): |
There was a problem hiding this comment.
Does this need to be protected against AttributeError if anything is passed through that is not a numpy array?
package/MDAnalysis/core/universe.py
Outdated
| mapping of residues to segments | ||
| trajectory : bool, optional | ||
| if True, attaches a dummy reader to the Universe, therefore | ||
| if True, attaches a MemoryReader to the Universe, therefore |
There was a problem hiding this comment.
Make this (and others) a link
:class:`~MDAnalysis.coordinates.memory.MemoryReader`| "This must be a array of shape (6,) or " | ||
| "(n_frames, 6)".format(dimensions.shape)) | ||
| self.dimensions_array = dimensions | ||
| dimensions = np.zeros((self.n_frames, 6), dtype=np.float64) |
There was a problem hiding this comment.
Did you decide to always have dimensions as an array, even if it contains zeros? (I assume this is more consistent with how ts.dimensions works, and it allows setting the box if required – more flexibility!) 👍
| try: | ||
| if velocities.ndim == 2: | ||
| velocities = velocities[np.newaxis, :, :] | ||
| except AttributeError: |
There was a problem hiding this comment.
This just checks whether the velocities object has an attribute ndim. For type checking, better use isinstance(). But since that's not really "pythonic" anyway, I'd prefer calling numpy.asarray(velocities) before doing attribute checks.
| try: | ||
| if forces.ndim == 2: | ||
| forces = forces[np.newaxis, :, :] | ||
| except AttributeError: |
There was a problem hiding this comment.
As above,I'd prefer using numpy.asarray(forces).
| raise ValueError("Provided dimensions array has shape {}. " | ||
| "This must be a array of shape (6,) or " | ||
| "(n_frames, 6)".format(dimensions.shape)) | ||
| except AttributeError: |
There was a problem hiding this comment.
Again, better use numpy.asarray(dimensions)
68900cc to
d86dc42
Compare
|
Ok changed to use |
|
@MDAnalysis/coredevs if this gets reviewed I can finish up 0.19.0 |
|
@zemanj could you please have a look when you can spare a moment? |
orbeckst
left a comment
There was a problem hiding this comment.
This looks ok to me – but I am not an expert on the MemoryReader. If anyone else (@zemanj @jbarnoud @wouterboomsma ) might want to have a quick look then that would be great.
(We would like to merge this soon in order to get the 0.19.0 release out...)
@richardjgowers perhaps open an issue noting that we might want to revisit the MemoryReader once we have timeseries support with dimensions for all readers (so that fast timeseries-slurpers such as the one for DCD and possibly XTC) can be used.
|
Thanks for notifying me, @orbeckst. I haven't followed recent development in MDAnalysis very closely, and therefore don't have anything intelligent to add to what has already been discussed. Good to see that MemoryReader is still being used. Thanks for putting in the effort in bringing this up to date @richardjgowers. |
Fixes #2076 #2077 #1041 #2081
Replaces #1537
Changes made in this Pull Request:
MemoryReadernow has velocities and forces (which hopefully work identically to positions)MemoryReadernow has variable dimensions (ie NPT conditions)Universe.transfer_to_memorytakes advantage of all of the aboveUniverse.emptynow uses a MemoryReader (now it does the above)PR Checklist