Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion package/CHANGELOG
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,11 @@ Fixes
* analysis.align: fixed output and catch more cases with
SelectionError when selections are incompatible (Issue #1809)
* fix cython 0.28 compiler error

* DCD istart was wrongly interpreted as frames (only in 0.17.0) instead of
integrator timesteps, which lead to large time-offsets when reading and
writing DCD files; now istart is correctly interpreted when reading and
when writing, istart=None will set it to the CHARMM default nsavc but the
default is istart=0, which will make the DCD start at frame 0 (Issue #1819)

Changes
* scipy >= 1.0.0 is now required (issue #1775 because of PR #1758)
Expand Down
11 changes: 8 additions & 3 deletions package/MDAnalysis/coordinates/DCD.py
Original file line number Diff line number Diff line change
Expand Up @@ -216,7 +216,7 @@ def Writer(self, filename, n_atoms=None, **kwargs):
def _frame_to_ts(self, frame, ts):
"""convert a dcd-frame to a :class:`TimeStep`"""
ts.frame = self._frame
ts.time = (ts.frame + self._file.header['istart']) * self.ts.dt
ts.time = (ts.frame + self._file.header['istart']/self._file.header['nsavc']) * self.ts.dt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought there where more places where we need changes. I'm pretty sure the writer needs to be updated too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty sure that you're right... I just committed what I had before falling asleep.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I fixed all occurences of istart/nsavc in the reader and writer. I also added a test for the writer to check the defaults that we are setting. Namely, istart=0 is not what CHARMM would do: it would set it to nsavc which is what MDAnalysis now does for istart=None. However, I kept the MDAnalysis value (which we had for a while) as the default.

ts.data['step'] = self._file.tell()

# The original unitcell is read as ``[A, gamma, B, beta, alpha, C]``
Expand Down Expand Up @@ -380,9 +380,13 @@ def __init__(self,
correspond to the interval between two frames as nsavc (i.e., every
how many MD steps is a frame saved to the DCD). By default, this
number is just set to one and this should be sufficient for almost
all cases but if required, nsavc can be changed.
all cases but if required, `nsavc` can be changed.
istart : int (optional)
starting frame number. CHARMM defaults to 1.
starting frame number in integrator timesteps. CHARMM defaults to
`nsavc`, i.e., start at frame number 1 = `istart` / `nsavc`. The value
``None`` will set `istart` to `nsavc` (the CHARMM default).
The MDAnalysis default is 0 so that the frame number and time of the first
frame is 0.
**kwargs : dict
General writer arguments

Expand All @@ -397,6 +401,7 @@ def __init__(self,
self.dt = dt
dt = mdaunits.convert(dt, 'ps', self.units['time'])
delta = float(dt) / nsavc
istart = istart if istart is not None else nsavc
self._file.write_header(
remarks=remarks,
natoms=self.n_atoms,
Expand Down
22 changes: 11 additions & 11 deletions testsuite/MDAnalysisTests/analysis/test_rms.py
Original file line number Diff line number Diff line change
Expand Up @@ -165,21 +165,21 @@ def outfile(self, tmpdir):

@pytest.fixture()
def correct_values(self):
return [[0, 1000, 0], [49, 1049, 4.68953]]
return [[0, 1, 0], [49, 50, 4.68953]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orbeckst this makes the tests pass. I'm unsure why we have to adjust the step by 1 position now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean, adjusting the step by 1 position instead of 1000? The reason for this is that istart is measured in integrator time steps. 1 frame contains nsavc integrator time steps. If a DCD wants to state that the trajectory starts "at frame 1" then it has to set istart = nsavc. In 0.17.0 we wrongly interpreted istart as frames instead of integrator time steps.

If your question is "why do our test DCDs now have a starting time "at frame 1", e.g.

>>> import MDAnalysis as mda
... from MDAnalysisTests.datafiles import PSF, DCD, DCD_NAMD_GBIS, PSF_NAMD_GBIS, TPR, XTC
...
>>> # DCD time starts at 1*dt
>>> u = mda.Universe(PSF, DCD)
>>> print(u.trajectory.frame, u.trajectory.time, u.trajectory.n_frames, u.trajectory.totaltime)
0 0.9999999119200186 98 96.9999914562418
>>> u = mda.Universe(PSF_NAMD_GBIS, DCD_NAMD_GBIS)
>>> print(u.trajectory.frame, u.trajectory.time, u.trajectory.n_frames, u.trajectory.totaltime)
0 0.010000000029814058 100 0.9900000029515917
>>> # XTC starts at 0*dt
>>> u = mda.Universe(TPR, XTC)
>>> print(u.trajectory.frame, u.trajectory.time, u.trajectory.n_frames, u.trajectory.totaltime)
0 0.0 10 900.0000686645508

then the answer is because this is how we interpret the meaning of istart at the moment: as an offset of istart/nsavc frames from 0 (based on the docs that I cited in #1819). I am admittedly not 100% sure if this is the best way to do this; it seems to imply that the starting frame is not saved in a trajectory and the first frame that was saved would be the one istart integrator time steps after the begin of the simulation. The CHARMM docs (Example under TRAJECTORY command) imply that the DYNAmics command does indeed start saving after the coordinates with the initial conditions:

Example. Assume that the three files traj1.trj, traj2.trj and traj3.trj were created using the following dyanmics commands:
dyna start nstep 1000 nsavc 100 ! saves 10 frames (100, 200, ...1000)
dyna restart nstep 10000 nsavc 100 ! saves 100 frames (1100, 1200, ...10100)
dyna restart nstep 10000 nsavc 100 ! saves 100 frames (10200, 10300, ...20100)

I might have to play around with the CHARMM TRAJectory command to get a better idea of what CHARMM really thinks of a trajectory.

However, I'd like to fix istart first (which is clearly wrong) and then discuss what is more of a discussion about how to interpret quirks in the DCD format and how to make it work seamlessly in MDAnalysis.

Copy link
Member

@kain88-de kain88-de Mar 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change of 1000 to 1 is OK. We made this change in the current development cycle. I'm curious why we needed to change from step 49 to 50. My first try to fix the istart behavior was in #1767. The change in the iteration introduced by your changes is new to this pr. Because such a change would break user code I'm hesitant to merge this.

There are some NAMD experts in my group. I will ask them what behavior is best for them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain with a simple example what the change "from 49 to 50" means?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I quickly browsed through the discussion in #1767. The choice istart=1 is now istart=None or manually setting istart=nsavc. Setting istart=0 as the default for the writer is retained.

If we wanted to make clearer that we want the user to be able to set frame offsets then we should introduce something like frame_offset. We already have offset for the Timestep.

I haven't understood yet where changes in frames come from. It might well be that something needs to be fixed but I'd like to have a test case for this in the dcd tests (and not primarily in the rms code).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will have some time during the week to have a closer look at this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why we needed to change from step 49 to 50.

I think the same reason: before we had 1049, which was calculated as istart + frames = 1000 + 49. Now we have istart/nsavc + frames = 1000/1000 + 49 = 1 + 49 = 50.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I didn't have time yet to look at this. But your guess is likely right. This means from now on all reported time will start from 1? except when istart is 0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm cool with that. This will affect users by showing later times now. I would change the CHANGELOG message to say that we used to report to early timestamps and now read times correctly and users should expect that times are shifted to later values. I like to be a bit more exact then just we did it wrong

Otherwise looks good to me. It can be merged.


@pytest.fixture()
def correct_values_mass(self):
return [[0, 1000, 0], [49, 1049, 4.74920]]
return [[0, 1, 0], [49, 50, 4.74920]]

@pytest.fixture()
def correct_values_group(self):
return [[0, 1000, 0, 0, 0],
[49, 1049, 4.7857, 4.7048, 4.6924]]
return [[0, 1, 0, 0, 0],
[49, 50, 4.7857, 4.7048, 4.6924]]

@pytest.fixture()
def correct_values_backbone_group(self):
return [[0, 1000, 0, 0, 0],
[49, 1049, 4.6997, 1.9154, 2.7139]]
return [[0, 1, 0, 0, 0],
[49, 50, 4.6997, 1.9154, 2.7139]]


def test_progress_meter(self, capsys, universe):
Expand All @@ -195,8 +195,8 @@ def test_rmsd(self, universe, correct_values):
step=49)
RMSD.run()
assert_almost_equal(RMSD.rmsd, correct_values, 4,
err_msg="error: rmsd profile should match" +
"test values")
err_msg="error: rmsd profile should match" +
"test values")

def test_rmsd_unicode_selection(self, universe, correct_values):
RMSD = MDAnalysis.analysis.rms.RMSD(universe, select=u'name CA',
Expand All @@ -217,10 +217,10 @@ def test_rmsd_atomgroup_selections(self, universe):
def test_rmsd_single_frame(self, universe):
RMSD = MDAnalysis.analysis.rms.RMSD(universe, select='name CA',
start=5, stop=6).run()
single_frame = [[5, 1005, 0.91544906]]
single_frame = [[5, 6, 0.91544906]]
assert_almost_equal(RMSD.rmsd, single_frame, 4,
err_msg="error: rmsd profile should match" +
"test values")
err_msg="error: rmsd profile should match" +
"test values")

def test_mass_weighted_and_save(self, universe, outfile, correct_values):
# mass weighting the CA should give the same answer as weighing
Expand Down
71 changes: 58 additions & 13 deletions testsuite/MDAnalysisTests/coordinates/test_dcd.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,8 @@
from MDAnalysisTests.datafiles import (DCD, PSF, DCD_empty, PRMncdf, NCDF,
COORDINATES_TOPOLOGY, COORDINATES_DCD,
PSF_TRICLINIC, DCD_TRICLINIC,
PSF_NAMD_TRICLINIC, DCD_NAMD_TRICLINIC)
PSF_NAMD_TRICLINIC, DCD_NAMD_TRICLINIC,
PSF_NAMD_GBIS, DCD_NAMD_GBIS)
from MDAnalysisTests.coordinates.base import (MultiframeReaderTest,
BaseReference,
BaseWriterTest)
Expand Down Expand Up @@ -82,17 +83,29 @@ def test_with_statement(self):

def test_set_time(self):
u = mda.Universe(PSF, DCD)
assert_almost_equal(u.trajectory.time, 1000, decimal=3)
assert_almost_equal(u.trajectory.time, 1.0,
decimal=5)


@pytest.mark.parametrize('istart', (0, 1, 2, 3))
def test_write_istart(universe_dcd, tmpdir, istart):
u = universe_dcd
@pytest.mark.parametrize('fstart', (0, 1, 2, 37, None))
def test_write_istart(universe_dcd, tmpdir, fstart):
outfile = str(tmpdir.join('test.dcd'))
with mda.Writer(outfile, u.atoms.n_atoms, istart=istart) as w:
w.write(u.atoms)
nsavc = universe_dcd.trajectory._file.header['nsavc']
istart = fstart * nsavc if fstart is not None else None
with mda.Writer(outfile, universe_dcd.atoms.n_atoms,
nsavc=nsavc, istart=istart) as w:
for ts in universe_dcd.trajectory:
w.write(universe_dcd.atoms)
u = mda.Universe(PSF, outfile)
assert u.trajectory._file.header['istart'] == istart
assert_almost_equal(u.trajectory._file.header['istart'],
istart if istart is not None else u.trajectory._file.header['nsavc'])
# issue #1819
times = [ts.time for ts in u.trajectory]

fstart = fstart if fstart is not None else 1
ref_times = (np.arange(universe_dcd.trajectory.n_frames) + fstart) * universe_dcd.trajectory.dt
assert_almost_equal(times, ref_times, decimal=5,
err_msg="Times not identical after setting istart={}".format(istart))


class TestDCDWriter(BaseWriterTest):
Expand Down Expand Up @@ -228,7 +241,9 @@ def test_reader_set_dt():
dt = 100
frame = 3
u = mda.Universe(PSF, DCD, dt=dt)
assert_almost_equal(u.trajectory[frame].time, (frame + 1000)*dt,
dcdheader = u.trajectory._file.header
fstart = dcdheader['istart'] / dcdheader['nsavc']
assert_almost_equal(u.trajectory[frame].time, (frame + fstart)*dt,
err_msg="setting time step dt={0} failed: "
"actually used dt={1}".format(
dt, u.trajectory._ts_kwargs['dt']))
Expand All @@ -243,17 +258,30 @@ def test_writer_dt(tmpdir, ext, decimal):
universe_dcd = mda.Universe(PSF, DCD, dt=dt)
t = universe_dcd.trajectory
outfile = str(tmpdir.join("test.{}".format(ext)))
with mda.Writer(outfile, n_atoms=t.n_atoms, dt=dt, istart=t._file.header['istart']) as W:
# use istart=None explicitly so that both dcd and xtc start with time 1*dt
# (XTC ignores istart, DCD will set istart = nsavc)
with mda.Writer(outfile, n_atoms=t.n_atoms, dt=dt, istart=None) as W:
for ts in universe_dcd.trajectory:
W.write(universe_dcd.atoms)

uw = mda.Universe(PSF, outfile)
assert_almost_equal(uw.trajectory.totaltime,
(uw.trajectory.n_frames - 1) * dt, decimal)
(uw.trajectory.n_frames - 1) * dt, decimal=decimal,
err_msg="Total time mismatch for ext={}".format(ext))
times = np.array([uw.trajectory.time for ts in uw.trajectory])
frames = np.array([ts.frame for ts in uw.trajectory])
assert_array_almost_equal(times, (frames + 1000) * dt, decimal)
frames = np.arange(1, uw.trajectory.n_frames + 1) # traj starts at 1*dt
assert_array_almost_equal(times, frames * dt, decimal=decimal,
err_msg="Times mismatch for ext={}".format(ext))

@pytest.mark.parametrize("variable, default", (("istart", 0), ("nsavc", 1)))
def test_DCDWriter_default(tmpdir, variable, default):
outfile = str(tmpdir.join("test.dcd"))
with mda.Writer(outfile, n_atoms=10) as W:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle, writing the empty DCD and returning the header could be made a module-level fixture. I am out of time to do this right now.

pass # just write empty trajectory with header
# need to read because header might only be complete after close
with mda.lib.formats.libdcd.DCDFile(outfile) as DCD:
header = DCD.header
assert header[variable] == default

@pytest.mark.parametrize("ext, decimal", (("dcd", 5),
("xtc", 2)))
Expand Down Expand Up @@ -381,3 +409,20 @@ def test_ncdf2dcd_coords(ncdf2dcd):
assert_almost_equal(ts_ncdf.positions,
ts_dcd.positions,
3)

@pytest.fixture(params=[(PSF, DCD),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be declared with scope="module"?

(PSF_TRICLINIC, DCD_TRICLINIC),
(PSF_NAMD_TRICLINIC, DCD_NAMD_TRICLINIC),
(PSF_NAMD_GBIS, DCD_NAMD_GBIS)])
def universe(request):
psf, dcd = request.param
yield mda.Universe(psf, dcd)

def test_ts_time(universe):
# issue #1819
u = universe
header = u.trajectory._file.header
ref_times = [(ts.frame + header['istart']/header['nsavc'])*ts.dt
for ts in u.trajectory]
times = [ts.time for ts in u.trajectory]
assert_almost_equal(times, ref_times, decimal=5)
Loading