Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions package/CHANGELOG
Original file line number Diff line number Diff line change
Expand Up @@ -175,8 +175,8 @@ Enhancements
checking if it can be used in parallel analysis. (Issue #2996, PR #2950)

Changes
* `helix_analysis.HELANAL` now uses the `analysis.base.Results` class to
store results attributes (Issue #3261, #3267)
* `analysis.diffusionmap.DistanceMatrix` class now stores `dist_matrix`
using the `analysis.base.Results` class (Issues #3288, #3290)
* `analysis.align.AlignTraj` and `analysis.align.AverageStructure` now store
their result attributes using the `analysis.base.Results` class
(Issues #3278 and #3261)
Expand Down Expand Up @@ -243,6 +243,9 @@ Changes
* Added OpenMM coordinate and topology converters (Issue #2863, PR #2917)

Deprecations
* The `analysis.diffusionmap.DistanceMatrix.dist_matrix` is now deprecated in
favour of `analysis.diffusionmap.DistanceMatrix.results.dist_matrix`.
It will be removed in 3.0.0 (Issues #3288, #3290)
* The `analysis.align.AlignTraj.rmsd` attribute is now deprecated in
favour of `analysis.align.AlignTraj.results.rmsd` (Issue #3278, #3261)
* The `universe`, `positions`, and `rmsd` attributes of
Expand Down
206 changes: 115 additions & 91 deletions package/MDAnalysis/analysis/diffusionmap.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,26 +54,32 @@
:data:`~MDAnalysis.tests.datafiles.DCD`). This tutorial shows how to use the
Diffusion Map class.

First load all modules and test data ::
First load all modules and test data

>>> import MDAnalysis as mda
>>> import MDAnalysis.analysis.diffusionmap as diffusionmap
>>> from MDAnalysis.tests.datafiles import PSF, DCD
.. code-block:: python

import MDAnalysis as mda
import MDAnalysis.analysis.diffusionmap as diffusionmap
from MDAnalysis.tests.datafiles import PSF, DCD

Given a universe or atom group, we can create and eigenvalue decompose
the Diffusion Matrix from that trajectory using :class:`DiffusionMap`:: and get
the corresponding eigenvalues and eigenvectors.

>>> u = mda.Universe(PSF,DCD)
.. code-block:: python

u = mda.Universe(PSF,DCD)

We leave determination of the appropriate scale parameter epsilon to the user,
[Clementi1]_ uses a complex method involving the k-nearest-neighbors of a
trajectory frame, whereas others simple use a trial-and-error approach with
a constant epsilon. Currently, the constant epsilon method is implemented
by MDAnalysis.

>>> dmap = diffusionmap.DiffusionMap(u, select='backbone', epsilon=2)
>>> dmap.run()
.. code-block:: python

dmap = diffusionmap.DiffusionMap(u, select='backbone', epsilon=2)
dmap.run()

From here we can perform an embedding onto the k dominant eigenvectors. The
non-linearity of the map means there is no explicit relationship between the
Expand All @@ -86,50 +92,28 @@
spectral gap and should be somewhat apparent for a system at equilibrium with a
high number of frames.

>>> # first cell of a jupyter notebook should contain: %matplotlib inline
>>> import matplotlib.pyplot as plt
>>> f, ax = plt.subplots()
>>> upper_limit = # some reasonably high number less than the n_eigenvectors
>>> ax.plot(dmap.eigenvalues[:upper_limit])
>>> ax.set(xlabel ='eigenvalue index', ylabel='eigenvalue')
>>> plt.tight_layout()
.. code-block:: python

import matplotlib.pyplot as plt
f, ax = plt.subplots()
upper_limit = # some reasonably high number less than the n_eigenvectors
ax.plot(dmap.eigenvalues[:upper_limit])
ax.set(xlabel ='eigenvalue index', ylabel='eigenvalue')
plt.tight_layout()

From here we can transform into the diffusion space

>>> num_eigenvectors = # some number less than the number of frames after
>>> # inspecting for the spectral gap
>>> fit = dmap.transform(num_eigenvectors, time=1)
.. code-block:: python

num_eigenvectors = # some number less than the number of frames after
# inspecting for the spectral gap
fit = dmap.transform(num_eigenvectors, time=1)

It can be difficult to interpret the data, and is left as a task
for the user. The `diffusion distance` between frames i and j is best
approximated by the euclidean distance between rows i and j of
self.diffusion_space.


.. _Distance-Matrix-tutorial:

Distance Matrix tutorial
------------------------

Often a, a custom distance matrix could be useful for local
epsilon determination or other manipulations on the diffusion
map method. The :class:`DistanceMatrix` exists in
:mod:`~MDAnalysis.analysis.diffusionmap` and can be passed
as an initialization argument for :class:`DiffusionMap`.

>>> import MDAnalysis as mda
>>> import MDAnalysis.analysis.diffusionmap as diffusionmap
>>> from MDAnalysis.tests.datafiles import PSF, DCD

Now create the distance matrix and pass it as an argument to
:class:`DiffusionMap`.

>>> u = mda.Universe(PSF,DCD)
>>> dist_matrix = diffusionmap.DistanceMatrix(u, select='all')
>>> dist_matrix.run()
>>> dmap = diffusionmap.DiffusionMap(dist_matrix)
>>> dmap.run()

Classes
-------

Expand Down Expand Up @@ -183,84 +167,124 @@ class DistanceMatrix(AnalysisBase):
using a given metric

A distance matrix can be initialized on its own and used as an
initialization argument in :class:`DiffusionMap`. Refer to the
:ref:`Distance-Matrix-tutorial` for a demonstration.
initialization argument in :class:`DiffusionMap`.

Parameters
----------
universe : `~MDAnalysis.core.universe.Universe`
The MD Trajectory for dimension reduction, remember that
computational cost of eigenvalue decomposition
scales at O(N^3) where N is the number of frames.
Cost can be reduced by increasing step interval or specifying a
start and stop value when calling :meth:`DistanceMatrix.run`.
select: str, optional
Any valid selection string for
:meth:`~MDAnalysis.core.groups.AtomGroup.select_atoms`
This selection of atoms is used to calculate the RMSD between
different frames. Water should be excluded.
metric : function, optional
Maps two numpy arrays to a float, is positive definite and
symmetric. The API for a metric requires that the arrays must have
equal length, and that the function should have weights as an
optional argument. Weights give each index value its own weight for
the metric calculation over the entire arrays. Default: metric is
set to rms.rmsd().
cutoff : float, optional
Specify a given cutoff for metric values to be considered equal,
Default: 1EO-5
weights : array, optional
Weights to be given to coordinates for metric calculation
verbose : bool, optional
Show detailed progress of the calculation if set to ``True``; the
default is ``False``.

Attributes
----------
atoms : AtomGroup
atoms : `~MDAnalysis.core.groups.AtomGroup`
Selected atoms in trajectory subject to dimension reduction
dist_matrix : array, (n_frames, n_frames)
results.dist_matrix : numpy.ndarray, (n_frames, n_frames)
Array of all possible ij metric distances between frames in trajectory.
This matrix is symmetric with zeros on the diagonal.

.. versionadded:: 2.0.0

dist_matrix : numpy.ndarray, (n_frames, n_frames)

.. deprecated:: 2.0.0
Will be removed in MDAnalysis 3.0.0. Please use
:attr:`results.dist_matrix` instead.

Example
-------
Often, a custom distance matrix could be useful for local
epsilon determination or other manipulations on the diffusion
map method. The :class:`DistanceMatrix` exists in
:mod:`~MDAnalysis.analysis.diffusionmap` and can be passed
as an initialization argument for :class:`DiffusionMap`.

.. code-block:: python

import MDAnalysis as mda
import MDAnalysis.analysis.diffusionmap as diffusionmap
from MDAnalysis.tests.datafiles import PSF, DCD

Now create the distance matrix and pass it as an argument to
:class:`DiffusionMap`.

u = mda.Universe(PSF,DCD)
dist_matrix = diffusionmap.DistanceMatrix(u, select='all')
dist_matrix.run()
dmap = diffusionmap.DiffusionMap(dist_matrix)
dmap.run()

.. versionchanged:: 1.0.0
``save()`` method has been removed. You can use ``np.save()`` on
:attr:`DistanceMatrix.dist_matrix` instead.
:attr:`DistanceMatrix.results.dist_matrix` instead.
.. versionchanged:: 2.0.0
:attr:`dist_matrix` is now stored in a
:class:`MDAnalysis.analysis.base.Results` instance.

"""
def __init__(self, u, select='all', metric=rmsd, cutoff=1E0-5,
def __init__(self, universe, select='all', metric=rmsd, cutoff=1E0-5,
weights=None, **kwargs):
"""
Parameters
----------
u : universe `~MDAnalysis.core.universe.Universe`
The MD Trajectory for dimension reduction, remember that
computational cost of eigenvalue decomposition
scales at O(N^3) where N is the number of frames.
Cost can be reduced by increasing step interval or specifying a
start and stop value when calling :meth:`DistanceMatrix.run`.
select: str, optional
Any valid selection string for
:meth:`~MDAnalysis.core.groups.AtomGroup.select_atoms`
This selection of atoms is used to calculate the RMSD between
different frames. Water should be excluded.
metric : function, optional
Maps two numpy arrays to a float, is positive definite and
symmetric. The API for a metric requires that the arrays must have
equal length, and that the function should have weights as an
optional argument. Weights give each index value its own weight for
the metric calculation over the entire arrays. Default: metric is
set to rms.rmsd().
cutoff : float, optional
Specify a given cutoff for metric values to be considered equal,
Default: 1EO-5
weights : array, optional
Weights to be given to coordinates for metric calculation
verbose : bool (optional)
Show detailed progress of the calculation if set to ``True``; the
default is ``False``.
"""
self._u = u
traj = self._u.trajectory

# remember that this must be called before referencing self.n_frames
super(DistanceMatrix, self).__init__(self._u.trajectory, **kwargs)
super(DistanceMatrix, self).__init__(universe.trajectory, **kwargs)

self.atoms = self._u.select_atoms(select)
self.atoms = universe.select_atoms(select)
self._metric = metric
self._cutoff = cutoff
self._weights = weights
self._calculated = False

def _prepare(self):
self.dist_matrix = np.zeros((self.n_frames, self.n_frames))
self.results.dist_matrix = np.zeros((self.n_frames, self.n_frames))

def _single_frame(self):
iframe = self._ts.frame
i_ref = self.atoms.positions
# diagonal entries need not be calculated due to metric(x,x) == 0 in
# theory, _ts not updated properly. Possible savings by setting a
# cutoff for significant decimal places to sparsify matrix
for j, ts in enumerate(self._u.trajectory[iframe:self.stop:self.step]):
for j, ts in enumerate(self._trajectory[iframe:self.stop:self.step]):
self._ts = ts
j_ref = self.atoms.positions
dist = self._metric(i_ref, j_ref, weights=self._weights)
self.dist_matrix[self._frame_index, j+self._frame_index] = (
dist if dist > self._cutoff else 0)
self.dist_matrix[j+self._frame_index, self._frame_index] = (
self.dist_matrix[self._frame_index, j+self._frame_index])
self._ts = self._u.trajectory[iframe]
self.results.dist_matrix[self._frame_index,
j+self._frame_index] = (
dist if dist > self._cutoff else 0)
self.results.dist_matrix[j+self._frame_index,
self._frame_index] = (
self.results.dist_matrix[self._frame_index,
j+self._frame_index])
self._ts = self._trajectory[iframe]

@property
def dist_matrix(self):
wmsg = ("The `dist_matrix` attribute was deprecated in "
"MDAnalysis 2.0.0 and will be removed in MDAnalysis 3.0.0. "
"Please use `results.dist_matrix` instead.")
warnings.warn(wmsg, DeprecationWarning)
return self.results.dist_matrix

def _conclude(self):
self._calculated = True
Expand Down Expand Up @@ -338,7 +362,7 @@ def run(self, start=None, stop=None, step=None):
warnings.warn("The distance matrix is very large, and can "
"be very slow to compute. Consider picking a larger "
"step size in distance matrix initialization.")
self._scaled_matrix = (self._dist_matrix.dist_matrix ** 2 /
self._scaled_matrix = (self._dist_matrix.results.dist_matrix ** 2 /
self._epsilon)
# take negative exponent of scaled matrix to create Isotropic kernel
self._kernel = np.exp(-self._scaled_matrix)
Expand Down
11 changes: 10 additions & 1 deletion testsuite/MDAnalysisTests/analysis/test_diffusionmap.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,9 @@ def test_eg(dist, dmap):
def test_dist_weights(u):
backbone = u.select_atoms('backbone')
weights_atoms = np.ones(len(backbone.atoms))
dist = diffusionmap.DistanceMatrix(u, select='backbone', weights=weights_atoms)
dist = diffusionmap.DistanceMatrix(u,
select='backbone',
weights=weights_atoms)
dist.run(step=3)
dmap = diffusionmap.DiffusionMap(dist)
dmap.run()
Expand Down Expand Up @@ -94,3 +96,10 @@ def test_not_universe_error(u):
trj_only = u.trajectory
with pytest.raises(ValueError, match='U is not a Universe'):
diffusionmap.DiffusionMap(trj_only)


def test_DistanceMatrix_attr_warning(u):
dist = diffusionmap.DistanceMatrix(u, select='backbone').run(step=3)
wmsg = f"The `dist_matrix` attribute was deprecated in MDAnalysis 2.0.0"
with pytest.warns(DeprecationWarning, match=wmsg):
assert getattr(dist, "dist_matrix") is dist.results.dist_matrix