Skip to content

added single distance calculations to lib.distances#1939

Merged
orbeckst merged 13 commits intodevelopfrom
issue-1262-distances
Jun 29, 2018
Merged

added single distance calculations to lib.distances#1939
orbeckst merged 13 commits intodevelopfrom
issue-1262-distances

Conversation

@richardjgowers
Copy link
Member

@richardjgowers richardjgowers commented Jun 14, 2018

Fixes #1262 #1938

Adds some single distance/angle/dihedral calculations to lib.distances (alongside the bulk functions) which should act as the standard way to do this. It's a little silly that we've got how to calculate an angle implemented so many times (with and without pbc, maybe with different limits).

Things to discuss that I wasn't sure about....

  • should pbc=True be the default, it would probably be less "surprising"
  • lib.distances seems right, so what's lib.mdamath then?
  • the namespace for lib.distances also has the C functions in there, we should probably clean that up

@orbeckst
Copy link
Member

Thanks for taking this on.

  • I would go with pbc=True as default. Given the prevalence of PBC in trajectories, this seems more likely to give the expected answer.
  • I think the idea of lib.mdamath was to collect "simple" mathematical functions for re-use in MDAnalysis. If we don't need a specific function in MDAnalysis itself or if we have a more sophisticated version (such as PBC-aware) then I suggest we
    • deprecate the simple version
    • already replace with a call to our fancy version but with defaults that reproduce the old behavior (eg pbc=False)
    • remove in 1.0
  • namespace clean-up: yes. Also, lib.distances.calc_angles sounds weird from the user's perspective: "I don't want distances, I want angles". (Of course it makes sense from the implementation/developer perspective.) Maybe lib.geometry ??

@richardjgowers
Copy link
Member Author

@orbeckst how would you feel about flattening out the namespace slightly, so MDAnalysis.lib.distance_array, MDAnalysis.lib.calc_angles. (eg I don't have to do numpy.something.dot, it's just there). Or even further, MDAnalysis.distance_array....

@orbeckst
Copy link
Member

I really don't like top-level functions such as MDAnalysis.distance_array, I like my name spaces only containing what is directly relevant, especially at the top.

I generally like nested name spaces better than flat ones, especially for anything that's not primarily user-facing such as lib. We already have the file formats under lib so it seems a bit incongruent to have these geometric functions at the same level. Having sub-libraries also makes interactive exploration with lib.<TAB> easier. And even if you make distance_array and friends available under lib, they still have to live somewhere in a module. Why not call this module something meaningful (as opposed to "lib") and then make use of the fact that the full import path means something?

@richardjgowers
Copy link
Member Author

We could do something like mdanalysis.geometry.distance_array and mda.geometry.calc_angles. Currently lib is just a bucket of code where we throw stuff which has no mda internal depedencies :)

@jbarnoud
Copy link
Contributor

Currently lib is just a bucket of code where we throw stuff which has no mda internal depedencies :)

Well. Ideally it is the case. There still is make_whole that is lost there.

@orbeckst
Copy link
Member

Currently lib is just a bucket of code where we throw stuff which has no mda internal depedencies :)

I think having a part of the library that has no external dependencies is good, and having it under lib makes it clear to developers that it can be imported without fear of circular imports and such.

And although lib is a bucket, the buckets inside the bucket have meaningful labels (ranging from the expressive qcprot, transformation, and formats.xxx to the generic log and util). Where would you want to put those? Or is it just the distance code that we are talking about?

Well. Ideally it is the case. There still is make_whole that is lost there.

That... should be fixed eventually :-).

If we are primarily talking about distances and if you want to have these functions less hidden from the user (admittedly, looking for lib.distances.distance_array() is cryptic... see #1863 ) then I agree we could have something like MDAnalysis.geometry for functions that work with MDAnalysis objects. Into this module I would import the generic functions (which I would still have live under lib.coordinate_calculations or rather a better name) and in MDAnalysis.geometry put wrappers that allow people to use them with AtomGroups or Atoms. Something like

from .lib import coordinate_calculations as raw

def distance_array(x, y, **kwargs):
    return raw.distance_array(x.positions, y.positions, **kwargs)

Or come up with a naming scheme that allows the different functions to coexist

from .lib.coordinate_calculations import distance_array

def mda_distance_array(x, y, **kwargs):
    return distance_array(x.positions, y.positions, **kwargs)

(I am not saying that prefixing with mda_ is pretty, in fact, I don't particularly like this and rather prefer using module paths for semantic clarity (geometry.raw.distance_array() vs geometry.distance_array()) but I am happy to concede that other people will have much better ideas ;-) ). See also the discussion in #1863 .

@richardjgowers
Copy link
Member Author

I guess I don't like that you have to know to look in lib for distance_array. From a user perspective why isn't distance_array equally core or analysis (or even coordinates, it manipulates coordinates after all). So I guess it only passes the tab completion test once you start in MDAnalysis.lib.

I like the idea of MDAnalysis.geometry being the new home which also fixes the AtomGroup argument problem too....

@richardjgowers richardjgowers force-pushed the issue-1262-distances branch 2 times, most recently from 45a7168 to 04cf3eb Compare June 16, 2018 21:13
@coveralls
Copy link

coveralls commented Jun 16, 2018

Coverage Status

Coverage decreased (-0.009%) to 89.887% when pulling 68446bd on issue-1262-distances into 339c510 on develop.

@richardjgowers
Copy link
Member Author

Ok this is ready for review @MDAnalysis/coredevs

coords = np.array([[ 83.37363434, 65.01402283, 35.03564835],
[ 82.28535461, 59.99816513, 34.94399261],
[ 81.19387817, 54.97631073, 34.85218048]],
coords = np.array([[1, 1, 1],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you changed this test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it wasn't giving 180.0 any more, more like 179.999 so I changed it to a real 180 angle

# 90 degree angle
(np.array([[1, 1, 1], [1, 2, 1], [2, 2, 1]], dtype=np.float32), 90.),
# straight line / 180.
(np.array([[1, 1, 1], [1, 2, 1], [1, 3, 1]], dtype=np.float32), 180.),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

90 and 180° are special cases. Maybe something an angle less peculiar would be worth testing.

# 0 degree angle (cis)
(np.array([[1, 2, 1], [1, 1, 1], [2, 1, 1], [2, 2, 1]], dtype=np.float32), 0.),
# 180 degree (trans)
(np.array([[1, 2, 1], [1, 1, 1], [2, 1, 1], [2, 0, 1]], dtype=np.float32), 180.),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, maybe a less peculiar angle in addition to the edge cases.

.. versionchanged:: 0.17.0
Fixed angles close to 180 giving NaN
.. versionchanged:: 0.18.1
Added pbc keyword
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be worth a note in the changelog, I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Thanks for adding the CHANGELOG entry.)

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looking good, the main thing is more general tests, as @jbarnoud said.

Thanks a lot for taking this one on, @richardjgowers !

@@ -1037,17 +1038,12 @@ def _get_timestep():
@staticmethod
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you get rid of the method completely and just call distances.calc_angle() in the code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could do, it'd technically be a API break but yeah

return distances.calc_angle(d.position, h.position, a.position, box)

@staticmethod
def calc_eucl_distance(a1, a2, box=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you get rid of the method completely and just call distances.calc_distance() in the code?

.. versionchanged:: 0.17.0
Fixed angles close to 180 giving NaN
.. versionchanged:: 0.18.1
Added pbc keyword
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Thanks for adding the CHANGELOG entry.)


.. versionadded:: 0.9.0
.. versionchanged:: 0.18.1
Added pbc keyword
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with default True

.. versionchanged:: 0.17.0
Fixed angles close to 180 giving NaN
.. versionchanged:: 0.18.1
Added pbc keyword
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with default True

a, b : numpy.ndarray
single coordinate vectors
box : numpy.ndarray, optional
simulation box, if given periodic boundary conditions will be applied
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think somewhere in the docs we will have to describe in more detail what exactly we mean and do when we say "periodic boundary conditions will be applied".

@richardjgowers
Copy link
Member Author

Ok maybe good to go now

@orbeckst
Copy link
Member

orbeckst commented Jun 19, 2018

@jbarnoud makes an important point that the test cases are edge cases.

Can you add something more generic? Perhaps we can generate a whole bunch of tests using simple geometry and some random angles?

For example, using the affine transformations from lib.transformations we can first build the fake geometry in internal coordinates (centered at the origin and oriented along coordinate axes) and then transform the points with a random translation and rotation:

# build in internal coordinates starting at origin

theta_deg = np.random.uniform(0, 180)    # angle    A-B-C
theta = np.deg2rad(theta_deg)
a, c = np.random.uniform(0, 1.5, 2)  

# construct three "atoms" A, B, C with bonds A-B and B-C and the desired angle
A = np.array([a, 0, 0])
B = np.array([0, 0, 0])
C = c * np.array([np.cos(theta), np.sin(theta), 0])

# now translate and rotate randomly
import MDAnalysis.lib.transformations as trans
T = trans.translation_matrix(np.random.uniform(-50, 50, 3))
R = trans.random_rotation_matrix(np.random.rand(3))
M = trans.concatenate_matrices(T, R)

# use affine transformations just for the heck of it (instead of R.x + t)
positions = np.array([A, B, C])
# add '1' in the fourth coordinate (treat as points, not vectors for affine transformation)
points  = np.hstack((positions, np.ones(positions.shape[0])[:, np.newaxis]))
# transform (to use broadcasting and take account of array orders, switch M*x to x*M.T)
points_transformed = np.dot(points, M.T)
# transformed coordinates (x, y, z)
positions_transformed = points_transformed[:, :3]


# check that the angle is the same
A1, B1, C1 = positions_transformed
AB1 = A1 - B1
CB1 = C1 - B1

a1 = np.linalg.norm(AB1)
c1 = np.linalg.norm(CB1)

print(np.allclose(a, a1))
print(np.allclose(c, c1))

theta_deg1 = np.rad2deg(np.arccos(np.dot(AB1, CB1) / (a1*c1)))
assert np.allclose(theta_deg, theta_deg1)

You can also check with

from MDAnalysis.analysis.rms import rmsd
assert np.allclose(rmsd(positions, positions_transformed, superposition=True), 0, atol=1e-7)

that we just do a rigid body translation + rotation.

@orbeckst
Copy link
Member

You missed a self.calc_* in HBond analysis and the tests fail.

@richardjgowers richardjgowers force-pushed the issue-1262-distances branch 2 times, most recently from a3e608c to ce46c79 Compare June 23, 2018 23:42

reference = ref if periodic else manual_angle(a, b, c)

assert result == pytest.approx(reference, abs=1e-3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails with numpy 1.10.4 – is pytest.approx() a good way to do this comparison?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was trying to be cute and use pytest, looks like np is better


reference = ref if periodic else manual_dihedral(a, b, c, d)

assert abs(result) == pytest.approx(abs(reference), abs=1e-3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fails for numpy 1.10.4

assert 2.5444436693403972e-12 == 0.0 ± 1.0e-03

This is not behaving as well as np.testing.assert_almost_equal().

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok for the slightly more general angle tests.

However, the assertions are not very robust and fail under numpy 1.10.4, even though the absolute difference is near machine precision.

@orbeckst orbeckst mentioned this pull request Jun 26, 2018
4 tasks
@richardjgowers
Copy link
Member Author

@orbeckst ok good to go now!

@orbeckst orbeckst merged commit 9bd5b09 into develop Jun 29, 2018
@orbeckst orbeckst deleted the issue-1262-distances branch June 29, 2018 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants