Rewrite make_whole in Cython by richardjgowers · Pull Request #1965 · MDAnalysis/mdanalysis

richardjgowers · 2018-06-30T22:44:37Z

The performance of make_whole has been raised in #1961 benchmarking it showed that the check that the molecule is traversable via bonds was taking a significant amount of the time. This rewrites make_whole in Cython and also adds support for triclinic unit cells

richardjgowers · 2018-06-30T22:45:24Z

package/setup.py

                         define_macros=define_macros,
-                         extra_compile_args=extra_compile_args)
+                         extra_compile_args=[a for a in extra_compile_args
+                                             if not a.startswith('-std')])


so c++ didn't play well with the -std=c99 flag. What does this do? And does removing it break anything?

This flag sets the C standard we are using. Yeah, c++ doesn't have this standard. Having an explicit standard for the C-code is nice. I did a PR a while ago where I did small clean up in the C code and I would like to keep the explicit standard requirement.

We should also set the c++ standard we use. I'll have a look at the function you are using and give you my advise after reading up on the docs. The behavior has some subtle changes depending on the used standard version

None of the functions I used make use of modern c++ features. If we want to have the compatibility with very old compilers gcc 4.8 and smaller and therefore almost all linux installations out there we can go for the 2003 standard version. Generally I prefer the c++11/14 standard aka modern C++ but we don't make use of any features of those just yet.

kain88-de · 2018-07-01T09:42:06Z

package/MDAnalysis/lib/_cutil.pyx

+
+@cython.boundscheck(False)
+@cython.wraparound(False)
+def _is_contiguous(int[:] atoms, int[:, :] bonds, int start):


this needs a docstring! I don't see why this function should be hidden.

kain88-de · 2018-07-01T09:45:38Z

package/setup.py

                         define_macros=define_macros,
-                         extra_compile_args=extra_compile_args)
+                         extra_compile_args=[a for a in extra_compile_args
+                                             if not a.startswith('-std')])


We should also set the c++ standard we use. I'll have a look at the function you are using and give you my advise after reading up on the docs. The behavior has some subtle changes depending on the used standard version

kain88-de · 2018-07-01T09:50:18Z

package/MDAnalysis/lib/_cutil.pyx

+    output = intset()
+
+    for val in a:
+        if b.count(val) != 1:


There is no need to be defensive here. The c++ set container will return the existing element if it is already there. https://en.cppreference.com/w/cpp/container/set/insert

You doing the count and the insert doubles the computation cost. Because sets don't have duplicates the count will always be either 0 or 1.

Actually go directly for the std version of this algorithm. It scales better linearly instead of O(a.size() * log(b.size)

https://en.cppreference.com/w/cpp/algorithm/set_difference

from libcpp.algorithm import set_difference from libcpp.iterator import inserter cdef intset difference(intset a, intset b): output = intset() set_difference(a.begin(), a.end(), b.begin(), b.end(), inserter(output, output.begin())) return output

This will be a bit more complicated because the stl functions aren't exposed by cython -.-

kain88-de · 2018-07-01T09:59:05Z

package/MDAnalysis/lib/_cutil.pyx

+        x = bonds[i, 0]
+        y = bonds[i, 1]
+        # only add bonds if both atoms are in atoms set
+        if total.count(x):


combine into one if statement if total.count(x) and total.count(y).

kain88-de · 2018-07-01T10:01:56Z

package/MDAnalysis/lib/_cutil.pyx

+    N = total.size()
+
+    nloops = 0
+    while seen.size() < N:


while seen.size() < N and nloops < N:

kain88-de · 2018-07-01T10:17:42Z

package/MDAnalysis/lib/_cutil.pyx

    cdef int j = 0
    cdef int n_values = values.shape[0]
-    cdef np.ndarray[np.int64_t, ndim=1] result = np.empty(n_values, dtype=np.int64)
+    cdef long[:] result = np.empty(n_values, dtype=np.int64)


you should continue to use the np.int64_t type to ensure the size of the type independent of the used platform. What a long can vary in size.

kain88-de · 2018-07-01T10:20:08Z

I addressed my own comments already. If this works correctly it could give another little boost in performance but this should be checked!

jbarnoud · 2018-07-01T10:52:06Z

Ping #1961

kain88-de · 2018-07-01T11:36:40Z

I also used memoryviews in unique_int_1d, they're generally faster

They might be faster. It's always worth to check the c-code cython generates. In this case I don't assume that it did not make a huge difference. Especially since we have to cast the memoryview back into a numpy array.

coveralls · 2018-07-01T11:41:00Z

Coverage decreased (-0.03%) to 89.933% when pulling 9ea6328 on cpp_is_contiguous into 22dfcdb on develop.

kain88-de · 2018-07-01T12:02:19Z

if c++14 isn't available on macOS give c++11 a try. This will also work here.

richardjgowers · 2018-07-01T18:08:20Z

@kain88-de thanks for the help, I've not used cpp much ever!

kain88-de · 2018-07-01T18:34:07Z

well in the end I didn't add much. I just fixed the unique_id code you broke by accident and clean up the code that's it. osx means we can't use nice things -.-

kain88-de · 2018-07-01T19:12:15Z

btw @richardjgowers how much faster is this version?

richardjgowers · 2018-07-01T20:05:35Z

@kain88-de the cython version of check_contiguous was about 2x faster than pure python, we were mostly just doing set operations to check connectivity and python's sets are good

I've reworked make_whole instead. Previously we checked that a solution could be found (_is_contiguous) then found the solution, whereas we can just try and do the solution (which implicitly does the check anyway). So _is_contiguous has been melted into make_whole

richardjgowers · 2018-07-01T20:06:47Z

package/MDAnalysis/lib/_cutil.pyx

+
+    oldpos = atomgroup.positions
+    newpos = np.zeros((oldpos.shape[0], 3), dtype=np.float32)
+


So I checked the C code made for this, between here and the return at the bottom is pure C :)

richardjgowers · 2018-07-01T20:09:00Z

package/MDAnalysis/lib/_cutil.pyx

+                    vec[i] = oldpos[other, i] - oldpos[atom, i]
+                # Apply periodic boundary conditions to this vector
+                if ortho:
+                    apply_pbc_ortho(&vec[0], &box[0])


So here ideally we'd use the minimum_image things from calc_distances.h but there's type mismatches I need to figure out

richardjgowers · 2018-07-01T20:11:08Z

package/MDAnalysis/lib/_cutil.pyx

+    ortho = True
+    box = atomgroup.dimensions
+    for i in range(3):
+        if box[i + 3] != 90.0:


should I use something like fabs(box[i + 3] - 90) > SOMETHING_SMALL here?

richardjgowers · 2018-07-02T14:21:31Z

package/MDAnalysis/lib/_cutil.pyx

+    else:
+        from .mdamath import triclinic_vectors
+        # forgive me code Jesus for I have sinned
+        B = triclinic_vectors(box)


does anyone know how we'd turn a 3,3 numpy array into the coordinate[3] type we need for the pbc_triclinic below?

see libmdaxdr
<coordinate*>&box[0, 0]

Oh to the typecasting on the B of course.

The line takes the first pointer to the first element of box aka box[0, 0] and casts that to a pointer of type coordinate.

jbarnoud · 2018-07-02T15:33:46Z

testsuite/MDAnalysisTests/lib/test_util.py

        with pytest.raises(ValueError):
            mdamath.make_whole(ag)

-    def test_too_small_box_size(self, universe, ag):


Why removing that test?

I don't think it's likely to happen, you either have box dimensions or not. The chance of an incorrect box being supplied is low.

We could keep the check & test, but it would make the algorithm slower because you need to calculate all bond lengths. I thought it was too much of an unlikely case to slow things down for it

richardjgowers · 2018-07-02T19:12:02Z

Ok thanks to @davidercruz we've got a fullerene test too

Should be good to go now

richardjgowers · 2018-07-03T12:43:30Z

@MDAnalysis/coredevs reviews please :)

jbarnoud · 2018-07-03T12:56:07Z

I plan on looking at this either tonight or tomorrow. Le 3 juil. 2018 2:43 PM, Richard Gowers <notifications@github.com> a écrit :@MDAnalysis/coredevs reviews please :) —You are receiving this because you commented.Reply to this email directly, view it on GitHub, or mute the thread.

jbarnoud

Mostly questions. The code looks sane. I am a bit annoyed that the precision of the tests needed to be reduced, though.

jbarnoud · 2018-07-05T10:12:10Z

package/MDAnalysis/lib/_cutil.pyx

+    return np.array(result)
+
+
+cdef intset difference(intset a, intset b):


Does not look the most efficient way to do, see https://en.cppreference.com/w/cpp/algorithm/set_difference. But it should do for now.

I think @kain88-de tried to use this but it's not available on osx or something

We can rewrite the algorithm ourselves. They assume that the containers are sorted (which they are for c++11 and later). Using the std algorithm is difficult on linux. I did not get the compiler to resolve the template. This could get better when we update to use the new conda compilers.

jbarnoud · 2018-07-06T13:54:41Z

package/MDAnalysis/lib/_cutil.pyx

+                             "You can set dimensions using 'atomgroup.dimensions='")
+
+    ortho = True
+    for i in range(3):


Why not range(3, 6) instead of doing index gymnastic?

jbarnoud · 2018-07-06T14:05:59Z

package/MDAnalysis/lib/_cutil.pyx

+                if ortho:
+                    minimum_image(&vec[0], &box[0], &inverse_box[0])
+                else:
+                    minimum_image_triclinic(&vec[0], <coordinate*>&tri_box[0])


Couldn't tri_box be casted only once before the loops?

does casting take any time? I thought it was more an instruction to the compiler rather than a copy of data?

I do not expect any cost for the cast; though, I do not see why tri_box could not be directly the right type. Nevertheless, it is really not a big deal.

jbarnoud · 2018-07-06T14:14:22Z

package/MDAnalysis/lib/mdamath.py

-            raise ValueError("Reference atom not in atomgroup")
-
-    # Check all of atomgroup is accessible from ref
-    if not _is_contiguous(atomgroup, ref):


This test is not there in the new version. From what I understand, a non contiguous atom group will trigger the test at the very end of the new version, but the error message is not super clear about the reason, and it fails late instead of early.

So when benchmarking this I found it takes a non negligible amount of time. Because we also implicitly do this check when applying the algorithm, it made more sense to not check this before and just try. We only apply the position change if the alg. worked, so it's defensive in that respect. But yes this means that failing because of non-contiguous is slower, although succeeding is faster.

OK. Good for me.

jbarnoud · 2018-07-06T14:16:30Z

testsuite/MDAnalysisTests/lib/test_util.py

        assert_array_almost_equal(universe.atoms[:4].positions, refpos)
        assert_array_almost_equal(universe.atoms[4].position,
-                                  np.array([110.0, 50.0, 0.0]))
+                                  np.array([110.0, 50.0, 0.0]), decimal=3)


This looks like a drastic decrease in precision, doesn't it? Could it be that make_whole needs to work on doubles?

lib.mdamath.make_whole now supports triclinic boxes renamed from new to avoid cpp name conflict in calc_distances.h added fullerene test case for make_whole

richardjgowers · 2018-07-06T15:27:49Z

@jbarnoud I bumped precision up to decimal=5 and it still passes (hopefully)

richardjgowers commented Jun 30, 2018

View reviewed changes

kain88-de reviewed Jul 1, 2018

View reviewed changes

kain88-de force-pushed the cpp_is_contiguous branch from d7be442 to ec69f49 Compare July 1, 2018 10:27

kain88-de force-pushed the cpp_is_contiguous branch from 8c6ac1e to 0d5150e Compare July 1, 2018 15:38

richardjgowers added the Work in progress label Jul 1, 2018

richardjgowers commented Jul 1, 2018

View reviewed changes

richardjgowers mentioned this pull request Jul 1, 2018

wrap/unwrap transformations #1961

Closed

4 tasks

richardjgowers force-pushed the cpp_is_contiguous branch from b6ef988 to 0d62ca5 Compare July 1, 2018 20:26

richardjgowers commented Jul 2, 2018

View reviewed changes

jbarnoud reviewed Jul 2, 2018

View reviewed changes

richardjgowers force-pushed the cpp_is_contiguous branch from 27c4a9e to 20f8a52 Compare July 2, 2018 16:12

jbarnoud requested a review from davidercruz July 3, 2018 07:06

richardjgowers removed the Work in progress label Jul 3, 2018

richardjgowers changed the title ~~Cpp is contiguous~~ Rewrite make_whole in Cython Jul 3, 2018

jbarnoud reviewed Jul 6, 2018

View reviewed changes

rewrote _is_contiguous in Cython, approx 2x faster

aebdd7b

add osx compilation

4cd8be3

richardjgowers force-pushed the cpp_is_contiguous branch from 6e5fb65 to cac9201 Compare July 6, 2018 15:23

rewrote make_whole in Cython

9ea6328

lib.mdamath.make_whole now supports triclinic boxes renamed from new to avoid cpp name conflict in calc_distances.h added fullerene test case for make_whole

richardjgowers force-pushed the cpp_is_contiguous branch from cac9201 to 9ea6328 Compare July 6, 2018 15:26

jbarnoud approved these changes Jul 6, 2018

View reviewed changes

jbarnoud merged commit 01e2511 into develop Jul 6, 2018

jbarnoud deleted the cpp_is_contiguous branch July 6, 2018 17:12

jbarnoud added this to the 0.19.0 milestone Jul 6, 2018

jbarnoud self-assigned this Jul 6, 2018

jbarnoud mentioned this pull request Jul 13, 2018

Fix index issue in make_whole #1986

Closed


		oldpos = atomgroup.positions
		newpos = np.zeros((oldpos.shape[0], 3), dtype=np.float32)

		return np.array(result)


		cdef intset difference(intset a, intset b):

Conversation

richardjgowers commented Jun 30, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kain88-de commented Jul 1, 2018

Uh oh!

jbarnoud commented Jul 1, 2018

Uh oh!

kain88-de commented Jul 1, 2018

Uh oh!

coveralls commented Jul 1, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kain88-de commented Jul 1, 2018

Uh oh!

richardjgowers commented Jul 1, 2018

Uh oh!

kain88-de commented Jul 1, 2018

Uh oh!

kain88-de commented Jul 1, 2018

Uh oh!

richardjgowers commented Jul 1, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

richardjgowers commented Jul 2, 2018

Uh oh!

richardjgowers commented Jul 3, 2018

Uh oh!

jbarnoud commented Jul 3, 2018 via email

Uh oh!

jbarnoud left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

richardjgowers commented Jun 30, 2018 •

edited

Loading

coveralls commented Jul 1, 2018 •

edited

Loading

jbarnoud Jul 6, 2018 •

edited

Loading