Augment Coordinates for periodic Boundary conditions#1977
Augment Coordinates for periodic Boundary conditions#1977richardjgowers merged 8 commits intoMDAnalysis:developfrom
Conversation
|
Since replacement of Bio.cKDTree with scipy.spatial.cKDTree is beneficial, is this the correct way forward? The class Periodic_cKDTree deals with both periodic and non-periodic boundary conditons. I have also defined a |
| Input coordinate array to generate duplicate images | ||
| box : array | ||
| Dimensions of the box of shape (3, 3) | ||
| reciprocal : array |
There was a problem hiding this comment.
is this the reciprocal of box? if so, we can just calculate it in the function
| cdef float other[3] | ||
|
|
||
| cdef int dim | ||
| dim = coordinates.shape[1] |
There was a problem hiding this comment.
we can hardcode dim to 3, it might sometimes help the compiler
| indices[p] = i | ||
| p += 1 | ||
|
|
||
|
|
There was a problem hiding this comment.
if you can fix up the formatting it would be easier to read, there's 3 blank lines here
| cdef float sum1 | ||
| cdef ssize_t dim | ||
|
|
||
| dim=3 |
| def undo_augment(int[:] results, int[:] translation, int nreal): | ||
| """Translate augmented indices back to originals | ||
|
|
||
| Note: modifies results in place! |
There was a problem hiding this comment.
This doesn't do the modification in place any more, is there a reason you do the copy now instead?
There was a problem hiding this comment.
Actually, I thought for complex selections it might be useful to keep all the particles (original + images). But it looks like its better to modify the results, otherwise it will lead to multiple computations of similar points with increase in number of selections(operations).
Meanwhile, I will also try to think of a case, where it might fail.
|
I was working on the tests, but got stuck in few problems and would like to share them here:
|
I do not understand the problem. Could you elaborate? |
|
For instance, if the augment function is called as |
|
But, now, if you provide anything less than 6 atoms you have the same problem. Haven't you? |
|
Yes, Exactly. For larger number of atoms, it can be assumed that few particles will be close to the boundary and therefore, it works, but the current form will fail for few number of particles. Either we can put a condition that |
|
At which point in the code do you know for sure what the size of |
|
Only at the end. That is why it is allocated a size of |
|
Can you really assume p < N? If you can have p = 6 for N = 1, then cannot you have p = 6 * N? What happens if all your points are at the border of the box? |
|
Yeah this is something I forgot to do, I think I have a solution....
…On Tue, Jul 10, 2018 at 4:57 AM, Jonathan Barnoud ***@***.***> wrote:
Can you really assume p < N? If you can have p = 6 for N = 1, then cannot
you have p = 6 * N? What happens if all your points are at the border of
the box?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1977 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AI0jB-h7wCMGJuSEd7scr0WrDZwQUncJks5uFHqKgaJpZM4VG4AZ>
.
|
|
@ayushsuhane ok I think we can just use a cpp vector of floats and ints from libcpp.vector cimport vector
import numpy as np
def thing():
cdef vector[float] output
cdef vector[int] indices
# each time we add an augmented coordinate
for j in range(3):
output.push_back(coord[j] + shiftX[j] + shiftY[j])
indices.push_back(i)
# at the end to return results
n = indices.size()
return np.asarray(output).reshape(n, 3), np.asarray(indices) |
| given by pointers a and b | ||
| """ | ||
| cdef ssize_t n | ||
| cdef float[:] result = numpy.zeros((3,), dtype=numpy.float32) |
There was a problem hiding this comment.
in general you don't want any numpy calls inside a cdef function, this should all just be pure C. We only want numpy calls at the start of a function (to prepare things) and at the end (to correctly format things for the user)
Codecov Report
@@ Coverage Diff @@
## develop #1977 +/- ##
===========================================
+ Coverage 88.48% 88.48% +<.01%
===========================================
Files 142 142
Lines 17202 17203 +1
Branches 2635 2635
===========================================
+ Hits 15221 15222 +1
Misses 1385 1385
Partials 596 596
Continue to review full report at Codecov.
|
|
|
package/MDAnalysis/lib/pkdtree.py
Outdated
| self._indices = np.array(list( | ||
| itertools.chain.from_iterable(indices)), | ||
| dtype=np.int32) | ||
| self._indices = np.unique(self._indices) |
There was a problem hiding this comment.
there's a faster unique in lib._cutil that we can use
package/MDAnalysis/lib/pkdtree.py
Outdated
| pairs = np.array(list(self.ckdt.query_pairs(radius)), | ||
| dtype=np.int32) | ||
| if pairs.size > 0: | ||
| pairs = np.unique(np.sort(pairs), axis=0) |
There was a problem hiding this comment.
use the cutil.unique one here too
| @@ -20,7 +20,6 @@ | |||
| # J. Comput. Chem. 32 (2011), 2319--2327, doi:10.1002/jcc.21787 | |||
| # | |||
| # | |||
There was a problem hiding this comment.
can you revert the changes to this file to keep history tidy?
There was a problem hiding this comment.
Just to make sure, are you asking to commit a revert or to make a separate branch and push force to the current branch, but I guess you have to rebase it afterwards to not include any of the current history. Is this right?
richardjgowers
left a comment
There was a problem hiding this comment.
The augment changes look like they're finished, so if you can split them into a new PR we can merge those. The KDTree changes will require more work as we need to replace the existing class (seamlessly as it's well used)
package/MDAnalysis/lib/pkdtree.py
Outdated
| return self._indices | ||
|
|
||
|
|
||
| class Periodic_cKDTree(object): |
There was a problem hiding this comment.
We need to replace the existing pkdtree rather than just add another option here
|
@ayushsuhane yeah just a PR with only the augment functions and tests, but don't lose the progress you've made on KDTree, it's not too far from being finished either |
|
@richardjgowers Do we need anything else here? I will open another PR for replacing Bio.KDTree . |
| cdef int N | ||
| N = results.shape[0] | ||
|
|
||
| for i in range(N): |
package/MDAnalysis/lib/_augment.pyx
Outdated
| original indices of the augmented coordinates | ||
| """ | ||
| cdef bint lo_x, hi_x, lo_y, hi_y, lo_z, hi_z | ||
| cdef int i, j, p, N |
package/MDAnalysis/lib/_augment.pyx
Outdated
|
|
||
| import cython | ||
| import numpy | ||
| cimport numpy |
There was a problem hiding this comment.
We're never using the cimport of numpy here. We'd be using it if we did something like cdef numpy.ndarray[float] arr1, the numpy calls on the right hand side use the import numpy import
package/MDAnalysis/lib/_augment.pyx
Outdated
| # | ||
|
|
||
| import cython | ||
| import numpy |
There was a problem hiding this comment.
as np, then change the calls to np.thing
package/MDAnalysis/lib/_augment.pyx
Outdated
| ---------- | ||
| coordinates : array | ||
| Input coordinate array to generate duplicate images | ||
| dm : array |
There was a problem hiding this comment.
most other functions take box in the [lx, ly, lz,, 90 90 90] form, so use that form here. Then do the conversion to triclinic vectors inside the function. You can keep the dm variable, just have it created inside the function
package/MDAnalysis/lib/_augment.pyx
Outdated
|
|
||
| Returns | ||
| ------- | ||
| results : ndarray of ints |
There was a problem hiding this comment.
indices which have been translated to refer to the original indices
package/MDAnalysis/lib/_augment.pyx
Outdated
| output : array | ||
| coordinates of duplicates generated due to periodic boundary conditions | ||
| indices : array | ||
| original indices of the augmented coordinates |
There was a problem hiding this comment.
.. seealso:: undo_augment
package/MDAnalysis/lib/_augment.pyx
Outdated
|
|
||
| @cython.boundscheck(False) | ||
| @cython.wraparound(False) | ||
| def augment(float[:, ::1] coordinates, float[:, ::1] dm, float r): |
There was a problem hiding this comment.
rename this to augment_coordinates so it's clear it works on coordinates
| @cython.wraparound(False) | ||
| def augment(float[:, ::1] coordinates, float[:, ::1] dm, float r): | ||
| """Calculate augmented coordinate set | ||
|
|
There was a problem hiding this comment.
Add a description of what this function does. Something like "calculates which particles are within r of a box boundary and creates duplicate images of these on the opposite side of the box"
| Parameters | ||
| ---------- | ||
| coordinates : array | ||
| Input coordinate array to generate duplicate images |
There was a problem hiding this comment.
These must all be within the primary unit cell for the algorithm to work
| [1.1, -0.1, 1.1]), | ||
| ([1.1, 0.5, 0.5], [0.5, -0.1, 0.5])) | ||
|
|
||
| radius = 1.5 |
There was a problem hiding this comment.
this should be inside the test function
|
|
||
| @pytest.mark.parametrize('b, qres', product(boxes, zip(queries, images))) | ||
| def test_augment(b, qres): | ||
| b = np.array(b, dtype=np.float32) |
There was a problem hiding this comment.
just store b as a numpy array?
|
|
||
| # Find images for several query points, | ||
| # here in fractional coordinates using augment | ||
| queries = ([0.1, 0.5, 0.5], # box face |
There was a problem hiding this comment.
can you reformat this so the queries and images are side by side, it's hard to read this way..
qres = [
([[0.1, 0.5, 0.5]], [[1.1, 0.5 0.5]]), # box face
etc
]| radius = 1.5 | ||
|
|
||
|
|
||
| @pytest.mark.parametrize('b, qres', product(boxes, zip(queries, images))) |
There was a problem hiding this comment.
don't use product, just have two mark.parametrize, one for b one for qres
package/setup.py
Outdated
| aug = MDAExtension('MDAnalysis.lib._augment', | ||
| sources=['MDAnalysis/lib/_augment' + source_suffix], | ||
| language='c++', | ||
| libraries=mathlib, |
There was a problem hiding this comment.
can you check if we need mathlib? I thought we needed it for sqrt
package/MDAnalysis/lib/_augment.pyx
Outdated
| cdef float coord[3] | ||
| cdef float end[3] | ||
| cdef float other[3] | ||
| cdef float[:, ::1] dm = np.zeros((3, 3), dtype=np.float32) |
There was a problem hiding this comment.
dm and reciprocal don't need to be numpy arrays, can we just use C floats?
|
Should the return datatype for indices be changed to int64 specifically? Or the current way of typecasting before calling unique_int_1d is workable. |
|
@ayushsuhane are you talking about how |
|
@richardjgowers Should it handle both or everything should be converted to int64? |
|
Meanwhile, I just changed it to np.int64 , if required to handle both of the int types I came across @richardjgowers @jbarnoud can you review it. Once it is merged then we can focus on the KDTree, since it also require augment function. |
richardjgowers
left a comment
There was a problem hiding this comment.
Code is looking good.
WRT docs, can you import augment + undo into lib.distances, then also add in the .. autofunction:: lib._augment.augment_coordinates to the distances docs. Ie they should look like they're from that module even though they are written somewhere else
| @@ -0,0 +1,2 @@ | |||
| .. automodule:: MDAnalysis.lib.augment | |||
There was a problem hiding this comment.
the doc build failed because you refer to lib.augment here but you named the module lib._augment elsewhere
|
@ayushsuhane WRT |
|
@richardjgowers were you referring to similar structure for the docs or did I get you wrong? |
|
@ayushsuhane yep that was right, thanks! |
Changes made in this Pull Request:
PR Checklist