Skip to content

Grid based search for fast distance calculations#1957

Closed
ayushsuhane wants to merge 3 commits intoMDAnalysis:developfrom
ayushsuhane:feature-grid
Closed

Grid based search for fast distance calculations#1957
ayushsuhane wants to merge 3 commits intoMDAnalysis:developfrom
ayushsuhane:feature-grid

Conversation

@ayushsuhane
Copy link
Contributor

Fixes #974

Include grid based search for fast selections.

The primary aim is to include the grid based search in MDAnalysis. Using FATSLiM neighbour search as the base library and adding other functionalities over top of it to handle PBC, non-PBC and triclinic cells.

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

@ayushsuhane
Copy link
Contributor Author

The search function currently returns pairs, squares of distances and indices within a specified distance. We can add more function as desired. I have modified the setup.py to include the gridsearch file as well and the method can be imported as import MDAnalysis.lib.grid .
First and foremost, I was wondering whether the structure is consistent with the MDAnalysis structure? Should the file be separated in terms of python functions and cython functions similar to distances.py?

@ayushsuhane ayushsuhane changed the title First commit for gridded method Grid based search for fast distance calculations Jun 26, 2018
Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are obviously needed. Documentation is needed once it is clear that you are going to move forward with this approach.

Please properly acknowledge the source of the code (see comment at top).

I had some other minor comments/questions (see inline).

@@ -0,0 +1,647 @@
#cython: cdivision=True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this comes from @seb-buch 's FATslim directly. In this case please add a full comment header that states the origin and the licence under which it has been incorporated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be good to use a due.cite for the entry point to these methods too

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to remember "add due.cite" for PRs in the future. For this PR I'd say that it can be done once @ayushsuhane works on the docs, so I wouldn't worry about it immediately.

The comment header is important to me, though, because we want to make sure that we don't appear to take code that we cannot legally use or to not give proper attribution. Adding comments from the first commit shows that we are taking attribution seriously.

neighborhood.size = 0
neighborhood.allocated_size = NEIGHBORHOOD_ALLOCATION_INCREMENT
neighborhood.beadids = <ns_int *> malloc(NEIGHBORHOOD_ALLOCATION_INCREMENT * sizeof(ns_int))
###Modified here
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this your comment?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I am asking because I have no idea what I should learn from the comment.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was trying to modify the code to output some additional values. I should remove them now though.

neighborhood.beaddist = <real *> malloc(NEIGHBORHOOD_ALLOCATION_INCREMENT * sizeof(real))
###
if neighborhood.beadids == NULL:
abort()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does abort() raise a proper Python Exception? Otherwise, shouldn't this raise RuntimeError or OSError or something else that calling code can decide to handle? (It should say that it failed to allocate memory.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, when dying here, do you leak the memory from the previous allocations?


cdef ns_neighborhood *neighborhood = <ns_neighborhood *> malloc(sizeof(ns_neighborhood))
if neighborhood == NULL:
abort()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does abort() raise a proper Python Exception? Otherwise, shouldn't this raise RuntimeError or OSError or something else that calling code can decide to handle? (It should say that it failed to allocate memory.)

DEF BOX_MARGIN=1.0010
DEF MAX_NTRICVEC=12

from libc.stdlib cimport malloc, realloc, free, abort
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are calls to abort() good style for cython? Will calling code be able to properly handle them? What kinds of exceptions do they raise?

(I have a few comments on abort below but I didn't flag all of them.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading more about abort it seems to close the program In the context of Fatslim, it is what you want, but in that context it means closing the python interpretor. Returning an error code would be better here.


# Update neighbor lists
neighborhood.beadids[neighborhood.size] = bid
### Modified here
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this comment mean?

@orbeckst
Copy link
Member

First and foremost, I was wondering whether the structure is consistent with the MDAnalysis structure? Should the file be separated in terms of python functions and cython functions similar to distances.py?

I don't think it has to be separated. One reason why distances is separated is because of the serial and OpenMP versions (and the idea that we might have other backends). In this case it seemed easier to have the compiled code separately.

If all the code fits into the one c_gridsearch.pyx file then I don't see an immediate need to create a pure Python wrapper around it. As long as the API remains the same, we can always refactor later if needed.

One advantage of the separation is that by looking at the sources you immediately see that there is a lib.distances module whereas in the current setup, one has to look into setup.py to see that c_gridsearch turns into lib.grid. I think coverage checking only works on Python code at the moment, so if you have pure Python code in a cython file then it won't show up in coverage.

Bottom line: Despite possible advantages for having a grid.py wrapper, I have no problem leaving the organization of files as you have at the moment. I'd be more concerned with

  • getting the functionality implemented
  • getting tests and benchmarks going so that there is firm ground for including it into the code base (does it improve performance while at the same time not breaking the existing code?)
  • adding docs

@ayushsuhane
Copy link
Contributor Author

ayushsuhane commented Jun 30, 2018

From what I can gather from the cython documentation

If you don’t do anything special, a function declared with cdef that does not return a Python object has no way of reporting Python exceptions to its caller. If an exception is detected in such a function, a warning message is printed and the exception is ignored

pure C cython functions cannot propagate the errors to the caller functions. Raising an exception like MemoryError inside the fuction would catch the error but would result in the exception being ignored. The output for deliberately pointing towards NULL during malloc

grid.nbeads = NULL
    if grid.nbeads == NULL:
        with gil:
            raise MemoryError('FATAL: Could not allocate memory for NS grid.nbeads (requested:' + str(sizeof(ns_int) * grid_size) +' bytes)\n')

results in

MemoryError: FATAL: Could not allocate memory for NS grid.nbeads (requested:1000
bytes)

Exception ignored in: 'gridlocal.populate_grid_array'
MemoryError: FATAL: Could not allocate memory for NS grid.nbeads (requested:1000
bytes)
Traceback (most recent call last):
File "trial.py", line 15, in
searcher.prepare()
File "c_gridsearch.pyx", line 608, in gridlocal.FastNS.prepare
raise RuntimeError("Could not initialize NS grid")
RuntimeError: Could not initialize NS grid

Typically, best practice is to handle any error by a corresponding check in a python function. But, it is not possible in this case. While there exists a library for exception handling in c++ which can be defined as
cdef int function() except +error_handler:
unfortunately, I couldnt find any library which can be used for C. It seems the errors in pure C needs to be handled in a way traditionally handled by C-exceptions handlers by returning an error code, which can be linked to the error message.
Another possible idea is to use cdef [datatype] function() except *, but it checks the return value every time which might slow the evaluations (see here). I checked similar function with a decorator cython.exceptval(True) in the previous commit. Apparently, the time difference is not significant. I tested with a particle set of 10000 points for single query and self distance

------------- | ---------------- | --------------
------------- | With except * | Without except *
Single query | 0.000351s | 0.000348s
Self distance | 3.3s | 3.5s

Fortunately enough, I think we can use the except * in the definition of the function. Althouhg Memory error and other error checks can be imposed here, but the function still ignores the exception and results in segmentation fault in case of failure in allocation. While I don't expect this behavior after using except, I suspect that we need to define the function definition with except * in .pxd file. @richardjgowers @orbeckst , @jbarnoud @kain88-de, since you are more experienced with cython, a suggestion from your side would helpful. Although typically usage of (.pxd) file is ignored in general discussions, I am not sure why this could occur.

Copy link
Contributor

@jbarnoud jbarnoud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The memory leak makes the code unusable in practice and should be addressed. There should be specific tests to make sure that the memory is freed as it is something we can miss when the code will change in the future.



def __dealloc__(self):
#destroy_nsgrid(self.grid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my play with @seb-buch module in the benchmark, having this line commented leads to a memory leak. When an instance of FastNS is destroyed by the garbage collector, self.grid—that is allocated directly with malloc and therefore not handled by python—is not freed. As a result, running a neighbor search on multiple frames result in all the memory being eating up. I had to hard reboot a coule of time because of that specific line being commented out.

It is likely commented for a reason, though. Maybe @seb-buch has some insight to give us on why he commented the line.

It also means we have to be careful with memory management with that PR.

Copy link
Contributor Author

@ayushsuhane ayushsuhane Jun 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree. Even raising the exceptions/ abort will lead to memory leak. While I have never coded in Cython, I have couple of possible ideas based on what people have done earlier and suggested elsewhere.
First is we can use atexit() functionality, where one can make all the malloc pointers global during the allocation and whenever any instance is destroyed, all the global pointers can be dereferenced (at the atexit() function) before the destruction. Additionally, we can put the same function before raising/catching any exception, so all the pointers which are populated globally are de-referenced first, followed by a gracefull exit.
Another approach to catch the exceptions is using "cysignals/signals.pxi", which catches the exceptions inside the pure C code and can be used along with cdef [datatype] function() except *. This can probably be coupled with atexit() as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the line just has to be uncommented. Also, the initialization should probably go in __cinit__ rather than __init__.

cdef ns_int i
cdef bint initialization_ok

if self.prepared and not force:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is nothing to do, just do nothing. Here, you are preparing again.

@jbarnoud
Copy link
Contributor

About the use of abort and exceptions that was worrying @orbeckst: from what I see, abort is used in a function that returns an exit code. abort could be replaced by return ERROR_CODE or similar. The exception is raised latter down the code: the error code is checked by a non-cdef function that raises a RuntimeError if the returned value is not a success code (0). It is a very common way to deal with errors, and it looks very sane to me here.

@jbarnoud
Copy link
Contributor

As an afterthought after reading my two last comments back to back, it may be worth doing some cleaning there: https://github.com/MDAnalysis/mdanalysis/pull/1957/files#diff-bff4d08482de32360b2c33d16e6091c5R587. I think about freeing the memory before raising the exception.



def __dealloc__(self):
#destroy_nsgrid(self.grid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the line just has to be uncommented. Also, the initialization should probably go in __cinit__ rather than __init__.


if initialization_ok:
self.prepared = True
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we reach that point, we are in a stale state. There should be some cleaning here. Though, just destroying the grid is not enough. The grid should be restored in a pre-prepared state.

DEF BOX_MARGIN=1.0010
DEF MAX_NTRICVEC=12

from libc.stdlib cimport malloc, realloc, free, abort
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading more about abort it seems to close the program In the context of Fatslim, it is what you want, but in that context it means closing the python interpretor. Returning an error code would be better here.

@seb-buch
Copy link
Contributor

;TL DR
This is my mess, I will do my best to clean the code and make it better.

Sorry my response time is sluggishly slow but indeed, the code is a quick and dirty port from fatslim, so there are much probably a few things that are not perfect (#euphemism).
A few comments on some of the comments:

  • About the use of abort(), @jbarnoud is right: in fatslim it makes sense to use it as it is a stand alone program but in the case of a library, I think this is bad and should be replaced by a proper exception as it is much more "python library-like". The main reason being that abort() does not raise anything and is not Python friendly as it is completely out of the scope of the Python GIL.

  • @jbarnoud is also right about the memory leak (due to the wrongly commented free() ). This one is on me: while hacking on the parallel version, I had some issues (ie segfaulting) so I ended up commenting this line to find the culprit which was elsewhere but I forgot to uncomment the free().

  • Not sure about what @jbarnoud was aiming when he said that "initialization should goes to cinit" but the "<ns_grid *> malloc(sizeof(ns_grid))" should indeed go to the cinit as it is pure C memory allocation.

My experience with Cython makes me think that playing with malloc()/free() is playing on thin ice (well no more no less than when your write C code) but you barely have any other option if you want really fast code (ie without the GIL) while needing many memory allocations and/or allocation of pure C structures. The way the NS code is implemented in FATSLiM make this pretty much mandatory but it may not be required here and it might well be possible to implement the grid search while staying in the memory comfort zone (aka no malloc()/no free()).
I will try to investigate this in order to clean the code and make it easier to maintain.

@orbeckst
Copy link
Member

Is this PR superceded by PR #1996 and hence can be closed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants