Skip to content

remove lib.parallel.distances #530

@orbeckst

Description

@orbeckst

I had a look at our parallel implementations of distance and geometry calculations. With PR #529 I propose a backward-compatible change that enables selecting between the serial (lib._distances) and OpenMP-enabled (lib._distances_openmp) C/Cython code that @richardjgowers wrote. I don't think that the OpenMP-enabled code was actually accessible in a convenient manner so far but with PR #529 that changes (see there).

  • I did some benchmarking (on an Intel Core 2 Duo 2.66 GHz, i.e. only 2 cores) and found that the OpenMP C/Cython code performs much better than the Cython parallel code in lib.parallel.distances (which is also based on OpenMP but within cython instead of C) as shown below.
  • Additionally, lib.parallel.distances can only take orthorhombic boxes into account and breaks/produces garbage with triclinic boxes. The C/Cython code can take PBC with all boxes into account.

Therefore, I propose to remove lib.parallel.distances.

Benchmarks

import MDAnalysis as mda
from MDAnalysis.tests.datafiles import TPR, XTC
import MDAnalysis.lib.parallel.distances as pdist
import MDAnalysis.lib.distances as dist

u = mda.Universe(TPR, XTC)
heavy = u.select_atoms("protein and not name H*")
ow = u.select_atoms("name OW")

# with result array
D = dist.distance_array(heavy.positions, ow.positions, box=u.dimensions)
D32 = D.astype(heavy.positions.dtype)

%timeit dist.distance_array(heavy.positions, ow.positions, box=u.dimensions, result=D, mode="serial")
1 loops, best of 3: 372 ms per loop

%timeit dist.distance_array(heavy.positions, ow.positions, box=u.dimensions, result=D, mode="OpenMP")
1 loops, best of 3: 203 ms per loop

%timeit pdist.distance_array(heavy.positions, ow.positions, box=u.dimensions, result=D32)
1 loops, best of 3: 412 ms per loop

# without results array
%timeit dist.distance_array(heavy.positions, ow.positions, box=u.dimensions, mode="serial")
1 loops, best of 3: 527 ms per loop

 %timeit dist.distance_array(heavy.positions, ow.positions, box=u.dimensions, mode="OpenMP")
1 loops, best of 3: 310 ms per loop

%timeit pdist.distance_array(heavy.positions, ow.positions, box=u.dimensions)
1 loops, best of 3: 435 ms per loop

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions