Skip to content

Diffusion Map Implementation in MDAnalysis #857

@jdetle

Description

@jdetle

Hi All,
As far as I'm concerned, I'd like to go ahead and refactor @euhruska's work instead of starting off from scratch. So far it looks like he did a good job in implementing all the principle components (see what I did there?) of a diffusion map algorithm, but there are some changes and improvements that can definitely be made:

Refactor to satisfy Bauhaus Style
The current implementation is left as a function and needs to be remade to inherit from BaseAlign,

Parallelize Trajectory Analysis
There are a few routes to go about doing this. I think it would be interesting to try using deco, as it could lay the groundwork for doing this simply in other analysis modules. (Can _single_frame() be made easily concurrent? Or is this limited by how our Reader is working?) If that doesn't work quickly, we could move on and work on something using cython and the like, as demonstrated by this.

Optimize memory allocation
Find all possible areas to prevent copying of arrays where possible, unfortunately the eigenvalue
problem is one area where we need to load an entire matrix into memory.

Provide API for introduction of new metric
A metric function should be able to be provided as an optional argument in initialization, with rmsd used if none is given. RMSD is a somewhat non-useful metric from the literature I read.

Perform thorough testing on real-world data
The most important part of this work is to make sure it actually works, using @euhruska's PR rather than starting from scratch again gives us more time to validate it against data and make sure everything is performing as it should. Of course, I would also work on covering the code.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions