Analysis - Linear Density module by mattiafelice-palermo · Pull Request #670 · MDAnalysis/mdanalysis

mattiafelice-palermo · 2016-01-27T13:18:30Z

As discussed in here.

This is a quite simple analysis module that, given a NVT trajectory in an orthorombic simulation cell, outputs a column based text file with mass and charge densities along the three axes of the cell.

A minimal working example would be as follows:

import MDAnalysis as mda
from MDAnalysisTests.datafiles import DCD, PSF
from lineardensity import LinearDensity 

universe  = mda.Universe(PSF,DCD)
sel_string = 'all'
selection = universe.select_atoms(sel_string)

job = LinearDensity(universe, selection, description=sel_string, grouping='atoms', binsize=0.25)
job.run()

where grouping specifies the group of atoms on which the density distribution will be computed (it takes the usual arguments atoms, residues, fragments etc.) and binsize is the resolution of the computed histogram.
The analysis module outputs a file named dcdfilename.description_grouping.ldens

#1 coord [Ang] 2-7 mass density (x,sx,y,sz,z,sz) [g/cm^3] 8-13 charge density (x,sx,y,sz,z,sz) [e/A^3]
#  Average density: 0.193193630269 g/cm3
   0.00000    0.32147    0.02698    0.31465    0.02528    0.00000    0.00000    0.00008    0.00071   -0.00011    0.00065    0.00000    0.00000
   0.25011    0.32246    0.02996    0.31789    0.02887    0.00000    0.00000    0.00001    0.00070    0.00012    0.00067    0.00000    0.00000
   0.50022    0.32115    0.02701    0.31733    0.02785    0.00000    0.00000    0.00001    0.00071    0.00007    0.00068    0.00000    0.00000
   0.75033    0.32143    0.03006    0.31959    0.02885    0.00000    0.00000    0.00005    0.00066    0.00013    0.00067    0.00000    0.00000
   1.00045    0.31694    0.02879    0.31757    0.02736    0.00000    0.00000   -0.00007    0.00069    0.00014    0.00064    0.00000    0.00000
   1.25056    0.32110    0.02822    0.31744    0.02814    0.00000    0.00000   -0.00008    0.00068   -0.00001    0.00070    0.00000    0.00000
   1.50067    0.31358    0.02598    0.31731    0.02730    0.00000    0.00000    0.00000    0.00067   -0.00000    0.00073    0.00000    0.00000
   1.75078    0.31366    0.02861    0.32024    0.02787    0.00000    0.00000    0.00004    0.00065   -0.00004    0.00069    0.00000    0.00000
   2.00089    0.31297    0.02913    0.31877    0.02777    0.00000    0.00000    0.00002    0.00060   -0.00008    0.00073    0.00000    0.00000
   2.25100    0.31165    0.02863    0.31673    0.02692    0.00000    0.00000    0.00004    0.00064   -0.00008    0.00069    0.00000    0.00000
   2.50112    0.31333    0.02710    0.31816    0.02520    0.00000    0.00000    0.00005    0.00068   -0.00000    0.00067    0.00000    0.00000
   2.75123    0.31668    0.02364    0.32286    0.02717    0.00000    0.00000    0.00002    0.00072   -0.00001    0.00069    0.00000    0.00000
   3.00134    0.31658    0.02897    0.31936    0.02593    0.00000    0.00000   -0.00006    0.00061   -0.00000    0.00071    0.00000    0.00000
   3.25145    0.31632    0.03075    0.32089    0.02807    0.00000    0.00000   -0.00001    0.00068    0.00004    0.00066    0.00000    0.00000
   3.50156    0.31390    0.02752    0.31770    0.02602    0.00000    0.00000    0.00000    0.00075   -0.00003    0.00072    0.00000    0.00000
[...]

from which it is easy to create plots of density profiles. I don't know if this module replicates the functionality of the already present density module (I could not understand from here) so I apologize in advance if this is a replica of already existing work!

richardjgowers · 2016-01-27T13:52:32Z

package/MDAnalysis/analysis/lineardensity.py

Generally, you don't want to use lists [a, b, c, d] for numerical stuff, use a numpy array np.array([elem.total_mass() for elem in group]). When you put this into weights= in numpy histogram, it's likely being converted into an array there (if not already an array), so with each loop iteration you're having to make this array many times.

mattiafelice-palermo · 2016-01-27T14:17:27Z

Thanks @richardjgowers for the very useful inputs! I will look into them and commit once it's done

mattiafelice-palermo · 2016-01-27T20:26:10Z

I modified the code according to Richard's suggestions, I hope it is fine! Let me know if anything else can be improved. For example, when computing the density profiles on groups of atoms, the analysis is much slower. I wonder if it is only because of the additional computational load due to the calculation of the COMs or if there is any part of the code that has been written poorly/in a non efficient way.

richardjgowers · 2016-01-27T20:45:10Z

I think the groups are slow because they're done individually rather than all together. What would be really cool, (and is a separate issue), would be a subroutine which takes a series of groupings, and returns the center of mass/geometry for each.

So for groups [1, 2, 4, 5], [6, 8, 10] and [11, 15] we could do

# Array of all particles
[1, 2, 4, 5, 6, 8, 10, 11, 15]
# Pointers to starts and stops within the above array
starts = [0, 4, 8]
stops = [4, 8, -1]

do i = 1, nCOGs:
    do j = starts(i), stops(i):
        COG(i) = COG(i) + positions(j)

Then we'd probably write it in C like these are:
https://github.com/MDAnalysis/mdanalysis/blob/develop/package/MDAnalysis/lib/include/calc_distances.h

mattiafelice-palermo · 2016-01-27T21:45:36Z

That was actually my main guess. I think it would be cool to be able to obtain, at once, an array of properties for all residues/segments in an atom group as follows:

universe  = mda.Universe(PSF,DCD)
selection = universe.select_atoms('all')

# arrays containing properties for each residue or each segment
residues_COMs = selection.residues.centers_of_mass
segments_principal_axes = selection.segments.principal_axes

# instead of the slow
residues_COMs = np.array([elem.center_of_mass for elem in selection.residues])
segments_principal_axes = np.array([elem.principalAxes() for elem in selection.segments])

At least in our field of work this would come very handy as it often happens that we need to access these properties at once. Do you think this would be feasible and that the syntax above would be intuitive?

richardjgowers · 2016-01-27T21:50:54Z

Yeah, I think @dotsdl and I have talked about this before. In the upcoming topology revolution, we've done this for masses, so ResidueGroup.masses returns the sum over all atoms in each Residue. Then maybe ResidueGroup.positions should return the COM per Residue like we're talking about.

richardjgowers · 2016-01-28T10:40:11Z

package/MDAnalysis/analysis/lineardensity.py

You don't need to call np.array on these because they're always returned as arrays anyway

richardjgowers · 2016-01-28T10:51:11Z

Docstrings everywhere please, following these rules:
http://sphinxcontrib-napoleon.readthedocs.org/en/latest/example_numpy.html

To make it look something like:
https://github.com/MDAnalysis/mdanalysis/blob/develop/package/MDAnalysis/analysis/rdf.py#L34

mattiafelice-palermo · 2016-01-31T01:56:23Z

Hi! I have fixed hopefully most of the smaller issues and added docstrings wherever necessary. Being my first time writing docstrings in numpy style, please let me know if they can be improved in any way.

ResidueGroup.positions sounds nice. When you decide to implement this, let me know if I can be of help (I've never wrote anything in C/cython but could be a good place to start).

richardjgowers · 2016-01-31T17:13:24Z

package/MDAnalysis/analysis/lineardensity.py

You need to copy the same thing from the top of all the other files here

richardjgowers · 2016-01-31T17:18:25Z

Ok, so for the auto documentation to work, you need to go to package/doc/sphinx/source/documentation_pages, add a stub rst file in the analysis directory (it should just be a one line file, copy another one in there). Then you also need to add a link to analysis_modules.rst, maybe under Volumetric Analysis, again just follow the pattern with all the other analysis modules. This is what tells the doc builder to read the .py source to make this page:

http://pythonhosted.org/MDAnalysis/documentation_pages/analysis_modules.html

richardjgowers · 2016-01-31T17:32:21Z

Ok almost done,

add yourself to package/AUTHORS
add yourself alphabetically to package/doc/sphinx/source/conf.py in the authors variable
add a line or two in CHANGELOG mentioning the new module.

If you could add a quick and easy regression test using the PSF/DCD system that would be good. Just run it once on that trajectory, and add the results as a test so if it stops working we know.

Finally, if you want, you can add a few lines to the next release blog post, you can clone that repo and checkout the next-release branch and add a PR onto that branch.
MDAnalysis/MDAnalysis.github.io#8

Then maybe have another @MDAnalysis/coredevs have a look at this too

mattiafelice-palermo · 2016-02-03T01:05:12Z

So, I am writing a test case but I found an inconsistency with the new parallel version of AnalysisBase. In the new version, _setup_frames takes a universe object (previously it was a trajectory) - I don't remember if I modified that for a particular reason? Anyway, the test won't work because _setup_frames expects a trajectory object.

If there are no objections I guess it would be better to restore the previous behaviour (_setup_frames taking a trajectory object) and modify LinearDensity accordingly, unless there was a particular reason for that.

richardjgowers · 2016-02-03T10:11:23Z

package/MDAnalysis/analysis/lineardensity.py

You don't need to ask for Universe here, as it's always available as selection.universe

richardjgowers · 2016-02-03T10:14:23Z

Yeah, I think we needed the Universe in setup frames for parallel, can't remember why either. And yeah, for now just make this work in serial, then we can go back and fix it when parallel happens.

…ort order, added save method, use of in-place sum of np arrays

…ric Analysis

…tible with AnalysisBase of MDAnalysis 0.13.0

…oved print statements from test

mattiafelice-palermo · 2016-02-06T21:27:43Z

Ok so I think the option is now working as expected. If any other coredev wants to have a look at it please feel free :) Thanks!

Analysis - Linear Density module

richardjgowers · 2016-02-11T15:29:36Z

Hey Mattia. Looking good, congrats on your first PR!

mattiafelice-palermo · 2016-02-12T00:37:38Z

That's great, thank you Richard! :)

richardjgowers reviewed Jan 27, 2016
View reviewed changes

richardjgowers reviewed Jan 28, 2016
View reviewed changes

richardjgowers added this to the 0.14 milestone Jan 28, 2016

richardjgowers reviewed Jan 31, 2016
View reviewed changes

package/MDAnalysis/analysis/lineardensity.py Outdated

Copy link

Member

richardjgowers Jan 31, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to copy the same thing from the top of all the other files here

richardjgowers mentioned this pull request Jan 31, 2016

Mass COM calculation #686

Closed

richardjgowers added needs review Component-Analysis labels Feb 1, 2016

richardjgowers reviewed Feb 3, 2016
View reviewed changes

richardjgowers mentioned this pull request Feb 3, 2016

Parallel trajectory analysis (revisited) #618

Closed

6 tasks

mattiafelice-palermo added 5 commits February 6, 2016 22:05

Added lineardensity module

b6dfa88

Fixed: np arrays instead of lists for charges and masses, correct imp…

eadb51a

…ort order, added save method, use of in-place sum of np arrays

Fixed improper use of list comprehension; Added docstrings

f38bc45

Added standard header

a677115

Created stub rst for the analysis module and added link under Volumet…

5605394

…ric Analysis

mattiafelice-palermo added 4 commits February 6, 2016 22:05

Added myself to authors

a3a9550

Modified changelog

38d09a2

Lineardensity does not take universe as argument anymore and is compa…

90c574c

…tible with AnalysisBase of MDAnalysis 0.13.0

Test added

fd44b72

mattiafelice-palermo force-pushed the feature-LinearDensity branch from 3b103e0 to fd44b72 Compare February 6, 2016 21:06

Trajectory name taken from variable filename and not dcdfilename, rem…

0f0aefa

…oved print statements from test

richardjgowers added a commit that referenced this pull request Feb 11, 2016

Merge pull request #670 from mattihappy/feature-LinearDensity

8e06aa5

Analysis - Linear Density module

richardjgowers merged commit 8e06aa5 into MDAnalysis:develop Feb 11, 2016

Conversation

mattiafelice-palermo commented Jan 27, 2016

Uh oh!

richardjgowers Jan 27, 2016

Choose a reason for hiding this comment

Uh oh!

mattiafelice-palermo commented Jan 27, 2016

Uh oh!

mattiafelice-palermo commented Jan 27, 2016

Uh oh!

richardjgowers commented Jan 27, 2016

Uh oh!

mattiafelice-palermo commented Jan 27, 2016

Uh oh!

richardjgowers commented Jan 27, 2016

Uh oh!

richardjgowers Jan 28, 2016

Choose a reason for hiding this comment

Uh oh!

richardjgowers commented Jan 28, 2016

Uh oh!

mattiafelice-palermo commented Jan 31, 2016

Uh oh!

richardjgowers Jan 31, 2016

Choose a reason for hiding this comment

Uh oh!

richardjgowers commented Jan 31, 2016

Uh oh!

richardjgowers commented Jan 31, 2016

Uh oh!

mattiafelice-palermo commented Feb 3, 2016

Uh oh!

richardjgowers Feb 3, 2016

Choose a reason for hiding this comment

Uh oh!

richardjgowers commented Feb 3, 2016

Uh oh!

mattiafelice-palermo commented Feb 6, 2016

Uh oh!

richardjgowers commented Feb 11, 2016

Uh oh!

mattiafelice-palermo commented Feb 12, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants