Skip to content

Analysis - Linear Density module#670

Merged
richardjgowers merged 10 commits intoMDAnalysis:developfrom
mattiafelice-palermo:feature-LinearDensity
Feb 11, 2016
Merged

Analysis - Linear Density module#670
richardjgowers merged 10 commits intoMDAnalysis:developfrom
mattiafelice-palermo:feature-LinearDensity

Conversation

@mattiafelice-palermo
Copy link
Contributor

As discussed in here.

This is a quite simple analysis module that, given a NVT trajectory in an orthorombic simulation cell, outputs a column based text file with mass and charge densities along the three axes of the cell.

A minimal working example would be as follows:

import MDAnalysis as mda
from MDAnalysisTests.datafiles import DCD, PSF
from lineardensity import LinearDensity 

universe  = mda.Universe(PSF,DCD)
sel_string = 'all'
selection = universe.select_atoms(sel_string)

job = LinearDensity(universe, selection, description=sel_string, grouping='atoms', binsize=0.25)
job.run()

where grouping specifies the group of atoms on which the density distribution will be computed (it takes the usual arguments atoms, residues, fragments etc.) and binsize is the resolution of the computed histogram.
The analysis module outputs a file named dcdfilename.description_grouping.ldens

#1 coord [Ang] 2-7 mass density (x,sx,y,sz,z,sz) [g/cm^3] 8-13 charge density (x,sx,y,sz,z,sz) [e/A^3]
#  Average density: 0.193193630269 g/cm3
   0.00000    0.32147    0.02698    0.31465    0.02528    0.00000    0.00000    0.00008    0.00071   -0.00011    0.00065    0.00000    0.00000
   0.25011    0.32246    0.02996    0.31789    0.02887    0.00000    0.00000    0.00001    0.00070    0.00012    0.00067    0.00000    0.00000
   0.50022    0.32115    0.02701    0.31733    0.02785    0.00000    0.00000    0.00001    0.00071    0.00007    0.00068    0.00000    0.00000
   0.75033    0.32143    0.03006    0.31959    0.02885    0.00000    0.00000    0.00005    0.00066    0.00013    0.00067    0.00000    0.00000
   1.00045    0.31694    0.02879    0.31757    0.02736    0.00000    0.00000   -0.00007    0.00069    0.00014    0.00064    0.00000    0.00000
   1.25056    0.32110    0.02822    0.31744    0.02814    0.00000    0.00000   -0.00008    0.00068   -0.00001    0.00070    0.00000    0.00000
   1.50067    0.31358    0.02598    0.31731    0.02730    0.00000    0.00000    0.00000    0.00067   -0.00000    0.00073    0.00000    0.00000
   1.75078    0.31366    0.02861    0.32024    0.02787    0.00000    0.00000    0.00004    0.00065   -0.00004    0.00069    0.00000    0.00000
   2.00089    0.31297    0.02913    0.31877    0.02777    0.00000    0.00000    0.00002    0.00060   -0.00008    0.00073    0.00000    0.00000
   2.25100    0.31165    0.02863    0.31673    0.02692    0.00000    0.00000    0.00004    0.00064   -0.00008    0.00069    0.00000    0.00000
   2.50112    0.31333    0.02710    0.31816    0.02520    0.00000    0.00000    0.00005    0.00068   -0.00000    0.00067    0.00000    0.00000
   2.75123    0.31668    0.02364    0.32286    0.02717    0.00000    0.00000    0.00002    0.00072   -0.00001    0.00069    0.00000    0.00000
   3.00134    0.31658    0.02897    0.31936    0.02593    0.00000    0.00000   -0.00006    0.00061   -0.00000    0.00071    0.00000    0.00000
   3.25145    0.31632    0.03075    0.32089    0.02807    0.00000    0.00000   -0.00001    0.00068    0.00004    0.00066    0.00000    0.00000
   3.50156    0.31390    0.02752    0.31770    0.02602    0.00000    0.00000    0.00000    0.00075   -0.00003    0.00072    0.00000    0.00000
[...]

from which it is easy to create plots of density profiles. I don't know if this module replicates the functionality of the already present density module (I could not understand from here) so I apologize in advance if this is a replica of already existing work!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, you don't want to use lists [a, b, c, d] for numerical stuff, use a numpy array np.array([elem.total_mass() for elem in group]). When you put this into weights= in numpy histogram, it's likely being converted into an array there (if not already an array), so with each loop iteration you're having to make this array many times.

@mattiafelice-palermo
Copy link
Contributor Author

Thanks @richardjgowers for the very useful inputs! I will look into them and commit once it's done

@mattiafelice-palermo
Copy link
Contributor Author

I modified the code according to Richard's suggestions, I hope it is fine! Let me know if anything else can be improved. For example, when computing the density profiles on groups of atoms, the analysis is much slower. I wonder if it is only because of the additional computational load due to the calculation of the COMs or if there is any part of the code that has been written poorly/in a non efficient way.

@richardjgowers
Copy link
Member

I think the groups are slow because they're done individually rather than all together. What would be really cool, (and is a separate issue), would be a subroutine which takes a series of groupings, and returns the center of mass/geometry for each.

So for groups [1, 2, 4, 5], [6, 8, 10] and [11, 15] we could do

# Array of all particles
[1, 2, 4, 5, 6, 8, 10, 11, 15]
# Pointers to starts and stops within the above array
starts = [0, 4, 8]
stops = [4, 8, -1]

do i = 1, nCOGs:
    do j = starts(i), stops(i):
        COG(i) = COG(i) + positions(j)

Then we'd probably write it in C like these are:
https://github.com/MDAnalysis/mdanalysis/blob/develop/package/MDAnalysis/lib/include/calc_distances.h

@mattiafelice-palermo
Copy link
Contributor Author

That was actually my main guess. I think it would be cool to be able to obtain, at once, an array of properties for all residues/segments in an atom group as follows:

universe  = mda.Universe(PSF,DCD)
selection = universe.select_atoms('all')

# arrays containing properties for each residue or each segment
residues_COMs = selection.residues.centers_of_mass
segments_principal_axes = selection.segments.principal_axes

# instead of the slow
residues_COMs = np.array([elem.center_of_mass for elem in selection.residues])
segments_principal_axes = np.array([elem.principalAxes() for elem in selection.segments])

At least in our field of work this would come very handy as it often happens that we need to access these properties at once. Do you think this would be feasible and that the syntax above would be intuitive?

@richardjgowers
Copy link
Member

Yeah, I think @dotsdl and I have talked about this before. In the upcoming topology revolution, we've done this for masses, so ResidueGroup.masses returns the sum over all atoms in each Residue. Then maybe ResidueGroup.positions should return the COM per Residue like we're talking about.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to call np.array on these because they're always returned as arrays anyway

@richardjgowers
Copy link
Member

@richardjgowers richardjgowers added this to the 0.14 milestone Jan 28, 2016
@mattiafelice-palermo
Copy link
Contributor Author

Hi! I have fixed hopefully most of the smaller issues and added docstrings wherever necessary. Being my first time writing docstrings in numpy style, please let me know if they can be improved in any way.

ResidueGroup.positions sounds nice. When you decide to implement this, let me know if I can be of help (I've never wrote anything in C/cython but could be a good place to start).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to copy the same thing from the top of all the other files here

@richardjgowers
Copy link
Member

Ok, so for the auto documentation to work, you need to go to package/doc/sphinx/source/documentation_pages, add a stub rst file in the analysis directory (it should just be a one line file, copy another one in there). Then you also need to add a link to analysis_modules.rst, maybe under Volumetric Analysis, again just follow the pattern with all the other analysis modules. This is what tells the doc builder to read the .py source to make this page:

http://pythonhosted.org/MDAnalysis/documentation_pages/analysis_modules.html

@richardjgowers
Copy link
Member

Ok almost done,

  • add yourself to package/AUTHORS
  • add yourself alphabetically to package/doc/sphinx/source/conf.py in the authors variable
  • add a line or two in CHANGELOG mentioning the new module.

If you could add a quick and easy regression test using the PSF/DCD system that would be good. Just run it once on that trajectory, and add the results as a test so if it stops working we know.

Finally, if you want, you can add a few lines to the next release blog post, you can clone that repo and checkout the next-release branch and add a PR onto that branch.
MDAnalysis/MDAnalysis.github.io#8

Then maybe have another @MDAnalysis/coredevs have a look at this too

@mattiafelice-palermo
Copy link
Contributor Author

So, I am writing a test case but I found an inconsistency with the new parallel version of AnalysisBase. In the new version, _setup_frames takes a universe object (previously it was a trajectory) - I don't remember if I modified that for a particular reason? Anyway, the test won't work because _setup_frames expects a trajectory object.

If there are no objections I guess it would be better to restore the previous behaviour (_setup_frames taking a trajectory object) and modify LinearDensity accordingly, unless there was a particular reason for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to ask for Universe here, as it's always available as selection.universe

@richardjgowers
Copy link
Member

Yeah, I think we needed the Universe in setup frames for parallel, can't remember why either. And yeah, for now just make this work in serial, then we can go back and fix it when parallel happens.

@mattiafelice-palermo
Copy link
Contributor Author

Ok so I think the option is now working as expected. If any other coredev wants to have a look at it please feel free :) Thanks!

richardjgowers added a commit that referenced this pull request Feb 11, 2016
@richardjgowers richardjgowers merged commit 8e06aa5 into MDAnalysis:develop Feb 11, 2016
@richardjgowers
Copy link
Member

Hey Mattia. Looking good, congrats on your first PR!

@mattiafelice-palermo
Copy link
Contributor Author

That's great, thank you Richard! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants