Conversation
|
I cleaned up some code in |
|
fixed gnm issue, rebased and force-pushed |
|
... and travis exceeded time so I restarted the job. (We really need to reduce the test time #1191 ) |
|
Full test exceeded time again. I'll try rolling back whatever small code changes I made in analysis.gnm. (Locally takes 122.4 s on a single 3.1 GHz i7 core with most of the time spent on |
package/MDAnalysis/analysis/gnm.py
Outdated
| if jcounter > icounter and _dsq(positions[icounter], positions[jcounter]) <= cutoffsq: | ||
| iresidue, jresidue = residue_index_map[icounter], residue_index_map[jcounter] | ||
| if self.MassWeight: | ||
| contact = 1.0 / (len(self.ca.residues[iresidue].atoms) * len(self.ca.residues[jresidue].atoms)) ** 0.5 |
There was a problem hiding this comment.
This line was responsible for the huge amount of time that the GNM test TestGNM.test_closeContactGNMAnalysis took.
There was a problem hiding this comment.
Can you push your fix to a new branch separate from this?
There was a problem hiding this comment.
also this logic appears also in the other class in this file. It would be nice if you can change it their too.
There was a problem hiding this comment.
Will fix elsewhere; apparently only mattered in the instance were I fixed it because many more atoms were used.
I can try to disentangle but their were also some cleanups in gnm in an earlier commit. I will look at a local interactive rebase and see if I can merge these commits.
There was a problem hiding this comment.
Er, actually, where did you see another instance of this problem? GNMAnalysis.generate_kirchoff() does not use custom weights. I am not sure where else this needs fixing.
There was a problem hiding this comment.
Looks like I misremembered. I thought both classes use the custom weights.
There was a problem hiding this comment.
See PR #1272 – all analysis.gnm changes were removed from this PR.
There was a problem hiding this comment.
@richardjgowers / @dotsdl would you expect a lookup such as len(self.ca.residues[iresidue].atoms to be slow? In particular, is self.ca.residues the bottle neck?
There was a problem hiding this comment.
So assuming self.ca is an AtomGroup,
ca.residueswill be performing anp.uniqueon all the residue indices in the group,residues[iresidue]is just slicing a numpy arrayresidue.atomswill be looking up a single entry in a residue->atoms table.
So nothing too offensively slow from what I can see, but you might be able to calculate them all ahead of time (as these memberships won't change) if you're looping over these things a lot.
|
Optimized a single line in (Also rebased and force pushed.) |
package/MDAnalysis/analysis/gnm.py
Outdated
| cutoffsq = self.cutoff ** 2 | ||
|
|
||
| # cache sqrt of residue sizes (slow) so that sr[i]*sr[j] == sqrt(r[i]*r[j]) | ||
| sqrt_res_sizes = np.sqrt([r.atoms.n_atoms for r in self.ca.residues]) if self.MassWeight else None |
There was a problem hiding this comment.
Btw is that really a weights defined by the mass? It looks only like a estimate of the actual mass. Would size be a better name?
Since you are refactoring this code already. It be nice if this variable was renamed weights and then have options None or 'size'.
There was a problem hiding this comment.
It's not really weight but that's what was in the original code.
I'll rename the variable but leave the kwarg as it is but add a note.
|
I moved the GNM improvements to separate PR #1272 and rebased and force-pushed. |
|
Yes, that's what's being done now. Speed up of about 5 times.
…--
Oliver Beckstein
email: orbeckst@gmail.com
Am Mar 30, 2017 um 5:43 schrieb Richard Gowers ***@***.***>:
but you might be able to calculate them all ahead of time (as these memberships won't change) if you're looping over these things a
|
|
rebased and force pushed |
|
All analysis docs are now numpy style. |
|
Somone has to review it, otherwise it can't be merged... |
| .. _`10.1002/prot.340090204`: http://dx.doi.org/10.1002/prot.340090204 | ||
|
|
||
|
|
||
| Example |
There was a problem hiding this comment.
Just fyi: with numpy docs, never use Example or Examples in a normal section context. sphinx napoleon rewrites it. So you have to ne a bit creative with labelling sections in the main part of the page.
kain88-de
left a comment
There was a problem hiding this comment.
I only found minor issues skimming over this. Good work!
|
|
||
| .. NOTE:: If failure occurs be sure to check the segment identification. | ||
|
|
||
| .. note:: If failure occurs be sure to check the segment identification. |
There was a problem hiding this comment.
Big letter N? Why Note use the numpy note section?
There was a problem hiding this comment.
Changed to Notes section (rebased so it now shows up in commit 056f097 )
| >>> hausdorff_wavg(P,Q[::-1]) # weighted avg hausdorff dist w/ Q reversed | ||
| 2.5669644353703447 | ||
|
|
||
| Notes |
There was a problem hiding this comment.
Yes, its Notes as in https://github.com/numpy/numpy/blob/master/doc/example.py
|
|
||
|
|
||
| .. NOTE:: If failure occurs be sure to check the segment identification. | ||
| .. Note:: If failure occurs be sure to check the segment identification. |
There was a problem hiding this comment.
Why not the numpy notes section
There was a problem hiding this comment.
Changed to Notes section (rebased so it now shows up in commit 056f097 )
Do not use Examples as a heading UNLESS inside a function/class doc because the NumPy reST parser changes it to a rubric heading. This breaks document structure.
- converted all docs to numpy style - added additional references - see also: @tylerjereddy 's scipy.spatial.distance.directed_hausdorff()
- analysis.align: formatting fixes - analysis.contacts: formatting fixes - analysis.diffusionmap: formatting fixes and section headers (cannot use 'Examples' as a normal section header because it is rewritten by sphinx.ext.napoleon as a rubric) - analysis.distances: numpyfied - analysis.hbonds.hbond_analysis: numpyfied
- numpified docs - removed kwargs start and end for resid selection from def helanal_main() and helanal_trajectory() because this can be easily done inside the selection string and neither start nor end are used further in the code. helanal_trajectory() uses the resid of the first and last residue extensively for reporting so we now get these resids from the selection itself.
|
Incorporated @kain88-de 's suggestions, rebased into the appropriate commit, and rewrote some of the commit messages. I think I am done. When @jbarnoud finishes PR #1247 then we will have transitioned all our docs to numpy style. |
| :func:`sequence_alignment`, which does not require external | ||
| programs. | ||
|
|
||
| .. SeeAlso:: |
There was a problem hiding this comment.
Any idea how to change those automatically to the old style? My first try find . -name '*py' -exec sed -i "s/.. SeeAlso::/See Also \n--------\n/" {} \; doesn't work. It can't deal with potential indentation of the initial see also paragraph.
There was a problem hiding this comment.
I had a quick look and I ended up with this:
sed -re 's/^(( *).. SeeAlso::( *))/\2See Also\n\2--------\n\2/g' package/MDAnalysis/lib/util.py | less
You need the -r or you will have to escape all the parentheses, and you may not have the \2 syntax. The \2 is there to report the indentation.
* replace .. SeeAlso:: with numpy section * fix obvious formatting errors * addressed comments from @jbarnoud * fix rendering issues * some more usability changes * add ExtendedPDBReader to docs * fix last link issue I used the following command to replace all occurrences. find . -name '*py' -exec sed -rie 's/^(( *).. SeeAlso::( *))/\2See Also\n\2--------\n\2/g' {} \; Following a comment of @jbarnoud at #1240 (comment)
Use this WIP PR to accumulate doc fixes for analysis. When we have enough we can merge or squash-merge. I didn't want to do a proper PR for a single simple reST fix...
Changes made in this Pull Request:
Note that @jbarnoud has been tackling all other docs in PR #1247.
PR Checklist