Fixed the run() method of analysis.hydrogenbonds.hbond_analysis method to use atom indices instead of ids#2572
Conversation
Fixed the attribute error (MDAnalysis#2396). Added test cases that check the ids and indices of atoms in a hydrogen bond. Co-authored-by: p-j-smith <paul.smith@kcl.ac.uk>
Codecov Report
@@ Coverage Diff @@
## develop #2572 +/- ##
===========================================
- Coverage 90.71% 90.70% -0.02%
===========================================
Files 174 174
Lines 23555 23554 -1
Branches 3073 3073
===========================================
- Hits 21369 21365 -4
- Misses 1565 1569 +4
+ Partials 621 620 -1
Continue to review full report at Codecov.
|
orbeckst
left a comment
There was a problem hiding this comment.
Minor issues, see comments.
Please also update CHANGELOG as this fixes an issue.
testsuite/MDAnalysisTests/analysis/test_hydrogenbonds_analysis.py
Outdated
Show resolved
Hide resolved
testsuite/MDAnalysisTests/analysis/test_hydrogenbonds_analysis.py
Outdated
Show resolved
Hide resolved
Tidied up the mock universe in TestHydrogenBondAnalysisMock. Using numpy asserts rather than bare asserts.
Co-authored-by: p-j-smith <paul.smith@kcl.ac.uk>
There was a problem hiding this comment.
Maybe replace hasattr(u, 'bonds') by hasattr(u._topology, 'bonds') for performance reasons (see comment below). Since I don't know how long the analysis usually takes (per frame), I don't know if this is an issue. So I leave the decision up to you. If you decide to follow my suggestion, the reason should be commented in the code.
|
The bonds lookup is because it actually calculates them too right? I think
that example is why I wanted to make Topology first class, so you could do
“‘bonds’ in u.topology”, else there’s not a clean/canonical way to check
for existence of attributes.
…On Sun, Mar 8, 2020 at 21:14, Johannes Zeman ***@***.***> wrote:
***@***.**** requested changes on this pull request.
Maybe replace hasattr(u, 'bonds') by hasattr(u._topology, 'bonds') for
performance reasons (see comment). Since I don't know how long the analysis
usually takes (per frame), I don't know if this is an issue. So I leave the
decision up to you. If you decide to follow my suggestion, the reason
should be commented in the code.
------------------------------
In package/MDAnalysis/analysis/hydrogenbonds/hbond_analysis.py
<#2572 (comment)>
:
> @@ -408,7 +409,7 @@ def _get_dh_pairs(self):
# If donors_sel is not provided, use topology to find d-h pairs
if not self.donors_sel:
if not (hasattr(self.u, 'bonds') and len(self.u.bonds) != 0):
I also posted
<#2396 (comment)>
this in the related issue for reference:
hasattr(u, 'bonds') is an *incredibly slow* test because it involves a
lot of object instantiations under the hood. hasattr(u._topology, 'bonds')
is *much* faster:
import MDAnalysis as mda
from MDAnalysis.tests.datafiles import GRO, TPR
u = mda.Universe(GRO) # universe *without* bonds
%timeit hasattr(u, 'bonds')
# 758 µs ± 4.33 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit hasattr(u._topology, 'bonds')
# 216 ns ± 2.88 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
u = mda.Universe(TPR) # universe *with* bonds
%timeit hasattr(u, 'bonds')
# 88.8 ms ± 524 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) <== AAARGH!
%timeit hasattr(u._topology, 'bonds')
# 86.4 ns ± 0.266 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
You see that in the latter example, hasattr(u, 'bonds') is *a million
times slower* than hasattr(u._topology, 'bonds'). This may be irrelevant
for a setup routine that is run only once, but when being called in a loop, hasattr(u,
'bonds') is a nice way of very thoroughly killing performance. 😁
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2572?email_source=notifications&email_token=ACGSGB5L4LVQ27RQ5Y6P7DTRGQDEPA5CNFSM4K7F64K2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCYNNRQA#pullrequestreview-370858176>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACGSGB7WAQRDWCWR4RY5NXTRGQDEPANCNFSM4K7F64KQ>
.
|
It is a million times faster to access. See MDAnalysis#2396 (comment)
|
@zemanj are you happy with the changes? I'll leave it to you to shepherd the PR to the final merge. |
zemanj
left a comment
There was a problem hiding this comment.
Looks awesome, just found some minor thingies.
Your tests are really neat, especially the ASCII art 😁👍
testsuite/MDAnalysisTests/analysis/test_hydrogenbonds_analysis.py
Outdated
Show resolved
Hide resolved
testsuite/MDAnalysisTests/analysis/test_hydrogenbonds_analysis.py
Outdated
Show resolved
Hide resolved
testsuite/MDAnalysisTests/analysis/test_hydrogenbonds_analysis.py
Outdated
Show resolved
Hide resolved
|
Thank you @bieniekmateusz @p-j-smith! |
Changes made in this Pull Request:
analysis.hydrogenbonds.hbond_analysis.HydrogenBondsAnalysiscurrently stores the ids of atoms in a hydrogen bond, rather than the indices. The documentation states, and the helper methods assume, that it is atom indices that are stored.PR Checklist
Co-authored-by: @p-j-smith paul.smith@kcl.ac.uk