Make the TPR parser a bit faster by jbarnoud · Pull Request #2804 · MDAnalysis/mdanalysis

jbarnoud · 2020-06-29T16:14:18Z

A function in the TPR parser calls list.pop thousands of times, which is
slow. This commit avoids that exansive call.

Taking the TPR from
https://github.com/bioexcel/covid_modelling_simulation_data/tree/master/spike_protein/full_spike/trimer
the parsing time on my computer goes from 25.6s to 9.39s. On a more
pathological TPR file, it goes from 3 minutes to about 6s.

Changes made in this Pull Request:

PR Checklist

~~- [ ] Tests?~~
~~- [ ] Docs?~~

CHANGELOG updated?
Issue raised/referenced?

A function in the TPR parser calls list.pop thousands of times, which is slow. This commit avoids that exansive call. Taking the TPR from https://github.com/bioexcel/covid_modelling_simulation_data/tree/master/spike_protein/full_spike/trimer the parsing time on my computer goes from 25.6s to 9.39s. On a more pathological TPR file, it goes from 3 minutes to about 6s.

richardjgowers

@jbarnoud looks solid, thanks!

codecov · 2020-06-29T20:36:29Z

Codecov Report

Merging #2804 into develop will decrease coverage by 0.13%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           develop    #2804      +/-   ##
===========================================
- Coverage    92.22%   92.08%   -0.14%     
===========================================
  Files          184      183       -1     
  Lines        24141    23670     -471     
  Branches      3123     3083      -40     
===========================================
- Hits         22263    21797     -466     
+ Misses        1813     1808       -5     
  Partials        65       65

Impacted Files	Coverage Δ
package/MDAnalysis/topology/tpr/obj.py	`96.72% <100.00%> (-0.06%)`	⬇️
package/MDAnalysis/auxiliary/base.py	`90.75% <0.00%> (-0.57%)`	⬇️
package/MDAnalysis/coordinates/chain.py	`91.94% <0.00%> (-0.37%)`	⬇️
package/MDAnalysis/coordinates/base.py	`93.88% <0.00%> (-0.33%)`	⬇️
package/MDAnalysis/coordinates/TRZ.py	`88.16% <0.00%> (-0.27%)`	⬇️
package/MDAnalysis/coordinates/GSD.py	`88.63% <0.00%> (-0.26%)`	⬇️
package/MDAnalysis/coordinates/chemfiles.py	`88.12% <0.00%> (-0.22%)`	⬇️
package/MDAnalysis/coordinates/INPCRD.py	`93.33% <0.00%> (-0.22%)`	⬇️
package/MDAnalysis/coordinates/GMS.py	`92.30% <0.00%> (-0.16%)`	⬇️
package/MDAnalysis/coordinates/TXYZ.py	`93.18% <0.00%> (-0.16%)`	⬇️
... and 38 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8314f7d...2fc42b3. Read the comment docs.

orbeckst · 2020-06-29T20:54:18Z

Could be backported to 1.0.1 via PR #2798.

* Make the TPR parser a bit faster A function in the TPR parser calls list.pop thousands of times, which is slow. This commit avoids that exansive call. Taking the TPR from https://github.com/bioexcel/covid_modelling_simulation_data/tree/master/spike_protein/full_spike/trimer the parsing time on my computer goes from 25.6s to 9.39s. On a more pathological TPR file, it goes from 3 minutes to about 6s. * Update changelog for #2804 Co-authored-by: Richard Gowers <richardjgowers@gmail.com>

* Make the TPR parser a bit faster A function in the TPR parser calls list.pop thousands of times, which is slow. This commit avoids that exansive call. Taking the TPR from https://github.com/bioexcel/covid_modelling_simulation_data/tree/master/spike_protein/full_spike/trimer the parsing time on my computer goes from 25.6s to 9.39s. On a more pathological TPR file, it goes from 3 minutes to about 6s. * Update changelog for MDAnalysis#2804 Co-authored-by: Richard Gowers <richardjgowers@gmail.com>

jbarnoud added 2 commits June 29, 2020 17:05

Update changelog for #2804

174dd40

richardjgowers approved these changes Jun 29, 2020

View reviewed changes

orbeckst assigned richardjgowers Jun 29, 2020

Merge branch 'develop' into slightly-faster-tpr

2fc42b3

richardjgowers merged commit 61e236d into develop Jul 2, 2020

richardjgowers deleted the slightly-faster-tpr branch July 2, 2020 17:16

fiona-naughton added enhancement Component-Readers labels Sep 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make the TPR parser a bit faster#2804

Make the TPR parser a bit faster#2804
richardjgowers merged 3 commits intodevelopfrom
slightly-faster-tpr

jbarnoud commented Jun 29, 2020 •

edited

Loading

Uh oh!

richardjgowers left a comment

Uh oh!

codecov bot commented Jun 29, 2020 •

edited

Loading

Uh oh!

orbeckst commented Jun 29, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

jbarnoud commented Jun 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes made in this Pull Request:

PR Checklist

Uh oh!

richardjgowers left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jun 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

orbeckst commented Jun 29, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jbarnoud commented Jun 29, 2020 •

edited

Loading

codecov bot commented Jun 29, 2020 •

edited

Loading