Skip to content

Conversation

@richardjgowers
Copy link
Contributor

  • makes the step through the trajectory for analysis by default gather 500 frames of data
  • for standard settings openmm rfe runs, this is a step of 10
  • improve performance by loading PDB topology once, not 11 (=nlambda) times
  • add progress bar to cli call for sanity

- makes the step through the trajectory for analysis by default gather 500 frames of data
- for standard settings openmm rfe runs, this is a step of 10
- improve performance by loading PDB topology once, not 11 (=nlambda) times
- add progress bar to cli call for sanity
@richardjgowers
Copy link
Contributor Author

@mikemhenry this reduces the total runtime for the dataset you put up from ~40 minutes to 45 seconds for me. If you can check that this patch gives a similar perf bump to acceptable levels then we should be good to go.

The stdout produced is 27MB, would you still want this bouncing off a file instead or is piping that much data OK?

@codecov
Copy link

codecov bot commented Nov 15, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

❗ No coverage uploaded for pull request base (main@bddd32d). Click here to learn what that means.

Additional details and impacted files
@@           Coverage Diff           @@
##             main      #26   +/-   ##
=======================================
  Coverage        ?   60.06%           
=======================================
  Files           ?        6           
  Lines           ?      293           
  Branches        ?        0           
=======================================
  Hits            ?      176           
  Misses          ?      117           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mikemhenry
Copy link
Contributor

I think that is fine, but with tdqm, will the output be something that we can easily read in with a logging tool? or is tdqm smart and doesn't output if its not going to stdout

@mikemhenry
Copy link
Contributor

I will test it on the cluster to see how long it takes

@mikemhenry
Copy link
Contributor

Very fast now and it looks like the TDQM bar doesn't end up in the output so calling the command like this $ openfe_analysis . | tee output.log lets you view the progress bar and save the output.

$ head -c 500 output.log 
{"protein_RMSD": [[0.0, 1.2808896885129466, 1.3386231474354495, 1.3704502925096826, 1.340629459943356, 1.2933647321006816, 1.0575695876290574, 1.1609251163130452, 1.1765957326639387, 1.325367601887441, 1.284754975758849, 1.2320423906041964, 1.217751234670653, 1.4551268444519825, 1.3134513541802377, 1.1945956838969465, 1.2136881911195208, 1.0910438893996461, 1.4721771533412353, 1.2923550390274625, 1.1123489761886183, 1.2387564476170243, 1.0531361145744993, 1.2627956618008394, 1.487708349464851, 1

@richardjgowers
Copy link
Contributor Author

Tqdm goes to stderr, probably for this reason?

@richardjgowers richardjgowers merged commit c0059c3 into main Nov 15, 2023
@richardjgowers richardjgowers deleted the perf_pass branch November 15, 2023 16:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants