From b58efe95975a19e41706c3d6b350de2583d9943a Mon Sep 17 00:00:00 2001 From: "David W.H. Swenson" Date: Wed, 12 Apr 2023 08:42:17 -0500 Subject: [PATCH 1/6] CLI tutorial (markdown) This adds a basic CLI tutorial. The goal is to walk the user through the process of setting up, running, and gathering results using the CLI. Audience is people for whom this is their first experience with OpenFE, and they're not super into Python. They already know the science, however, and may know some of the tools in this space (LOMAP, etc). --- easyCampaign/cli-tutorial.md | 155 +++++++++++++++++++++++++++++++++++ 1 file changed, 155 insertions(+) create mode 100644 easyCampaign/cli-tutorial.md diff --git a/easyCampaign/cli-tutorial.md b/easyCampaign/cli-tutorial.md new file mode 100644 index 0000000..462f7a9 --- /dev/null +++ b/easyCampaign/cli-tutorial.md @@ -0,0 +1,155 @@ +# Relative Free Energies with the OpenFE CLI + +This tutorial will show how to use the OpenFE command line interface to get +free energies -- with no Python at all! This will work for simple setups, you +may need to use the Python interface for more complicated setups. + +The entire process of running the campaign of simulations is split into 3 +stages; each of which corresponds to a CLI command: + +1. Setting up the campaign creating files that describe each of the individual + simulations to run. +2. Running the simulations. +3. Gathering the results of separate simulations into a single table. + +To work through this tutorial, let's start out with a fresh directory +containing files from the tutorial in our [examples +repository](https://github.com/OpenFreeEnergy/ExampleNotebooks). If +`$EXAMPLES_REPO` is a path to a local copy of that repository, then after +switching to an empty directory, you can get the files with: + +```bash +cp $EXAMPLES_REPO/easyCampaign/molecules/rbfe/* ./ +``` + +Then when you run `ls`, you should see that your directory has two files in it: +`p38_old_ligands.sdf` and `p38_old_protein.pdb`. That will be the starting +point for the tutorial. + +## Setting up the campaign + +The CLI makes setting up the simulation very easy -- it's just a single CLI +command. There are separate commands for binding free energy and hydration free +energy setups. + +For RBFE campaigns, the relevant command is `openfe plan-rbfe-network`. For +RHFE, the command is `openfe plan-rhfe-network`. They work mostly the same, +except that the RHFE planner does not take a protein. In this tutorial, we'll +do an RBFE calculation. The only difference for RHFE is in the setup stage -- +running the simulations and gathering the results are the same. + +To run the setup, we'll specify the protein using the `-p` option, and we'll +tell it search for SDF/MOL2 files in the current directory using `-M ./`. We'll tell it to output into the same directory that we're working in with the `-o ./` option. + +```bash +openfe plan-rbfe-network -p p38_old_protein.pdb -M ./ -o ./ +``` + +Planning the campaign may a take a few minutes, as it tries to find the best +network from all possible transformations. This will create a file for each +leg that we will calculate, all within a directory called `transformations`. +Now you're ready to run the simulations! Let's look at the structure of the +`transformations` directory: + + + + +```text +transformations +├── lig_p38a_2aa_lig_p38a_2z +│   ├── complex +│   │   └── openfe-tutorial_easy_rbfe_lig_p38a_2aa_receptor_lig_p38a_2z_receptor.json +│   └── solvent +│   └── openfe-tutorial_easy_rbfe_lig_p38a_2aa_solvent_lig_p38a_2z_solvent.json +├── lig_p38a_2aa_lig_p38a_3fly +│   ├── complex +│   │   └── openfe-tutorial_easy_rbfe_lig_p38a_2aa_receptor_lig_p38a_3fly_receptor.json +│   └── solvent +│   └── openfe-tutorial_easy_rbfe_lig_p38a_2aa_solvent_lig_p38a_3fly_solvent.json +[continues] +``` + +There is a subdirectory for each edge, named according to the ligand pair. +Within that, there are directories for the two "legs" associated with this +ligand transformation: the ligand transformation in solvent, and the ligand +tranformation complexed with the receptor. Each JSON file represents a single +leg to run, and contains all the necessary information to run that leg. + +Note that this specific setup makes a number of choices for you. All of +these choices can be customized in the Python API, and some can be customized +using the CLI. To see additional CLI options, use `openfe plan-rbfe-network +--help`. Here are the specifics on how these simulation are set up: + +1. LOMAP is used to generate the atom mappings between ligands. +2. The network is a minimal spanning tree, with the default LOMAP score used to + score the mappings. +3. Solvent is water with NaCl at an ionic strength of 0.15 M (neutralized). +4. The protocol used is OpenFE's OpenMM RFE protocol, with default settings. + + + + +## Running the simulations + +In principle, you can run each simulation on your local machine with something +like: + +``` +# this will take a long time! +for file in transformations/**/*.json; do + relpath=${file:16} # strip off "transformations/" + dirpath=${relpath%.*} # strip off final ".json" + openfe quickrun $file -o results/$relpath -d results/$dirpath +done +``` + +In practice, you probably want to submit these to a queue. In that case, you'll +want to create a new job script for each simulation JSON file, and the core of +that job script will be to run the `openfe quickrun` command above. + +Details of what information is needed in that job script will depend on your +computing center. Here is an example of a very simple script that will create +and submit a job script for the simplest SLURM use case: + +``` +for file in transformations/**/*.json; do + relpath=${file:16} # strip off "transformations/" + dirpath=${relpath%.*} # strip off final ".json" + jobpath="transformations/${dirpath}.job" + cmd="openfe quickrun $file -o results/$relpath -d results/$dirpath" + echo -e "#!/usr/bin/env bash\n${cmd}" > $jobpath + sbatch $jobpath +done +``` + +## Gathering the results + +Once the simulations have been run, you will see many results in the results +directory. For each simulation, there will be a result JSON file, as well as a +directory that includes files created during the simulation, with the names as +given to the `openfe quickrun` command. + + + +The JSON results file contains not only the calculated $\Delta G$, and +uncertainty estimate, but also important metadata about what happened during +the simulation. In particular, it will contain information about any errors or +failures that occurred -- these errors will not cause the entire campaign to +fail, and will be recorded so you can later analyze what went wrong. + +To gather all the $\Delta G$ estimates into a single file, use the `openfe +gather` command from withing the working directory used above: + +``` +openfe gather ./results/ -o final_results.tsv +``` + +This will write out a tab-separated table of results, including both the +$\Delta G$ for each leg and the $\Delta\Delta G$ computed from pairs of legs. +The first column labels the data, e.g., `DGcomplex(ligandB,ligandA)` for the +$\Delta G$ of the transformation of ligand A into ligand B while in complex +with the protein, or `DDGbind(ligandB,ligandA)` for the $\Delta\Delta G$ of +binding ligand A vs. ligand B: $\Delta G$bind, $B$$ - \Delta +G$bind$A$. + + From 920d87d8cb53cb1f9380dcada318b179e6c987d2 Mon Sep 17 00:00:00 2001 From: "David W.H. Swenson" Date: Wed, 12 Apr 2023 11:40:03 -0500 Subject: [PATCH 2/6] try overriding markdown editor --- .binder/overrides.json | 6 ++++++ .binder/postBuild | 5 +++++ 2 files changed, 11 insertions(+) create mode 100644 .binder/overrides.json create mode 100644 .binder/postBuild diff --git a/.binder/overrides.json b/.binder/overrides.json new file mode 100644 index 0000000..af18c04 --- /dev/null +++ b/.binder/overrides.json @@ -0,0 +1,6 @@ +{ + "defaultViewers": { + "markdown": "Markdown Preview" + } +} + diff --git a/.binder/postBuild b/.binder/postBuild new file mode 100644 index 0000000..587c8e6 --- /dev/null +++ b/.binder/postBuild @@ -0,0 +1,5 @@ +#!/usr/bin/env bash +set -eux + +mkdir -p ${NB_PYTHON_PREFIX}/share/jupyter/lab/settings +cp overrides.json ${NB_PYTHON_PREFIX}/share/jupyter/lab/settings From 86357adf67bc2b40e18ec72b61cdcf82e9dcce51 Mon Sep 17 00:00:00 2001 From: "David W.H. Swenson" Date: Wed, 12 Apr 2023 11:41:22 -0500 Subject: [PATCH 3/6] correct path --- .binder/postBuild | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.binder/postBuild b/.binder/postBuild index 587c8e6..3a6057f 100644 --- a/.binder/postBuild +++ b/.binder/postBuild @@ -2,4 +2,4 @@ set -eux mkdir -p ${NB_PYTHON_PREFIX}/share/jupyter/lab/settings -cp overrides.json ${NB_PYTHON_PREFIX}/share/jupyter/lab/settings +cp .binder/overrides.json ${NB_PYTHON_PREFIX}/share/jupyter/lab/settings From 15ebdc06a30f6d608e4ca28f96ad424da6e57ea5 Mon Sep 17 00:00:00 2001 From: "David W.H. Swenson" Date: Wed, 12 Apr 2023 11:53:05 -0500 Subject: [PATCH 4/6] specify which settings --- .binder/overrides.json | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/.binder/overrides.json b/.binder/overrides.json index af18c04..7748745 100644 --- a/.binder/overrides.json +++ b/.binder/overrides.json @@ -1,6 +1,8 @@ { - "defaultViewers": { - "markdown": "Markdown Preview" + "@jupyterlab/docmanager-extension:plugin": { + "defaultViewers": { + "markdown": "Markdown Preview" + } } } From 65c2a81650014b65d22db2bf5278f65b1c5a9d5e Mon Sep 17 00:00:00 2001 From: "David W.H. Swenson" Date: Wed, 12 Apr 2023 14:54:29 -0500 Subject: [PATCH 5/6] Switch to RHFE --- easyCampaign/cli-tutorial.md | 53 ++++++++++++++++++------------------ 1 file changed, 26 insertions(+), 27 deletions(-) diff --git a/easyCampaign/cli-tutorial.md b/easyCampaign/cli-tutorial.md index 462f7a9..5741e49 100644 --- a/easyCampaign/cli-tutorial.md +++ b/easyCampaign/cli-tutorial.md @@ -19,12 +19,11 @@ repository](https://github.com/OpenFreeEnergy/ExampleNotebooks). If switching to an empty directory, you can get the files with: ```bash -cp $EXAMPLES_REPO/easyCampaign/molecules/rbfe/* ./ +cp $EXAMPLES_REPO/easyCampaign/molecules/rhfe/* ./ ``` -Then when you run `ls`, you should see that your directory has two files in it: -`p38_old_ligands.sdf` and `p38_old_protein.pdb`. That will be the starting -point for the tutorial. +Then when you run `ls`, you should see that your directory has one file in it: +`benzenes_RHFE.sdf`. That will be the starting point for the tutorial. ## Setting up the campaign @@ -35,14 +34,15 @@ energy setups. For RBFE campaigns, the relevant command is `openfe plan-rbfe-network`. For RHFE, the command is `openfe plan-rhfe-network`. They work mostly the same, except that the RHFE planner does not take a protein. In this tutorial, we'll -do an RBFE calculation. The only difference for RHFE is in the setup stage -- +do an RHFE calculation. The only difference for RBFE is in the setup stage -- running the simulations and gathering the results are the same. -To run the setup, we'll specify the protein using the `-p` option, and we'll -tell it search for SDF/MOL2 files in the current directory using `-M ./`. We'll tell it to output into the same directory that we're working in with the `-o ./` option. +To run the setup, we'll tell it search for SDF/MOL2 files in the current +directory using `-M ./`. We'll tell it to output into the same directory that +we're working in with the `-o ./` option. ```bash -openfe plan-rbfe-network -p p38_old_protein.pdb -M ./ -o ./ +openfe plan-rhfe-network -M ./ -o ./ ``` Planning the campaign may a take a few minutes, as it tries to find the best @@ -56,28 +56,28 @@ Now you're ready to run the simulations! Let's look at the structure of the ```text transformations -├── lig_p38a_2aa_lig_p38a_2z -│   ├── complex -│   │   └── openfe-tutorial_easy_rbfe_lig_p38a_2aa_receptor_lig_p38a_2z_receptor.json -│   └── solvent -│   └── openfe-tutorial_easy_rbfe_lig_p38a_2aa_solvent_lig_p38a_2z_solvent.json -├── lig_p38a_2aa_lig_p38a_3fly -│   ├── complex -│   │   └── openfe-tutorial_easy_rbfe_lig_p38a_2aa_receptor_lig_p38a_3fly_receptor.json -│   └── solvent -│   └── openfe-tutorial_easy_rbfe_lig_p38a_2aa_solvent_lig_p38a_3fly_solvent.json +├── lig_10_lig_15 +│   ├── solvent +│   │   └── openfe-tutorial_easy_rhfe_lig_10_solvent_lig_15_solvent.json +│   └── vacuum +│   └── openfe-tutorial_easy_rhfe_lig_10_vacuum_lig_15_vacuum.json +├── lig_10_lig_5 +│   ├── solvent +│   │   └── openfe-tutorial_easy_rhfe_lig_5_solvent_lig_10_solvent.json +│   └── vacuum +│   └── openfe-tutorial_easy_rhfe_lig_5_vacuum_lig_10_vacuum.json:w [continues] ``` There is a subdirectory for each edge, named according to the ligand pair. Within that, there are directories for the two "legs" associated with this -ligand transformation: the ligand transformation in solvent, and the ligand -tranformation complexed with the receptor. Each JSON file represents a single -leg to run, and contains all the necessary information to run that leg. +ligand transformation: the ligand transformation in solvent and in vacuum. +Each JSON file represents a single leg to run, and contains all the necessary +information to run that leg. Note that this specific setup makes a number of choices for you. All of these choices can be customized in the Python API, and some can be customized -using the CLI. To see additional CLI options, use `openfe plan-rbfe-network +using the CLI. To see additional CLI options, use `openfe plan-rhfe-network --help`. Here are the specifics on how these simulation are set up: 1. LOMAP is used to generate the atom mappings between ligands. @@ -146,10 +146,9 @@ openfe gather ./results/ -o final_results.tsv This will write out a tab-separated table of results, including both the $\Delta G$ for each leg and the $\Delta\Delta G$ computed from pairs of legs. -The first column labels the data, e.g., `DGcomplex(ligandB,ligandA)` for the -$\Delta G$ of the transformation of ligand A into ligand B while in complex -with the protein, or `DDGbind(ligandB,ligandA)` for the $\Delta\Delta G$ of -binding ligand A vs. ligand B: $\Delta G$bind, $B$$ - \Delta -G$bind$A$. +The first column labels the data, e.g., `DGvacuum(ligandB,ligandA)` for the +$\Delta G$ of the transformation of ligand A into ligand B in vacuum, or +`DDGsolv(ligandB,ligandA)` for the $\Delta\Delta G$ of binding ligand A vs. +ligand B: $\Delta G$solv, $B$$ - \Delta G$solv$A$. From 61bc61e580fc625f8fad23f62f4af0e0582ed31f Mon Sep 17 00:00:00 2001 From: Richard Gowers Date: Tue, 18 Apr 2023 18:05:14 +0100 Subject: [PATCH 6/6] vi strikes again --- easyCampaign/cli-tutorial.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/easyCampaign/cli-tutorial.md b/easyCampaign/cli-tutorial.md index 5741e49..2944def 100644 --- a/easyCampaign/cli-tutorial.md +++ b/easyCampaign/cli-tutorial.md @@ -65,7 +65,7 @@ transformations │   ├── solvent │   │   └── openfe-tutorial_easy_rhfe_lig_5_solvent_lig_10_solvent.json │   └── vacuum -│   └── openfe-tutorial_easy_rhfe_lig_5_vacuum_lig_10_vacuum.json:w +│   └── openfe-tutorial_easy_rhfe_lig_5_vacuum_lig_10_vacuum.json [continues] ```