From db8ece50d66de0a92e74fca01f7423974a270b13 Mon Sep 17 00:00:00 2001 From: Irfan Alibay Date: Sun, 1 Jun 2025 16:28:03 +0100 Subject: [PATCH 1/8] Start adding details about --n-protocol-repeats --- rbfe_tutorial/cli_tutorial.md | 29 ++++++++++++++++++++--------- 1 file changed, 20 insertions(+), 9 deletions(-) diff --git a/rbfe_tutorial/cli_tutorial.md b/rbfe_tutorial/cli_tutorial.md index 8454c3b..4dc7d6e 100644 --- a/rbfe_tutorial/cli_tutorial.md +++ b/rbfe_tutorial/cli_tutorial.md @@ -38,7 +38,7 @@ running the simulations and gathering the results are the same. With the single command: ```bash -openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup +openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup --n-protocol-repeats 1 ``` we do the following: @@ -49,6 +49,12 @@ we do the following: - Pass a PDB of the protein target (TYK2) with `-p tyk2_protein.pdb`. - Instruct `openfe` to output files into a directory called `network_setup` with the `-o network_setup` option. +- Instruct `openfe` to only run one full repeat of the alchemical simulation per + `quickrun` call using `--n-protocol-repeats 1`. + **Note:** `openfe`'s default behaviour is that it needs three + repeats to calculate the uncertainty (i.e. standard deviation) in an estimate. By + settings `--n-protocol-repeats` to 1, you must execute the transformation a minimum + of 3 times. Planning the campaign may take some time due to the complex series of tasks involved: @@ -61,7 +67,7 @@ The partial charge generation can take advantage of multiprocessing which offers the number of processors available using the `-n` flag: ```bash -openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup -n 4 +openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup --n-protocol-repeats 1 -n 4 ``` This will result in a directory called `network_setup/`, which is structured like this: @@ -146,7 +152,7 @@ partial_charge: 2. Plan your rbfe network with an additional `-s` flag for passing the settings: ```bash -openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup -s settings.yaml +openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup --n-protocol-repeats 1 -s settings.yaml ``` 3. The output of the CLI program will now reflect the changes made: @@ -214,7 +220,7 @@ where `path/to/transformation.json` is the path to one of the files created abov When running a complete network of simulations, it is important to ensure that the file name for the result JSON and name of the working directory are -different for each leg, otherwise you'll overwrite results. We recommend doing +different for each leg and each repeat, otherwise you'll overwrite results. We recommend doing that with something like the following, which uses the fact that the JSON files in `network_setup/transformations/` have unique names, and creates directories and result JSON files based on those names. To run all legs sequentially (not @@ -225,7 +231,10 @@ recommended) you could do something like: for file in network_setup/transformations/*.json; do relpath=${file:30} # strip off "network_setup/transformations/" dirpath=${relpath%.*} # strip off final ".json" - openfe quickrun $file -o results/$relpath -d results/$dirpath + # loop over three repeats + for repeat in {1..3}; do + openfe quickrun $file -o results/repeat${repeat}/$relpath -d results/repeat${repeat}/$dirpath + done done ``` @@ -241,10 +250,12 @@ and submit a job script for the simplest SLURM use case: for file in network_setup/transformations/*.json; do relpath=${file:30} # strip off "network_setup/transformations/" dirpath=${relpath%.*} # strip off final ".json" - jobpath="network_setup/transformations/${dirpath}.job" - cmd="openfe quickrun $file -o results/$relpath -d results/$dirpath" - echo -e "#!/usr/bin/env bash\n${cmd}" > $jobpath - sbatch $jobpath + for repeat in {1..3}; do + jobpath="network_setup/transformations/${dirpath}_${repeat}.job" + cmd="openfe quickrun $file -o results/repeat${repeat}/$relpath -d results/repeat${repeat}/$dirpath" + echo -e "#!/usr/bin/env bash\n${cmd}" > $jobpath + sbatch $jobpath + done done ``` From 00719fb228059de329af8e00106bb77c85c09bb1 Mon Sep 17 00:00:00 2001 From: Alyssa Travitz <31974495+atravitz@users.noreply.github.com> Date: Wed, 11 Jun 2025 14:46:19 -0700 Subject: [PATCH 2/8] Update rbfe_tutorial/cli_tutorial.md --- rbfe_tutorial/cli_tutorial.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/rbfe_tutorial/cli_tutorial.md b/rbfe_tutorial/cli_tutorial.md index 4dc7d6e..43708c0 100644 --- a/rbfe_tutorial/cli_tutorial.md +++ b/rbfe_tutorial/cli_tutorial.md @@ -49,7 +49,7 @@ we do the following: - Pass a PDB of the protein target (TYK2) with `-p tyk2_protein.pdb`. - Instruct `openfe` to output files into a directory called `network_setup` with the `-o network_setup` option. -- Instruct `openfe` to only run one full repeat of the alchemical simulation per +- Instruct `openfe` to only run one repeat of the alchemical simulation per `quickrun` call using `--n-protocol-repeats 1`. **Note:** `openfe`'s default behaviour is that it needs three repeats to calculate the uncertainty (i.e. standard deviation) in an estimate. By From 2ce71e3c4f009811f5ee052089a0bf5cd0777041 Mon Sep 17 00:00:00 2001 From: Alyssa Travitz <31974495+atravitz@users.noreply.github.com> Date: Wed, 11 Jun 2025 14:46:26 -0700 Subject: [PATCH 3/8] Update rbfe_tutorial/cli_tutorial.md --- rbfe_tutorial/cli_tutorial.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/rbfe_tutorial/cli_tutorial.md b/rbfe_tutorial/cli_tutorial.md index 43708c0..d9c70a1 100644 --- a/rbfe_tutorial/cli_tutorial.md +++ b/rbfe_tutorial/cli_tutorial.md @@ -50,7 +50,7 @@ we do the following: - Instruct `openfe` to output files into a directory called `network_setup` with the `-o network_setup` option. - Instruct `openfe` to only run one repeat of the alchemical simulation per - `quickrun` call using `--n-protocol-repeats 1`. + `quickrun` call using `--n-protocol-repeats=1`. **Note:** `openfe`'s default behaviour is that it needs three repeats to calculate the uncertainty (i.e. standard deviation) in an estimate. By settings `--n-protocol-repeats` to 1, you must execute the transformation a minimum From 62707fe2a211250e1ddf95408fdf5b45703fc9ac Mon Sep 17 00:00:00 2001 From: Alyssa Travitz <31974495+atravitz@users.noreply.github.com> Date: Wed, 11 Jun 2025 14:46:55 -0700 Subject: [PATCH 4/8] Apply suggestions from code review --- rbfe_tutorial/cli_tutorial.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/rbfe_tutorial/cli_tutorial.md b/rbfe_tutorial/cli_tutorial.md index d9c70a1..d42eca1 100644 --- a/rbfe_tutorial/cli_tutorial.md +++ b/rbfe_tutorial/cli_tutorial.md @@ -38,7 +38,7 @@ running the simulations and gathering the results are the same. With the single command: ```bash -openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup --n-protocol-repeats 1 +openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup --n-protocol-repeats=1 ``` we do the following: @@ -53,7 +53,7 @@ we do the following: `quickrun` call using `--n-protocol-repeats=1`. **Note:** `openfe`'s default behaviour is that it needs three repeats to calculate the uncertainty (i.e. standard deviation) in an estimate. By - settings `--n-protocol-repeats` to 1, you must execute the transformation a minimum + setting `--n-protocol-repeats` to 1, you must execute the transformation a minimum of 3 times. Planning the campaign may take some time due to the complex series of tasks involved: @@ -67,7 +67,7 @@ The partial charge generation can take advantage of multiprocessing which offers the number of processors available using the `-n` flag: ```bash -openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup --n-protocol-repeats 1 -n 4 +openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup --n-protocol-repeats=1 -n 4 ``` This will result in a directory called `network_setup/`, which is structured like this: @@ -152,7 +152,7 @@ partial_charge: 2. Plan your rbfe network with an additional `-s` flag for passing the settings: ```bash -openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup --n-protocol-repeats 1 -s settings.yaml +openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup --n-protocol-repeats=1 -s settings.yaml ``` 3. The output of the CLI program will now reflect the changes made: From 0ddd3208536748f9213115cc2089937d79753bda Mon Sep 17 00:00:00 2001 From: Alyssa Travitz Date: Thu, 12 Jun 2025 13:06:21 -0700 Subject: [PATCH 5/8] update result tree --- rbfe_tutorial/cli_tutorial.md | 61 +++++++++++++++++++++-------------- 1 file changed, 37 insertions(+), 24 deletions(-) diff --git a/rbfe_tutorial/cli_tutorial.md b/rbfe_tutorial/cli_tutorial.md index d42eca1..c081144 100644 --- a/rbfe_tutorial/cli_tutorial.md +++ b/rbfe_tutorial/cli_tutorial.md @@ -86,7 +86,7 @@ network_setup ├── rbfe_lig_ejm_31_complex_lig_ejm_50_complex.json ├── rbfe_lig_ejm_31_solvent_lig_ejm_42_solvent.json ├── rbfe_lig_ejm_31_solvent_lig_ejm_46_solvent.json -[continues] + ... ``` The `ligand_network.graphml` file describes the atom mappings between the @@ -284,29 +284,42 @@ openfe ```text results -├── rbfe_lig_ejm_31_complex_lig_ejm_42_complex -│   ├── shared_RelativeHybridTopologyProtocolUnit-3ea82011-75f0-4bb6-b415-e7d05bd012f6 -│   │   ├── checkpoint.nc -│   │   └── simulation.nc -│   ├── shared_RelativeHybridTopologyProtocolUnit-5262feb6-cb50-4bb2-90a2-359810c2bb9c -│   │   ├── checkpoint.nc -│   │   └── simulation.nc -│   └── shared_RelativeHybridTopologyProtocolUnit-7a6def34-2967-4452-8d47-483bc7219c06 -│   ├── checkpoint.nc -│   └── simulation.nc -├── rbfe_lig_ejm_31_complex_lig_ejm_42_complex.json -├── rbfe_lig_ejm_31_complex_lig_ejm_46_complex -│   ├── shared_RelativeHybridTopologyProtocolUnit-ad113e55-5636-474e-9be3-ee77fe887e77 -│   │   ├── checkpoint.nc -│   │   └── simulation.nc -│   ├── shared_RelativeHybridTopologyProtocolUnit-ca74ad3c-2ac8-4961-be7c-fa802a1ec76b -│   │   ├── checkpoint.nc -│   │   └── simulation.nc -│   └── shared_RelativeHybridTopologyProtocolUnit-f848e671-fdd3-4b8d-8bd2-6eb5140e3ed3 -│   ├── checkpoint.nc -│   └── simulation.nc -├── rbfe_lig_ejm_31_complex_lig_ejm_46_complex.json -[continues] +├── replicate_0 +│   ├── rbfe_lig_ejm_31_complex_lig_ejm_42_complex +│   │   ├── shared_RelativeHybridTopologyProtocolUnit-79c279f04ec84218b7935bc0447539a9_attempt_0 +│   │   │   ├── checkpoint.nc +│   │   │   ├── db.json +│   │   │   ├── simulation_real_time_analysis.yaml +│   │   │   └── simulation.nc +│   │   ├── shared_RelativeHybridTopologyProtocolUnit-a3cef34132aa4e9cbb824fcbcd043b0e_attempt_0 +│   │   │   ├── checkpoint.nc +│   │   │   ├── db.json +│   │   │   ├── simulation_real_time_analysis.yaml +│   │   │   └── simulation.nc +│   │   └── shared_RelativeHybridTopologyProtocolUnit-abb2b104151c45fc8b0993fa0a7ee0af_attempt_0 +│   │   ├── checkpoint.nc +│   │   ├── db.json +│   │   ├── simulation_real_time_analysis.yaml +│   │   └── simulation.nc +│   ├── rbfe_lig_ejm_31_complex_lig_ejm_42_complex.json +│   ├── rbfe_lig_ejm_31_complex_lig_ejm_46_complex +│   │   ├── shared_RelativeHybridTopologyProtocolUnit-361500fe831c431aa830efd207db0955_attempt_0 +│   │   │   ├── checkpoint.nc +│   │   │   ├── db.json +│   │   │   ├── simulation_real_time_analysis.yaml +│   │   │   └── simulation.nc +│   │   ├── shared_RelativeHybridTopologyProtocolUnit-5a6176cfbf074f92bc76caac91b1c1bf_attempt_0 +│   │   │   ├── checkpoint.nc +│   │   │   ├── db.json +│   │   │   ├── simulation_real_time_analysis.yaml +│   │   │   └── simulation.nc +│   │   └── shared_RelativeHybridTopologyProtocolUnit-e16de73f07964e9096f34611e0c874ca_attempt_0 +│   │   ├── checkpoint.nc +│   │   ├── db.json +│   │   ├── simulation_real_time_analysis.yaml +│   │   └── simulation.nc +│   ├── rbfe_lig_ejm_31_complex_lig_ejm_46_complex.json +... ``` The JSON results file contains not only the calculated $\Delta G$, and From 4f45399fc3542470b1bf666d58da310292c12bdf Mon Sep 17 00:00:00 2001 From: Alyssa Travitz Date: Wed, 18 Jun 2025 11:17:21 -0700 Subject: [PATCH 6/8] update language --- rbfe_tutorial/cli_tutorial.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/rbfe_tutorial/cli_tutorial.md b/rbfe_tutorial/cli_tutorial.md index c081144..56bad21 100644 --- a/rbfe_tutorial/cli_tutorial.md +++ b/rbfe_tutorial/cli_tutorial.md @@ -51,10 +51,9 @@ we do the following: with the `-o network_setup` option. - Instruct `openfe` to only run one repeat of the alchemical simulation per `quickrun` call using `--n-protocol-repeats=1`. - **Note:** `openfe`'s default behaviour is that it needs three - repeats to calculate the uncertainty (i.e. standard deviation) in an estimate. By - setting `--n-protocol-repeats` to 1, you must execute the transformation a minimum - of 3 times. + **Note:** `openfe`'s default behaviour is to use three + repeats to calculate the uncertainty (i.e. standard deviation) in an estimate. When + setting `--n-protocol-repeats=1`, you must execute the transformation multiple times - at minimum 2, but best practie is 3 independent repeats. Planning the campaign may take some time due to the complex series of tasks involved: From 05823a1473d793146b294ee7c91368a942443e3d Mon Sep 17 00:00:00 2001 From: Alyssa Travitz Date: Wed, 18 Jun 2025 11:59:55 -0700 Subject: [PATCH 7/8] updating output example --- rbfe_tutorial/cli_tutorial.md | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/rbfe_tutorial/cli_tutorial.md b/rbfe_tutorial/cli_tutorial.md index 56bad21..eb86228 100644 --- a/rbfe_tutorial/cli_tutorial.md +++ b/rbfe_tutorial/cli_tutorial.md @@ -161,17 +161,19 @@ RBFE-NETWORK PLANNER ______________________ Parsing in Files: - Got input: - Small Molecules: SmallMoleculeComponent(name=lig_ejm_54) SmallMoleculeComponent(name=lig_jmc_23) SmallMoleculeComponent(name=lig_ejm_47) SmallMoleculeComponent(name=lig_jmc_27) SmallMoleculeComponent(name=lig_ejm_46) SmallMoleculeComponent(name=lig_ejm_31) SmallMoleculeComponent(name=lig_ejm_42) SmallMoleculeComponent(name=lig_ejm_50) SmallMoleculeComponent(name=lig_ejm_45) SmallMoleculeComponent(name=lig_jmc_28) SmallMoleculeComponent(name=lig_ejm_55) SmallMoleculeComponent(name=lig_ejm_43) SmallMoleculeComponent(name=lig_ejm_48) - Protein: ProteinComponent(name=) - Cofactors: [] - Solvent: SolventComponent(name=O, Na+, Cl-) + Got input: + Small Molecules: SmallMoleculeComponent(name=lig_ejm_31) SmallMoleculeComponent(name=lig_ejm_42) SmallMoleculeComponent(name=lig_ejm_43) SmallMoleculeComponent(name=lig_ejm_46) SmallMoleculeComponent(name=lig_ejm_47) SmallMoleculeComponent(name=lig_ejm_48) SmallMoleculeComponent(name=lig_ejm_50) SmallMoleculeComponent(name=lig_jmc_23) SmallMoleculeComponent(name=lig_jmc_27) SmallMoleculeComponent(name=lig_jmc_28) + Protein: ProteinComponent(name=) + Cofactors: [] + Solvent: SolventComponent(name=O, Na+, Cl-) Using Options: - Mapper: - Mapping Scorer: - Networker: functools.partial() - Partial Charge Generation: am1bccelf10 + Mapper: + Mapping Scorer: + Network Generation: + Partial Charge Generation: am1bcc + + n_protocol_repeats=1 (1 simulation repeat(s) per transformation) ``` That concludes the straightforward process of tailoring your OpenFE setup to your specifications. From 5220299da79028126d710cdbaaaec5e393665006 Mon Sep 17 00:00:00 2001 From: Alyssa Travitz Date: Mon, 23 Jun 2025 09:29:05 -0700 Subject: [PATCH 8/8] =1 -> 1 --- rbfe_tutorial/cli_tutorial.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/rbfe_tutorial/cli_tutorial.md b/rbfe_tutorial/cli_tutorial.md index eb86228..529314b 100644 --- a/rbfe_tutorial/cli_tutorial.md +++ b/rbfe_tutorial/cli_tutorial.md @@ -38,7 +38,7 @@ running the simulations and gathering the results are the same. With the single command: ```bash -openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup --n-protocol-repeats=1 +openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup --n-protocol-repeats 1 ``` we do the following: @@ -50,10 +50,10 @@ we do the following: - Instruct `openfe` to output files into a directory called `network_setup` with the `-o network_setup` option. - Instruct `openfe` to only run one repeat of the alchemical simulation per - `quickrun` call using `--n-protocol-repeats=1`. + `quickrun` call using `--n-protocol-repeats 1`. **Note:** `openfe`'s default behaviour is to use three repeats to calculate the uncertainty (i.e. standard deviation) in an estimate. When - setting `--n-protocol-repeats=1`, you must execute the transformation multiple times - at minimum 2, but best practie is 3 independent repeats. + setting `--n-protocol-repeats 1`, you must execute the transformation multiple times - at minimum 2, but best practie is 3 independent repeats. Planning the campaign may take some time due to the complex series of tasks involved: @@ -66,7 +66,7 @@ The partial charge generation can take advantage of multiprocessing which offers the number of processors available using the `-n` flag: ```bash -openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup --n-protocol-repeats=1 -n 4 +openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup --n-protocol-repeats 1 -n 4 ``` This will result in a directory called `network_setup/`, which is structured like this: @@ -151,7 +151,7 @@ partial_charge: 2. Plan your rbfe network with an additional `-s` flag for passing the settings: ```bash -openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup --n-protocol-repeats=1 -s settings.yaml +openfe plan-rbfe-network -M tyk2_ligands.sdf -p tyk2_protein.pdb -o network_setup --n-protocol-repeats 1 -s settings.yaml ``` 3. The output of the CLI program will now reflect the changes made: