Update mutation rate to mutation density#307
Conversation
- reimplement with click - partial renaming - simplification of sample_name logic
There was a problem hiding this comment.
Pull Request Overview
This PR replaces the term “mutation rate” with “mutation density” across scripts and configurations, removes per-KB normalization, and updates the compute step to use a dedicated compute_mutdensity.py for SNVs, indels, combined SNV+indel, and all-mutation densities.
- Renamed output directories, script names, process labels, and file suffixes from
mutrate/mutratestomutdensity/mutdensities - Updated configuration parameters (
mutationrate→mutationdensity,mutrate_regressions→mutdensity_regressions) and included the newmutdensity.config - Changed the compute process from
MUTRATEtoMUTATION_DENSITY, switching fromcompute_mutrate.pytocompute_mutdensity.py
Reviewed Changes
Copilot reviewed 35 out of 35 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| modules/local/mutated_cells_expected/main.nf | Renamed output dir and script argument to expected_mutdensity |
| modules/local/computemutdensity/main.nf | Renamed process, outputs, script call, and output file suffix to density |
| modules/local/indels/main.nf | Updated output filename for indels comparisons |
| conf/tools/regressions.config | Renamed regression parameter to mutdensity_regressions |
| conf/modules.config | Swapped in mutdensity.config for the old mutrate.config |
(multiple conf/*.config) |
Changed mutationrate flags to mutationdensity |
Comments suppressed due to low confidence (4)
modules/local/bbgtools/oncodrive3d/plot_chimerax/main.nf:20
- [nitpick] The
argsvariable was removed; if users previously passed custom arguments viatask.ext.args, consider documenting this change or restoring support for extensibility.
script:
modules/local/mutated_cells_expected/main.nf:28
- The directory is renamed from
expected_mutratetoexpected_mutdensity; ensuremutgenomes_expected_mutrisk.Rhas been updated to output into this new directory name.
mkdir expected_mutdensity
modules/local/computemutdensity/main.nf:16
- Changing
prefixtosample_namealters the output filename pattern; verify downstream processes expect${sample_name}.${panel_version}.mutdensities.tsvand update any file‐matching logic accordingly.
def sample_name = "${meta.id}"
conf/tools/regressions.config:8
- After renaming
mutrate_regressionstomutdensity_regressions, verify all pipeline components that previously referenced the old parameter are updated to use the new one to avoid runtime errors.
mutdensity_regressions = null
|
functional testing works I have not tested the results, but they should be the same |
FedericaBrando
left a comment
There was a problem hiding this comment.
Very good Ferriol! 🚀
Given our talk here I summarize the comments:
- Add a docstring at the beginning of the file explaining in very few sentences the definition of mutation density (gene and sample)
- Modularize the
compute_mutdensity()to 3 parts:- preprocessing where you read and load dfs, subsets depths etc
- call
mutdensity_sample()(once) - call
mutdensity_gene()(once)
- Make variable with mutation types (SNV, INSERTION, DELETION) etc at top of the script to be shared between mutdensity functions (DRY principle)
- Avoid calling the mutdensity functions several times, include the concats in the function itself.
This would improve readability and modularity of the script! Almost there 😎
- clean code - add explanation on mutation density
|
FedericaBrando
left a comment
There was a problem hiding this comment.
Way cleaner this way! Thanks for the effort! 💯
I added a small comment and then, after testing that nothing change, it can be merged into dev!
commit 10c12aa Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Thu Jul 24 12:35:40 2025 +0200 update gnomAD threshold to 0.001 - ignore errors in omega plot commit 31e741e Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Tue Jul 22 18:34:11 2025 +0200 update description and fix broken link commit 7b0fd9b Merge: 72dc4d9 cb18adc Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Sun Jul 20 22:34:46 2025 +0200 Merge pull request #315 from bbglab/dev commit cb18adc Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Sat Jul 19 12:40:39 2025 +0200 fix bug in mut density & update omega container commit 1b4a52d Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Tue Jul 15 15:52:07 2025 +0200 fix broken path for test_real commit db6d640 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Fri Jul 11 15:50:45 2025 +0200 Allow gene selection in consensus (#316) * update consensus building to filter genes - add consensus compliance param - add list of genes param - NOT tested * tested gene filter implementation - consensus panels implemented in polars - allowing subset for specific genes commit b92c2b9 Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Thu Jul 10 23:56:47 2025 +0200 add metro map commit 11fb1c5 Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Thu Jul 10 08:25:31 2025 +0200 update description in main README commit 628f282 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Mon Jul 7 19:14:21 2025 +0200 Add more complete docs (#306) * first doc update * update in usage documentation - list params - list structural parameters and files * backbone of output docs * update usage description with custom sets of mutations * fix headers * docs: Update usage with vep information * update output description * update order of usage information * update distribution of information in the docs * fix typo in docs/output.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix markdown linting * remove unnecessary validation params from config - REQUIRES TESTING * remove remaining references to download VEP cache * update dag format to mmd * update in usage * update documentation of file formatting and some params * add examples in file formatting docs * apply review comments * remove Nextflow parameters section * minor fix in nextflow.config --------- Co-authored-by: Miquel L. Grau <miguel.grau@irbbarcelona.org> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> commit 72dc4d9 Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Mon Jul 7 15:53:47 2025 +0200 temporary LICENSE definition commit 14c5246 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Fri Jun 27 11:19:45 2025 +0200 fix bug in panel_annotation (#313) reimplement it with click solved the problem - only_canonical boolean working - not tested commit 764782a Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Thu Jun 26 23:23:05 2025 +0200 Update mutation rate to mutation density (#307) * rename mutrate to mut density - reimplement with click - partial renaming - simplification of sample_name logic * full update of mutation rate to mutation density * define other_sample_SNP based on all VAF * update mutation density functions - clean code - add explanation on mutation density * apply review changes commit aae8f0b Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Thu Jun 26 10:49:00 2025 +0200 Ensure POSTPROCESSVEPPANEL in output (#311) * fix relative mutabilities output * explicitly define postprocessveppanel outdir * force outputting postprocesspanel * update storing fixes commit dd25b9b Merge: 05c80ee 26d5d9b Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Wed Jun 4 23:09:29 2025 +0200 Merge pull request #303 from bbglab/dev First pre-release merge commit 05c80ee Merge: ea9a301 e06218a Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Tue Apr 29 10:30:23 2025 +0200 Merge pull request #289 from bbglab/tmp-dev First release commit e06218a Merge: 7559d7f ea9a301 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Tue Apr 29 10:11:05 2025 +0200 Merge branch 'main' into tmp-dev commit ea9a301 Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Thu Jul 25 16:05:10 2024 +0200 update schema
* plotting wishlist - subworkflow definition - shortlisting plots to add missing: - nf scripts for the modules - python scripts for the plots * add raw version of supplementary figure plotting * update omega plotting * plotting update: omega & needles & stacked * update plotting cohort plots working * clean code list inputs * saturation data loading working * gene saturation all tracks working with TP53 * tested additional complementary plots missing: - handle sample information input files - handle reference datasets - handle multiple genes * update gene saturation inputs from pipeline - not tested - pending to decide creation of unique_splice_sites * Squashed commit of the following: commit 10c12aa Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Thu Jul 24 12:35:40 2025 +0200 update gnomAD threshold to 0.001 - ignore errors in omega plot commit 31e741e Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Tue Jul 22 18:34:11 2025 +0200 update description and fix broken link commit 7b0fd9b Merge: 72dc4d9 cb18adc Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Sun Jul 20 22:34:46 2025 +0200 Merge pull request #315 from bbglab/dev commit cb18adc Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Sat Jul 19 12:40:39 2025 +0200 fix bug in mut density & update omega container commit 1b4a52d Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Tue Jul 15 15:52:07 2025 +0200 fix broken path for test_real commit db6d640 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Fri Jul 11 15:50:45 2025 +0200 Allow gene selection in consensus (#316) * update consensus building to filter genes - add consensus compliance param - add list of genes param - NOT tested * tested gene filter implementation - consensus panels implemented in polars - allowing subset for specific genes commit b92c2b9 Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Thu Jul 10 23:56:47 2025 +0200 add metro map commit 11fb1c5 Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Thu Jul 10 08:25:31 2025 +0200 update description in main README commit 628f282 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Mon Jul 7 19:14:21 2025 +0200 Add more complete docs (#306) * first doc update * update in usage documentation - list params - list structural parameters and files * backbone of output docs * update usage description with custom sets of mutations * fix headers * docs: Update usage with vep information * update output description * update order of usage information * update distribution of information in the docs * fix typo in docs/output.md * fix markdown linting * remove unnecessary validation params from config - REQUIRES TESTING * remove remaining references to download VEP cache * update dag format to mmd * update in usage * update documentation of file formatting and some params * add examples in file formatting docs * apply review comments * remove Nextflow parameters section * minor fix in nextflow.config --------- Co-authored-by: Miquel L. Grau <miguel.grau@irbbarcelona.org> commit 72dc4d9 Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Mon Jul 7 15:53:47 2025 +0200 temporary LICENSE definition commit 14c5246 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Fri Jun 27 11:19:45 2025 +0200 fix bug in panel_annotation (#313) reimplement it with click solved the problem - only_canonical boolean working - not tested commit 764782a Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Thu Jun 26 23:23:05 2025 +0200 Update mutation rate to mutation density (#307) * rename mutrate to mut density - reimplement with click - partial renaming - simplification of sample_name logic * full update of mutation rate to mutation density * define other_sample_SNP based on all VAF * update mutation density functions - clean code - add explanation on mutation density * apply review changes commit aae8f0b Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Thu Jun 26 10:49:00 2025 +0200 Ensure POSTPROCESSVEPPANEL in output (#311) * fix relative mutabilities output * explicitly define postprocessveppanel outdir * force outputting postprocesspanel * update storing fixes commit dd25b9b Merge: 05c80ee 26d5d9b Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Wed Jun 4 23:09:29 2025 +0200 Merge pull request #303 from bbglab/dev First pre-release merge commit 05c80ee Merge: ea9a301 e06218a Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Tue Apr 29 10:30:23 2025 +0200 Merge pull request #289 from bbglab/tmp-dev First release commit e06218a Merge: 7559d7f ea9a301 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Tue Apr 29 10:11:05 2025 +0200 Merge branch 'main' into tmp-dev commit ea9a301 Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Thu Jul 25 16:05:10 2024 +0200 update schema * update stacked plot needles * minor fixes after merge - NOT WORKING * update linewidth and size - add mutation types * fix plot saturation within pipeline - collect sitecomparisons - remove reference to ddg. requires external - fix input unique to keep header and minimal information - tested and works * fix o3d logs output * plot multiple genes, not only TP53 * apply review suggestions - temporary solution to domain information loading * update domain definition and plotting - subset domains to in_panel ones - update domain name definition - plotting modules works - missing autoexons plot - update signatures output * - separate generation of depth per exon - add general error handling to all plotting modules - generate exons bedfile within panel * batch update of exon definitions - use correct exons definition - update custom bedfile name - not tested * fix exons panel generation * add error handling to omega plot
Decide if we want to update the term for mutated reads rate, and call it mutation burden or whatever we prefer?POSTPONEDAI generated summary
This pull request introduces a major refactor across multiple configuration and module files to replace the term "mutation rate" with "mutation density." The changes are primarily aimed at improving clarity and consistency in terminology throughout the codebase. Additionally, several files and processes have been renamed to align with this updated terminology.
Terminology Update: Mutation Rate to Mutation Density
mutationratetomutationdensityacross multiple configuration files, includingconf/bladder.config,conf/kidney.config,conf/lung.config, and others. This ensures consistency in terminology. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]mutrate_regressionstomutdensity_regressionsin regression-related configuration files to reflect the updated terminology. [1] [2]File and Process Renaming
conf/tools/mutrate.configtoconf/tools/mutdensity.configand updated process names within the file, such asSUBSETMUTRATEtoSUBSETMUTDENSITY. [1] [2] [3] [4] [5]modules/local/computemutrate/main.nftomodules/local/computemutdensity/main.nfand updated process names and scripts accordingly, including switching fromcompute_mutrate.pytocompute_mutdensity.py. [1] [2] [3]Script and Argument Adjustments
expected_mutratetoexpected_mutdensityin theEXPECTED_MUTATED_CELLSprocess.mutrates_per_gene.tsvtomutdensities_per_gene.tsvin theOMEGA_PREPROCESSprocess.Removal of Redundant Code
argsvariable in theONCODRIVE3D_PLOT_CHIMERAXprocess, simplifying the script.These changes collectively improve the clarity of the codebase by unifying terminology and aligning file and process names with the updated concept of "mutation density."