Skip to content

Update mutation rate to mutation density#307

Merged
FerriolCalvet merged 6 commits intodevfrom
mutrate2mutdensity
Jun 26, 2025
Merged

Update mutation rate to mutation density#307
FerriolCalvet merged 6 commits intodevfrom
mutrate2mutdensity

Conversation

@FerriolCalvet
Copy link
Collaborator

@FerriolCalvet FerriolCalvet commented Jun 17, 2025

  • Update naming everywhere
  • Remove per KB normalization in the file to avoid redundant information that is not being used.
  • Compute mutation density for SNVs (alone), Indels (alone), SNVs + Indels (alone) and then all mutations together.

Decide if we want to update the term for mutated reads rate, and call it mutation burden or whatever we prefer? POSTPONED

  • Define other_sample_SNP based on all VAFs values

AI generated summary

This pull request introduces a major refactor across multiple configuration and module files to replace the term "mutation rate" with "mutation density." The changes are primarily aimed at improving clarity and consistency in terminology throughout the codebase. Additionally, several files and processes have been renamed to align with this updated terminology.

Terminology Update: Mutation Rate to Mutation Density

  • Configuration Files: Updated parameter names from mutationrate to mutationdensity across multiple configuration files, including conf/bladder.config, conf/kidney.config, conf/lung.config, and others. This ensures consistency in terminology. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]
  • Regression Configurations: Renamed mutrate_regressions to mutdensity_regressions in regression-related configuration files to reflect the updated terminology. [1] [2]

File and Process Renaming

  • Tool Configurations: Renamed conf/tools/mutrate.config to conf/tools/mutdensity.config and updated process names within the file, such as SUBSETMUTRATE to SUBSETMUTDENSITY. [1] [2] [3] [4] [5]
  • Main Modules: Renamed modules/local/computemutrate/main.nf to modules/local/computemutdensity/main.nf and updated process names and scripts accordingly, including switching from compute_mutrate.py to compute_mutdensity.py. [1] [2] [3]

Script and Argument Adjustments

  • Module Processes: Updated arguments and folder names in scripts to reflect the new terminology, such as changing expected_mutrate to expected_mutdensity in the EXPECTED_MUTATED_CELLS process.
  • Omega Preprocessing: Modified the synonymous mutation rates file reference from mutrates_per_gene.tsv to mutdensities_per_gene.tsv in the OMEGA_PREPROCESS process.

Removal of Redundant Code

  • Oncodrive3D Module: Removed unused args variable in the ONCODRIVE3D_PLOT_CHIMERAX process, simplifying the script.

These changes collectively improve the clarity of the codebase by unifying terminology and aligning file and process names with the updated concept of "mutation density."

@FerriolCalvet FerriolCalvet requested a review from Copilot June 17, 2025 23:42
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR replaces the term “mutation rate” with “mutation density” across scripts and configurations, removes per-KB normalization, and updates the compute step to use a dedicated compute_mutdensity.py for SNVs, indels, combined SNV+indel, and all-mutation densities.

  • Renamed output directories, script names, process labels, and file suffixes from mutrate/mutrates to mutdensity/mutdensities
  • Updated configuration parameters (mutationratemutationdensity, mutrate_regressionsmutdensity_regressions) and included the new mutdensity.config
  • Changed the compute process from MUTRATE to MUTATION_DENSITY, switching from compute_mutrate.py to compute_mutdensity.py

Reviewed Changes

Copilot reviewed 35 out of 35 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
modules/local/mutated_cells_expected/main.nf Renamed output dir and script argument to expected_mutdensity
modules/local/computemutdensity/main.nf Renamed process, outputs, script call, and output file suffix to density
modules/local/indels/main.nf Updated output filename for indels comparisons
conf/tools/regressions.config Renamed regression parameter to mutdensity_regressions
conf/modules.config Swapped in mutdensity.config for the old mutrate.config
(multiple conf/*.config) Changed mutationrate flags to mutationdensity
Comments suppressed due to low confidence (4)

modules/local/bbgtools/oncodrive3d/plot_chimerax/main.nf:20

  • [nitpick] The args variable was removed; if users previously passed custom arguments via task.ext.args, consider documenting this change or restoring support for extensibility.
    script:

modules/local/mutated_cells_expected/main.nf:28

  • The directory is renamed from expected_mutrate to expected_mutdensity; ensure mutgenomes_expected_mutrisk.R has been updated to output into this new directory name.
    mkdir expected_mutdensity

modules/local/computemutdensity/main.nf:16

  • Changing prefix to sample_name alters the output filename pattern; verify downstream processes expect ${sample_name}.${panel_version}.mutdensities.tsv and update any file‐matching logic accordingly.
    def sample_name = "${meta.id}"

conf/tools/regressions.config:8

  • After renaming mutrate_regressions to mutdensity_regressions, verify all pipeline components that previously referenced the old parameter are updated to use the new one to avoid runtime errors.
    mutdensity_regressions                = null

@bbglab bbglab deleted a comment from Copilot AI Jun 18, 2025
@FerriolCalvet
Copy link
Collaborator Author

functional testing works

I have not tested the results, but they should be the same

Copy link
Member

@FedericaBrando FedericaBrando left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good Ferriol! 🚀

Given our talk here I summarize the comments:

  • Add a docstring at the beginning of the file explaining in very few sentences the definition of mutation density (gene and sample)
  • Modularize the compute_mutdensity() to 3 parts:
    • preprocessing where you read and load dfs, subsets depths etc
    • call mutdensity_sample() (once)
    • call mutdensity_gene() (once)
  • Make variable with mutation types (SNV, INSERTION, DELETION) etc at top of the script to be shared between mutdensity functions (DRY principle)
  • Avoid calling the mutdensity functions several times, include the concats in the function itself.

This would improve readability and modularity of the script! Almost there 😎

- clean code
- add explanation on mutation density
@FerriolCalvet
Copy link
Collaborator Author

  • Check that the outputs are the same after this code upgrade

Copy link
Member

@FedericaBrando FedericaBrando left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Way cleaner this way! Thanks for the effort! 💯

I added a small comment and then, after testing that nothing change, it can be merged into dev!

@FerriolCalvet FerriolCalvet self-assigned this Jun 26, 2025
@FerriolCalvet FerriolCalvet merged commit 764782a into dev Jun 26, 2025
@FerriolCalvet FerriolCalvet deleted the mutrate2mutdensity branch June 26, 2025 21:23
FerriolCalvet added a commit that referenced this pull request Jul 28, 2025
commit 10c12aa
Author: FerriolCalvet <ferriolcalvet@gmail.com>
Date:   Thu Jul 24 12:35:40 2025 +0200

    update gnomAD threshold to 0.001

    - ignore errors in omega plot

commit 31e741e
Author: FerriolCalvet <ferriolcalvet@gmail.com>
Date:   Tue Jul 22 18:34:11 2025 +0200

    update description and fix broken link

commit 7b0fd9b
Merge: 72dc4d9 cb18adc
Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com>
Date:   Sun Jul 20 22:34:46 2025 +0200

    Merge pull request #315 from bbglab/dev

commit cb18adc
Author: FerriolCalvet <ferriolcalvet@gmail.com>
Date:   Sat Jul 19 12:40:39 2025 +0200

    fix bug in mut density & update omega container

commit 1b4a52d
Author: FerriolCalvet <ferriolcalvet@gmail.com>
Date:   Tue Jul 15 15:52:07 2025 +0200

    fix broken path for test_real

commit db6d640
Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com>
Date:   Fri Jul 11 15:50:45 2025 +0200

    Allow gene selection in consensus (#316)

    * update consensus building to filter genes

    - add consensus compliance param
    - add list of genes param
    - NOT tested

    * tested gene filter implementation

    - consensus panels implemented in polars
    - allowing subset for specific genes

commit b92c2b9
Author: FerriolCalvet <ferriolcalvet@gmail.com>
Date:   Thu Jul 10 23:56:47 2025 +0200

    add metro map

commit 11fb1c5
Author: FerriolCalvet <ferriolcalvet@gmail.com>
Date:   Thu Jul 10 08:25:31 2025 +0200

    update description in main README

commit 628f282
Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com>
Date:   Mon Jul 7 19:14:21 2025 +0200

    Add more complete docs (#306)

    * first doc update

    * update in usage documentation

    - list params
    - list structural parameters and files

    * backbone of output docs

    * update usage description with custom sets of mutations

    * fix headers

    * docs: Update usage with vep information

    * update output description

    * update order of usage information

    * update distribution of information in the docs

    * fix typo in docs/output.md

    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

    * fix markdown linting

    * remove unnecessary validation params from config

    - REQUIRES TESTING

    * remove remaining references to download VEP cache

    * update dag format to mmd

    * update in usage

    * update documentation of file formatting and some params

    * add examples in file formatting docs

    * apply review comments

    * remove Nextflow parameters section

    * minor fix in nextflow.config

    ---------

    Co-authored-by: Miquel L. Grau <miguel.grau@irbbarcelona.org>
    Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

commit 72dc4d9
Author: FerriolCalvet <ferriolcalvet@gmail.com>
Date:   Mon Jul 7 15:53:47 2025 +0200

    temporary LICENSE definition

commit 14c5246
Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com>
Date:   Fri Jun 27 11:19:45 2025 +0200

    fix bug in panel_annotation (#313)

    reimplement it with click solved the problem
    - only_canonical boolean working
    - not tested

commit 764782a
Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com>
Date:   Thu Jun 26 23:23:05 2025 +0200

    Update mutation rate to mutation density (#307)

    * rename mutrate to mut density

    - reimplement with click
    - partial renaming
    - simplification of sample_name logic

    * full update of mutation rate to mutation density

    * define other_sample_SNP based on all VAF

    * update mutation density functions

    - clean code
    - add explanation on mutation density

    * apply review changes

commit aae8f0b
Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com>
Date:   Thu Jun 26 10:49:00 2025 +0200

    Ensure POSTPROCESSVEPPANEL in output (#311)

    * fix relative mutabilities output

    * explicitly define postprocessveppanel outdir

    * force outputting postprocesspanel

    * update storing fixes

commit dd25b9b
Merge: 05c80ee 26d5d9b
Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com>
Date:   Wed Jun 4 23:09:29 2025 +0200

    Merge pull request #303 from bbglab/dev

    First pre-release merge

commit 05c80ee
Merge: ea9a301 e06218a
Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com>
Date:   Tue Apr 29 10:30:23 2025 +0200

    Merge pull request #289 from bbglab/tmp-dev

    First release

commit e06218a
Merge: 7559d7f ea9a301
Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com>
Date:   Tue Apr 29 10:11:05 2025 +0200

    Merge branch 'main' into tmp-dev

commit ea9a301
Author: FerriolCalvet <ferriolcalvet@gmail.com>
Date:   Thu Jul 25 16:05:10 2024 +0200

    update schema
FerriolCalvet added a commit that referenced this pull request Aug 1, 2025
* plotting wishlist

- subworkflow definition
- shortlisting plots to add

missing:
- nf scripts for the modules
- python scripts for the plots

* add raw version of supplementary figure plotting

* update omega plotting

* plotting update: omega & needles & stacked

* update plotting cohort plots working

* clean code
list inputs

* saturation data loading working

* gene saturation all tracks working with TP53

* tested additional complementary plots

missing:
- handle sample information input files
- handle reference datasets
- handle multiple genes

* update gene saturation inputs from pipeline

- not tested
- pending to decide creation of unique_splice_sites

* Squashed commit of the following:

commit 10c12aa
Author: FerriolCalvet <ferriolcalvet@gmail.com>
Date:   Thu Jul 24 12:35:40 2025 +0200

    update gnomAD threshold to 0.001

    - ignore errors in omega plot

commit 31e741e
Author: FerriolCalvet <ferriolcalvet@gmail.com>
Date:   Tue Jul 22 18:34:11 2025 +0200

    update description and fix broken link

commit 7b0fd9b
Merge: 72dc4d9 cb18adc
Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com>
Date:   Sun Jul 20 22:34:46 2025 +0200

    Merge pull request #315 from bbglab/dev

commit cb18adc
Author: FerriolCalvet <ferriolcalvet@gmail.com>
Date:   Sat Jul 19 12:40:39 2025 +0200

    fix bug in mut density & update omega container

commit 1b4a52d
Author: FerriolCalvet <ferriolcalvet@gmail.com>
Date:   Tue Jul 15 15:52:07 2025 +0200

    fix broken path for test_real

commit db6d640
Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com>
Date:   Fri Jul 11 15:50:45 2025 +0200

    Allow gene selection in consensus (#316)

    * update consensus building to filter genes

    - add consensus compliance param
    - add list of genes param
    - NOT tested

    * tested gene filter implementation

    - consensus panels implemented in polars
    - allowing subset for specific genes

commit b92c2b9
Author: FerriolCalvet <ferriolcalvet@gmail.com>
Date:   Thu Jul 10 23:56:47 2025 +0200

    add metro map

commit 11fb1c5
Author: FerriolCalvet <ferriolcalvet@gmail.com>
Date:   Thu Jul 10 08:25:31 2025 +0200

    update description in main README

commit 628f282
Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com>
Date:   Mon Jul 7 19:14:21 2025 +0200

    Add more complete docs (#306)

    * first doc update

    * update in usage documentation

    - list params
    - list structural parameters and files

    * backbone of output docs

    * update usage description with custom sets of mutations

    * fix headers

    * docs: Update usage with vep information

    * update output description

    * update order of usage information

    * update distribution of information in the docs

    * fix typo in docs/output.md
    * fix markdown linting

    * remove unnecessary validation params from config

    - REQUIRES TESTING

    * remove remaining references to download VEP cache

    * update dag format to mmd

    * update in usage

    * update documentation of file formatting and some params

    * add examples in file formatting docs

    * apply review comments

    * remove Nextflow parameters section

    * minor fix in nextflow.config

    ---------

    Co-authored-by: Miquel L. Grau <miguel.grau@irbbarcelona.org>

commit 72dc4d9
Author: FerriolCalvet <ferriolcalvet@gmail.com>
Date:   Mon Jul 7 15:53:47 2025 +0200

    temporary LICENSE definition

commit 14c5246
Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com>
Date:   Fri Jun 27 11:19:45 2025 +0200

    fix bug in panel_annotation (#313)

    reimplement it with click solved the problem
    - only_canonical boolean working
    - not tested

commit 764782a
Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com>
Date:   Thu Jun 26 23:23:05 2025 +0200

    Update mutation rate to mutation density (#307)

    * rename mutrate to mut density

    - reimplement with click
    - partial renaming
    - simplification of sample_name logic

    * full update of mutation rate to mutation density

    * define other_sample_SNP based on all VAF

    * update mutation density functions

    - clean code
    - add explanation on mutation density

    * apply review changes

commit aae8f0b
Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com>
Date:   Thu Jun 26 10:49:00 2025 +0200

    Ensure POSTPROCESSVEPPANEL in output (#311)

    * fix relative mutabilities output

    * explicitly define postprocessveppanel outdir

    * force outputting postprocesspanel

    * update storing fixes

commit dd25b9b
Merge: 05c80ee 26d5d9b
Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com>
Date:   Wed Jun 4 23:09:29 2025 +0200

    Merge pull request #303 from bbglab/dev

    First pre-release merge

commit 05c80ee
Merge: ea9a301 e06218a
Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com>
Date:   Tue Apr 29 10:30:23 2025 +0200

    Merge pull request #289 from bbglab/tmp-dev

    First release

commit e06218a
Merge: 7559d7f ea9a301
Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com>
Date:   Tue Apr 29 10:11:05 2025 +0200

    Merge branch 'main' into tmp-dev

commit ea9a301
Author: FerriolCalvet <ferriolcalvet@gmail.com>
Date:   Thu Jul 25 16:05:10 2024 +0200

    update schema

* update stacked plot needles

* minor fixes after merge

- NOT WORKING

* update linewidth and size

- add mutation types

* fix plot saturation within pipeline
- collect sitecomparisons
- remove reference to ddg. requires external
- fix input unique to keep header and minimal information
- tested and works

* fix o3d logs output

* plot multiple genes, not only TP53

* apply review suggestions

- temporary solution to domain information loading

* update domain definition and plotting

- subset domains to in_panel ones
- update domain name definition
- plotting modules works
- missing autoexons plot
- update signatures output

* - separate generation of depth per exon
- add general error handling to all plotting modules
- generate exons bedfile within panel

* batch update of exon definitions

- use correct exons definition
- update custom bedfile name
- not tested

* fix exons panel generation

* add error handling to omega plot
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants