add plotting of profiles similarity #384
Conversation
- separated plots for samples and groups
commit e7ace44 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Fri Oct 10 09:44:35 2025 +0200 v1.0.0 fixes (#380) * fix syntax of optional - fix ambiguity in features list definition * remove optional input definition commit e409639 Merge: 14640cd 1a61fc9 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Wed Oct 8 15:09:55 2025 +0200 Merge pull request #377 from bbglab/dev New release: v1.0.0 Ter commit 1a61fc9 Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Wed Oct 8 11:25:01 2025 +0200 update documentation tackling several issues commit cc808de Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Wed Oct 8 10:23:45 2025 +0200 update naming of summary mutation plots commit 2ab4e65 Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Tue Oct 7 23:15:35 2025 +0200 fix typos and make inputs of expand regions optional commit 6bb325e Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Tue Oct 7 23:03:33 2025 +0200 apply review suggestions commit 872809d Merge: 2ac1bdb 14640cd Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Sun Oct 5 16:07:02 2025 +0200 Merge branch 'main' into dev commit 2ac1bdb Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Sat Oct 4 17:52:17 2025 +0200 add tools' explanation in docs - add adjusted mutation density explanation - rename subworkflow directory commit 275bd68 Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Fri Oct 3 23:16:31 2025 +0200 update features groups documentation commit 67a902e Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Tue Sep 30 09:12:55 2025 +0200 add nanoseq masks to default filtering - add also gnomAD_SNP - add documentation on Nanoseq masks commit b06e900 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Mon Sep 29 09:17:47 2025 +0200 add test setup and first tests (#375) * first definition of tests - to be tested * add first semi-working version of pipeline level tests * add first module testing for EXPANDREGIONS - testing focussed in PPM1D gene - confirm preferred behaviour for this process if omega_withingene is true, but no option of subgenic element definition is activated it fails - stub mode set up pending * tests working for EXPANDREGIONS * update snapshot * minor python fixes * changes after PR review commit 7caf653 Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Thu Sep 25 16:43:38 2025 +0200 update deepCSA diagram commit 8fa7998 Author: Marta Huertas <97596516+m-huertasp@users.noreply.github.com> Date: Thu Sep 25 16:31:39 2025 +0200 Add Nanoseq masks as filters (#374) * feature: add nanoseq masks to FILTERS This commit adds the posibility of using nanoseq masks in deepCSA. New parameters are added both in nextflow_schema and nextflow config. No major other changes are made as nanoseq masks use the same script as FILTEREXONS and FILTERPANELS. * feature: add click to handle inputs These changes are copied from branch input-with-click, more specifically from commit 0af42a9. * refactor: add positive parameter When using filterbed.py, if using positive you filter positions in the bed file and when using negative you filter positions not in the bed file. This commit adjusts the parameters in nanoseq filters to adjust to this behaviour. * refactor: implement with click and add positive parameter The click implementation is usefull to add the --positive flag for those bed files with the positive = true parameter in the modules.conf * refactor: import nanoseq files in subworkflow This commits moves the import from the main workflow to the MUTATION_PREPROCESSING subworkflow. This is cleaner and easier to maintain. * docs: add nanoseq masks paths * refactor: remove debug printing * refactor: move publish dir instructions Improve clarity. * refactor: move nanoseq masks paths to cluster configuration * refactor: simplify definitions and avoid non-intended output To avoid non-intended output, we define filtername empty if not defined instead of "covered". * refactor: unify filters into one and remove non canonical chromosomes The functions negative_filter_panel_regions and positive_filter_panel_regions have been unified into one function: filter_panel. The logic is exactly the same. A new function is created to remove non canonical chromosomes in the positions dataframe (from the bed file). Non canonical chromosomes were giving problems when merging with sample_maf as "chr" was not detected. * refactor: apply nanoseq masks individually with cleaner channel management The if statement for the nanoseq masks has been divided to handle them individually, in case only one is provided. Also, assigning a value to a channel twive is avoided by adding "else" statements. * refactor: add one liner to create filtered maf panels variable Taking into account if nanoseq masks were applied. * refactor: add one liner to create filtered maf panels variable Taking into account if nanoseq masks were applied. * minor update in mut preprocessing style - fix paths in test_real - update order of variables in schema --------- Co-authored-by: FerriolCalvet <ferriolcalvet@gmail.com> commit 42c85f6 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Thu Sep 25 12:37:10 2025 +0200 Update container for HDP signature extraction (#362) * update hdp_wrapper container * add ignore strategy to compare signatures step * add tmp fixes configs commit 5b9ed08 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Fri Sep 19 21:45:57 2025 +0200 fix bug in redefinition of panel with subgenic elements (#373) * fix bug in redefinition of exons and domains - now if a subgenic element is partially covered, it is still included in the expanded file, before it was not Missing: -documentation * add docs * fix bug in end coordinate when matching * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> commit 3d8fe1f Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Wed Sep 17 20:00:41 2025 +0200 Add globalloc synonymous numbers QC (#370) * add globalloc synonymous numbers qc - added all the plots and correlation computations of obs. vs estimated numbers of synonymous mutations * update omega syn qc - working version with plots and tsv outputs commit 03a69f6 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Tue Sep 16 10:39:30 2025 +0200 fix bug that outputted empty maf files (#367) * fix remove creation of empty MAFs * address #337 commit 3fb032d Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Sat Sep 13 11:14:47 2025 +0200 add minor fix to plotting needles for groups commit 9adbe74 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Sat Sep 13 11:05:11 2025 +0200 Allow the option to plot selection and saturation at the level of groups (#366) * init plotting groups * needle plots and selection working for groups - add param to plot only cohort or all custom groups - update groups.json generation missing: - pass site comparison plots & test saturation * fix saturation plots working for groups - fix domain selection plotting as png not pdf commit 14640cd Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Mon Jul 28 18:01:31 2025 +0200 minor updates documentation related
There was a problem hiding this comment.
Pull Request Overview
This pull request adds functionality for plotting profile similarities and includes several improvements to the deepCSA pipeline configuration and documentation. The changes introduce new plotting capabilities, enhanced filtering options with Nanoseq masks, and improved process management.
- Added profile similarity plotting with heatmaps and clustering analysis
- Introduced Nanoseq mask filtering for SNPs and noisy genomic regions
- Enhanced group-based plotting capabilities with configurable options
Reviewed Changes
Copilot reviewed 44 out of 60 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| workflows/deepcsa.nf | Added group key extraction logic, updated mutation profile and plotting subworkflow calls to support group-based analysis |
| workflows/tests/deepcsa.nf.test | Added new workflow test for minimal features |
| subworkflows/local/mutationprofile/main.nf | Added profile concatenation and similarity analysis |
| subworkflows/local/mutationpreprocessing/main.nf | Implemented Nanoseq mask filtering for human samples |
| subworkflows/local/plottingsummary/main.nf | Added group-based plotting with configurable all-samples-only mode |
| subworkflows/local/omega/main.nf | Added QC evaluation for omega global/local estimation |
| modules/local/concatprofiles/main.nf | New module for concatenating profiles and computing similarity metrics |
| modules/local/plot/qc/globalloc_synonymous/main.nf | New module for omega synonymous QC plotting |
| modules/local/filterbed/main.nf | Enhanced filtering with positive/negative flag support |
| modules/local/plot/saturation/main.nf | Updated input parameters by combining site comparison with results |
| nextflow.config | Added plot_only_allsamples parameter and Nanoseq mask parameters |
| conf/modules.config | Added Nanoseq filtering process configurations |
| conf/general_files_IRB.config | Added Nanoseq mask file paths |
| conf/tmp_quick_fixes.config | Added error handling for specific processes |
| docs/usage.md | Added Nanoseq masks documentation section |
| docs/tools.md | New documentation explaining adjusted mutation density and other tools |
| test_data/modules/*.bed | Added test data files for PPM1D exons and domains |
Comments suppressed due to low confidence (1)
docs/file_formatting.md:1
- Corrected spelling of 'potenitally' to 'potentially'.
# File formats of inputs
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Pull Request Overview
Copilot reviewed 3 out of 5 changed files in this pull request and generated 1 comment.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
AI summary
This pull request introduces a new process for concatenating mutational profiles and integrates it into the existing mutational profiling workflow. The changes enable the aggregation of mutation profiles with group information and update the main workflow to support this functionality.
New process integration and workflow updates:
CONCAT_PROFILESprocess inmodules/local/concatprofiles/main.nfthat aggregates mutation profiles and generates summary outputs, including heatmaps, clustermaps, cosine similarity tables, and compiled profiles.CONCAT_PROFILESprocess into theMUTATIONAL_PROFILEsubworkflow by importing it and updating the workflow to pass the requiredall_groupsparameter and emit the compiled profiles output. [1] [2] [3]Main workflow parameter and invocation changes:
DEEPCSAworkflow inworkflows/deepcsa.nfto pass the newTABLE2GROUP.out.json_allgroupsparameter to all mutational profile subworkflows, ensuring group information is available for profile aggregation. [1] [2]