Conversation
- to be tested
- testing focussed in PPM1D gene - confirm preferred behaviour for this process if omega_withingene is true, but no option of subgenic element definition is activated it fails - stub mode set up pending
There was a problem hiding this comment.
Pull Request Overview
This PR initializes test infrastructure for the DEEPCSA pipeline using nf-test framework. It sets up the testing environment and adds basic tests for both the entire pipeline and specific processes.
- Sets up nf-test configuration and test directory structure
- Adds pipeline-level tests with various parameter combinations (mostly commented out)
- Implements specific process tests for the EXPAND_REGIONS module
- Includes test data files and configurations for different execution environments
Reviewed Changes
Copilot reviewed 11 out of 14 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| workflows/tests/deepcsa.nf.test | Pipeline-level test definitions with minimal features test |
| tests/nextflow.config | Test-specific Nextflow configuration with cluster and Singularity settings |
| tests/main.nf.test | Basic workflow test template |
| test_data/modules/*.bed | Test data files for module testing |
| nf-test.config | Main nf-test configuration file |
| modules/local/expand_regions/tests/main.nf.test* | Process-specific tests and snapshots for EXPAND_REGIONS |
| modules/local/expand_regions/main.nf | Updated stub section for testing |
| conf/test.config | Simplified test profile configuration |
| conf/local.config | Updated local execution configuration |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
Fede I asked for your review in case you have some general comments about the testing set up or something important to take into account that I overlooked. For now I want to merge this so that the first tests are implemented in dev even if the size of the tests and some of the data cannot be fully shared, but I have a batch of data ready to add for testing that is much smaller and can be publicly shared. I would be happy to receive any feedback you can provide and I am debating how to store the data |
|
Is it realistic to upload this as tests in this repo? |
There was a problem hiding this comment.
Overall if the test run as expected then the implementation is fine, I added some minor comments on some doubts that I had when reading the code and also some advice on commented code, pandas copy().
Your choice if you want to address them or not, if you confirm that tests work then it can be merged.
🚀
bin/add_hotspots.py
Outdated
|
|
||
| current_chr = "" | ||
| region_counters = {} # Dictionary to track exon numbers for each gene | ||
| chr_data = panel_data |
There was a problem hiding this comment.
here I advise to use panel_data.copy(), slicing on a view of a df may give weird results
|
first version of tests work and I am merging it. |
FerriolCalvet
left a comment
There was a problem hiding this comment.
pending things listed above
commit e7ace44 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Fri Oct 10 09:44:35 2025 +0200 v1.0.0 fixes (#380) * fix syntax of optional - fix ambiguity in features list definition * remove optional input definition commit e409639 Merge: 14640cd 1a61fc9 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Wed Oct 8 15:09:55 2025 +0200 Merge pull request #377 from bbglab/dev New release: v1.0.0 Ter commit 1a61fc9 Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Wed Oct 8 11:25:01 2025 +0200 update documentation tackling several issues commit cc808de Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Wed Oct 8 10:23:45 2025 +0200 update naming of summary mutation plots commit 2ab4e65 Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Tue Oct 7 23:15:35 2025 +0200 fix typos and make inputs of expand regions optional commit 6bb325e Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Tue Oct 7 23:03:33 2025 +0200 apply review suggestions commit 872809d Merge: 2ac1bdb 14640cd Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Sun Oct 5 16:07:02 2025 +0200 Merge branch 'main' into dev commit 2ac1bdb Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Sat Oct 4 17:52:17 2025 +0200 add tools' explanation in docs - add adjusted mutation density explanation - rename subworkflow directory commit 275bd68 Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Fri Oct 3 23:16:31 2025 +0200 update features groups documentation commit 67a902e Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Tue Sep 30 09:12:55 2025 +0200 add nanoseq masks to default filtering - add also gnomAD_SNP - add documentation on Nanoseq masks commit b06e900 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Mon Sep 29 09:17:47 2025 +0200 add test setup and first tests (#375) * first definition of tests - to be tested * add first semi-working version of pipeline level tests * add first module testing for EXPANDREGIONS - testing focussed in PPM1D gene - confirm preferred behaviour for this process if omega_withingene is true, but no option of subgenic element definition is activated it fails - stub mode set up pending * tests working for EXPANDREGIONS * update snapshot * minor python fixes * changes after PR review commit 7caf653 Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Thu Sep 25 16:43:38 2025 +0200 update deepCSA diagram commit 8fa7998 Author: Marta Huertas <97596516+m-huertasp@users.noreply.github.com> Date: Thu Sep 25 16:31:39 2025 +0200 Add Nanoseq masks as filters (#374) * feature: add nanoseq masks to FILTERS This commit adds the posibility of using nanoseq masks in deepCSA. New parameters are added both in nextflow_schema and nextflow config. No major other changes are made as nanoseq masks use the same script as FILTEREXONS and FILTERPANELS. * feature: add click to handle inputs These changes are copied from branch input-with-click, more specifically from commit 0af42a9. * refactor: add positive parameter When using filterbed.py, if using positive you filter positions in the bed file and when using negative you filter positions not in the bed file. This commit adjusts the parameters in nanoseq filters to adjust to this behaviour. * refactor: implement with click and add positive parameter The click implementation is usefull to add the --positive flag for those bed files with the positive = true parameter in the modules.conf * refactor: import nanoseq files in subworkflow This commits moves the import from the main workflow to the MUTATION_PREPROCESSING subworkflow. This is cleaner and easier to maintain. * docs: add nanoseq masks paths * refactor: remove debug printing * refactor: move publish dir instructions Improve clarity. * refactor: move nanoseq masks paths to cluster configuration * refactor: simplify definitions and avoid non-intended output To avoid non-intended output, we define filtername empty if not defined instead of "covered". * refactor: unify filters into one and remove non canonical chromosomes The functions negative_filter_panel_regions and positive_filter_panel_regions have been unified into one function: filter_panel. The logic is exactly the same. A new function is created to remove non canonical chromosomes in the positions dataframe (from the bed file). Non canonical chromosomes were giving problems when merging with sample_maf as "chr" was not detected. * refactor: apply nanoseq masks individually with cleaner channel management The if statement for the nanoseq masks has been divided to handle them individually, in case only one is provided. Also, assigning a value to a channel twive is avoided by adding "else" statements. * refactor: add one liner to create filtered maf panels variable Taking into account if nanoseq masks were applied. * refactor: add one liner to create filtered maf panels variable Taking into account if nanoseq masks were applied. * minor update in mut preprocessing style - fix paths in test_real - update order of variables in schema --------- Co-authored-by: FerriolCalvet <ferriolcalvet@gmail.com> commit 42c85f6 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Thu Sep 25 12:37:10 2025 +0200 Update container for HDP signature extraction (#362) * update hdp_wrapper container * add ignore strategy to compare signatures step * add tmp fixes configs commit 5b9ed08 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Fri Sep 19 21:45:57 2025 +0200 fix bug in redefinition of panel with subgenic elements (#373) * fix bug in redefinition of exons and domains - now if a subgenic element is partially covered, it is still included in the expanded file, before it was not Missing: -documentation * add docs * fix bug in end coordinate when matching * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> commit 3d8fe1f Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Wed Sep 17 20:00:41 2025 +0200 Add globalloc synonymous numbers QC (#370) * add globalloc synonymous numbers qc - added all the plots and correlation computations of obs. vs estimated numbers of synonymous mutations * update omega syn qc - working version with plots and tsv outputs commit 03a69f6 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Tue Sep 16 10:39:30 2025 +0200 fix bug that outputted empty maf files (#367) * fix remove creation of empty MAFs * address #337 commit 3fb032d Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Sat Sep 13 11:14:47 2025 +0200 add minor fix to plotting needles for groups commit 9adbe74 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Sat Sep 13 11:05:11 2025 +0200 Allow the option to plot selection and saturation at the level of groups (#366) * init plotting groups * needle plots and selection working for groups - add param to plot only cohort or all custom groups - update groups.json generation missing: - pass site comparison plots & test saturation * fix saturation plots working for groups - fix domain selection plotting as png not pdf commit 14640cd Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Mon Jul 28 18:01:31 2025 +0200 minor updates documentation related
* add profile concatenation and cosine sim plotting -not tested * fix concat profiles working - separated plots for samples and groups * Squashed commit of the following: commit e7ace44 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Fri Oct 10 09:44:35 2025 +0200 v1.0.0 fixes (#380) * fix syntax of optional - fix ambiguity in features list definition * remove optional input definition commit e409639 Merge: 14640cd 1a61fc9 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Wed Oct 8 15:09:55 2025 +0200 Merge pull request #377 from bbglab/dev New release: v1.0.0 Ter commit 1a61fc9 Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Wed Oct 8 11:25:01 2025 +0200 update documentation tackling several issues commit cc808de Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Wed Oct 8 10:23:45 2025 +0200 update naming of summary mutation plots commit 2ab4e65 Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Tue Oct 7 23:15:35 2025 +0200 fix typos and make inputs of expand regions optional commit 6bb325e Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Tue Oct 7 23:03:33 2025 +0200 apply review suggestions commit 872809d Merge: 2ac1bdb 14640cd Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Sun Oct 5 16:07:02 2025 +0200 Merge branch 'main' into dev commit 2ac1bdb Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Sat Oct 4 17:52:17 2025 +0200 add tools' explanation in docs - add adjusted mutation density explanation - rename subworkflow directory commit 275bd68 Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Fri Oct 3 23:16:31 2025 +0200 update features groups documentation commit 67a902e Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Tue Sep 30 09:12:55 2025 +0200 add nanoseq masks to default filtering - add also gnomAD_SNP - add documentation on Nanoseq masks commit b06e900 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Mon Sep 29 09:17:47 2025 +0200 add test setup and first tests (#375) * first definition of tests - to be tested * add first semi-working version of pipeline level tests * add first module testing for EXPANDREGIONS - testing focussed in PPM1D gene - confirm preferred behaviour for this process if omega_withingene is true, but no option of subgenic element definition is activated it fails - stub mode set up pending * tests working for EXPANDREGIONS * update snapshot * minor python fixes * changes after PR review commit 7caf653 Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Thu Sep 25 16:43:38 2025 +0200 update deepCSA diagram commit 8fa7998 Author: Marta Huertas <97596516+m-huertasp@users.noreply.github.com> Date: Thu Sep 25 16:31:39 2025 +0200 Add Nanoseq masks as filters (#374) * feature: add nanoseq masks to FILTERS This commit adds the posibility of using nanoseq masks in deepCSA. New parameters are added both in nextflow_schema and nextflow config. No major other changes are made as nanoseq masks use the same script as FILTEREXONS and FILTERPANELS. * feature: add click to handle inputs These changes are copied from branch input-with-click, more specifically from commit 0af42a9. * refactor: add positive parameter When using filterbed.py, if using positive you filter positions in the bed file and when using negative you filter positions not in the bed file. This commit adjusts the parameters in nanoseq filters to adjust to this behaviour. * refactor: implement with click and add positive parameter The click implementation is usefull to add the --positive flag for those bed files with the positive = true parameter in the modules.conf * refactor: import nanoseq files in subworkflow This commits moves the import from the main workflow to the MUTATION_PREPROCESSING subworkflow. This is cleaner and easier to maintain. * docs: add nanoseq masks paths * refactor: remove debug printing * refactor: move publish dir instructions Improve clarity. * refactor: move nanoseq masks paths to cluster configuration * refactor: simplify definitions and avoid non-intended output To avoid non-intended output, we define filtername empty if not defined instead of "covered". * refactor: unify filters into one and remove non canonical chromosomes The functions negative_filter_panel_regions and positive_filter_panel_regions have been unified into one function: filter_panel. The logic is exactly the same. A new function is created to remove non canonical chromosomes in the positions dataframe (from the bed file). Non canonical chromosomes were giving problems when merging with sample_maf as "chr" was not detected. * refactor: apply nanoseq masks individually with cleaner channel management The if statement for the nanoseq masks has been divided to handle them individually, in case only one is provided. Also, assigning a value to a channel twive is avoided by adding "else" statements. * refactor: add one liner to create filtered maf panels variable Taking into account if nanoseq masks were applied. * refactor: add one liner to create filtered maf panels variable Taking into account if nanoseq masks were applied. * minor update in mut preprocessing style - fix paths in test_real - update order of variables in schema --------- Co-authored-by: FerriolCalvet <ferriolcalvet@gmail.com> commit 42c85f6 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Thu Sep 25 12:37:10 2025 +0200 Update container for HDP signature extraction (#362) * update hdp_wrapper container * add ignore strategy to compare signatures step * add tmp fixes configs commit 5b9ed08 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Fri Sep 19 21:45:57 2025 +0200 fix bug in redefinition of panel with subgenic elements (#373) * fix bug in redefinition of exons and domains - now if a subgenic element is partially covered, it is still included in the expanded file, before it was not Missing: -documentation * add docs * fix bug in end coordinate when matching * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> commit 3d8fe1f Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Wed Sep 17 20:00:41 2025 +0200 Add globalloc synonymous numbers QC (#370) * add globalloc synonymous numbers qc - added all the plots and correlation computations of obs. vs estimated numbers of synonymous mutations * update omega syn qc - working version with plots and tsv outputs commit 03a69f6 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Tue Sep 16 10:39:30 2025 +0200 fix bug that outputted empty maf files (#367) * fix remove creation of empty MAFs * address #337 commit 3fb032d Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Sat Sep 13 11:14:47 2025 +0200 add minor fix to plotting needles for groups commit 9adbe74 Author: Ferriol Calvet <38539786+FerriolCalvet@users.noreply.github.com> Date: Sat Sep 13 11:05:11 2025 +0200 Allow the option to plot selection and saturation at the level of groups (#366) * init plotting groups * needle plots and selection working for groups - add param to plot only cohort or all custom groups - update groups.json generation missing: - pass site comparison plots & test saturation * fix saturation plots working for groups - fix domain selection plotting as png not pdf commit 14640cd Author: FerriolCalvet <ferriolcalvet@gmail.com> Date: Mon Jul 28 18:01:31 2025 +0200 minor updates documentation related * minor updates in plotting settings * fix naming
initialization of nextflow tests with nf-test.