-
Notifications
You must be signed in to change notification settings - Fork 780
Add new pipeline of DynUNet #132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
53cb111
Update DynUNet
yiheng-wang-nv d7bff9c
Remove unused libraries
yiheng-wang-nv bcb738e
Add some cell outputs for reference
yiheng-wang-nv 60ea525
Add py based dynunet pipeline
yiheng-wang-nv a068750
Add comments for the deep supervision changes
yiheng-wang-nv f447eb6
Add new pipeline of DynUNet for Decathlon tasks
yiheng-wang-nv c412791
update train script
yiheng-wang-nv bd79e03
Add inference
yiheng-wang-nv b233e61
Add multigpu support
yiheng-wang-nv cc3717b
Add reference for transform
yiheng-wang-nv 5d26183
Fix doc string error
yiheng-wang-nv 0db476f
Update transforms
yiheng-wang-nv 862976b
Add task 04 fold 0 scores
yiheng-wang-nv b99ea8e
set param fold in scripts
yiheng-wang-nv e0897ba
Add val results for task04
yiheng-wang-nv dd72c82
Add scripts for 10 tasks and results in readme
yiheng-wang-nv d4f547b
Update modules/dynunet_pipeline/README.md
yiheng-wang-nv e82de1c
Update modules/dynunet_pipeline/README.md
yiheng-wang-nv 25ddb92
Update modules/dynunet_pipeline/create_datalist.py
yiheng-wang-nv 60ff681
Update readme and add datalist json files
yiheng-wang-nv 92d4498
Modify nib saver
yiheng-wang-nv 0d21377
Merge branch 'master' into dynunet-new-pipeline
wyli File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,97 @@ | ||
| # Overview | ||
| This pipeline is modified from NNUnet [1][2] which wins the "Medical Segmentation Decathlon Challenge 2018" and open sourced from https://github.com/MIC-DKFZ/nnUNet. | ||
|
|
||
| ## Data | ||
| The source decathlon datasets can be found from http://medicaldecathlon.com/. | ||
|
|
||
| After getting the dataset, please run `create_datalist.py` to get the datalists (please check the command line arguments first). The default seed can help to get the same 5 folds data splits as NNUnet has, and the created datalist will be in `config/` | ||
|
|
||
| My running environment: | ||
|
|
||
| - OS: Ubuntu 20.04.1 LTS | ||
| - Python: 3.8.5 | ||
| - Pytorch: 1.8.0 | ||
|
|
||
| To prevent the inconsistency, all json files are included in `config/` already. | ||
|
|
||
| ## Training | ||
| Please run `train.py` for training. Please modify the command line arguments according | ||
| to the actual situation, such as `determinism_flag` for deterministic training, `amp` for automatic mixed precision. | ||
|
|
||
| ## Validation | ||
| Please run `train.py` and set the argument `mode` to `val` for validation. | ||
|
|
||
| ## Inference | ||
| Please run `inference.py` for inference. | ||
|
|
||
| ## Examples | ||
| All training scripts for 10 tasks are included in `commands/`. | ||
| For instance: | ||
|
|
||
| - `train.sh` is used for training. | ||
| - `finetune.sh` is used for finetuning. | ||
| - `val.sh` is used for validation. | ||
| - `infer.sh` is used for inference. | ||
| - If you need to use multiple GPUs, please run scripts that contain `multi_gpu`. | ||
|
|
||
| You can take task 04's scripts for reference since for other tasks, only the training parts are included. A task folder that contains `train.sh` means it only needs to use 1 GPU for training, and `train_multi_gpu.sh` means it needs at least 2 GPUs for training. | ||
|
|
||
| The devices I used for training for all tasks are shown as follow: | ||
|
|
||
| | task | number of GPUs used (Tesla V100 32GB) | | ||
| |:----:|:-------------------------------------:| | ||
| | 1 | 2 | | ||
| | 2 | 1 | | ||
| | 3 | 4 | | ||
| | 4 | 1 | | ||
| | 5 | 1 | | ||
| | 6 | 1 | | ||
| | 7 | 2 | | ||
| | 8 | 2 | | ||
| | 9 | 1 | | ||
| | 10 | 1 | | ||
|
|
||
| I used these scripts and trained for all 5 folds for all 10 tasks. As for the test set, I did the ensemble by average the 5 feature maps (coming from 5 folds' models) before the `argmax` manipulation (for task 03, since the feature maps are very large, I just did voting for 5 final predictions). By submitting the ensembled results to the Decathlon Challenge's Leaderboard, I got the following results: | ||
|
|
||
| | | DynUNet class 1 | 2 | 3 | NNUNet class 1 | 2 | 3 | | ||
| |:-------:|:---------------:|:----:|:----:|:--------------:|:----:|:----:| | ||
| | task 01 | 0.68 | 0.47 | 0.69 | 0.68 | 0.47 | 0.68 | | ||
|
|
||
|
|
||
| | | DynUNet class 1 | NNUNet class 1 | | ||
| |:-------:|:---------------:|:--------------:| | ||
| | task 02 | 0.93 | 0.93 | | ||
| | task 06 | 0.67 | 0.74 | | ||
| | task 09 | 0.96 | 0.97 | | ||
| | task 10 | 0.55 | 0.58 | | ||
|
|
||
|
|
||
| | | DynUNet class 1 | 2 | NNUNet class 1 | 2 | | ||
| |:-------:|:---------------:|:----:|:--------------:|:----:| | ||
| | task 03 | 0.95 | 0.72 | 0.96 | 0.76 | | ||
| | task 04 | 0.90 | 0.88 | 0.90 | 0.89 | | ||
| | task 05 | 0.71 | 0.87 | 0.77 | 0.90 | | ||
| | task 07 | 0.81 | 0.54 | 0.82 | 0.53 | | ||
| | task 08 | 0.66 | 0.71 | 0.66 | 0.72 | | ||
|
|
||
| Comments: | ||
| - The results of DynUNet come from the re-implemented `3D_fullres` version in MONAI and without postprocessing. | ||
|
|
||
| - The results of NNUnet come from different versions (`3D_fullres` for task 01, 02 and 04, `3D_cascade` for task 10, and ensembled two versions for other tasks) and may have postprocessing [1]. | ||
|
|
||
| - Therefore, the two results may not be fully comparable and the above tables are just for reference. | ||
|
|
||
| - After implementing this repository, I re-trained on task 04 and attached the validation results as follow, and the comparisons between DynUNet and NNUnet are all for the single `3D_fullres` version. | ||
|
|
||
| As for task 04, with the default settings in `train.sh` and `finetune.sh`, you can get around the following validation results: | ||
|
|
||
| | | 0 | 1 | 2 | 3 | 4 | Mean | NNUNet val | | ||
| |---------|--------|--------|--------|--------|--------|--------|------------| | ||
| | class 1 | 0.9007 | 0.8930 | 0.8985 | 0.8979 | 0.9015 | 0.8983 | 0.8975 | | ||
| | class 2 | 0.8835 | 0.8774 | 0.8826 | 0.8818 | 0.8828 | 0.8816 | 0.8807 | | ||
|
|
||
|
|
||
| # References | ||
| [1] Isensee F, Jäger P F, Kohl S A A, et al. Automated design of deep learning methods for biomedical image segmentation[J]. arXiv preprint arXiv:1904.08128, 2019. | ||
|
|
||
| [2] Isensee F, Petersen J, Klein A, et al. nnu-net: Self-adapting framework for u-net-based medical image segmentation[J]. arXiv preprint arXiv:1809.10486, 2018. | ||
12 changes: 12 additions & 0 deletions
12
modules/dynunet_pipeline/commands/task01/finetune_multi_gpu.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,12 @@ | ||
| # train step 2, finetune with small learning rate | ||
| # please replace the weight variable into your actual weight | ||
|
|
||
| lr=1e-2 | ||
| fold=0 | ||
| weight=model.pt | ||
|
|
||
| python -m torch.distributed.launch --nproc_per_node=2 --nnodes=1 --node_rank=0 \ | ||
| --master_addr="localhost" --master_port=1234 \ | ||
| train.py -fold $fold -train_num_workers 4 -interval 10 -num_samples 1 \ | ||
| -learning_rate $lr -max_epochs 1000 -task_id 01 -pos_sample_num 1 \ | ||
| -expr_name baseline -tta_val True -checkpoint $weight -multi_gpu True |
12 changes: 12 additions & 0 deletions
12
modules/dynunet_pipeline/commands/task01/train_multi_gpu.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,12 @@ | ||
| # train step 1, with large learning rate | ||
| # although max_epochs here is 3000, my results shown that for all 5 folds, | ||
| # the best epochs is less than 400, thus maybe you can manually stop early. | ||
|
|
||
| lr=1e-1 | ||
| fold=0 | ||
|
|
||
| python -m torch.distributed.launch --nproc_per_node=2 --nnodes=1 --node_rank=0 \ | ||
| --master_addr="localhost" --master_port=1234 \ | ||
| train.py -fold $fold -train_num_workers 4 -interval 10 -num_samples 1 \ | ||
| -learning_rate $lr -max_epochs 3000 -task_id 01 -pos_sample_num 1 \ | ||
| -expr_name baseline -tta_val True -multi_gpu True |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,11 @@ | ||
| # train step 2, finetune with small learning rate | ||
| # please replace the weight variable into your actual weight | ||
|
|
||
| lr=1e-2 | ||
| fold=0 | ||
| weight=model.pt | ||
|
|
||
| python train.py -fold $fold -train_num_workers 4 -interval 1 -num_samples 4 \ | ||
| -learning_rate $lr -max_epochs 500 -task_id 02 -pos_sample_num 1 \ | ||
| -expr_name baseline -tta_val True -checkpoint $weight -determinism_flag True \ | ||
| -determinism_seed 0 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| # train step 1, with large learning rate | ||
|
|
||
| lr=1e-1 | ||
| fold=0 | ||
|
|
||
| python train.py -fold $fold -train_num_workers 4 -interval 1 -num_samples 4 \ | ||
| -learning_rate $lr -max_epochs 3000 -task_id 02 -pos_sample_num 1 \ | ||
| -expr_name baseline -tta_val True -determinism_flag True -determinism_seed 0 |
13 changes: 13 additions & 0 deletions
13
modules/dynunet_pipeline/commands/task03/finetune_multi_gpu.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| # train step 2, finetune with small learning rate | ||
| # please replace the weight variable into your actual weight | ||
|
|
||
| lr=1e-2 | ||
| fold=0 | ||
| weight=model.pt | ||
|
|
||
| python -m torch.distributed.launch --nproc_per_node=2 --nnodes=1 --node_rank=0 \ | ||
| --master_addr="localhost" --master_port=1234 \ | ||
| train.py -fold $fold -train_num_workers 8 -interval 20 -num_samples 1 \ | ||
| -learning_rate $lr -max_epochs 2000 -task_id 03 -pos_sample_num 1 \ | ||
| -expr_name baseline -tta_val True -checkpoint $weight -multi_gpu True \ | ||
| -eval_overlap 0.5 -sw_batch_size 2 -batch_dice True |
11 changes: 11 additions & 0 deletions
11
modules/dynunet_pipeline/commands/task03/train_multi_gpu.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,11 @@ | ||
| # train step 1, with large learning rate | ||
|
|
||
| lr=2e-2 | ||
| fold=0 | ||
|
|
||
| python -m torch.distributed.launch --nproc_per_node=2 --nnodes=1 --node_rank=0 \ | ||
| --master_addr="localhost" --master_port=1234 \ | ||
| train.py -fold $fold -train_num_workers 8 -interval 20 -num_samples 1 \ | ||
| -learning_rate $lr -max_epochs 3000 -task_id 03 -pos_sample_num 1 \ | ||
| -expr_name baseline -tta_val True -multi_gpu True -eval_overlap 0.1 \ | ||
| -sw_batch_size 2 -batch_dice True |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,11 @@ | ||
| # train step 2, finetune with small learning rate | ||
| # please replace the weight variable into your actual weight | ||
|
|
||
| lr=1e-3 | ||
| fold=0 | ||
| weight=model.pt | ||
|
|
||
| python train.py -fold $fold -train_num_workers 4 -interval 1 -num_samples 1 \ | ||
| -learning_rate $lr -max_epochs 50 -task_id 04 -pos_sample_num 1 \ | ||
| -expr_name baseline -tta_val True -checkpoint $weight -determinism_flag True \ | ||
| -determinism_seed 0 |
12 changes: 12 additions & 0 deletions
12
modules/dynunet_pipeline/commands/task04/finetune_multi_gpu.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,12 @@ | ||
| # train step 2, finetune with small learning rate | ||
| # please replace the weight variable into your actual weight | ||
|
|
||
| lr=1e-3 | ||
| fold=0 | ||
| weight=model.pt | ||
|
|
||
| python -m torch.distributed.launch --nproc_per_node=2 --nnodes=1 --node_rank=0 \ | ||
| --master_addr="localhost" --master_port=1234 \ | ||
| train.py -fold $fold -train_num_workers 4 -interval 1 -num_samples 1 \ | ||
| -learning_rate $lr -max_epochs 50 -task_id 04 -pos_sample_num 1 \ | ||
| -expr_name baseline -tta_val True -checkpoint $weight -multi_gpu True |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| # please replace the weight variable into your actual weight | ||
|
|
||
| weight=model.pt | ||
| fold=0 | ||
|
|
||
| python inference.py -fold $fold -expr_name baseline -task_id 04 -tta_val True \ | ||
| -checkpoint $weight |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| # please replace the weight variable into your actual weight | ||
|
|
||
| weight=model.pt | ||
| fold=0 | ||
|
|
||
| python -m torch.distributed.launch --nproc_per_node=2 --nnodes=1 --node_rank=0 \ | ||
| --master_addr="localhost" --master_port=1234 \ | ||
| inference.py -fold $fold -expr_name baseline -task_id 04 -tta_val True \ | ||
| -checkpoint $weight -multi_gpu True |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| # train step 1, with large learning rate | ||
|
|
||
| lr=1e-1 | ||
| fold=0 | ||
|
|
||
| python train.py -fold $fold -train_num_workers 4 -interval 1 -num_samples 1 \ | ||
| -learning_rate $lr -max_epochs 500 -task_id 04 -pos_sample_num 2 \ | ||
| -expr_name baseline -tta_val True -determinism_flag True -determinism_seed 0 |
10 changes: 10 additions & 0 deletions
10
modules/dynunet_pipeline/commands/task04/train_multi_gpu.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| # train step 1, with large learning rate | ||
|
|
||
| lr=1e-1 | ||
| fold=0 | ||
|
|
||
| python -m torch.distributed.launch --nproc_per_node=2 --nnodes=1 --node_rank=0 \ | ||
| --master_addr="localhost" --master_port=1234 \ | ||
| train.py -fold $fold -train_num_workers 4 -interval 1 -num_samples 1 \ | ||
| -learning_rate $lr -max_epochs 500 -task_id 04 -pos_sample_num 2 \ | ||
| -expr_name baseline -tta_val True -multi_gpu True |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| # please replace the weight variable into your actual weight | ||
|
|
||
| weight=model.pt | ||
| fold=0 | ||
|
|
||
| python train.py -fold $fold -expr_name baseline -task_id 04 -tta_val True \ | ||
| -checkpoint $weight -mode val |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| # please replace the weight variable into your actual weight | ||
|
|
||
| weight=model.pt | ||
| fold=0 | ||
|
|
||
| python -m torch.distributed.launch --nproc_per_node=2 --nnodes=1 --node_rank=0 \ | ||
| --master_addr="localhost" --master_port=1234 \ | ||
| train.py -fold $fold -expr_name baseline -task_id 04 -tta_val True \ | ||
| -checkpoint $weight -mode val -multi_gpu True |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,11 @@ | ||
| # train step 2, finetune with small learning rate | ||
| # please replace the weight variable into your actual weight | ||
|
|
||
| lr=1e-2 | ||
| fold=0 | ||
| weight=model.pt | ||
|
|
||
| python train.py -fold $fold -train_num_workers 4 -interval 1 -num_samples 4 \ | ||
| -learning_rate $lr -max_epochs 1000 -task_id 05 -pos_sample_num 1 \ | ||
| -expr_name baseline -tta_val True -checkpoint $weight -determinism_flag True \ | ||
| -determinism_seed 0 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| # train step 1, with large learning rate | ||
|
|
||
| lr=1e-1 | ||
| fold=0 | ||
|
|
||
| python train.py -fold $fold -train_num_workers 4 -interval 5 -num_samples 4 \ | ||
| -learning_rate $lr -max_epochs 1000 -task_id 05 -pos_sample_num 1 \ | ||
| -expr_name baseline -tta_val True -determinism_flag True -determinism_seed 0 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,11 @@ | ||
| # train step 2, finetune with small learning rate | ||
| # please replace the weight variable into your actual weight | ||
|
|
||
| lr=1e-3 | ||
| fold=0 | ||
| weight=model.pt | ||
|
|
||
| python train.py -fold $fold -train_num_workers 4 -interval 5 -num_samples 1 \ | ||
| -learning_rate $lr -max_epochs 1000 -task_id 06 -pos_sample_num 1 \ | ||
| -expr_name baseline -tta_val True -checkpoint $weight -determinism_flag True \ | ||
| -determinism_seed 0 -batch_dice True |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| # train step 1, with large learning rate | ||
|
|
||
| lr=1e-2 | ||
| fold=0 | ||
|
|
||
| python train.py -fold $fold -train_num_workers 4 -interval 10 -num_samples 1 \ | ||
| -learning_rate $lr -max_epochs 3000 -task_id 06 -pos_sample_num 1 \ | ||
| -expr_name baseline -tta_val True -determinism_flag True -determinism_seed 0 \ | ||
| -batch_dice True |
17 changes: 17 additions & 0 deletions
17
modules/dynunet_pipeline/commands/task07/finetune_multi_gpu.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| # train step 2, finetune with small learning rate | ||
| # please replace the weight variable into your actual weight | ||
| # since this task uses lr scheduler, please set the lr and max epochs | ||
| # here according to the step 1 training results. The value of max epochs equals | ||
| # to 2000 minus the best epoch in step 1. | ||
|
|
||
| lr=5e-3 | ||
| max_epochs=1000 | ||
| fold=0 | ||
| weight=model.pt | ||
|
|
||
| python -m torch.distributed.launch --nproc_per_node=2 --nnodes=1 --node_rank=0 \ | ||
| --master_addr="localhost" --master_port=1234 \ | ||
| train.py -fold $fold -train_num_workers 4 -interval 10 -num_samples 1 \ | ||
| -learning_rate $lr -max_epochs $max_epochs -task_id 07 -pos_sample_num 1 \ | ||
| -expr_name baseline -tta_val True -checkpoint $weight -multi_gpu True \ | ||
| -lr_decay True |
10 changes: 10 additions & 0 deletions
10
modules/dynunet_pipeline/commands/task07/train_multi_gpu.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| # train step 1, with large learning rate | ||
|
|
||
| lr=1e-2 | ||
| fold=0 | ||
|
|
||
| python -m torch.distributed.launch --nproc_per_node=2 --nnodes=1 --node_rank=0 \ | ||
| --master_addr="localhost" --master_port=1234 \ | ||
| train.py -fold $fold -train_num_workers 4 -interval 10 -num_samples 1 \ | ||
| -learning_rate $lr -max_epochs 2000 -task_id 07 -pos_sample_num 1 \ | ||
| -expr_name baseline -tta_val True -multi_gpu True -lr_decay True |
17 changes: 17 additions & 0 deletions
17
modules/dynunet_pipeline/commands/task08/finetune_multi_gpu.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| # train step 2, finetune with small learning rate | ||
| # please replace the weight variable into your actual weight | ||
| # since this task uses lr scheduler, please set the lr and max epochs | ||
| # here according to the step 1 training results. The value of max epochs equals | ||
| # to 2000 minus the best epoch in step 1. | ||
|
|
||
| lr=5e-3 | ||
| max_epochs=1000 | ||
| fold=0 | ||
| weight=model.pt | ||
|
|
||
| python -m torch.distributed.launch --nproc_per_node=2 --nnodes=1 --node_rank=0 \ | ||
| --master_addr="localhost" --master_port=1234 \ | ||
| train.py -fold $fold -train_num_workers 4 -interval 10 -num_samples 1 \ | ||
| -learning_rate $lr -max_epochs $max_epochs -task_id 08 -pos_sample_num 1 \ | ||
| -expr_name baseline -tta_val True -checkpoint $weight -multi_gpu True \ | ||
| -lr_decay True |
10 changes: 10 additions & 0 deletions
10
modules/dynunet_pipeline/commands/task08/train_multi_gpu.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| # train step 1, with large learning rate | ||
|
|
||
| lr=1e-2 | ||
| fold=0 | ||
|
|
||
| python -m torch.distributed.launch --nproc_per_node=2 --nnodes=1 --node_rank=0 \ | ||
| --master_addr="localhost" --master_port=1234 \ | ||
| train.py -fold $fold -train_num_workers 4 -interval 10 -num_samples 1 \ | ||
| -learning_rate $lr -max_epochs 2000 -task_id 08 -pos_sample_num 1 \ | ||
| -expr_name baseline -tta_val True -multi_gpu True -lr_decay True |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,11 @@ | ||
| # train step 2, finetune with small learning rate | ||
| # please replace the weight variable into your actual weight | ||
|
|
||
| lr=1e-2 | ||
| fold=0 | ||
| weight=model.pt | ||
|
|
||
| python train.py -fold $fold -train_num_workers 4 -interval 5 -num_samples 3 \ | ||
| -learning_rate $lr -max_epochs 1000 -task_id 09 -pos_sample_num 2 \ | ||
| -expr_name baseline -tta_val True -checkpoint $weight -determinism_flag True \ | ||
| -determinism_seed 0 -lr_decay True -batch_dice True |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| # train step 1, with large learning rate | ||
|
|
||
| lr=1e-2 | ||
| fold=0 | ||
|
|
||
| python train.py -fold $fold -train_num_workers 4 -interval 10 -num_samples 3 \ | ||
| -learning_rate $lr -max_epochs 3000 -task_id 09 -pos_sample_num 2 \ | ||
| -expr_name baseline -tta_val True -determinism_flag True -determinism_seed 0 \ | ||
| -batch_dice True |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.