From f25631f8636fbc6f5071bc7b0e0843999be95b36 Mon Sep 17 00:00:00 2001 From: Eric Kerfoot Date: Mon, 21 Feb 2022 17:26:11 +0000 Subject: [PATCH 01/10] Adding archive specification document Signed-off-by: Eric Kerfoot --- docs/source/mar_specification.rst | 138 ++++++++++++++++++++++++++++++ 1 file changed, 138 insertions(+) create mode 100644 docs/source/mar_specification.rst diff --git a/docs/source/mar_specification.rst b/docs/source/mar_specification.rst new file mode 100644 index 0000000000..93763cd3cc --- /dev/null +++ b/docs/source/mar_specification.rst @@ -0,0 +1,138 @@ + +=========================== +MONAI Archive Specification +=========================== + +Overview +======== + +This is the specification for the MONAI Archive (MAR) format of portable described deep learning models. The objective of a MAR is to define a packaged network or model which includes the critical information necessary to allow users and other programs to understand how the model is used and for what purpose. A MAR includes the stored weights of a model as a state dictionary and/or a Torchscript object. Additional JSON files are included to store metadata about the model, information for constructing training, inference, and post-processing transform sequences, plain-text description, legal information, and other data the model creator wishes to include. + +This specification defines the directory structure a MAR must have and the necessary files it must contain. Additional files may be included and the directory packaged into a zip file or included as extra files directly in a Torchscript file. + +Directory Structure +=================== + +A MAR package is defined primarily as a directory with a set of specifically named subdirectories containing the model and metadata files. The root directory should be named for the model, given as "ModelName", and should contain the following structure: + +:: + ModelName + ┣━ configs + ┃ ┗━ metadata.json + ┣━ models + ┃ ┣━ model.pt + ┃ ┗━ model.ts + ┗━ docs + ┣━ README.md + ┗━ license.txt + + +These files mostly are required to be present with the given names for the directory to define a valid MAR: + +* **metadata.json**: netadata information in JSON format relating to the type of model, definition of input and output tensors, versions of the model and used software, and other information described below. +* **model.pt**: the state dictionary of a saved model, the information to instantiate the model must be found in the metadata file. +* **model.ts**: the Torchscript saved model if the model is compatible with being saved correctly in this format. +* **README.md**: plain-language information on the model, how to use it, author information, etc. in Markdown format. +* **license.txt**: software license attached to the model, can be left blank if no license needed. + +Archive Format +============== + +The MAR directory and its contents can be compressed into a zip file to constitute a single file package. When unzipped into a directory this file will reproduce the above directory structure, and should itself also be named after the model it contains. + +The Torchscript file format is also just a zip file with a specific structure. When creating such an archive with `save_net_with_metadata` a MAR-compliant Torchscript file can be created by including the contents of `metadata.json` as the `meta_values` argument of the function, and other files included as `more_extra_files` entries. These will be stored in a `extras` directory in the zip file and can be retrieved with `load_net_with_metadata` or with any other library/tool that can read zip data. In this format the `model.*` files are obviously not needed by `README.md` and `license.txt` can be added as more extra files. + +metadata.json File +================== + +This file contains the metadata information relating to the model, including what the shape and format of inputs and outputs are, what the meaning of the outputs are, what type of model is present, and other information. The JSON structure is a dictionary containing a defined set of keys with additional user-specified keys. The mandatory keys are as follows: + +* **version**: version of the stored model. +* **monai_version**: version of MONAI the MAR was generated on, later versions expected to work. +* **pytorch_version**: version of Pytorch the MAR was generated on, later versions expected to work. +* **numpy_version**: version of Numpy the MAR was generated on, later versions expected to work. +* **optional_packages_version**: dictionary relating optional package names to their versions, these packages are not needed but are recommended to be isntalled with this stated minimum version. +* **task**: plain-language description of what the model is meant to do. +* **description**: longer form plain-language description of what the model is, what it does, etc. +* **authorship**: state author(s) of the model. +* **copyright**: state model copyright. +* **network_data_format**: defines the format, shape, and meaning of inputs and outputs to the model, contains keys "inputs" and "outputs" relating named inputs/outputs to their format specifiers (defined below). + +Tensor format specifiers are used to define input and output tensors and their meanings, and must be a dictionary containing at least these keys: +* **type**: what sort of data the tensor represents: "image", "label", etc. +* **format**: what format of information is stored: "magnitude", "hounsfield", "kspace", "segmentation", "multiclass", etc. +* **num_channels**: number of channels the tensor has, assumed channel dimension first. +* **spatial_shape**: shape of the spatial dimensions of the form "[H]", "[H, W]", or "[H, W, D]" +* **dtype**: data type of tensor, eg. "float32", "int32" +* **value_range**: minimum and maximum values the input data is expected to have of the form "[MIN, MAX]" or "[]" if not known. +* **is_patch_data**: "true" if the data is a patch of an input/output tensor or the entirely of the tensor, "false" otherwise. +* **channel_def**: dictionary relating channel indices to plain-language description of what the channel contains. + +Optional keys: +* **changelog**: dictionary relating previous version names to strings describing the version. +* **intended_use**: what the model is to be used for, ie. what task it accomplishes. +* **data_source**: description of where training/validation can be sourced. +* **data_type**: type of source data used for training/validation. +* **references**: list of published referenced relating to the model. + +A JSON schema for this file can be found at https://github.com/Project-MONAI/MONAI/blob/3049e280f2424962bb2a69261389fcc0b98e0036/monai/apps/mmars/schema/metadata.json + +An example JSON metadata file: + +:: + { + "version": "0.1.0", + "changelog": { + "0.1.0": "complete the model package", + "0.0.1": "initialize the model package structure" + }, + "monai_version": "0.8.0", + "pytorch_version": "1.10.0", + "numpy_version": "1.21.2", + "optional_packages_version": {"nibabel": "3.2.1"}, + "task": "Decathlon spleen segmentation", + "description": "A pre-trained model for volumetric (3D) segmentation of the spleen from CT image", + "authorship": "MONAI team", + "copyright": "Copyright (c) MONAI Consortium", + "data_source": "Task09_Spleen.tar from http://medicaldecathlon.com/", + "data_type": "dicom", + "dataset_dir": "/workspace/data/Task09_Spleen", + "image_classes": "single channel data, intensity scaled to [0, 1]", + "label_classes": "single channel data, 1 is spleen, 0 is everything else", + "pred_classes": "2 channels OneHot data, channel 1 is spleen, channel 0 is background", + "eval_metrics": { + "mean_dice": 0.96 + }, + "intended_use": "This is an example, not to be used for diagnostic purposes", + "references": [ + "Xia, Yingda, et al. '3D Semi-Supervised Learning with Uncertainty-Aware Multi-View Co-Training.' arXiv preprint arXiv:1811.12506 (2018). https://arxiv.org/abs/1811.12506.", + "Kerfoot E., Clough J., Oksuz I., Lee J., King A.P., Schnabel J.A. (2019) Left-Ventricle Quantification Using Residual U-Net. In: Pop M. et al. (eds) Statistical Atlases and Computational Models of the Heart. Atrial Segmentation and LV Quantification Challenges. STACOM 2018. Lecture Notes in Computer Science, vol 11395. Springer, Cham. https://doi.org/10.1007/978-3-030-12029-0_40" + ], + "network_data_format":{ + "inputs": { + "image": { + "type": "image", + "format": "magnitude", + "num_channels": 1, + "spatial_shape": [160, 160, 160], + "dtype": "float32", + "value_range": [0, 1], + "is_patch_data": false, + "channel_def": {0: "image"} + } + }, + "outputs":{ + "pred": { + "type": "image", + "format": "segmentation", + "num_channels": 2, + "spatial_shape": [160, 160, 160], + "dtype": "float32", + "value_range": [0, 1], + "is_patch_data": false, + "channel_def": {0: "background", 1: "spleen"} + } + } + } + } + From f0ea2f8240074df090ff040abecf86bf56bff56e Mon Sep 17 00:00:00 2001 From: Eric Kerfoot Date: Mon, 21 Feb 2022 17:28:17 +0000 Subject: [PATCH 02/10] Adding archive specification document Signed-off-by: Eric Kerfoot --- docs/source/mar_specification.rst | 2 -- 1 file changed, 2 deletions(-) diff --git a/docs/source/mar_specification.rst b/docs/source/mar_specification.rst index 93763cd3cc..4a1a6319df 100644 --- a/docs/source/mar_specification.rst +++ b/docs/source/mar_specification.rst @@ -14,7 +14,6 @@ Directory Structure =================== A MAR package is defined primarily as a directory with a set of specifically named subdirectories containing the model and metadata files. The root directory should be named for the model, given as "ModelName", and should contain the following structure: - :: ModelName ┣━ configs @@ -78,7 +77,6 @@ Optional keys: A JSON schema for this file can be found at https://github.com/Project-MONAI/MONAI/blob/3049e280f2424962bb2a69261389fcc0b98e0036/monai/apps/mmars/schema/metadata.json An example JSON metadata file: - :: { "version": "0.1.0", From 3e2c78d475f3e657268b9100c254f5be41468cc8 Mon Sep 17 00:00:00 2001 From: Eric Kerfoot Date: Mon, 21 Feb 2022 17:28:49 +0000 Subject: [PATCH 03/10] Adding archive specification document Signed-off-by: Eric Kerfoot --- docs/source/mar_specification.rst | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/source/mar_specification.rst b/docs/source/mar_specification.rst index 4a1a6319df..720adfa631 100644 --- a/docs/source/mar_specification.rst +++ b/docs/source/mar_specification.rst @@ -58,6 +58,7 @@ This file contains the metadata information relating to the model, including wha * **network_data_format**: defines the format, shape, and meaning of inputs and outputs to the model, contains keys "inputs" and "outputs" relating named inputs/outputs to their format specifiers (defined below). Tensor format specifiers are used to define input and output tensors and their meanings, and must be a dictionary containing at least these keys: + * **type**: what sort of data the tensor represents: "image", "label", etc. * **format**: what format of information is stored: "magnitude", "hounsfield", "kspace", "segmentation", "multiclass", etc. * **num_channels**: number of channels the tensor has, assumed channel dimension first. @@ -68,6 +69,7 @@ Tensor format specifiers are used to define input and output tensors and their m * **channel_def**: dictionary relating channel indices to plain-language description of what the channel contains. Optional keys: + * **changelog**: dictionary relating previous version names to strings describing the version. * **intended_use**: what the model is to be used for, ie. what task it accomplishes. * **data_source**: description of where training/validation can be sourced. From 0a654801f4e8ae37cbdb8df460e80f8b99478b68 Mon Sep 17 00:00:00 2001 From: "pre-commit-ci[bot]" <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Mon, 21 Feb 2022 17:39:00 +0000 Subject: [PATCH 04/10] [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --- docs/source/mar_specification.rst | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/docs/source/mar_specification.rst b/docs/source/mar_specification.rst index 720adfa631..22dcb076c2 100644 --- a/docs/source/mar_specification.rst +++ b/docs/source/mar_specification.rst @@ -6,14 +6,14 @@ MONAI Archive Specification Overview ======== -This is the specification for the MONAI Archive (MAR) format of portable described deep learning models. The objective of a MAR is to define a packaged network or model which includes the critical information necessary to allow users and other programs to understand how the model is used and for what purpose. A MAR includes the stored weights of a model as a state dictionary and/or a Torchscript object. Additional JSON files are included to store metadata about the model, information for constructing training, inference, and post-processing transform sequences, plain-text description, legal information, and other data the model creator wishes to include. +This is the specification for the MONAI Archive (MAR) format of portable described deep learning models. The objective of a MAR is to define a packaged network or model which includes the critical information necessary to allow users and other programs to understand how the model is used and for what purpose. A MAR includes the stored weights of a model as a state dictionary and/or a Torchscript object. Additional JSON files are included to store metadata about the model, information for constructing training, inference, and post-processing transform sequences, plain-text description, legal information, and other data the model creator wishes to include. -This specification defines the directory structure a MAR must have and the necessary files it must contain. Additional files may be included and the directory packaged into a zip file or included as extra files directly in a Torchscript file. +This specification defines the directory structure a MAR must have and the necessary files it must contain. Additional files may be included and the directory packaged into a zip file or included as extra files directly in a Torchscript file. Directory Structure =================== -A MAR package is defined primarily as a directory with a set of specifically named subdirectories containing the model and metadata files. The root directory should be named for the model, given as "ModelName", and should contain the following structure: +A MAR package is defined primarily as a directory with a set of specifically named subdirectories containing the model and metadata files. The root directory should be named for the model, given as "ModelName", and should contain the following structure: :: ModelName ┣━ configs @@ -54,7 +54,7 @@ This file contains the metadata information relating to the model, including wha * **task**: plain-language description of what the model is meant to do. * **description**: longer form plain-language description of what the model is, what it does, etc. * **authorship**: state author(s) of the model. -* **copyright**: state model copyright. +* **copyright**: state model copyright. * **network_data_format**: defines the format, shape, and meaning of inputs and outputs to the model, contains keys "inputs" and "outputs" relating named inputs/outputs to their format specifiers (defined below). Tensor format specifiers are used to define input and output tensors and their meanings, and must be a dictionary containing at least these keys: @@ -135,4 +135,3 @@ An example JSON metadata file: } } } - From ebbb801e904b4d41856aa6af7a3e3d7dc046e8f9 Mon Sep 17 00:00:00 2001 From: Eric Kerfoot Date: Mon, 21 Feb 2022 17:52:11 +0000 Subject: [PATCH 05/10] Adding archive specification document Signed-off-by: Eric Kerfoot --- docs/source/mar_specification.rst | 77 +++++++++++++++++++++++++++---- 1 file changed, 68 insertions(+), 9 deletions(-) diff --git a/docs/source/mar_specification.rst b/docs/source/mar_specification.rst index 22dcb076c2..3f6508206c 100644 --- a/docs/source/mar_specification.rst +++ b/docs/source/mar_specification.rst @@ -15,15 +15,15 @@ Directory Structure A MAR package is defined primarily as a directory with a set of specifically named subdirectories containing the model and metadata files. The root directory should be named for the model, given as "ModelName", and should contain the following structure: :: - ModelName - ┣━ configs - ┃ ┗━ metadata.json - ┣━ models - ┃ ┣━ model.pt - ┃ ┗━ model.ts - ┗━ docs - ┣━ README.md - ┗━ license.txt + ModelName + ┣━ configs + ┃ ┗━ metadata.json + ┣━ models + ┃ ┣━ model.pt + ┃ ┗━ model.ts + ┗━ docs + ┣━ README.md + ┗━ license.txt These files mostly are required to be present with the given names for the directory to define a valid MAR: @@ -80,6 +80,7 @@ A JSON schema for this file can be found at https://github.com/Project-MONAI/MON An example JSON metadata file: :: +<<<<<<< HEAD { "version": "0.1.0", "changelog": { @@ -135,3 +136,61 @@ An example JSON metadata file: } } } +======= + { + "version": "0.1.0", + "changelog": { + "0.1.0": "complete the model package", + "0.0.1": "initialize the model package structure" + }, + "monai_version": "0.8.0", + "pytorch_version": "1.10.0", + "numpy_version": "1.21.2", + "optional_packages_version": {"nibabel": "3.2.1"}, + "task": "Decathlon spleen segmentation", + "description": "A pre-trained model for volumetric (3D) segmentation of the spleen from CT image", + "authorship": "MONAI team", + "copyright": "Copyright (c) MONAI Consortium", + "data_source": "Task09_Spleen.tar from http://medicaldecathlon.com/", + "data_type": "dicom", + "dataset_dir": "/workspace/data/Task09_Spleen", + "image_classes": "single channel data, intensity scaled to [0, 1]", + "label_classes": "single channel data, 1 is spleen, 0 is everything else", + "pred_classes": "2 channels OneHot data, channel 1 is spleen, channel 0 is background", + "eval_metrics": { + "mean_dice": 0.96 + }, + "intended_use": "This is an example, not to be used for diagnostic purposes", + "references": [ + "Xia, Yingda, et al. '3D Semi-Supervised Learning with Uncertainty-Aware Multi-View Co-Training.' arXiv preprint arXiv:1811.12506 (2018). https://arxiv.org/abs/1811.12506.", + "Kerfoot E., Clough J., Oksuz I., Lee J., King A.P., Schnabel J.A. (2019) Left-Ventricle Quantification Using Residual U-Net. In: Pop M. et al. (eds) Statistical Atlases and Computational Models of the Heart. Atrial Segmentation and LV Quantification Challenges. STACOM 2018. Lecture Notes in Computer Science, vol 11395. Springer, Cham. https://doi.org/10.1007/978-3-030-12029-0_40" + ], + "network_data_format":{ + "inputs": { + "image": { + "type": "image", + "format": "magnitude", + "num_channels": 1, + "spatial_shape": [160, 160, 160], + "dtype": "float32", + "value_range": [0, 1], + "is_patch_data": false, + "channel_def": {0: "image"} + } + }, + "outputs":{ + "pred": { + "type": "image", + "format": "segmentation", + "num_channels": 2, + "spatial_shape": [160, 160, 160], + "dtype": "float32", + "value_range": [0, 1], + "is_patch_data": false, + "channel_def": {0: "background", 1: "spleen"} + } + } + } + } + +>>>>>>> Adding archive specification document From c974092bdd6022fcce45a4b1d088e4f6598fc64c Mon Sep 17 00:00:00 2001 From: Eric Kerfoot Date: Mon, 21 Feb 2022 17:54:35 +0000 Subject: [PATCH 06/10] Adding archive specification document Signed-off-by: Eric Kerfoot --- docs/source/mar_specification.rst | 59 ------------------------------- 1 file changed, 59 deletions(-) diff --git a/docs/source/mar_specification.rst b/docs/source/mar_specification.rst index 3f6508206c..671ecbc1bb 100644 --- a/docs/source/mar_specification.rst +++ b/docs/source/mar_specification.rst @@ -80,63 +80,6 @@ A JSON schema for this file can be found at https://github.com/Project-MONAI/MON An example JSON metadata file: :: -<<<<<<< HEAD - { - "version": "0.1.0", - "changelog": { - "0.1.0": "complete the model package", - "0.0.1": "initialize the model package structure" - }, - "monai_version": "0.8.0", - "pytorch_version": "1.10.0", - "numpy_version": "1.21.2", - "optional_packages_version": {"nibabel": "3.2.1"}, - "task": "Decathlon spleen segmentation", - "description": "A pre-trained model for volumetric (3D) segmentation of the spleen from CT image", - "authorship": "MONAI team", - "copyright": "Copyright (c) MONAI Consortium", - "data_source": "Task09_Spleen.tar from http://medicaldecathlon.com/", - "data_type": "dicom", - "dataset_dir": "/workspace/data/Task09_Spleen", - "image_classes": "single channel data, intensity scaled to [0, 1]", - "label_classes": "single channel data, 1 is spleen, 0 is everything else", - "pred_classes": "2 channels OneHot data, channel 1 is spleen, channel 0 is background", - "eval_metrics": { - "mean_dice": 0.96 - }, - "intended_use": "This is an example, not to be used for diagnostic purposes", - "references": [ - "Xia, Yingda, et al. '3D Semi-Supervised Learning with Uncertainty-Aware Multi-View Co-Training.' arXiv preprint arXiv:1811.12506 (2018). https://arxiv.org/abs/1811.12506.", - "Kerfoot E., Clough J., Oksuz I., Lee J., King A.P., Schnabel J.A. (2019) Left-Ventricle Quantification Using Residual U-Net. In: Pop M. et al. (eds) Statistical Atlases and Computational Models of the Heart. Atrial Segmentation and LV Quantification Challenges. STACOM 2018. Lecture Notes in Computer Science, vol 11395. Springer, Cham. https://doi.org/10.1007/978-3-030-12029-0_40" - ], - "network_data_format":{ - "inputs": { - "image": { - "type": "image", - "format": "magnitude", - "num_channels": 1, - "spatial_shape": [160, 160, 160], - "dtype": "float32", - "value_range": [0, 1], - "is_patch_data": false, - "channel_def": {0: "image"} - } - }, - "outputs":{ - "pred": { - "type": "image", - "format": "segmentation", - "num_channels": 2, - "spatial_shape": [160, 160, 160], - "dtype": "float32", - "value_range": [0, 1], - "is_patch_data": false, - "channel_def": {0: "background", 1: "spleen"} - } - } - } - } -======= { "version": "0.1.0", "changelog": { @@ -192,5 +135,3 @@ An example JSON metadata file: } } } - ->>>>>>> Adding archive specification document From 08225a62cee82dce641d105fa8f6ba513cf34982 Mon Sep 17 00:00:00 2001 From: Eric Kerfoot Date: Mon, 21 Feb 2022 18:10:10 +0000 Subject: [PATCH 07/10] Adding archive specification document Signed-off-by: Eric Kerfoot --- docs/source/mar_specification.rst | 128 +++++++++++++++--------------- 1 file changed, 64 insertions(+), 64 deletions(-) diff --git a/docs/source/mar_specification.rst b/docs/source/mar_specification.rst index 671ecbc1bb..072cb9db8c 100644 --- a/docs/source/mar_specification.rst +++ b/docs/source/mar_specification.rst @@ -15,15 +15,15 @@ Directory Structure A MAR package is defined primarily as a directory with a set of specifically named subdirectories containing the model and metadata files. The root directory should be named for the model, given as "ModelName", and should contain the following structure: :: - ModelName - ┣━ configs - ┃ ┗━ metadata.json - ┣━ models - ┃ ┣━ model.pt - ┃ ┗━ model.ts - ┗━ docs - ┣━ README.md - ┗━ license.txt + ModelName + ┣━ configs + ┃ ┗━ metadata.json + ┣━ models + ┃ ┣━ model.pt + ┃ ┗━ model.ts + ┗━ docs + ┣━ README.md + ┗━ license.txt These files mostly are required to be present with the given names for the directory to define a valid MAR: @@ -80,58 +80,58 @@ A JSON schema for this file can be found at https://github.com/Project-MONAI/MON An example JSON metadata file: :: - { - "version": "0.1.0", - "changelog": { - "0.1.0": "complete the model package", - "0.0.1": "initialize the model package structure" - }, - "monai_version": "0.8.0", - "pytorch_version": "1.10.0", - "numpy_version": "1.21.2", - "optional_packages_version": {"nibabel": "3.2.1"}, - "task": "Decathlon spleen segmentation", - "description": "A pre-trained model for volumetric (3D) segmentation of the spleen from CT image", - "authorship": "MONAI team", - "copyright": "Copyright (c) MONAI Consortium", - "data_source": "Task09_Spleen.tar from http://medicaldecathlon.com/", - "data_type": "dicom", - "dataset_dir": "/workspace/data/Task09_Spleen", - "image_classes": "single channel data, intensity scaled to [0, 1]", - "label_classes": "single channel data, 1 is spleen, 0 is everything else", - "pred_classes": "2 channels OneHot data, channel 1 is spleen, channel 0 is background", - "eval_metrics": { - "mean_dice": 0.96 - }, - "intended_use": "This is an example, not to be used for diagnostic purposes", - "references": [ - "Xia, Yingda, et al. '3D Semi-Supervised Learning with Uncertainty-Aware Multi-View Co-Training.' arXiv preprint arXiv:1811.12506 (2018). https://arxiv.org/abs/1811.12506.", - "Kerfoot E., Clough J., Oksuz I., Lee J., King A.P., Schnabel J.A. (2019) Left-Ventricle Quantification Using Residual U-Net. In: Pop M. et al. (eds) Statistical Atlases and Computational Models of the Heart. Atrial Segmentation and LV Quantification Challenges. STACOM 2018. Lecture Notes in Computer Science, vol 11395. Springer, Cham. https://doi.org/10.1007/978-3-030-12029-0_40" - ], - "network_data_format":{ - "inputs": { - "image": { - "type": "image", - "format": "magnitude", - "num_channels": 1, - "spatial_shape": [160, 160, 160], - "dtype": "float32", - "value_range": [0, 1], - "is_patch_data": false, - "channel_def": {0: "image"} - } - }, - "outputs":{ - "pred": { - "type": "image", - "format": "segmentation", - "num_channels": 2, - "spatial_shape": [160, 160, 160], - "dtype": "float32", - "value_range": [0, 1], - "is_patch_data": false, - "channel_def": {0: "background", 1: "spleen"} - } - } - } - } + { + "version": "0.1.0", + "changelog": { + "0.1.0": "complete the model package", + "0.0.1": "initialize the model package structure" + }, + "monai_version": "0.8.0", + "pytorch_version": "1.10.0", + "numpy_version": "1.21.2", + "optional_packages_version": {"nibabel": "3.2.1"}, + "task": "Decathlon spleen segmentation", + "description": "A pre-trained model for volumetric (3D) segmentation of the spleen from CT image", + "authorship": "MONAI team", + "copyright": "Copyright (c) MONAI Consortium", + "data_source": "Task09_Spleen.tar from http://medicaldecathlon.com/", + "data_type": "dicom", + "dataset_dir": "/workspace/data/Task09_Spleen", + "image_classes": "single channel data, intensity scaled to [0, 1]", + "label_classes": "single channel data, 1 is spleen, 0 is everything else", + "pred_classes": "2 channels OneHot data, channel 1 is spleen, channel 0 is background", + "eval_metrics": { + "mean_dice": 0.96 + }, + "intended_use": "This is an example, not to be used for diagnostic purposes", + "references": [ + "Xia, Yingda, et al. '3D Semi-Supervised Learning with Uncertainty-Aware Multi-View Co-Training.' arXiv preprint arXiv:1811.12506 (2018). https://arxiv.org/abs/1811.12506.", + "Kerfoot E., Clough J., Oksuz I., Lee J., King A.P., Schnabel J.A. (2019) Left-Ventricle Quantification Using Residual U-Net. In: Pop M. et al. (eds) Statistical Atlases and Computational Models of the Heart. Atrial Segmentation and LV Quantification Challenges. STACOM 2018. Lecture Notes in Computer Science, vol 11395. Springer, Cham. https://doi.org/10.1007/978-3-030-12029-0_40" + ], + "network_data_format":{ + "inputs": { + "image": { + "type": "image", + "format": "magnitude", + "num_channels": 1, + "spatial_shape": [160, 160, 160], + "dtype": "float32", + "value_range": [0, 1], + "is_patch_data": false, + "channel_def": {0: "image"} + } + }, + "outputs":{ + "pred": { + "type": "image", + "format": "segmentation", + "num_channels": 2, + "spatial_shape": [160, 160, 160], + "dtype": "float32", + "value_range": [0, 1], + "is_patch_data": false, + "channel_def": {0: "background", 1: "spleen"} + } + } + } + } From d7660ef0929b74728585434dd6714e83406232e6 Mon Sep 17 00:00:00 2001 From: Eric Kerfoot Date: Wed, 9 Mar 2022 19:11:47 +0000 Subject: [PATCH 08/10] Updated specification Signed-off-by: Eric Kerfoot --- ...specification.rst => mb_specification.rst} | 46 +++++++++++-------- 1 file changed, 26 insertions(+), 20 deletions(-) rename docs/source/{mar_specification.rst => mb_specification.rst} (62%) diff --git a/docs/source/mar_specification.rst b/docs/source/mb_specification.rst similarity index 62% rename from docs/source/mar_specification.rst rename to docs/source/mb_specification.rst index 072cb9db8c..a79678f7b4 100644 --- a/docs/source/mar_specification.rst +++ b/docs/source/mb_specification.rst @@ -1,20 +1,22 @@ -=========================== -MONAI Archive Specification -=========================== +========================== +MONAI Bundle Specification +========================== Overview ======== -This is the specification for the MONAI Archive (MAR) format of portable described deep learning models. The objective of a MAR is to define a packaged network or model which includes the critical information necessary to allow users and other programs to understand how the model is used and for what purpose. A MAR includes the stored weights of a model as a state dictionary and/or a Torchscript object. Additional JSON files are included to store metadata about the model, information for constructing training, inference, and post-processing transform sequences, plain-text description, legal information, and other data the model creator wishes to include. +This is the specification for the MONAI Bundle (MB) format of portable described deep learning models. The objective of a MB is to define a packaged network or model which includes the critical information necessary to allow users and programs to understand how the model is used and for what purpose. A bundle includes the stored weights of a model as a pickled state dictionary and/or a Torchscript object. Additional JSON files are included to store metadata about the model, information for constructing training, inference, and post-processing transform sequences, plain-text description, legal information, and other data the model creator wishes to include. -This specification defines the directory structure a MAR must have and the necessary files it must contain. Additional files may be included and the directory packaged into a zip file or included as extra files directly in a Torchscript file. +This specification defines the directory structure a bundle must have and the necessary files it must contain. Additional files may be included and the directory packaged into a zip file or included as extra files directly in a Torchscript file. Directory Structure =================== -A MAR package is defined primarily as a directory with a set of specifically named subdirectories containing the model and metadata files. The root directory should be named for the model, given as "ModelName", and should contain the following structure: +A MONAI Bundle is defined primarily as a directory with a set of specifically named subdirectories containing the model and metadata files. The root directory should be named for the model, given as "ModelName" in this exmaple, and should contain the following structure: + :: + ModelName ┣━ configs ┃ ┗━ metadata.json @@ -26,9 +28,9 @@ A MAR package is defined primarily as a directory with a set of specifically nam ┗━ license.txt -These files mostly are required to be present with the given names for the directory to define a valid MAR: +These files mostly are required to be present with the given names for the directory to define a valid bundle: -* **metadata.json**: netadata information in JSON format relating to the type of model, definition of input and output tensors, versions of the model and used software, and other information described below. +* **metadata.json**: metadata information in JSON format relating to the type of model, definition of input and output tensors, versions of the model and used software, and other information described below. * **model.pt**: the state dictionary of a saved model, the information to instantiate the model must be found in the metadata file. * **model.ts**: the Torchscript saved model if the model is compatible with being saved correctly in this format. * **README.md**: plain-language information on the model, how to use it, author information, etc. in Markdown format. @@ -37,20 +39,20 @@ These files mostly are required to be present with the given names for the direc Archive Format ============== -The MAR directory and its contents can be compressed into a zip file to constitute a single file package. When unzipped into a directory this file will reproduce the above directory structure, and should itself also be named after the model it contains. +The bundle directory and its contents can be compressed into a zip file to constitute a single file package. When unzipped into a directory this file will reproduce the above directory structure, and should itself also be named after the model it contains. -The Torchscript file format is also just a zip file with a specific structure. When creating such an archive with `save_net_with_metadata` a MAR-compliant Torchscript file can be created by including the contents of `metadata.json` as the `meta_values` argument of the function, and other files included as `more_extra_files` entries. These will be stored in a `extras` directory in the zip file and can be retrieved with `load_net_with_metadata` or with any other library/tool that can read zip data. In this format the `model.*` files are obviously not needed by `README.md` and `license.txt` can be added as more extra files. +The Torchscript file format is also just a zip file with a specific structure. When creating such an archive with `save_net_with_metadata` a MB-compliant Torchscript file can be created by including the contents of `metadata.json` as the `meta_values` argument of the function, and other files included as `more_extra_files` entries. These will be stored in a `extras` directory in the zip file and can be retrieved with `load_net_with_metadata` or with any other library/tool that can read zip data. In this format the `model.*` files are obviously not needed by `README.md` and `license.txt` can be added as more extra files. metadata.json File ================== This file contains the metadata information relating to the model, including what the shape and format of inputs and outputs are, what the meaning of the outputs are, what type of model is present, and other information. The JSON structure is a dictionary containing a defined set of keys with additional user-specified keys. The mandatory keys are as follows: -* **version**: version of the stored model. -* **monai_version**: version of MONAI the MAR was generated on, later versions expected to work. -* **pytorch_version**: version of Pytorch the MAR was generated on, later versions expected to work. -* **numpy_version**: version of Numpy the MAR was generated on, later versions expected to work. -* **optional_packages_version**: dictionary relating optional package names to their versions, these packages are not needed but are recommended to be isntalled with this stated minimum version. +* **version**: version of the stored model, this allows multiple versions of the same model to be differentiated. +* **monai_version**: version of MONAI the bundle was generated on, later versions expected to work. +* **pytorch_version**: version of Pytorch the bundle was generated on, later versions expected to work. +* **numpy_version**: version of Numpy the bundle was generated on, later versions expected to work. +* **optional_packages_version**: dictionary relating optional package names to their versions, these packages are not needed but are recommended to be installed with this stated minimum version. * **task**: plain-language description of what the model is meant to do. * **description**: longer form plain-language description of what the model is, what it does, etc. * **authorship**: state author(s) of the model. @@ -61,12 +63,12 @@ Tensor format specifiers are used to define input and output tensors and their m * **type**: what sort of data the tensor represents: "image", "label", etc. * **format**: what format of information is stored: "magnitude", "hounsfield", "kspace", "segmentation", "multiclass", etc. -* **num_channels**: number of channels the tensor has, assumed channel dimension first. -* **spatial_shape**: shape of the spatial dimensions of the form "[H]", "[H, W]", or "[H, W, D]" +* **num_channels**: number of channels the tensor has, assumed channel dimension first +* **spatial_shape**: shape of the spatial dimensions of the form "[H]", "[H, W]", or "[H, W, D]", see below for possible values of H, W, and D * **dtype**: data type of tensor, eg. "float32", "int32" -* **value_range**: minimum and maximum values the input data is expected to have of the form "[MIN, MAX]" or "[]" if not known. -* **is_patch_data**: "true" if the data is a patch of an input/output tensor or the entirely of the tensor, "false" otherwise. -* **channel_def**: dictionary relating channel indices to plain-language description of what the channel contains. +* **value_range**: minimum and maximum values the input data is expected to have of the form "[MIN, MAX]" or "[]" if not known +* **is_patch_data**: "true" if the data is a patch of an input/output tensor or the entirely of the tensor, "false" otherwise +* **channel_def**: dictionary relating channel indices to plain-language description of what the channel contains Optional keys: @@ -76,10 +78,14 @@ Optional keys: * **data_type**: type of source data used for training/validation. * **references**: list of published referenced relating to the model. +Spatial shape definition can be complex for models accepting inputs of varying shapes, especially if there are specific conditions on what those shapes can be. Shapes are specified as lists of either positive integers for fixed sizes or strings containing expressions defining the condition a size depends on. This can be "*" to mean any size, or use an expression with Python mathematical operators and one character variables to represent dependence on an unknown quantity. For example, "2**n" represents a size which must be a power of 2, "2**n*m" must be a multiple of a power of 2. Variables are shared between dimension expressions, so a spatial shape of `["2**n", "2**n"]` states that the dimensions must be the same powers of 2 given by `n`. + A JSON schema for this file can be found at https://github.com/Project-MONAI/MONAI/blob/3049e280f2424962bb2a69261389fcc0b98e0036/monai/apps/mmars/schema/metadata.json An example JSON metadata file: + :: + { "version": "0.1.0", "changelog": { From 83277a5c58818212fb82165941f628898c176ac8 Mon Sep 17 00:00:00 2001 From: "pre-commit-ci[bot]" <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Wed, 9 Mar 2022 19:12:23 +0000 Subject: [PATCH 09/10] [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --- docs/source/mb_specification.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/source/mb_specification.rst b/docs/source/mb_specification.rst index a79678f7b4..53976f70db 100644 --- a/docs/source/mb_specification.rst +++ b/docs/source/mb_specification.rst @@ -16,7 +16,7 @@ Directory Structure A MONAI Bundle is defined primarily as a directory with a set of specifically named subdirectories containing the model and metadata files. The root directory should be named for the model, given as "ModelName" in this exmaple, and should contain the following structure: :: - + ModelName ┣━ configs ┃ ┗━ metadata.json @@ -48,7 +48,7 @@ metadata.json File This file contains the metadata information relating to the model, including what the shape and format of inputs and outputs are, what the meaning of the outputs are, what type of model is present, and other information. The JSON structure is a dictionary containing a defined set of keys with additional user-specified keys. The mandatory keys are as follows: -* **version**: version of the stored model, this allows multiple versions of the same model to be differentiated. +* **version**: version of the stored model, this allows multiple versions of the same model to be differentiated. * **monai_version**: version of MONAI the bundle was generated on, later versions expected to work. * **pytorch_version**: version of Pytorch the bundle was generated on, later versions expected to work. * **numpy_version**: version of Numpy the bundle was generated on, later versions expected to work. @@ -78,14 +78,14 @@ Optional keys: * **data_type**: type of source data used for training/validation. * **references**: list of published referenced relating to the model. -Spatial shape definition can be complex for models accepting inputs of varying shapes, especially if there are specific conditions on what those shapes can be. Shapes are specified as lists of either positive integers for fixed sizes or strings containing expressions defining the condition a size depends on. This can be "*" to mean any size, or use an expression with Python mathematical operators and one character variables to represent dependence on an unknown quantity. For example, "2**n" represents a size which must be a power of 2, "2**n*m" must be a multiple of a power of 2. Variables are shared between dimension expressions, so a spatial shape of `["2**n", "2**n"]` states that the dimensions must be the same powers of 2 given by `n`. +Spatial shape definition can be complex for models accepting inputs of varying shapes, especially if there are specific conditions on what those shapes can be. Shapes are specified as lists of either positive integers for fixed sizes or strings containing expressions defining the condition a size depends on. This can be "*" to mean any size, or use an expression with Python mathematical operators and one character variables to represent dependence on an unknown quantity. For example, "2**n" represents a size which must be a power of 2, "2**n*m" must be a multiple of a power of 2. Variables are shared between dimension expressions, so a spatial shape of `["2**n", "2**n"]` states that the dimensions must be the same powers of 2 given by `n`. A JSON schema for this file can be found at https://github.com/Project-MONAI/MONAI/blob/3049e280f2424962bb2a69261389fcc0b98e0036/monai/apps/mmars/schema/metadata.json An example JSON metadata file: :: - + { "version": "0.1.0", "changelog": { From a3787a7ac89ac0ba259985c820b4e181ea77dee2 Mon Sep 17 00:00:00 2001 From: Eric Kerfoot Date: Wed, 9 Mar 2022 19:29:28 +0000 Subject: [PATCH 10/10] Updated specification Signed-off-by: Eric Kerfoot --- docs/source/index.rst | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/docs/source/index.rst b/docs/source/index.rst index 76ba003c8d..1a4263db0d 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -66,6 +66,11 @@ Technical documentation is available at `docs.monai.io `_ contrib +.. toctree:: + :maxdepth: 1 + :caption: Specifications + + mb_specification Links -----