From b8273de28253d6a64e9efa3ddbc4ac1d0c99d387 Mon Sep 17 00:00:00 2001 From: Samuel Farrens Date: Wed, 23 Mar 2022 17:53:53 +0100 Subject: [PATCH 1/2] added md for understanding the api docs --- docs/source/about.md | 2 +- docs/source/basic_execution.md | 4 ++ docs/source/conf.py | 4 +- docs/source/configuration.md | 26 +++++-- docs/source/contributing.md | 6 +- docs/source/installation.md | 18 +++-- docs/source/module_develop.md | 119 ++++++++++++++++++++++++++----- docs/source/module_example.md | 66 +++++++++++------ docs/source/toc.rst | 1 + docs/source/understanding_api.md | 46 ++++++++++++ 10 files changed, 239 insertions(+), 53 deletions(-) create mode 100644 docs/source/understanding_api.md diff --git a/docs/source/about.md b/docs/source/about.md index e9510fd9e..abbeccc97 100644 --- a/docs/source/about.md +++ b/docs/source/about.md @@ -4,7 +4,7 @@ ShapePipe an open-source and modular weak-lensing measurement, analysis and validation pipeline written in Python. The current version of ShapePipe starts with reduced survey images and ends by -providing gaaxy shape measurements along with all of the information required for +providing galaxy shape measurements along with all of the information required for shear calibration. It includes various validation tools and a novel point spread function (PSF) modelling technique. The code has been designed to facilitate the inclusion of new or improved processing steps to adapt to advances made diff --git a/docs/source/basic_execution.md b/docs/source/basic_execution.md index 00d558420..fade00674 100644 --- a/docs/source/basic_execution.md +++ b/docs/source/basic_execution.md @@ -9,6 +9,10 @@ option: shapepipe_run --help ``` +```{warning} +The `shapepipe` environment will need to be built and activated in order to run this script (see [Installation](installation.md)). +``` + The options for defining a pipeline are managed via a [configuration file](configuration.md). diff --git a/docs/source/conf.py b/docs/source/conf.py index 09acf71a1..3ca876fdc 100644 --- a/docs/source/conf.py +++ b/docs/source/conf.py @@ -19,8 +19,8 @@ # -- Project information ----------------------------------------------------- project = 'ShapePipe' -copyright = '2018, Samuel Farrens' -author = 'Samuel Farrens' +copyright = '2022, CosmoStat' +author = 'CosmoStat' # The short X.Y version version = '0.0' diff --git a/docs/source/configuration.md b/docs/source/configuration.md index 900fe1a14..11497f283 100644 --- a/docs/source/configuration.md +++ b/docs/source/configuration.md @@ -103,7 +103,8 @@ ShapePipe will look for files of the form `galaxy-00-0.fits`, `galaxy-00-1.fits`, etc. in the directory `/home/username/my_input_dir` and save outputs to `/home/username/my_output_dir`. Note that the `FILE_PATTERN` does not need to be complete, in other words files of the form -`mygalaxy-00-0.fits` would equally be found with the above options. +`mygalaxy-00-0.fits` would equally be found if `CORRECT_FILE_PATTERN = True` +was added to the options above. Conversely, with the options @@ -173,7 +174,7 @@ All ShapePipe modules accept the following options Additional module-specific options can be added using the following structure ```ini -[MODULE_NAME] +[MODULE_NAME_RUNNER] PARAMETER = PARAMETER VALUE ``` @@ -181,16 +182,31 @@ This mechanism can also be used to modify module decorator properties or append additional values to list properties as follows ```ini -[MODULE_NAME] +[MODULE_NAME_RUNNER] ADD_PARAMETER = PARAMETER VALUE ``` +### Multiple Module Runs + If a given module is run more than once, run specific parameter values can be specified as follows ```ini -[MODULE_NAME_RUN_X] +[MODULE_NAME_RUNNER_RUN_X] PARAMETER = PARAMETER VALUE ``` -where ``X`` is an integer greater than or equal to ``1``. +where ``X`` is an integer greater than or equal to ``1``. This feature can be combined with the ``INPUT_DIR`` options. For example, for module *B* to access the first of two runs of module *A* you could something like set up below. + +```ini + +[MODULE_A_RUNNER_RUN_1] +... + +[MODULE_A_RUNNER_RUN_2] +... + +[MODULE_B_RUNNER] +INPUT_DIR = last:module_a_run_1 + +``` diff --git a/docs/source/contributing.md b/docs/source/contributing.md index 073290ed5..99cf85a09 100644 --- a/docs/source/contributing.md +++ b/docs/source/contributing.md @@ -1,11 +1,11 @@ # Contribution Guidelines -ShapePipe is a fully open source project and we welcome contributions. +ShapePipe is a fully open-source project and we welcome contributions. Pleas read our -[contribution guidelines](https://github.com/CosmoStat/shapepipe/CONTRIBUTING.md). +[contribution guidelines](https://github.com/CosmoStat/shapepipe/blob/develop/CONTRIBUTING.md). for details on how to contribute to the development of this package. All contributors are kindly asked to adhere to the -[code of conduct](https://github.com/CosmoStat/shapepipe/CODE_OF_CONDUCT.md) +[code of conduct](https://github.com/CosmoStat/shapepipe/blob/develop/CODE_OF_CONDUCT.md) at all times to ensure a safe and inclusive environment for everyone. diff --git a/docs/source/installation.md b/docs/source/installation.md index ae06943b8..2c9873b49 100644 --- a/docs/source/installation.md +++ b/docs/source/installation.md @@ -22,11 +22,21 @@ The ShapePipe package should first be cloned (or downloaded) from the [GitHub repository](https://github.com/CosmoStat/shapepipe). ```bash -git clone git@github.com:CosmoStat/shapepipe.git +git clone -b --depth 1 git@github.com:CosmoStat/shapepipe.git cd shapepipe ``` -Then the entire ShapePipe environment, including dependencies, can be built +where `` is a +[tagged release](https://github.com/CosmoStat/shapepipe/releases) of ShapePipe +(e.g. `v1.0.0`). It is recommend to use the +[latest release](https://github.com/CosmoStat/shapepipe/releases/latest) +unless you want to reproduce an older set of results. + +```{note} +Developers should simply clone the repository as usual. +``` + +Then, the entire ShapePipe environment, including dependencies, can be built using the `install_shapepipe` script as follows. ```bash @@ -84,10 +94,10 @@ entire ShapePipe environment. ## Installing the ShapePipe Library Only ```{warning} -Note, this method will not include any executable scripts or examples. +Note, this method will not include any system executables or examples. ``` -The ShapePipe library, *i.e.* the core package not including module +The ShapePipe library, i.e. the core package not including module dependencies, can be installed in the following ways. After cloning the repository. diff --git a/docs/source/module_develop.md b/docs/source/module_develop.md index af3321308..dd82d2e2a 100644 --- a/docs/source/module_develop.md +++ b/docs/source/module_develop.md @@ -1,42 +1,127 @@ # Module Development -New modules can be implemented in the pipeline by simply writing a *module runner*. +This page provides details on how new modules can be implemented in ShapePipe. -The basic requirement for a new module runner is a single function decorated with the `module_runner` wrapper that outputs the module `stdout` and `stderr`. *e.g.*: +## Module Package + +Every ShapePipe module requires a *module package*, which is simply a directory +with the module name followed by `_package`, e.g. for a module called +`my_new_module` you would create a new directory called `my_new_module_package` +and put it in `shapepipe/modules`. Inside this directory you will need to +include a Python file (ideally named after your module, e.g. +`my_new_module.py`) and a `__init__.py` file with the following content. ```python -@module_runner() -def example_module(*args) +"""MY NEW MODULE PACKAGE. + +This package contains the module for ``my_new_module``. + +:Author: + +:Parent module: + +:Input: + +:Output: + +Description +=========== + + +Module-specific config file entries +=================================== + +""" + +__all__ = ['my_new_module.py'] +``` + +You should provide a description of what your module does, what the config file +options are, the inputs and outputs, and any modules this module may depend on. + +The Python file (e.g. `my_new_module.py`) will contain all of the code for +implementing the module operations. + +## Module Runner + +In addition to the module package, new modules will also require a +*module runner*. -# DO SOMETHING +The basic requirement for a new module runner is a single function decorated +with the `module_runner` wrapper that outputs the module `stdout` and +`stderr`. + +```python +from shapepipe.modules.module_decorator import module_runner +from shapepipe.modules.module_name_package.module_name import ... + + +@module_runner(*args) +def example_module(*args): + + # DO SOMETHING + + return stdout, stderr +``` + +In the specific case of a module that executes an executable available on the +system, the module runner should also import the `execute` function. + +```python +from shapepipe.modules.module_decorator import module_runner +from shapepipe.modules.module_name_package.module_name import ... +from shapepipe.pipeline.execute import execute + + +@module_runner(*args) +def example_module(*args): + + # DO SOMETHING + command_line = ... + # Execute command line + stderr, stdout = execute(command_line) + + return stdout, stderr +``` -return stdout, stderr +```{note} +If no `stdout` or `stderr` are provided by the given module, the the module +runner should simply return `None, None`. ``` The module runner decorator takes the following keyword arguments: 1. `version` : (`str`) The module version. Default value is `'0.0'`. -2. `input_module` : (`str` or `list`) The name of a preceding module(s) whose output provide(s) the input to this module. Default value is `None`. -3. `file_pattern` : (`str` or `list`) The input file pattern(s) to look for. Default value is `''`. -4. `file_ext` : (`str` or `list`) The input file extensions(s) to look for. Default value is `''`. -5. `depends` : (`str` or `list`) The Python package(s) the module depends on. Default value is `[]`. -6. `executes` : (`str` or `list`) The system executable(s) the module implements. Default value is `[]`. -7. `numbering_scheme` : (`str`) The numbering scheme implemented by the module to find input files. -9. `run_method` : (`str`) The method by which the given module should be run. The options are `parallel` and `serial`. Default value is `parallel`. +2. `input_module` : (`str` or `list`) The name of a preceding module(s) + whose output provide(s) the input to this module. +3. `file_pattern` : (`str` or `list`) The input file pattern(s) to look for. +4. `file_ext` : (`str` or `list`) The input file extensions(s) to look for. +5. `depends` : (`str` or `list`) The Python package(s) the module depends on. +6. `executes` : (`str` or `list`) The system executable(s) the module + implements. +7. `numbering_scheme` : (`str`) The numbering scheme implemented by the module + to find input files. +9. `run_method` : (`str`) The method by which the given module should be run. + The options are `parallel` and `serial`. Default value is `parallel`. The arguments passed to the module runner are the following: 1. `input_file_list` : The list of input files. 2. `run_dirs` : The run directories for the module output files. -3. `file_number_string` : The number pattern corresponding to the current process. -4. `config` : The config parser instance, which provides access to the configuration file parameter values. Module specific parameters can be passed using the following structure: +3. `file_number_string` : The number pattern corresponding to the current + process. +4. `config` : The config parser instance, which provides access to the + configuration file parameter values. Module specific parameters can be + passed using the following structure: ```python parameter_value = config.get(module_config_sec, 'PARAMETER') ``` -5. `module_config_sec` : The name of the configuration file section for the current module. -6. `w_log` : The worker log instance, which can be used to record additional messages in the module output logs using the following structure: +5. `module_config_sec` : The name of the configuration file section for the + current module. +6. `w_log` : The worker log instance, which can be used to record additional + messages in the module output logs using the following structure: ```python w_log.info('MESSAGE') diff --git a/docs/source/module_example.md b/docs/source/module_example.md index a9823455f..4edae36ce 100644 --- a/docs/source/module_example.md +++ b/docs/source/module_example.md @@ -10,12 +10,24 @@ As this module does not implement any system executable, it is not necessary to return `None, None`. ```python +"""PYTHON MODULE EXAMPLE. + +This module defines methods for an example Python module. + +:Author: Samuel Farrens + +""" + +from shapepipe.modules.module_decorator import module_runner +from shapepipe.modules.python_example_package import python_example + + @module_runner( - version='1.0', - file_pattern=['numbers', 'letters'], - file_ext='.txt', - depends='numpy', - run_method='parallel', + version='1.1', + file_pattern=['numbers', 'letters'], + file_ext='.txt', + depends='numpy', + run_method='parallel', ) def python_example_runner( input_file_list, @@ -25,26 +37,26 @@ def python_example_runner( module_config_sec, w_log, ): - """Define The Python Example Runner.""" - # Set output file name - output_file_name = ( - f'{run_dirs["output"]}/pyex_output{file_number_string}.cat' - ) + """Define The Python Example Runner.""" + # Set output file name + output_file_name = ( + f'{run_dirs["output"]}/pyex_output{file_number_string}.cat' + ) - # Retrieve log message from config file - message = config.get(module_config_sec, 'MESSAGE') + # Retrieve log message from config file + message = config.get(module_config_sec, 'MESSAGE') - # Create an instance of the Python example class - py_ex_inst = python_example.PythonExample(0) + # Create an instance of the Python example class + py_ex_inst = python_example.PythonExample(0) - # Read input files - py_ex_inst.read_files(*input_file_list) + # Read input files + py_ex_inst.read_files(*input_file_list) - # Write output files - py_ex_inst.write_file(output_file_name, message) + # Write output files + py_ex_inst.write_file(output_file_name, message) - # Return file content and no stderr - return py_ex_inst.content, None + # Return file content and no stderr + return py_ex_inst.content, None ``` ## Executable Example @@ -52,9 +64,21 @@ def python_example_runner( In this example the module runner call the system executable `head`. This module read input files from the `python_example` module output that match the file pattern `'process'` with file extension `'.cat'`. ```python +"""EXECUTE MODULE EXAMPLE. + +This module defines methods for an example command line execution module. + +:Author: Samuel Farrens + +""" + +from shapepipe.modules.module_decorator import module_runner +from shapepipe.pipeline.execute import execute + + @module_runner( - version='1.0', input_module='python_example_runner', + version='1.0', file_pattern='pyex_output', file_ext='.cat', executes='head', diff --git a/docs/source/toc.rst b/docs/source/toc.rst index 18d5e9993..da8f06f25 100644 --- a/docs/source/toc.rst +++ b/docs/source/toc.rst @@ -39,6 +39,7 @@ :titlesonly: :caption: API Documentation + understanding_api shapepipe .. toctree:: diff --git a/docs/source/understanding_api.md b/docs/source/understanding_api.md new file mode 100644 index 000000000..bcf8c6d2e --- /dev/null +++ b/docs/source/understanding_api.md @@ -0,0 +1,46 @@ +# Understanding the API Documentation + +This page aims to help ShapePipe users and developers understand the +Application Programming Interface (API) documentation. + +```{note} +If you are already familiar with this type of documentation you can skip this page. +``` + +## What Are API Docs? + +The API documentation is designed to communicate in clear way what each class +and function in the package does. For example, what inputs they expect, what +outputs they provide, and what the various options do. These pages are +automatically generated from docstrings (i.e. the sections starting and ending with three double quotes `"""`) written in the code. + +## Standard API Docs + +All the classes/functions include a short description of what they do. This is followed by a `Parameters` section containing a bullet point list of the expected input arguments. For each parameter you will see in brackets the expected input type (e.g. `int`, `float`, `list`, etc.) followed by a brief description of what the argument is for. Parameters listed as *optional* in the brackets do not need to be provided and will default to some predefined value. + +```{note} +If an optional argument does not explicitly specify the default parameter value then the user should expect that this means the default will be `None`, `''`, `[]`, etc. depending on the input data type. +``` + +For functions that return objects a `Returns` section will follow the `Parameters` section. This will provide a brief description of what is provided by this function. Following `Returns` you will always find `Return type`, which specifies the data type of the returned object. + +If the function raises an exception under certain conditions you will find a `Raises` section containing a bullet point list of the type of exceptions raised. Each point is followed by a short description of the conditions that will lead to the exception being raised. + +Some objects may contain a `Notes` section providing further details for the class/function. Some objects may also contain a `See Also` section that links to another related object. + +Finally, all objects include `[source]` button that will allow you to view the code implementation for the given class/function. You can switch back to the API docs by clicking the corresponding `[docs]` button. + +## ShapePipe Module Docs + +In addition to the stand API docs, we provide a description of each ShapePipe module defined in the `__init__.py` file for each module package. On these pages you will find a header that specifies the package author, the `Parent module` (i.e. a module that should be run before the module in question), and a brief description of the inputs and outputs. This is followed by a description of the module. Finally, there is a section that defines all of the module-specific [config file options](configuration.md). Each option is listed with the exact name that should be added to the config file, followed by the expected value type. Underneath this is a brief description of what this option does. For example, the following option for a module called `my_module` + +**MY_OPTION : *int*** + +would be included in the config file as follows. + +```ini +[MY_MODULE_RUNNER] +MY_OPTION = 1 +``` + +Similarly to the standard API docs, parameters listed as *optional* do not need to be provided and will default to some predefined value. From 06b72a622d3e6bb9e3803a19f32e3d6e9a5645c7 Mon Sep 17 00:00:00 2001 From: Samuel Farrens Date: Thu, 24 Mar 2022 16:51:09 +0100 Subject: [PATCH 2/2] updates following reviwer comments --- docs/source/configuration.md | 2 +- docs/source/understanding_api.md | 50 +++++++++++++++++++++++++------- 2 files changed, 41 insertions(+), 11 deletions(-) diff --git a/docs/source/configuration.md b/docs/source/configuration.md index 11497f283..dd2b195a9 100644 --- a/docs/source/configuration.md +++ b/docs/source/configuration.md @@ -104,7 +104,7 @@ ShapePipe will look for files of the form `galaxy-00-0.fits`, save outputs to `/home/username/my_output_dir`. Note that the `FILE_PATTERN` does not need to be complete, in other words files of the form `mygalaxy-00-0.fits` would equally be found if `CORRECT_FILE_PATTERN = True` -was added to the options above. +were added to the options above. Conversely, with the options diff --git a/docs/source/understanding_api.md b/docs/source/understanding_api.md index bcf8c6d2e..08b6decb0 100644 --- a/docs/source/understanding_api.md +++ b/docs/source/understanding_api.md @@ -4,7 +4,8 @@ This page aims to help ShapePipe users and developers understand the Application Programming Interface (API) documentation. ```{note} -If you are already familiar with this type of documentation you can skip this page. +If you are already familiar with this type of documentation you can skip this +page. ``` ## What Are API Docs? @@ -12,27 +13,55 @@ If you are already familiar with this type of documentation you can skip this pa The API documentation is designed to communicate in clear way what each class and function in the package does. For example, what inputs they expect, what outputs they provide, and what the various options do. These pages are -automatically generated from docstrings (i.e. the sections starting and ending with three double quotes `"""`) written in the code. +automatically generated from docstrings (i.e. the sections starting and ending +with three double quotes `"""`) written in the code. ## Standard API Docs -All the classes/functions include a short description of what they do. This is followed by a `Parameters` section containing a bullet point list of the expected input arguments. For each parameter you will see in brackets the expected input type (e.g. `int`, `float`, `list`, etc.) followed by a brief description of what the argument is for. Parameters listed as *optional* in the brackets do not need to be provided and will default to some predefined value. +All the classes/functions include a short description of what they do. This is +followed by a `Parameters` section containing a bullet point list of the +expected input arguments. For each parameter you will see in brackets the +expected input type (e.g. `int`, `float`, `list`, etc.) followed by a brief +description of what the argument is for. Parameters listed as *optional* in the +brackets do not need to be provided and will default to some predefined value. ```{note} -If an optional argument does not explicitly specify the default parameter value then the user should expect that this means the default will be `None`, `''`, `[]`, etc. depending on the input data type. +If an optional argument does not explicitly specify the default parameter value +then the user should expect that this means the default will be `None`, `''`, +`[]`, etc. depending on the input data type. ``` -For functions that return objects a `Returns` section will follow the `Parameters` section. This will provide a brief description of what is provided by this function. Following `Returns` you will always find `Return type`, which specifies the data type of the returned object. +For functions that return objects a `Returns` section will follow the +`Parameters` section. This will provide a brief description of what is provided +by this function. Following `Returns` you will always find `Return type`, which +specifies the data type of the returned object. -If the function raises an exception under certain conditions you will find a `Raises` section containing a bullet point list of the type of exceptions raised. Each point is followed by a short description of the conditions that will lead to the exception being raised. +If the function raises an exception under certain conditions you will find a +`Raises` section containing a bullet point list of the type of exceptions +raised. Each point is followed by a short description of the conditions that +will lead to the exception being raised. -Some objects may contain a `Notes` section providing further details for the class/function. Some objects may also contain a `See Also` section that links to another related object. +Some objects may contain a `Notes` section providing further details for the +class/function. Some objects may also contain a `See Also` section that links +to another related object. -Finally, all objects include `[source]` button that will allow you to view the code implementation for the given class/function. You can switch back to the API docs by clicking the corresponding `[docs]` button. +Finally, all objects include `[source]` button that will allow you to view the +code implementation for the given class/function. You can switch back to the +API docs by clicking the corresponding `[docs]` button. ## ShapePipe Module Docs -In addition to the stand API docs, we provide a description of each ShapePipe module defined in the `__init__.py` file for each module package. On these pages you will find a header that specifies the package author, the `Parent module` (i.e. a module that should be run before the module in question), and a brief description of the inputs and outputs. This is followed by a description of the module. Finally, there is a section that defines all of the module-specific [config file options](configuration.md). Each option is listed with the exact name that should be added to the config file, followed by the expected value type. Underneath this is a brief description of what this option does. For example, the following option for a module called `my_module` +In addition to the stand API docs, we provide a description of each ShapePipe +module defined in the `__init__.py` file for each module package. On these +pages you will find a header that specifies the package author, the +`Parent module` (i.e. a module that should be run before the module in + question), and a brief description of the inputs and outputs. This is + followed by a description of the module. Finally, there is a section that + defines all of the module-specific [config file options](configuration.md). + Each option is listed with the exact name that should be added to the config + file, followed by the expected value type. Underneath this is a brief + description of what this option does. For example, the following option for + a module called `my_module`: **MY_OPTION : *int*** @@ -43,4 +72,5 @@ would be included in the config file as follows. MY_OPTION = 1 ``` -Similarly to the standard API docs, parameters listed as *optional* do not need to be provided and will default to some predefined value. +Similarly to the standard API docs, parameters listed as *optional* do not need +to be provided and will default to some predefined value.