From e0f572e9f3deeeeaca5c31a7eeb799b4453e701b Mon Sep 17 00:00:00 2001 From: Daniel Weindl Date: Fri, 18 Mar 2022 15:06:58 +0100 Subject: [PATCH 01/11] Proposal: Different languages for model specification --- doc/_static/petab_schema.yaml | 31 ++++++++--- doc/documentation_data_format.rst | 86 +++++++++++++++++++++++++------ 2 files changed, 92 insertions(+), 25 deletions(-) diff --git a/doc/_static/petab_schema.yaml b/doc/_static/petab_schema.yaml index 107e54fd..f73b277e 100644 --- a/doc/_static/petab_schema.yaml +++ b/doc/_static/petab_schema.yaml @@ -38,13 +38,24 @@ properties: files and optional visualization files. properties: - sbml_files: - type: array - description: List of PEtab SBML files. - - items: - type: string - description: PEtab SBML file name or URL. + models: + type: object + description: One or multiple models + + # the model ID + patternProperties: ^[a-zA-Z_]\w*$ + type: object + properties: + location: + type: string + description: Model file name or URL + language: + type: string + description: | + Model language, e.g., 'sbml', 'cellml', 'bngl', 'pysb' + required: + - location + - language measurement_files: type: array @@ -78,8 +89,12 @@ properties: type: string description: PEtab visualization file name or URL. + mapping_file: + type: string + description: Optional PEtab mapping file name or URL. + required: - - sbml_files + - models - observable_files - measurement_files - condition_files diff --git a/doc/documentation_data_format.rst b/doc/documentation_data_format.rst index 64bf5a39..a91753cd 100644 --- a/doc/documentation_data_format.rst +++ b/doc/documentation_data_format.rst @@ -2,7 +2,7 @@ PEtab data format specification =============================== -Format version: 1 +Format version: 2.0.0 This document explains the PEtab data format. @@ -41,12 +41,11 @@ Overview --------- The PEtab data format specifies a parameter estimation problem using a number -of text-based files (`Systems Biology Markup Language (SBML) `_ -and +of text-based files ( `Tab-Separated Values (TSV) `_) (Figure 2), i.e. -- An SBML model [SBML] +- An model - A measurement file to fit the model to [TSV] @@ -67,6 +66,9 @@ and - (optional) A visualization file, which contains specifications how the data and/or simulations should be plotted by the visualization routines [TSV] +- (optional) A mapping file, which allows mapping PEtab entity IDs to entity + IDs in the model, which might not have valid PEtab IDs themselves [TSV] + .. figure:: gfx/petab_files.png :alt: Files constituting a PEtab problem @@ -91,11 +93,11 @@ problem as such. - Fields in "[]" are optional and may be left empty. -SBML model definition ---------------------- - -The model must be specified as valid SBML. There are no further restrictions. +Model definition +---------------- +PEtab 2.0.0 is agnostic of specific model formats. A model file is referenced +in the PEtab problem description (YAML) via its file name or a URL. Condition table --------------- @@ -107,7 +109,7 @@ different experimental conditions). This is specified as a tab-separated value file in the following way: +--------------+------------------+------------------------------------+-----+---------------------------------------+ -| conditionId | [conditionName] | parameterOrSpeciesOrCompartmentId1 | ... | parameterOrSpeciesOrCompartmentId${n} | +| conditionId | [conditionName] | modelEntityId1 | ... | modelEntityId1${n} | +==============+==================+====================================+=====+=======================================+ | STRING | [STRING] | NUMERIC\|STRING | ... | NUMERIC\|STRING | +--------------+------------------+------------------------------------+-----+---------------------------------------+ @@ -140,12 +142,13 @@ Detailed field description Condition names are arbitrary strings to describe the given condition. They may be used for reporting or visualization. -- ``${parameterOrSpeciesOrCompartmentId1}`` +- ``${modelEntityId}`` - Further columns may be global parameter IDs, IDs of species or compartments - as defined in the SBML model. Only one column is allowed per ID. + Further columns may be the IDs of model entities that have globally unique + IDs, such as parameters, species or compartments defined in the model. + Only one column is allowed per ID. Values for these condition parameters may be provided either as numeric - values, or as IDs defined in the SBML model, the parameter table or both. + values, or as IDs defined in the model, the parameter table or both. - ``${parameterId}`` @@ -154,11 +157,11 @@ Detailed field description - ``${speciesId}`` If a species ID is provided, it is interpreted as the initial - condition of that species (as amount if `hasOnlySubstanceUnits` is set to `True` - for the respective species, as concentration otherwise) and will override the + condition of that species (as amount if `hasOnlySubstanceUnits` is set to `True` + for the respective species, as concentration otherwise) and will override the initial condition given in the SBML model or given by a preequilibration condition. If ``NaN`` is provided for a condition, the result of the - preequilibration (or initial condition from the SBML model, if + preequilibration (or initial condition from the model, if no preequilibration is defined) is used. - ``${compartmentId}`` @@ -166,6 +169,11 @@ Detailed field description If a compartment ID is provided, it is interpreted as the initial compartment size. + - For all other entities, values are statically replaced at all time points. + For entities that assign values to other entities, such as SBML + `AssignmentRule`s, the value of the target of that rule is statically + replaced at all time points. + Measurement table ----------------- @@ -693,6 +701,50 @@ Detailed field description legend and which defaults to the value in ``datasetId``. +Mapping table +------------- + +Mapping PEtab entity IDs to entity IDs in the model. This optional file may be +used to reference model entities in PEtab files where the ID in the model would +not be a valid identifier in PEtab (e.g., due to containing blanks, dots, or +other special characters). + +The tsv file has two mandatory columns, ``petabEntityId`` and +``modelEntityId``. Additional columns are allowed. + ++---------------+---------------+ +| petabEntityId | modelEntityId | ++===============+===============+ +| STRING | STRING | ++---------------+---------------+ +| e.g. | ... | ++---------------+---------------+ +| reaction1_k1 | reaction1.k1 | ++---------------+---------------+ + + +Detailed field description +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- ``petabEntityId`` [STRING, NOT NULL] + + A valid PEtab identifier that is not defined in any other part of the PEtab + problem. This identifier may be referenced in condition, measurement, + parameter and observable tables, but cannot be referenced in the model + itself. + +- ``modelEntityId`` [STRING, NOT NULL] + + A globally unique identifier defined in the model. + + For example, in SBML, local parameters may be referenced as + ``$reactionId.$localParameterId``, which are not valid PEtab IDs as they + contain a ``.`` character. Similarly, this table may be used to reference + specific species in a BGNL model which may contain many unsupported + characters such as ``,``, ``(`` or ``.``. However, please note that IDs must + exactly match the species names in the BNGL generated network file and no + pattern matching will be performed. + Extensions ~~~~~~~~~~ @@ -731,7 +783,7 @@ Parameter estimation problems combining multiple models ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Parameter estimation problems can comprise multiple models. For now, PEtab -allows to specify multiple SBML models with corresponding condition and +allows to specify multiple models with corresponding condition and measurement tables, and one joint parameter table. This means that the parameter namespace is global. Therefore, parameters with the same ID in different models will be considered identical. From 01fe4502e9feaf7490a11537420fe09148f54b3e Mon Sep 17 00:00:00 2001 From: Daniel Weindl Date: Wed, 23 Mar 2022 12:29:49 +0100 Subject: [PATCH 02/11] Apply suggestions from code review Co-authored-by: Dilan Pathirana <59329744+dilpath@users.noreply.github.com> --- doc/documentation_data_format.rst | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/doc/documentation_data_format.rst b/doc/documentation_data_format.rst index a91753cd..7cbb3264 100644 --- a/doc/documentation_data_format.rst +++ b/doc/documentation_data_format.rst @@ -45,7 +45,7 @@ of text-based files ( `Tab-Separated Values (TSV) `_) (Figure 2), i.e. -- An model +- A model - A measurement file to fit the model to [TSV] @@ -109,7 +109,7 @@ different experimental conditions). This is specified as a tab-separated value file in the following way: +--------------+------------------+------------------------------------+-----+---------------------------------------+ -| conditionId | [conditionName] | modelEntityId1 | ... | modelEntityId1${n} | +| conditionId | [conditionName] | modelEntityId1 | ... | modelEntityId${n} | +==============+==================+====================================+=====+=======================================+ | STRING | [STRING] | NUMERIC\|STRING | ... | NUMERIC\|STRING | +--------------+------------------+------------------------------------+-----+---------------------------------------+ @@ -147,7 +147,7 @@ Detailed field description Further columns may be the IDs of model entities that have globally unique IDs, such as parameters, species or compartments defined in the model. Only one column is allowed per ID. - Values for these condition parameters may be provided either as numeric + Values for these condition entities may be provided either as numeric values, or as IDs defined in the model, the parameter table or both. - ``${parameterId}`` @@ -706,10 +706,10 @@ Mapping table Mapping PEtab entity IDs to entity IDs in the model. This optional file may be used to reference model entities in PEtab files where the ID in the model would -not be a valid identifier in PEtab (e.g., due to containing blanks, dots, or +not be a valid identifier in PEtab (e.g., due to inclusion of blanks, dots, or other special characters). -The tsv file has two mandatory columns, ``petabEntityId`` and +The TSV file has two mandatory columns, ``petabEntityId`` and ``modelEntityId``. Additional columns are allowed. +---------------+---------------+ @@ -740,9 +740,9 @@ Detailed field description For example, in SBML, local parameters may be referenced as ``$reactionId.$localParameterId``, which are not valid PEtab IDs as they contain a ``.`` character. Similarly, this table may be used to reference - specific species in a BGNL model which may contain many unsupported + specific species in a BNGL model that may contain many unsupported characters such as ``,``, ``(`` or ``.``. However, please note that IDs must - exactly match the species names in the BNGL generated network file and no + exactly match the species names in the BNGL-generated network file, and no pattern matching will be performed. Extensions @@ -783,7 +783,7 @@ Parameter estimation problems combining multiple models ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Parameter estimation problems can comprise multiple models. For now, PEtab -allows to specify multiple models with corresponding condition and +allows one to specify multiple models with corresponding condition and measurement tables, and one joint parameter table. This means that the parameter namespace is global. Therefore, parameters with the same ID in different models will be considered identical. From a3fced55d7a3f519594e21c49c0331ef689ea295 Mon Sep 17 00:00:00 2001 From: Daniel Weindl Date: Tue, 26 Apr 2022 12:12:59 +0200 Subject: [PATCH 03/11] Update doc/documentation_data_format.rst Co-authored-by: Dilan Pathirana <59329744+dilpath@users.noreply.github.com> --- doc/documentation_data_format.rst | 2 -- 1 file changed, 2 deletions(-) diff --git a/doc/documentation_data_format.rst b/doc/documentation_data_format.rst index 7cbb3264..f02a5f09 100644 --- a/doc/documentation_data_format.rst +++ b/doc/documentation_data_format.rst @@ -717,8 +717,6 @@ The TSV file has two mandatory columns, ``petabEntityId`` and +===============+===============+ | STRING | STRING | +---------------+---------------+ -| e.g. | ... | -+---------------+---------------+ | reaction1_k1 | reaction1.k1 | +---------------+---------------+ From 349c5abb21438a54527b23fa2316a1efc9394bfb Mon Sep 17 00:00:00 2001 From: Daniel Weindl Date: Tue, 26 Apr 2022 12:16:36 +0200 Subject: [PATCH 04/11] model -> model_files --- doc/_static/petab_schema.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/_static/petab_schema.yaml b/doc/_static/petab_schema.yaml index f73b277e..75742142 100644 --- a/doc/_static/petab_schema.yaml +++ b/doc/_static/petab_schema.yaml @@ -38,7 +38,7 @@ properties: files and optional visualization files. properties: - models: + model_files: type: object description: One or multiple models @@ -94,7 +94,7 @@ properties: description: Optional PEtab mapping file name or URL. required: - - models + - model_files - observable_files - measurement_files - condition_files From db489a0666f9ddd645901a6129610575ad59b621 Mon Sep 17 00:00:00 2001 From: Daniel Weindl Date: Tue, 26 Apr 2022 12:19:11 +0200 Subject: [PATCH 05/11] mapping table --- doc/documentation_data_format.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/doc/documentation_data_format.rst b/doc/documentation_data_format.rst index f02a5f09..899ed78d 100644 --- a/doc/documentation_data_format.rst +++ b/doc/documentation_data_format.rst @@ -148,7 +148,8 @@ Detailed field description IDs, such as parameters, species or compartments defined in the model. Only one column is allowed per ID. Values for these condition entities may be provided either as numeric - values, or as IDs defined in the model, the parameter table or both. + values, or as IDs defined in the model, the mapping table or the parameter + table. - ``${parameterId}`` From 17c853f5e128df81e7c50fad76bc6624ac4688b5 Mon Sep 17 00:00:00 2001 From: Daniel Weindl Date: Thu, 19 May 2022 10:16:06 +0200 Subject: [PATCH 06/11] Update condition table --- doc/documentation_data_format.rst | 60 ++++++++++++++++--------------- 1 file changed, 31 insertions(+), 29 deletions(-) diff --git a/doc/documentation_data_format.rst b/doc/documentation_data_format.rst index 899ed78d..2eda89c4 100644 --- a/doc/documentation_data_format.rst +++ b/doc/documentation_data_format.rst @@ -145,35 +145,37 @@ Detailed field description - ``${modelEntityId}`` Further columns may be the IDs of model entities that have globally unique - IDs, such as parameters, species or compartments defined in the model. - Only one column is allowed per ID. - Values for these condition entities may be provided either as numeric - values, or as IDs defined in the model, the mapping table or the parameter - table. - - - ``${parameterId}`` - - The values will override any parameter values specified in the model. - - - ``${speciesId}`` - - If a species ID is provided, it is interpreted as the initial - condition of that species (as amount if `hasOnlySubstanceUnits` is set to `True` - for the respective species, as concentration otherwise) and will override the - initial condition given in the SBML model or given by a preequilibration - condition. If ``NaN`` is provided for a condition, the result of the - preequilibration (or initial condition from the model, if - no preequilibration is defined) is used. - - - ``${compartmentId}`` - - If a compartment ID is provided, it is interpreted as the initial - compartment size. - - - For all other entities, values are statically replaced at all time points. - For entities that assign values to other entities, such as SBML - `AssignmentRule`s, the value of the target of that rule is statically - replaced at all time points. + IDs, such as parameters, species or compartments defined in the model to set + condition-specific values. Only one column is allowed per ID. + Values for these entities may be provided either as numeric values, or as IDs + of globally unique entity IDs as defined in the model, the mapping table or + the parameter table. + + The value in the condition table either replaces the initial value or the + value at all timepoints based on whether the model entity has a rate law + assigned or not: + + * For model entities that have constant algebraic assignments + (but not necessarily constant values), i.e, that do not have a rate of + change with respect to time assigned and that are not subject to event + assignments, the algebraic assignment is replaced statically at all + timepoints. Examples for such model entities are the targets of SBML + `AssignmentRules`. + + * For all other entities, e.g., those that are assigned by SBML `RateRules`, + only the initial value can be assigned in the condition table. If an + assignment of the rate of change with respect to time or event assignment + is desired, the values of model entities that are used to define rate of + change or event assignments must be assigned in the condition table. + If no such model entities exist, assignment is not possible. + + Any non-``NaN`` value will override the original values of the model, or if + preequilibration was used, they will override the value obtained from + preequilibration. A ``NaN`` value indicates that the original value of the + model is to be used (when used in the preequilibration condition, or in the + simulation condition if no preequilibration is used) or that the result of + preequilibration is to be used (when used in the simulation condition after + preequilibration). Measurement table From 5236cb05726517c8e40a21075f4a379f44fac467 Mon Sep 17 00:00:00 2001 From: Daniel Weindl Date: Mon, 11 Jul 2022 14:33:26 +0200 Subject: [PATCH 07/11] Fixup schema --- doc/_static/petab_schema.yaml | 28 +++++++++++++++------------- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/doc/_static/petab_schema.yaml b/doc/_static/petab_schema.yaml index 75742142..0c95e605 100644 --- a/doc/_static/petab_schema.yaml +++ b/doc/_static/petab_schema.yaml @@ -43,19 +43,21 @@ properties: description: One or multiple models # the model ID - patternProperties: ^[a-zA-Z_]\w*$ - type: object - properties: - location: - type: string - description: Model file name or URL - language: - type: string - description: | - Model language, e.g., 'sbml', 'cellml', 'bngl', 'pysb' - required: - - location - - language + patternProperties: + "^[a-zA-Z_]\\w*$": + type: object + properties: + location: + type: string + description: Model file name or URL + language: + type: string + description: | + Model language, e.g., 'sbml', 'cellml', 'bngl', 'pysb' + required: + - location + - language + additionalProperties: false measurement_files: type: array From b5c28f910ca0b21a6b58e02805ec27609c7c9509 Mon Sep 17 00:00:00 2001 From: Daniel Weindl Date: Wed, 8 Mar 2023 20:20:08 +0100 Subject: [PATCH 08/11] list of mapping files --- doc/_static/petab_schema.yaml | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/doc/_static/petab_schema.yaml b/doc/_static/petab_schema.yaml index 0c95e605..95316be0 100644 --- a/doc/_static/petab_schema.yaml +++ b/doc/_static/petab_schema.yaml @@ -91,9 +91,13 @@ properties: type: string description: PEtab visualization file name or URL. - mapping_file: + mapping_files: + type: array + description: List of PEtab mapping files. + + items: type: string - description: Optional PEtab mapping file name or URL. + description: PEtab mapping file name or URL. required: - model_files From 5090a9c15c4b9bffa0220e38a00fecb00a522290 Mon Sep 17 00:00:00 2001 From: Daniel Weindl Date: Tue, 2 Jul 2024 13:57:36 +0200 Subject: [PATCH 09/11] fix table --- doc/documentation_data_format.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/documentation_data_format.rst b/doc/documentation_data_format.rst index cb14201b..cf11ede0 100644 --- a/doc/documentation_data_format.rst +++ b/doc/documentation_data_format.rst @@ -109,7 +109,7 @@ different experimental conditions). This is specified as a tab-separated value file in the following way: +--------------+------------------+------------------------------------+-----+---------------------------------------+ -| conditionId | [conditionName] | modelEntityId1 | ... | modelEntityId${n} | +| conditionId | [conditionName] | modelEntityId1 | ... | modelEntityId${n} | +==============+==================+====================================+=====+=======================================+ | STRING | [STRING] | NUMERIC\|STRING | ... | NUMERIC\|STRING | +--------------+------------------+------------------------------------+-----+---------------------------------------+ From 673d067395c91179096a736df3044277e39a6b7f Mon Sep 17 00:00:00 2001 From: Daniel Weindl Date: Wed, 3 Jul 2024 17:01:46 +0200 Subject: [PATCH 10/11] Update doc/documentation_data_format.rst --- doc/documentation_data_format.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/documentation_data_format.rst b/doc/documentation_data_format.rst index cf11ede0..90945bd0 100644 --- a/doc/documentation_data_format.rst +++ b/doc/documentation_data_format.rst @@ -751,7 +751,7 @@ Detailed field description - ``modelEntityId`` [STRING, NOT NULL] - A globally unique identifier defined in the model. + A globally unique identifier defined in the model, *that is not a valid PEtab ID*. For example, in SBML, local parameters may be referenced as ``$reactionId.$localParameterId``, which are not valid PEtab IDs as they From a69c01713c7178d7f7d5d6b9afc7804591f994ac Mon Sep 17 00:00:00 2001 From: Daniel Weindl Date: Wed, 3 Jul 2024 17:03:52 +0200 Subject: [PATCH 11/11] .. --- doc/documentation_data_format.rst | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/doc/documentation_data_format.rst b/doc/documentation_data_format.rst index 90945bd0..79e32368 100644 --- a/doc/documentation_data_format.rst +++ b/doc/documentation_data_format.rst @@ -751,7 +751,8 @@ Detailed field description - ``modelEntityId`` [STRING, NOT NULL] - A globally unique identifier defined in the model, *that is not a valid PEtab ID*. + A globally unique identifier defined in the model, + *that is not a valid PEtab ID* (see :ref:`identifiers`). For example, in SBML, local parameters may be referenced as ``$reactionId.$localParameterId``, which are not valid PEtab IDs as they @@ -1126,6 +1127,8 @@ float values are demoted to boolean values. For example, in ``1 + true``, the expression is interpreted as ``true && true = true``. +.. _identifiers: + Identifiers -----------