From 6bfda0e981dc204079eb5ac3e81b61827ee62344 Mon Sep 17 00:00:00 2001 From: Daniel Weindl Date: Wed, 26 Mar 2025 14:01:04 +0100 Subject: [PATCH 1/5] v2: Merge observableTransformation and noiseDistribution See also https://github.com/PEtab-dev/PEtab/discussions/618#discussioncomment-12628162 --- doc/v2/documentation_data_format.rst | 74 ++++++++++++++++++---------- 1 file changed, 47 insertions(+), 27 deletions(-) diff --git a/doc/v2/documentation_data_format.rst b/doc/v2/documentation_data_format.rst index a0ffe6b5..255681af 100644 --- a/doc/v2/documentation_data_format.rst +++ b/doc/v2/documentation_data_format.rst @@ -153,6 +153,11 @@ PEtab 2.0.0 is a major update of the PEtab format. The main changes are: the PEtab format. * The admissible values for ``estimate`` in the :ref:`v2_parameters_table` are now ``true`` and ``false`` instead of ``1`` and ``0``. +* The ``observableTransformation`` column of the :ref:`v2_observables_table` + has been combined with the ``noiseDistribution`` column to make its intent + clearer. The ``log10`` transformation has been removed, since this was mostly + relevant for visualization purposes, and the same effect can be achieved by + rescaling the parameters of the respective (natural) log-distributions. .. _v2_model: .. _v2_model_entities: @@ -564,17 +569,17 @@ The observable table has the following columns: *(wrapped for readability)* -+-----+----------------------------+---------------------------------------+-----------------------+ -| ... | [observableTransformation] | noiseFormula | [noiseDistribution] | -+=====+============================+=======================================+=======================+ -| ... | [lin(default)\|log\|log10] | STRING\|NUMBER | [laplace\|normal] | -+-----+----------------------------+---------------------------------------+-----------------------+ -| ... | e.g. | | | -+-----+----------------------------+---------------------------------------+-----------------------+ -| ... | lin | noiseParameter1_relativeTotalProtein1 | normal | -+-----+----------------------------+---------------------------------------+-----------------------+ -| ... | ... | ... | ... | -+-----+----------------------------+---------------------------------------+-----------------------+ ++-----+---------------------------------------+-----------------------+ +| ... | noiseFormula | [noiseDistribution] | ++=====+=======================================+=======================+ +| ... | STRING\|NUMBER | *see below* | ++-----+---------------------------------------+-----------------------+ +| ... | | | ++-----+---------------------------------------+-----------------------+ +| ... | noiseParameter1_relativeTotalProtein1 | normal | ++-----+---------------------------------------+-----------------------+ +| ... | ... | ... | ++-----+---------------------------------------+-----------------------+ Detailed field description @@ -603,13 +608,6 @@ Detailed field description which are overridden by ``observableParameters`` in the measurement table (see description there). -- ``observableTransformation`` [STRING, OPTIONAL] - - Transformation of the observable and measurement for computing the objective - function. Must be one of ``lin``, ``log`` or ``log10``. Defaults to ``lin``. - The measurements and model outputs are both assumed to be provided in linear - space. - * ``noiseFormula`` [NUMERIC|STRING] Measurement noise can be specified as a numerical value which will @@ -644,20 +642,42 @@ Detailed field description observable formula contains an override, and a proportional noise model is used, which means the observable formula also appears in the noise formula. -- ``noiseDistribution`` [STRING: 'normal' or 'laplace', OPTIONAL] +* ``noiseDistribution`` [STRING, OPTIONAL] + + Assumed noise distribution for the measurements of the given observable. + + Supported options are: + + * ``normal``: Gaussian noise model with standard deviation as specified in + ``noiseFormula``. + * ``log-normal``: Log-normal noise model with standard deviation as specified + in ``noiseFormula``. I.e., the logarithm of the measurements is normally + distributed. + * ``log10-normal``: Log10-normal noise model with standard deviation as + specified in ``noiseFormula``. I.e., the logarithm to the base 10 of the + measurement is normally distributed. + * ``laplace``: Laplace noise model with scale parameter as specified in + ``noiseFormula``. + * ``log-laplace``: Log-Laplace noise model with scale parameter as specified + in ``noiseFormula``. I.e., the logarithm of the measurements is Laplace + distributed. + * ``log10-laplace``: Log10-Laplace noise model with scale parameter as + specified in ``noiseFormula``. I.e., the logarithm to the base 10 of the + measurement is Laplace distributed. + + The respective probability density functions are given in the + :ref:`noise distributions section `. - Assumed noise distribution for the given measurement. Only normally or - Laplace distributed noise is currently allowed (log-normal and - log-Laplace are obtained by setting ``observableTransformation`` to ``log``, similarly for ``log10``). - Defaults to ``normal``. If ``normal``, the specified ``noiseParameters`` will be - interpreted as standard deviation (*not* variance). If ``Laplace`` ist specified, the specified ``noiseParameter`` will be interpreted as the scale, or diversity, parameter. -.. _noise_distributions: +.. _v2_noise_distributions: Noise distributions ~~~~~~~~~~~~~~~~~~~ -For ``noiseDistribution``, ``normal`` and ``laplace`` are supported. For ``observableTransformation``, ``lin``, ``log`` and ``log10`` are supported. Denote by :math:`y` the simulation, :math:`m` the measurement, and :math:`\sigma` the standard deviation of a normal, or the scale parameter of a laplace model, as given via the ``noiseFormula`` field. Then we have the following effective noise distributions. +Denote by :math:`y` the simulation, :math:`m` the measurement, +and :math:`\sigma` the standard deviation of a normal, or the scale parameter +of a laplace model, as given via the ``noiseFormula`` field. +Then we have the following effective noise distributions: - Normal distribution: @@ -1070,7 +1090,7 @@ parameters : \theta_{\text{ML}} = \arg\min_{\theta} \mathcal{L}_{\text{ML}}(\theta) Where :math:`p(\mathcal{D} \mid \mathcal{M}, \theta)` is the likelihood -as described under :ref:`noise_distributions`. +as described under :ref:`v2_noise_distributions`. For MAP estimation, the objective function is the unnormalized negative log-posterior of the data given the model and the parameters: From e2eccd477de009794f3588c6d82b2f45c1542105 Mon Sep 17 00:00:00 2001 From: Daniel Weindl Date: Thu, 27 Mar 2025 16:55:29 +0100 Subject: [PATCH 2/5] drop log10 --- doc/v2/documentation_data_format.rst | 17 ----------------- 1 file changed, 17 deletions(-) diff --git a/doc/v2/documentation_data_format.rst b/doc/v2/documentation_data_format.rst index 255681af..63889b61 100644 --- a/doc/v2/documentation_data_format.rst +++ b/doc/v2/documentation_data_format.rst @@ -653,17 +653,11 @@ Detailed field description * ``log-normal``: Log-normal noise model with standard deviation as specified in ``noiseFormula``. I.e., the logarithm of the measurements is normally distributed. - * ``log10-normal``: Log10-normal noise model with standard deviation as - specified in ``noiseFormula``. I.e., the logarithm to the base 10 of the - measurement is normally distributed. * ``laplace``: Laplace noise model with scale parameter as specified in ``noiseFormula``. * ``log-laplace``: Log-Laplace noise model with scale parameter as specified in ``noiseFormula``. I.e., the logarithm of the measurements is Laplace distributed. - * ``log10-laplace``: Log10-Laplace noise model with scale parameter as - specified in ``noiseFormula``. I.e., the logarithm to the base 10 of the - measurement is Laplace distributed. The respective probability density functions are given in the :ref:`noise distributions section `. @@ -689,11 +683,6 @@ Then we have the following effective noise distributions: .. math:: \pi(m|y,\sigma) = \frac{1}{\sqrt{2\pi}\sigma m}\exp\left(-\frac{(\log m - \log y)^2}{2\sigma^2}\right) -- Log10-normal distribution (i.e. log10(m) is normally distributed): - - .. math:: - \pi(m|y,\sigma) = \frac{1}{\sqrt{2\pi}\sigma m \log(10)}\exp\left(-\frac{(\log_{10} m - \log_{10} y)^2}{2\sigma^2}\right) - - Laplace distribution: .. math:: @@ -704,12 +693,6 @@ Then we have the following effective noise distributions: .. math:: \pi(m|y,\sigma) = \frac{1}{2\sigma m}\exp\left(-\frac{|\log m - \log y|}{\sigma}\right) -- Log10-Laplace distribution (i.e. log10(m) is Laplace distributed): - - .. math:: - \pi(m|y,\sigma) = \frac{1}{2\sigma m \log(10)}\exp\left(-\frac{|\log_{10} m - \log_{10} y|}{\sigma}\right) - - The distributions above are for a single data point. For a collection :math:`D=\{m_i\}_i` of data points and corresponding simulations :math:`Y=\{y_i\}_i` and noise parameters :math:`\Sigma=\{\sigma_i\}_i`, the current specification assumes independence, i.e. the full distributions is .. math:: From a2e97a8d16e99763ff165fd2c923a46539ef5a48 Mon Sep 17 00:00:00 2001 From: Daniel Weindl Date: Wed, 23 Apr 2025 09:44:48 +0200 Subject: [PATCH 3/5] .. --- doc/v2/documentation_data_format.rst | 75 +++++++++++++--------------- 1 file changed, 36 insertions(+), 39 deletions(-) diff --git a/doc/v2/documentation_data_format.rst b/doc/v2/documentation_data_format.rst index 356367e5..d9984472 100644 --- a/doc/v2/documentation_data_format.rst +++ b/doc/v2/documentation_data_format.rst @@ -617,6 +617,8 @@ Detailed field description * ``noiseFormula`` [NUMERIC|STRING] + The scale parameter of the noise distribution for the given observable. + Measurement noise can be specified as a numerical value which will default to a Gaussian noise model if not specified differently in ``noiseDistribution`` with standard deviation as provided here. In this case, @@ -652,55 +654,50 @@ Detailed field description * ``noiseDistribution`` [STRING, OPTIONAL] Assumed noise distribution for the measurements of the given observable. - - Supported options are: - - * ``normal``: Gaussian noise model with standard deviation as specified in - ``noiseFormula``. - * ``log-normal``: Log-normal noise model with standard deviation as specified - in ``noiseFormula``. I.e., the logarithm of the measurements is normally - distributed. - * ``laplace``: Laplace noise model with scale parameter as specified in - ``noiseFormula``. - * ``log-laplace``: Log-Laplace noise model with scale parameter as specified - in ``noiseFormula``. I.e., the logarithm of the measurements is Laplace - distributed. - - The respective probability density functions are given in the - :ref:`noise distributions section `. - + The supported :ref:`noise distributions ` and the + respective interpretation of ``noiseFormula`` are given in the table below. .. _v2_noise_distributions: Noise distributions ~~~~~~~~~~~~~~~~~~~ -Denote by :math:`y` the simulation, :math:`m` the measurement, -and :math:`\sigma` the standard deviation of a normal, or the scale parameter -of a laplace model, as given via the ``noiseFormula`` field. +Denote by :math:`m` the measurement, :math:`y` the simulation +(the location parameter of the noise distribution), +and :math:`\sigma` the scale parameter of the noise distribution +as given via the ``noiseFormula`` field (the standard deviation of a normal, +or the scale parameter of a laplace model). Then we have the following effective noise distributions: -- Normal distribution: - - .. math:: - \pi(m|y,\sigma) = \frac{1}{\sqrt{2\pi}\sigma}\exp\left(-\frac{(m-y)^2}{2\sigma^2}\right) - -- Log-normal distribution (i.e. log(m) is normally distributed): - - .. math:: - \pi(m|y,\sigma) = \frac{1}{\sqrt{2\pi}\sigma m}\exp\left(-\frac{(\log m - \log y)^2}{2\sigma^2}\right) - -- Laplace distribution: - - .. math:: - \pi(m|y,\sigma) = \frac{1}{2\sigma}\exp\left(-\frac{|m-y|}{\sigma}\right) - -- Log-Laplace distribution (i.e. log(m) is Laplace distributed): +.. list-table:: + :header-rows: 1 + :widths: 10 10 80 - .. math:: - \pi(m|y,\sigma) = \frac{1}{2\sigma m}\exp\left(-\frac{|\log m - \log y|}{\sigma}\right) + * - Type + - ``noiseDistribution`` + - Probability density function (PDF) + * - Gaussian distribution + - ``normal`` + - .. math:: + \pi(m|y,\sigma) = \frac{1}{\sqrt{2\pi}\sigma}\exp\left(-\frac{(m-y)^2}{2\sigma^2}\right) + * - Log-normal distribution (i.e. log(m) is normally distributed) + - ``log-normal`` + - .. math:: + \pi(m|y,\sigma) = \frac{1}{\sqrt{2\pi}\sigma m}\exp\left(-\frac{(\log m - \log y)^2}{2\sigma^2}\right) + * - Laplace distribution + - ``laplace`` + - .. math:: + \pi(m|y,\sigma) = \frac{1}{2\sigma}\exp\left(-\frac{|m-y|}{\sigma}\right) + * - Log-Laplace distribution (i.e. log(m) is Laplace distributed) + - ``log-laplace`` + - .. math:: + \pi(m|y,\sigma) = \frac{1}{2\sigma m}\exp\left(-\frac{|\log m - \log y|}{\sigma}\right) -The distributions above are for a single data point. For a collection :math:`D=\{m_i\}_i` of data points and corresponding simulations :math:`Y=\{y_i\}_i` and noise parameters :math:`\Sigma=\{\sigma_i\}_i`, the current specification assumes independence, i.e. the full distributions is +The distributions above are for a single data point. +For a collection :math:`D=\{m_i\}_i` of data points and corresponding +simulations :math:`Y=\{y_i\}_i` +and noise parameters :math:`\Sigma=\{\sigma_i\}_i`, +the current specification assumes independence, i.e. the full distributions is .. math:: \pi(D|Y,\Sigma) = \prod_i\pi(m_i|y_i,\sigma_i) From 750e10c7e200bf791bcfedb42e4da0495c94d3d6 Mon Sep 17 00:00:00 2001 From: Daniel Weindl Date: Thu, 24 Apr 2025 08:54:26 +0200 Subject: [PATCH 4/5] Apply suggestions from code review Co-authored-by: Dilan Pathirana <59329744+dilpath@users.noreply.github.com> --- doc/v2/documentation_data_format.rst | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/doc/v2/documentation_data_format.rst b/doc/v2/documentation_data_format.rst index d9984472..62b0575e 100644 --- a/doc/v2/documentation_data_format.rst +++ b/doc/v2/documentation_data_format.rst @@ -662,7 +662,7 @@ Detailed field description Noise distributions ~~~~~~~~~~~~~~~~~~~ -Denote by :math:`m` the measurement, :math:`y` the simulation +Denote by :math:`m` the measurement, :math:`y:=\text{observableFormula}` the simulation (the location parameter of the noise distribution), and :math:`\sigma` the scale parameter of the noise distribution as given via the ``noiseFormula`` field (the standard deviation of a normal, @@ -680,7 +680,8 @@ Then we have the following effective noise distributions: - ``normal`` - .. math:: \pi(m|y,\sigma) = \frac{1}{\sqrt{2\pi}\sigma}\exp\left(-\frac{(m-y)^2}{2\sigma^2}\right) - * - Log-normal distribution (i.e. log(m) is normally distributed) + * - | Log-normal distribution + | (i.e. log(m) is normally distributed) - ``log-normal`` - .. math:: \pi(m|y,\sigma) = \frac{1}{\sqrt{2\pi}\sigma m}\exp\left(-\frac{(\log m - \log y)^2}{2\sigma^2}\right) @@ -697,7 +698,7 @@ The distributions above are for a single data point. For a collection :math:`D=\{m_i\}_i` of data points and corresponding simulations :math:`Y=\{y_i\}_i` and noise parameters :math:`\Sigma=\{\sigma_i\}_i`, -the current specification assumes independence, i.e. the full distributions is +the current specification assumes independence, i.e. the full distribution is .. math:: \pi(D|Y,\Sigma) = \prod_i\pi(m_i|y_i,\sigma_i) From 5af4fea8e39cdb8bc6f32ef6df517d13e70da650 Mon Sep 17 00:00:00 2001 From: Daniel Weindl Date: Thu, 24 Apr 2025 09:00:36 +0200 Subject: [PATCH 5/5] .. --- doc/v2/documentation_data_format.rst | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/doc/v2/documentation_data_format.rst b/doc/v2/documentation_data_format.rst index 62b0575e..d0c14a2a 100644 --- a/doc/v2/documentation_data_format.rst +++ b/doc/v2/documentation_data_format.rst @@ -662,7 +662,8 @@ Detailed field description Noise distributions ~~~~~~~~~~~~~~~~~~~ -Denote by :math:`m` the measurement, :math:`y:=\text{observableFormula}` the simulation +Denote by :math:`m` the measured value, +:math:`y:=\text{observableFormula}` the simulated value (the location parameter of the noise distribution), and :math:`\sigma` the scale parameter of the noise distribution as given via the ``noiseFormula`` field (the standard deviation of a normal, @@ -681,7 +682,7 @@ Then we have the following effective noise distributions: - .. math:: \pi(m|y,\sigma) = \frac{1}{\sqrt{2\pi}\sigma}\exp\left(-\frac{(m-y)^2}{2\sigma^2}\right) * - | Log-normal distribution - | (i.e. log(m) is normally distributed) + | (i.e., :math:`\log(m)` is normally distributed) - ``log-normal`` - .. math:: \pi(m|y,\sigma) = \frac{1}{\sqrt{2\pi}\sigma m}\exp\left(-\frac{(\log m - \log y)^2}{2\sigma^2}\right) @@ -689,7 +690,8 @@ Then we have the following effective noise distributions: - ``laplace`` - .. math:: \pi(m|y,\sigma) = \frac{1}{2\sigma}\exp\left(-\frac{|m-y|}{\sigma}\right) - * - Log-Laplace distribution (i.e. log(m) is Laplace distributed) + * - | Log-Laplace distribution + | (i.e., :math:`\log(m)` is Laplace distributed) - ``log-laplace`` - .. math:: \pi(m|y,\sigma) = \frac{1}{2\sigma m}\exp\left(-\frac{|\log m - \log y|}{\sigma}\right)