From c35f3a04721bf0b9b474bd4df5aa8e8da863ad32 Mon Sep 17 00:00:00 2001
From: juacrumar <juacrumar@lairen.eu>
Date: Thu, 30 Mar 2023 17:40:58 +0200
Subject: [PATCH 1/4] squash first draft of commondata documentation as
 starting point

first draft of the documented new commondata format

add definition of the version key

add explanation for the variants and the theory

Apply suggestions from code review

Co-authored-by: Felix Hekhorn <felixhekhorn@users.noreply.github.com>

Update new-commondata.rst

update docs

Update doc/sphinx/source/data/new-commondata.rst

Update doc/sphinx/source/data/new-commondata.rst

Update doc/sphinx/source/data/new-commondata.rst

Update new-commondata.rst

update docs with the definition of the old:new mapping

Update doc/sphinx/source/data/new-commondata.rst
---
 ...ntion.md => dataset-naming-convention.rst} |  55 ++--
 doc/sphinx/source/data/new-commondata.rst     | 234 ++++++++++++++++++
 2 files changed, 261 insertions(+), 28 deletions(-)
 rename doc/sphinx/source/data/{dataset-naming-convention.md => dataset-naming-convention.rst} (55%)
 create mode 100644 doc/sphinx/source/data/new-commondata.rst
diff --git a/doc/sphinx/source/data/dataset-naming-convention.md b/doc/sphinx/source/data/dataset-naming-convention.rst
similarity index 55%
rename from doc/sphinx/source/data/dataset-naming-convention.md
rename to doc/sphinx/source/data/dataset-naming-convention.rst
index 38cd86f5e3..daed57803d 100644
--- a/doc/sphinx/source/data/dataset-naming-convention.md
+++ b/doc/sphinx/source/data/dataset-naming-convention.rst
@@ -3,58 +3,57 @@ NNPDF's dataset naming convention
 =================================
 
 Each dataset implemented in NNPDF must have a unique name, which is a string
-constructed following this [Backus–Naur form]:
+constructed following this [Backus–Naur form]::
 
-```
-<valid dataset name> ::= <experiment> "_" <process>
-                       | <experiment> "_" <process> "_" <energy>
-                       | <experiment> "_" <process> "_" <variant>
-                       | <experiment> "_" <process> "_" <energy> "_" <variant>
+  <valid dataset name> ::= <experiment> "_" <process>
+                        | <experiment> "_" <process> "_" <energy>
+                        | <experiment> "_" <process> "_" <variant>
+                        | <experiment> "_" <process> "_" <energy> "_" <variant>
 
-<experiment> ::= "ATLAS" | "BCDMS" | "CHORUS" | "CMS" | "E605" | "E866"
-               | "E906" | "EMC" | "HERA" | "LHCB" | "NMC" | "NNPDF" | "NUTEV"
+  <experiment> ::= "ATLAS" | "BCDMS" | "CHORUS" | "CMS" | "E605" | "E866"
+                | "E906" | "EMC" | "HERA" | "LHCB" | "NMC" | "NNPDF" | "NUTEV"
 
-<process> ::= "1JET" | "2JET" | "CC" | "DY" | "H" | "HVBF" | "INTEG" | "NC"
-            | "POS" | "TTB" | "WM" | "WMWP" | "WP" | "WPZ" | "ZPT"
+  <process> ::= "1JET" | "2JET" | "CC" | "DY" | "H" | "HVBF" | "INTEG" | "NC"
+              | "POS" | "TTB" | "WM" | "WMWP" | "WP" | "WPZ" | "ZPT"
 
-<integer> ::= TODO
+  <integer> ::= TODO
 
-<string> ::= TODO
+  <string> ::= TODO
 
-<energy> ::= <integer> "GEV" | <integer> "TEV"
+  <energy> ::= <integer> "GEV" | <integer> "TEV"
 
-<variant> ::= <string>
-            | <string> "_" <string>
-            | <string> "_" <string> "_" <string>
-            | <string> "_" <string> "_" <string> "_" <string>
+  <variant> ::= <string>
+              | <string> "_" <string>
+              | <string> "_" <string> "_" <string>
+              | <string> "_" <string> "_" <string> "_" <string>
 
-```
 
 Experiments
 ===========
 
-- [`ATLAS`](https://home.cern/science/experiments/atlas): A Large Toroidal
+- `ATLAS <https://home.cern/science/experiments/atlas>`_: A Large Toroidal
   Aparatus
 - BCDMS: TODO
 - CHORUS: TODO
-- [`CMS`](https://home.cern/science/experiments/cms): Compact Muon Solenoid
+- `CMS <https://home.cern/science/experiments/cms>`_: Compact Muon Solenoid
 - E605: TODO
 - E866: TODO
 - E906: TODO
 - EMC: TODO
-- [`HERA`](https://dphep.web.cern.ch/accelerators/hera): Hadron Elektron Ring
+- `HERA <https://dphep.web.cern.ch/accelerators/hera>`: Hadron Elektron Ring
   Anlage. While technically speaking this is an accelerator, this string is
   used for the combined analyses of H1 and ZEUS.
-- [`LHCB`](https://home.cern/science/experiments/lhcb):
+- `LHCB <https://home.cern/science/experiments/lhcb>`_:
 - NMC: TODO
-- [`NNPDF`](https://nnpdf.mi.infn.it/): This experiment name is used for two
+- `NNPDF <https://nnpdf.mi.infn.it/>`_: This experiment name is used for two
   purposes:
-  1. for auxiliary datasets needed in the PDF fit, for instance `INTEG` and
-     `POS`
-  2. for predictions used in NNPDF papers to study the impact of PDFs in
-     processes not included in its PDF fit
+
+  1. for auxiliary datasets needed in the PDF fit, for instance `INTEG` and `POS`
+  2. for predictions used in NNPDF papers to study the impact of PDFs in processes not included in its PDF fit
 - NUTEV: TODO
 
+
+
 Processes
 =========
 
@@ -81,4 +80,4 @@ Processes
 - `ZPT`: production of two same-flavor opposite-sign leptons with non-zero
   total transverse momentum (Z-boson pt spectrum)
 
-[Backus–Naur form](https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form)
+`Backus–Naur form <https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form>`_
diff --git a/doc/sphinx/source/data/new-commondata.rst b/doc/sphinx/source/data/new-commondata.rst
new file mode 100644
index 0000000000..ef549a0c9d
--- /dev/null
+++ b/doc/sphinx/source/data/new-commondata.rst
@@ -0,0 +1,234 @@
+Naming convention and organization of the datasets
+--------------------------------------------------
+
+All datasets in the new data format follow the exact same naming convention::
+
+    <experiment>_<process>_<energy>{_<extras>}_<observable>
+
+The data is contained in folders, each folder containing one single hepdata publication. 
+In all cases one can reconstruct the name of the folder by separating the observable name on the last ``_``, i.e., the folder will always be named::
+
+    <experiment>_<process>_<energy>{_<extras>}
+
+Where all observables contained in one hepdata entry are separated by their observable name.
+
+Each folder will contain one single metadata file named ``metadata.yaml`` which defines all observables implemented for a given dataset.
+
+In order to keep backward compatibility and ease the comparison between new and old commondata, the ``buildmaster/dataset_names.yml`` file keeps a mapping of the datasets implemented in both formats.
+When a ``legacy`` variant is available, the usage of the old name automatically enables such variants. The format of this mapping is as follow (which enables using variants):
+
+..  code-block:: yaml
+
+    old_name_1: new_name_1
+    old_name_2:
+        dataset: new_name_2
+        variant: this_particular_variant
+
+
+Metadata Format
+---------------
+
+This ``metadata.yaml`` file contains a first portion of general information which might be shared by several sets and a list of ``implemented_observables`` which define the separate observables.
+
+
+..  code-block:: yaml
+
+    setname: "EXPERIMENT_PROCESS_ENERGY{_EXTRA}"
+
+    version: 1
+    version_comment: "Initial implementation"
+
+    # References
+    arXiv:
+        url: ""
+    iNSPIRE:
+        url: "https://inspirehep.net/literature/302822"
+    hepdata:
+        url: "https://www.hepdata.net/record/ins302822"
+        version: 1
+
+    nnpdf_metadata:
+        nnpdf31_process: "PROCESS"
+        experiment: "EXPERIMENT_NAME"
+
+    implemented_observables:
+      - observable_name: "OBS"
+        observable:
+            description: "Description of the observable"
+            label: "Latex label for the observable"
+            units: "[u]"
+        ndata: n_of_datapoints
+        tables: [n, j, k] # (optional) corresponding tables in the hepdata entry
+        npoints: [n, j, k] # (optional) number of points per table
+        process_type: INC # for instance, INC, JET, DIJET, etc
+
+        # Plotting information (for instance, the kinematics variable could be pt, mt, q2)
+        plotting:
+            dataset_label: "Label to be used in reports"
+            kinematics_override: identity
+            x_scale: log
+            plot_x: var_1
+            figure_by:
+                - var_2
+
+        kinematic_coverage: [var_1, var_2, var_3]
+
+        kinematics:
+            variables:
+                var_1: {description: "Description of var", label: "latex", units: "u"}
+                var_2: {description: "Description of var", label: "latex", units: "u"}
+                var_3: {description: "Description of var", label: "latex", units: "u"}
+            file: kinematics.yaml
+
+        data_central: data.yaml
+        data_uncertainties:
+            - uncertainties.yaml
+            - uncertainties_2.yaml
+
+        # Having variants is optional
+        # variants can overwrite the data_uncertainties 
+        variants:
+            different_errors:
+                data_uncertainties:
+                    - uncertainties.yaml
+                    - uncertainties_3.yaml
+
+        # The theory field is always optional
+        theory: 
+            FK_tables:
+                - - DYE605
+            operation: 'null'
+
+
+
+
+Versioning
+~~~~~~~~~~
+
+The initial version of a dataset should be set to ``version: 1``.
+Any change on a dataset should be *always* accompanied of a version bump and a ``version_comment`` explaining the update.
+This will allow to keep an exact tracking of all changes to every dataset even if they change over time.
+
+Variants
+~~~~~~~~
+
+In some occasions we might want to maintain two variations of the same observable.
+For instance, we might have two incompatible sources of uncertainties. In such case a variant can be added.
+The syntax of the ``variants`` is.
+
+Theory
+~~~~~~
+
+The theory field defines how predictions for the dataset are to be computed.
+It includes two entries:
+
+- ``FK_tables``: this is a list of lists which defines the FK Tables to be loaded. The outermost list are the operands (in case an operation is needed to recover the observable, more on that below). The innermost list are the grids that are to be concatenated in order to form the operands.
+- ``operaton``: operation to be applied in order to compute the observable
+
+Example:
+
+..  code-block:: yaml
+            theory: 
+            FK_tables:
+                - - Z_contribution
+                  - Wp_contribution
+                  - Wm_total
+                - - total_xs
+            operation: 'ratio'
+
+In this case the ``fktables`` for the Z, W+ and W- contributions will be concatenated (the dataset might include predictions for all three contributions).
+After that, the final observable will be computed by taking the ratio of the concatenation of all those observables and the total cross section (``total_xs``).
+
+
+..  code-block:: yaml
+
+    data_uncertainties:
+        - uncertainties.yaml
+
+    variants:
+        name_of_the_variant:
+            data_uncertainties:
+                - uncertainties.yaml
+                - extra_uncertainties.yaml
+        another_variant:
+            data_uncertainties:
+                - different_uncertainties.yaml
+
+
+When loading this dataset with no variant only the ``uncertainties.yaml`` file will be read.
+Instead, when choosing ``variant: name_of_the_variant``, both ``uncertainties.yaml`` and  ``extra_uncertainties.yaml`` will be loaded.
+Note that if we want to substitute the default set of uncertainties we just need to not include it in the variant (as done in ``another_variant``).
+
+
+Data
+----
+
+The format of the data is a ``yaml`` file with an entry ```data_central``` which is a list for all values for all bins.
+
+..  code-block:: yaml
+
+    data_central:
+        - val1
+        - val2
+        - val3
+
+Uncertainties
+-------------
+
+The uncertainties are (also) ``.yaml`` files. 
+Note that in the ``metadata.yaml`` the ``data_uncertainties`` entry is given as a list. 
+When using more than one uncertainty file they will be concatenated. 
+This allows the user the flexibility of creating variants where only a subset of the uncertainties are modified.
+
+The format of the uncertainty files is of two fields, a ``definitions`` field that contains metadata about all the uncertainties (their name, their treatment (``ADD`` or ``MULT``) and their type) and a second field ``bins`` which is a list of mappings with as many entries as the `data_central` with the named uncertainties.
+
+Note that, regardless of their treatment type, the uncertainties should always be written as absolute values and not relative to the data values.
+
+..  code-block:: yaml
+
+    definitions:
+        stat:
+            description:
+            treatment:
+            type:
+        error_name:
+            description:
+            treatment:
+            type:
+        error_name_2:
+            description:
+            treatment:
+            type:
+    bins:
+        - stat:
+          error_name:
+          error_name_2:
+
+Kinematics:
+-----------
+The kinematics file follow a convention very similar to the uncertainties file, where the ``definitions`` field is skipped since that information is already contained in the parent ``metadata.yaml`` file.
+
+Therefore, we have a list of ``bins`` (of the same size as the list for `data_central`) and for each entry we have the information of all the variables.
+
+..  code-block:: yaml
+
+    bins:
+        - var_1:
+            min: 0
+            max: 1
+            mid: 0.5
+          var_2:
+            min: 0
+            max: 1
+            mid: 0.5
+
+Plotting
+~~~~~~~~
+
+The ``plotting`` section defines the plotting style inside ``validphys``.
+In previous implementations there were per-process options that defined plotting options for family of processes.
+In the commondata format defined in this page every plotting option must be defined in the ``plotting`` section of each observable.
+
+Internally within ``validphys`` only 3 kinematic variables are taken into account. The 3 selected variables (and their order) is defined by ``plotting::kinematic_coverage``.
+
+The name of the variables (which in this example are `var_1`, `var_2`, `var_3`) need to be the same in the plotting and kinematics.

From af1be35341d3d93c7102edc06cd7dedd54c94584 Mon Sep 17 00:00:00 2001
From: juacrumar <juacrumar@lairen.eu>
Date: Sun, 3 Mar 2024 12:51:13 +0100
Subject: [PATCH 2/4] add documentation for the new commondata format ; remove
 documentation for the old format

---
 conda-recipe/meta.yaml                        |   2 +-
 doc/sphinx/source/conf.py                     |   2 +-
 doc/sphinx/source/data/commondata.rst         | 355 ++++++++++++++++++
 doc/sphinx/source/data/data-config.rst        |  32 +-
 .../source/data/dataset-naming-convention.rst |  35 +-
 .../source/data/example-fk-preamble.rst       | 212 -----------
 doc/sphinx/source/data/exp-data-files.rst     | 245 ------------
 .../source/data/fk-config-variables.rst       |  18 -
 doc/sphinx/source/data/index.rst              |   7 +-
 doc/sphinx/source/data/intro.rst              |  44 +--
 doc/sphinx/source/data/new-commondata.rst     | 234 ------------
 doc/sphinx/source/data/plotting-format.rst    | 268 +++++++++++++
 doc/sphinx/source/data/plotting_format.md     | 303 ---------------
 doc/sphinx/source/external-code/apfelcomb.md  |  59 ---
 doc/sphinx/source/external-code/index.rst     |   1 -
 doc/sphinx/source/tutorials/apfelcomb.md      | 300 ---------------
 doc/sphinx/source/tutorials/index.rst         |   2 -
 pyproject.toml                                |   2 +-
 18 files changed, 668 insertions(+), 1453 deletions(-)
 create mode 100644 doc/sphinx/source/data/commondata.rst
 delete mode 100644 doc/sphinx/source/data/example-fk-preamble.rst
 delete mode 100644 doc/sphinx/source/data/exp-data-files.rst
 delete mode 100644 doc/sphinx/source/data/fk-config-variables.rst
 delete mode 100644 doc/sphinx/source/data/new-commondata.rst
 create mode 100644 doc/sphinx/source/data/plotting-format.rst
 delete mode 100644 doc/sphinx/source/data/plotting_format.md
 delete mode 100644 doc/sphinx/source/external-code/apfelcomb.md
 delete mode 100644 doc/sphinx/source/tutorials/apfelcomb.md

diff --git a/conda-recipe/meta.yaml b/conda-recipe/meta.yaml
index 163a88fa78..3e4a615a98 100644
--- a/conda-recipe/meta.yaml
+++ b/conda-recipe/meta.yaml
@@ -50,7 +50,7 @@ requirements:
         - requests
         - prompt_toolkit
         - validobj
-        - sphinx >=4.0.2 # documentation. Needs pinning becasue https://github.com/sphinx-doc/sphinx/issues/9216
+        - sphinx >=5.0.2,<6 # documentation. Needs pinning temporarily due to markdown
         - recommonmark
         - sphinx_rtd_theme >0.5
         - sphinxcontrib-bibtex
diff --git a/doc/sphinx/source/conf.py b/doc/sphinx/source/conf.py
index a155d04c18..f6caf9cac3 100644
--- a/doc/sphinx/source/conf.py
+++ b/doc/sphinx/source/conf.py
@@ -85,7 +85,7 @@
 #
 # This is also used if you do content translation via gettext catalogs.
 # Usually you set "language" from the command line for these cases.
-language = None
+language = "en"
 
 # List of patterns, relative to source directory, that match files and
 # directories to ignore when looking for source files.
diff --git a/doc/sphinx/source/data/commondata.rst b/doc/sphinx/source/data/commondata.rst
new file mode 100644
index 0000000000..0ea37f666d
--- /dev/null
+++ b/doc/sphinx/source/data/commondata.rst
@@ -0,0 +1,355 @@
+.. _commondata:
+
+=======================
+Experimental data files
+=======================
+
+Data made available by experimental collaborations comes in a variety of
+formats. For use in a fitting code, this data must be converted into a common
+format that contains all the required information for use in PDF fitting.
+Existing formats commonly used by the community, such as in `HepData <https://www.hepdata.net/>`_,
+are generally unsuitable. Principally as they often do not fully describe the
+breakdown of systematic uncertainties. Therefore over several years an NNPDF
+standard data format has been iteratively developed, now denoted ``CommonData``.
+
+This documentation describes the ``CommonData`` format
+used in NNPDF starting from code version 4.0.10 and compatible with releases beyond 4.0.
+
+
+Naming convention and organization of the datasets
+--------------------------------------------------
+
+All datasets in the new data format follow the exact same naming convention::
+
+    <setname>_<observable>
+
+where the setname is defined by::
+
+    <experiment>_<process>_<energy>{_<extras>}
+
+The naming convention for the set names is defined in the :ref:`naming convention documentation<dataset-naming-convention>`.
+
+Each ``<setname>`` defines a folder in which the data is contained.
+While the separation of data in different folders can be arbitrary,
+a folder cannot contain more than one hepdata entry
+or datasets that mix different processes, energies or experiment.
+Due to historical reasons and for backwards compatibility the special energy ``NOTFIXED`` is used
+for datasets where more than one center of mass energy is used.
+When in doubt, it is preferable to utilize two different folders.
+The ``<extras>`` string is free and can be used to disambiguate.
+
+The data downloaded or parsed from hepdata or other sources is kept in the
+``<setname>/<rawdata>`` folder and it is not installed with the rest of the code.
+Each folder must contain a ``<setname>/metadata.yaml`` file which will define
+all datasets implemented within the folder and that will be described below.
+Only ``.yaml`` file are allowed to be installed together with the ``nnpdf`` code.
+
+In order to keep backward compatibility and allow the reproducibility of the 4.0 family of fits
+a ``dataset_names.yml`` file keeps a mapping of the datasets that were used in 4.0.
+When using the old names in a runcard, ``validphys`` will automatically translate
+them using this file.
+The format of this mapping is as follow:
+
+..  code-block:: yaml
+
+    old_name_1:
+        dataset: new_name_1
+        variant: legacy
+
+                                 
+CommonData Metadata specification
+---------------------------------
+
+The ``metadata.yaml`` file defines unequivocally the datasets implemented within a folder.
+The general structure is a first portion of general information (references, name of the set)
+and a list of ``implemented_observables`` which define separate datasets.
+
+
+Shared information
+==================
+
+
+..  code-block:: yaml
+
+    setname: "EXPERIMENT_PROCESS_ENERGY{_EXTRA}"
+
+    version: 1
+    version_comment: "A comment about this version"
+
+    # References
+    arXiv:
+        url: "https://arxiv.org/abs/XYZ.ABC"
+    iNSPIRE:
+        url: "https://inspirehep.net/literature/XYZ"
+    hepdata:
+        url: "https://www.hepdata.net/record/insXYZ"
+        version: 1
+
+    nnpdf_metadata:
+        nnpdf31_process: "PROCESS"
+        experiment: "EXPERIMENT_NAME"
+
+    implemented_observables:
+      - observable_metadata_1
+      - observable_metadata_2
+
+
+The header of the ``metadata.yaml`` file contains information shared among different datasets.
+
+Setname
+~~~~~~~
+
+Correspond to the name of the set and must be equal to the folder. It acts a s a sanity check.
+
+Versioning
+~~~~~~~~~~
+
+The initial version of a dataset should be set to ``version: 1``.
+Any change on a dataset should be *always* accompanied of a version bump and a ``version_comment`` explaining the update.
+This will allow to keep an exact tracking of all changes to every dataset even if they change over time due to bugs, updates in hepdata, etc.
+
+References
+~~~~~~~~~~
+
+References to the original source of the data. 
+This can be ``arXiv``, ``iNSPIRE`` or ``hepdata``.
+All information must be provided unless it is explicitly missing.
+
+nnpdf_metadata
+~~~~~~~~~~~~~~
+
+Grouping information used internally by ``validphys`` up to NNPDF4.0.
+It accepts the keys ``experiment``, which should in general coincide
+with the ``EXPERIMENT`` key in the ``<setname>`` and the key ``nnpdf31_process``
+which is the process grouping information used in the 3.1 and 4.0 MHOU papers.
+
+Observable specific information
+===============================
+
+Within a ``metadata.yaml`` we can find one or more implemented datasets.
+These correspond to different observables of a single measurement.
+For instance, the LHCB publication of Z rapidity measurements at 13 TeV
+(``setname: LHCB_Z0_13TEV``) contains two observables: Z decay into two electrons
+and Z decay into 2 muons.
+This setname contain two datasets: ``LHCB_Z0_13TEV_DIELECTRON-Y`` and ``LHCB_Z0_13TEV_DIMUON-Y``.
+
+In the following we describe the metadata corresponding to the observable within the ``metadata.yaml`` file.
+
+
+..  code-block:: yaml
+    
+   implemented_observables:
+    - observable_name: "DIMUON-Y"
+      process_type: "EWK_RAP"
+      tables: [5]
+      ndata: 18
+      observable:
+        description: "Differential cross-section of Z-->µµ as a function of Z-rapidity"
+        label: r"$d\sigma / d|y|$"
+        units: "[fb]"
+      kinematics:
+        file: kinematics_dimuon.yaml
+        variables:
+          y: {description: "Z boson rapidity", label: "$y$", units: ""}
+          M2: {description: "Z boson Mass", label: "$M^2$", units: "$GeV^2$"}
+          sqrts: {description: "Center of Mass Energy", label: '$\sqrt{s}$', units: "$GeV$"}
+      kinematic_coverage: [y, M2, sqrts]
+      data_central: data_dimuon.yaml
+      data_uncertainties:
+        - uncertainties_dimuon.yaml
+      variants:
+        - example_variant:
+            data_uncertainties:
+              - uncertainties_different_treatment.yaml
+      theory:
+        FK_tables:
+          - - LHCB_DY_13TEV_DIMUON
+        operation: 'null'
+        conversion_factor: 1000.0
+      # Plotting information
+      plotting:
+        dataset_label: "LHCb $Z\\to µµ$"
+        plot_x: y
+        y_label: '$d\sigma_{Z}/dy$ (fb)'
+
+``observable_name``
+~~~~~~~~~~~~~~~~~~~
+The observable name is used to construct the full name of the dataset ``<setname>_<observable_name>``.
+It must be unique within a set and contain no ``_`` (as it could lead to confusion).
+
+``process_name``
+~~~~~~~~~~~~~~~~
+One of the processes defined in the ``process_options`` module at
+``validphys/src/validphys2/process_options.py``.
+This is used internally by validphys to describe the combination of observable
+and process in various plots, to check that the kinematic variables utilized by the
+dataset are sensible and to generate derived plots such as the ``x-q2`` kinematic coverage plots.
+
+``tables``
+~~~~~~~~~~
+Tables from the hepdata entries that have been used to construct the dataset
+
+``ndata``
+~~~~~~~~~
+Number of datapoints in the dataset.
+While this quantity could be derived from the data itself,
+many other pieces (crucially backwards compatibility with cuts and theories) requires
+the number of datapoints to be set in stone.
+If an update requires to change the number of datapoint,
+it should be added as a separate observable.
+
+``observable``
+~~~~~~~~~~~~~~
+This is a dictionary with the entries ``description``, ``label`` and ``units``.
+All entries must be latex-compilable as they are used by various plotting routines in ``validphys``.
+
+``kinematics::file``
+~~~~~~~~~~~~~~~~~~~~
+A reference to a ``.yaml`` file containing all kinematic information.
+The file contain a list of ``ndata`` ``bins`` for which information about all variables
+is included for all bins.
+When ``mid`` is not given, it will be automatically filled with the midpoint between min and max.
+Only ``mid`` is used for cuts, while ``min`` and ``max`` may be used for plotting routines.
+
+..  code-block:: yaml
+
+    bins:
+        - var_1:
+            min: 0
+            max: 1
+            mid: 0.5
+          var_2:
+            min: 0
+            max: 1
+            mid: 0.5
+
+``kinematics::variables``
+~~~~~~~~~~~~~~~~~~~~~~~~~
+Metadata for each of the variables contained in the ``kinematics::file``
+and which can be ``description``, ``label`` and ``units``.
+Latex syntax is accepted and encouraged since they will be used by plotting routines.
+
+..  code-block:: yaml
+
+    variables:
+      var_1: {description: "my var 1", label: "$m$", "units: "GeV"}
+
+
+``kinematic_coverage``
+~~~~~~~~~~~~~~~~~~~~~~
+A list of the variables within the kinematic files
+
+
+``data_central``
+~~~~~~~~~~~~~~~~
+A reference to a ``yaml`` file containing the measurement central data.
+The format of the data is a ``yaml`` file with an entry ``data_central`` which
+list for all values for all bins.
+
+..  code-block:: yaml
+
+    data_central:
+        - val1
+        - val2
+        - val3
+
+``data_uncertainties``
+~~~~~~~~~~~~~~~~~~~~~~
+A list of ``.yaml`` file containing the uncertainty information for the measurement.
+When using more than one uncertainty file they will be concatenated. 
+This allows the user the flexibility of creating variants
+where only a subset of the uncertainties are modified.
+
+The format of the uncertainty files is of two fields, a ``definitions`` field that contains
+metadata about all the uncertainties: name, treatment (``ADD`` or ``MULT``) and type
+and a second field ``bins`` which is a list of mappings with ``ndata`` entries
+with the named uncertainties.
+
+Note that, regardless of their treatment, uncertainties should always be written as absolute values
+and not relative to the data values. If the data should be updated, the uncertainties should be too.
+
+..  code-block:: yaml
+
+    definitions:
+        stat:
+            description:
+            treatment:
+            type:
+        error_name:
+            description:
+            treatment:
+            type:
+        error_name_2:
+            description:
+            treatment:
+            type:
+    bins:
+        - stat:
+          error_name:
+          error_name_2:
+
+
+
+
+``variants``
+~~~~~~~~~~~~
+
+In some occasions we might want to maintain two variations of the same observable.
+For instance, we might have two incompatible sources of uncertainties. In such case a variant can be added.
+These variants can overwrite certain keys if necessary.
+When a variant is used, the key under the variant will be used instead of the key defined in the observable.
+
+A ``variant`` can only overwrite the entries ``data_central``, ``theory`` and ``data_uncertainties``.
+Example:
+
+..  code-block:: yaml
+
+    data_uncertainties:
+        - uncertainties.yaml
+
+    variants:
+        name_of_the_variant:
+            data_uncertainties:
+                - uncertainties.yaml
+                - extra_uncertainties.yaml
+        another_variant:
+            data_central: different_data.yaml
+            data_uncertainties:
+                - different_uncertainties.yaml
+              
+When loading this dataset with no variant only the ``uncertainties.yaml`` file will be read.
+Instead, when choosing ``variant: name_of_the_variant``, both ``uncertainties.yaml`` and  ``extra_uncertainties.yaml`` will be loaded.
+If we select ``variant: another_variant`` both the ``data_uncertainties`` and the ``data_central`` keys will be substituted.
+Note that if we want to substitute the default set of uncertainties we just need to not include it in the variant (as done in ``another_variant``).
+
+``theory``
+~~~~~~~~~~
+
+The theory field defines how predictions for the dataset are to be computed.
+It includes two entries:
+
+- ``FK_tables``: this is a list of lists which defines the FK Tables to be loaded. The outermost list are the operands (in case an operation is needed to recover the observable, more on that below). The innermost list are the grids that are to be concatenated in order to form the operands.
+- ``operaton``: operation to be applied in order to compute the observable
+
+Example:
+
+..  code-block:: yaml
+  
+  theory: 
+  FK_tables:
+      - - Z_contribution
+        - Wp_contribution
+        - Wm_total
+      - - total_xs
+  operation: 'ratio'
+
+In this case the ``fktables`` for the Z, W+ and W- contributions will be concatenated (the dataset might include predictions for all three contributions).
+After that, the final observable will be computed by taking the ratio of the concatenation of all those observables and the total cross section (``total_xs``).
+
+``plotting``
+~~~~~~~~~~~~
+
+The ``plotting`` section defines the plotting style inside ``validphys``
+and is described in detail in :ref:`plotting-format`.
+
+Note that name of the variables need to be the same in the plotting and kinematics.
diff --git a/doc/sphinx/source/data/data-config.rst b/doc/sphinx/source/data/data-config.rst
index 86c6d65539..9e8c909fd2 100644
--- a/doc/sphinx/source/data/data-config.rst
+++ b/doc/sphinx/source/data/data-config.rst
@@ -22,25 +22,10 @@ located in the ``nnpdf`` git repository at
 	``validphys/src/validphys2/datafiles/commondata``
 
 where a separate ``CommonData`` file is stored for each *Dataset* with the
-filename format
-
-	``DATA_<SETNAME>.dat``
-
-Information on the treatment of systematic uncertainties, provided in
-``SYSTYPE`` files, is located in the subdirectory
-
-	``commondata/systypes``
+filename format described in :ref:`dataset-naming-convention`.
+The data is installed as part of the python package of ``nnpdf``,
+all data files to be installed must have a ``.yaml`` extension.
 
-Here several ``SYSTYPE`` files may be supplied for each *Dataset*. The
-various options are enumerated by suffix to the filename. The filename format
-for ``SYSTYPE`` files is therefore
-
-	``SYSTYPE_<SETNAME>_<SYSID>.dat``
-
-Where the default systematic ID is **DEFAULT**. As an example, consider
-the first ``SYSTYPE`` file for the D0ZRAP *Dataset*:
-
-	``SYSTYPE_D0ZRAP_DEFAULT.dat``
 
 Theory lookup table
 ===================
@@ -78,25 +63,20 @@ contains the following directory structure
 
 	| ``theory_X/``
 	|	``-cfactor/``
-	|	``-compound/``
 	|	``-fastkernel/``
 
 Inside the directory ``theory_X/cfactor/`` are stored ``CFACTOR`` files
 with the filename format
 
-	``CF_<TYP>_<SETNAME>.dat``
+	``CF_<TYP>_<FKNAME>.dat``
 
 where ``<TYP>`` is a three-letter designation for the source of the C-factor
-(e.g. EWK or QCD) and ``<SETNAME>`` is the typical *Dataset* designation.
-The directory ``theory_X/compound/`` contains the ``COMPOUND`` files
-described earlier, this time with the filename format
-
-	``FK_<SETNAME>-COMPOUND.dat``
+(e.g. EWK or QCD) and ``<FKNAME>`` is the FK-Table to which it should be applied.
 
 Finally the ``FK`` tables themselves are stored in ``theory_X/fastkernel/``
 with the filename format
 
-	``FK_<SETNAME>.dat``
+	``<FKNAME>.pineappl.lz4``
 
 Naturally, all of the FastKernel and C-factor files within the directory
 ``theory_X/`` have been determined with the theoretical parameters specified in
diff --git a/doc/sphinx/source/data/dataset-naming-convention.rst b/doc/sphinx/source/data/dataset-naming-convention.rst
index daed57803d..87c43c3278 100644
--- a/doc/sphinx/source/data/dataset-naming-convention.rst
+++ b/doc/sphinx/source/data/dataset-naming-convention.rst
@@ -1,3 +1,6 @@
+.. _dataset-naming-convention:
+
+
 =================================
 NNPDF's dataset naming convention
 =================================
@@ -5,27 +8,21 @@ NNPDF's dataset naming convention
 Each dataset implemented in NNPDF must have a unique name, which is a string
 constructed following this [Backus–Naur form]::
 
-  <valid dataset name> ::= <experiment> "_" <process>
-                        | <experiment> "_" <process> "_" <energy>
-                        | <experiment> "_" <process> "_" <variant>
-                        | <experiment> "_" <process> "_" <energy> "_" <variant>
-
-  <experiment> ::= "ATLAS" | "BCDMS" | "CHORUS" | "CMS" | "E605" | "E866"
-                | "E906" | "EMC" | "HERA" | "LHCB" | "NMC" | "NNPDF" | "NUTEV"
+  <valid set name> ::= <experiment> "_" <process> "_" <energy>
+                     | <experiment> "_" <process> "_" <energy> "_" <extra_information>
 
-  <process> ::= "1JET" | "2JET" | "CC" | "DY" | "H" | "HVBF" | "INTEG" | "NC"
-              | "POS" | "TTB" | "WM" | "WMWP" | "WP" | "WPZ" | "ZPT"
+  <valid dataset name> ::= <set name> "_" <observable name>
 
-  <integer> ::= TODO
+  <experiment> ::= "ATLAS" | "BCDMS" | "CDF" | "CHORUS" | "CMS" | "D0" | "DYE605" | "DYE866" 
+                | "DYE906" | "EMC" | "H1" | "HERA" | "LHCB" | "NMC" | "NNPDF" | "NUTEV" | "SLAC"
+                | "ZEUS"
 
-  <string> ::= TODO
+  <process> ::= "1JET" | "2JET" | "CC" | "DY" | "INTEG" | "NC" | "PH" | "POS" | "SINGLETOP" 
+              | "TTBAR" | "WCHARM" | "WJ" | "WMWP" | "WP" | "WPWM" | "Z0" | "Z0J"
 
-  <energy> ::= <integer> "GEV" | <integer> "TEV"
+  <energy> ::= <integer> <unit> | <integer> "P" <integer> <unit>  | "NOTFIXED"
 
-  <variant> ::= <string>
-              | <string> "_" <string>
-              | <string> "_" <string> "_" <string>
-              | <string> "_" <string> "_" <string> "_" <string>
+  <extra_information> ::= <string>
 
 
 Experiments
@@ -60,7 +57,7 @@ Processes
 - `1JET`: single-jet inclusive production
 - `2JET`: dijet production
 - `CC`: DIS charged-current
-- `DY`: lepton-pair production (neutral current off-shell Drell–Yan)
+- `Z0`: lepton-pair production (neutral current off-shell Drell–Yan)
 - `H`: on-shell Higgs-boson production
 - `HVBF`: production of an on-shell Higgs-boson with two jets (vector-boson
   fusion)
@@ -69,7 +66,7 @@ Processes
 - `NC`: DIS neutral-current
 - `POS`: auxiliary dataset for positivity constraints; only valid for
   `NNPDF` experiment
-- `TTB`: top–anti-top production
+- `TTBAR`: top–anti-top production
 - `WM`: production of a single negatively-charged lepton (charged current
   off-shell Drell–Yan)
 - `WMWP`: production of two opposite-sign different flavor leptons (W-diboson
@@ -77,7 +74,7 @@ Processes
 - `WP`: production of a single positively-charged lepton (charged current
   off-shell Drell–Yan)
 - `WPZ`: production of three leptons (WZ-diboson production)
-- `ZPT`: production of two same-flavor opposite-sign leptons with non-zero
+- `Z0PT`: production of two same-flavor opposite-sign leptons with non-zero
   total transverse momentum (Z-boson pt spectrum)
 
 `Backus–Naur form <https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form>`_
diff --git a/doc/sphinx/source/data/example-fk-preamble.rst b/doc/sphinx/source/data/example-fk-preamble.rst
deleted file mode 100644
index a5736644c3..0000000000
--- a/doc/sphinx/source/data/example-fk-preamble.rst
+++ /dev/null
@@ -1,212 +0,0 @@
-.. _example_fk_preamble:
-
-========================
-Example: ``FK`` preamble
-========================
-
-DIS preamble - BCDMSD
-=====================
-
-	| {GridDesc___________________________________________________
-	| -------------------------------
-	| FK_BCDMSD.dat
-	| -------------------------------
-	| _VersionInfo________________________________________________
-	| *APFEL: 2.6.1
-	| *libnnpdf: 1.1.0b
-	| _GridInfo___________________________________________________
-	| *HADRONIC: 0
-	| *NDATA: 254
-	| *NX: 50
-	| *SETNAME: BCDMSD
-	| {FlavourMap_________________________________________________
-	| 0 1 1 0 0 0 0 0 0 0 1 1 0 0
-	| _TheoryInfo_________________________________________________
-	| *DAMP: 1
-	| *FNS: FONLL-C
-	| *GF: 1.16638e-05
-	| *HQ: MSBAR
-	| *IC: 0
-	| *MP: 0.938
-	| *MW: 80.398
-	| *MZ: 91.1876
-	| *MaxNfAs: 5
-	| *MaxNfPdf: 5
-	| *ModEv: TRN
-	| *NfFF: 5
-	| *PTO: 2
-	| *Q0: 1
-	| *QED: 0
-	| *Qedref: 1.777
-	| *Qmb: 4.18
-	| *Qmc: 3
-	| *Qmt: 162.7
-	| *Qref: 91.2
-	| *SIN2TW: 0.23126
-	| *SxOrd: LL
-	| *SxRes: 0
-	| *TMC: 1
-	| *TheoryID: 7
-	| *XIF: 1
-	| *XIR: 1
-	| *alphaqed: 0.00749625
-	| *alphas: 0.118
-	| *mb: 4.18
-	| *mc: 0.986
-	| *mt: 162.7
-	| {xGrid______________________________________________________
-	| 6.9265888619991195e-02
-	| 7.7677574001058236e-02
-	| 8.6760599033455912e-02
-	| 9.6515727077269992e-02
-	| 1.0693847246838524e-01
-	| 1.1801962180968653e-01
-	| 1.2974586013120251e-01
-	| 1.4210045166737728e-01
-	| 1.5506393063634324e-01
-	| 1.6861476611854062e-01
-	| 1.8272997502743873e-01
-	| 1.9738566676226815e-01
-	| 2.1255751145471796e-01
-	| 2.2822113029454361e-01
-	| 2.4435241115381084e-01
-	| 2.6092775579054239e-01
-	| 2.7792426659347097e-01
-	| 2.9531988146590743e-01
-	| 3.1309346535041777e-01
-	| 3.3122486633420206e-01
-	| 3.4969494345265562e-01
-	| 3.6848557237516494e-01
-	| 3.8757963421332448e-01
-	| 4.0696099179998674e-01
-	| 4.2661445698241623e-01
-	| 4.4652575176931059e-01
-	| 4.6668146557197077e-01
-	| 4.8706901027900074e-01
-	| 5.0767657449372061e-01
-	| 5.2849307792672917e-01
-	| 5.4950812667484750e-01
-	| 5.7071196990123374e-01
-	| 5.9209545827198862e-01
-	| 6.1365000437161166e-01
-	| 6.3536754522794392e-01
-	| 6.5724050700057512e-01
-	| 6.7926177183385794e-01
-	| 7.0142464683629069e-01
-	| 7.2372283512038826e-01
-	| 7.4615040881848282e-01
-	| 7.6870178397770295e-01
-	| 7.9137169723166323e-01
-	| 8.1415518414141896e-01
-	| 8.3704755910070550e-01
-	| 8.6004439670038091e-01
-	| 8.8314151445118372e-01
-	| 9.0633495676848319e-01
-	| 9.2962098012648797e-01
-	| 9.5299603929602150e-01
-	| 9.7645677458414570e-01
-	| {FastKernel_________________________________________________
-
-Hadronic preamble - CDFR2KT
-===========================
-
-	| {GridDesc___________________________________________________
-	| -----------------------------------------------------------
-	| FK_CDFR2KT.dat
-	| -----------------------------------------------------------
-	| _VersionInfo________________________________________________
-	| *APFEL: 2.6.1
-	| *libnnpdf: 1.1.0b
-	| {Readme_____________________________________________________
-	| ***********************************************************************
-	| ExpName: CDFR2KT
-	| Author: FastNLO authors
-	| Date: 2010
-	| CodesUsed: NLOjet++/FastNLO (scenario fnt2004 from FastNLO webpage)
-	| AdditionalInfo: incl. jets, kT algo D=0.7
-	| ***********************************************************************
-	| _GridInfo___________________________________________________
-	| *HADRONIC: 1
-	| *NDATA: 76
-	| *NX: 30
-	| *SETNAME: CDFR2KT
-	| {FlavourMap_________________________________________________
-	| 0 0 0 0 0 0 0 0 0 0 0 0 0 0
-	| 0 1 1 1 0 0 1 0 0 0 1 1 0 0
-	| 0 1 1 1 0 0 1 0 0 0 1 1 0 0
-	| 0 1 1 1 0 1 1 0 0 0 0 1 0 0
-	| 0 0 0 0 1 0 0 0 0 0 0 0 0 0
-	| 0 0 0 1 0 1 1 0 0 0 1 0 0 0
-	| 0 1 1 1 0 1 1 0 0 0 0 1 0 0
-	| 0 0 0 0 0 0 0 0 0 0 0 0 0 0
-	| 0 0 0 0 0 0 0 0 0 0 0 0 0 0
-	| 0 0 0 0 0 0 0 0 0 1 0 0 0 0
-	| 0 1 1 0 0 1 0 0 0 0 1 1 0 0
-	| 0 1 1 1 0 0 1 0 0 0 1 1 0 0
-	| 0 0 0 0 0 0 0 0 0 0 0 0 0 0
-	| 0 0 0 0 0 0 0 0 0 0 0 0 0 0
-	| _TheoryInfo_________________________________________________
-	| *DAMP: 1
-	| *FNS: FONLL-C
-	| *GF: 1.16638e-05
-	| *HQ: MSBAR
-	| *IC: 0
-	| *MP: 0.938
-	| *MW: 80.398
-	| *MZ: 91.1876
-	| *MaxNfAs: 5
-	| *MaxNfPdf: 5
-	| *ModEv: TRN
-	| *NfFF: 5
-	| *PTO: 2
-	| *Q0: 1
-	| *QED: 0
-	| *Qedref: 1.777
-	| *Qmb: 4.18
-	| *Qmc: 3
-	| *Qmt: 162.7
-	| *Qref: 91.2
-	| *SIN2TW: 0.23126
-	| *SxOrd: LL
-	| *SxRes: 0
-	| *TMC: 1
-	| *TheoryID: 7
-	| *XIF: 1
-	| *XIR: 1
-	| *alphaqed: 0.00749625
-	| *alphas: 0.118
-	| *mb: 4.18
-	| *mc: 0.986
-	| *mt: 162.7
-	| {xGrid______________________________________________________
-	| 4.0941945000024672e-03
-	| 5.9356426849003037e-03
-	| 8.5647477735742213e-03
-	| 1.2278230204351056e-02
-	| 1.7448602544560710e-02
-	| 2.4515641282009264e-02
-	| 3.3957625320032526e-02
-	| 4.6241012256902900e-02
-	| 6.1757804939792604e-02
-	| 8.0769759935090835e-02
-	| 1.0337895878919207e-01
-	| 1.2953267418094364e-01
-	| 1.5905525671030885e-01
-	| 1.9169158055350388e-01
-	| 2.2714813737177489e-01
-	| 2.6512436628283159e-01
-	| 3.0533281023729242e-01
-	| 3.4750997595899380e-01
-	| 3.9142071068612511e-01
-	| 4.3685860760309952e-01
-	| 4.8364426537988547e-01
-	| 5.3162257521672562e-01
-	| 5.8065972288573631e-01
-	| 6.3064027352226959e-01
-	| 6.8146451295139832e-01
-	| 7.3304610913825119e-01
-	| 7.8531009886079706e-01
-	| 8.3819117643580765e-01
-	| 8.9163224991215573e-01
-	| 9.4558322764065939e-01
-	| {FastKernel_________________________________________________
diff --git a/doc/sphinx/source/data/exp-data-files.rst b/doc/sphinx/source/data/exp-data-files.rst
deleted file mode 100644
index 016523c96a..0000000000
--- a/doc/sphinx/source/data/exp-data-files.rst
+++ /dev/null
@@ -1,245 +0,0 @@
-.. _exp_data_files:
-
-=======================
-Experimental data files
-=======================
-
-Data made available by experimental collaborations comes in a variety of
-formats. For use in a fitting code, this data must be converted into a common
-format that contains all the required information for use in PDF fitting.
-Existing formats commonly used by the community, such as in `HepData <https://www.hepdata.net/>`_,
-are generally unsuitable. Principally as they often do not fully describe the
-breakdown of systematic uncertainties. Therefore over several years an NNPDF
-standard data format has been iteratively developed, now denoted
-``CommonData``. In addition to the ``CommonData`` files themselves, in the
-``nnpdf++`` project the user has the ability to vary the treatment of individual
-systematic errors by use of parameter files denoted ``SYSTYPE`` files. In this
-section we shall detail the specifications of these two files.
-
-In principle, the file specification and classes described in this section are
-independent of the ``nnpdf++`` project and may be generated by whatever means
-the user sees fit.  In practice, the ``CommonData`` and ``SYSTYPE`` files
-are generated by the ``buildmaster`` project of ``nnpdf++`` from the raw
-experimental data files.
-
-.. _process_type_label:
-
-Process types and kinematics
-============================
-
-Before going into the file formats, we shall summarise the identifying features
-used for data in the ``nnpdf++`` code.
-
-Each data point has an associated *process type* string. This can be
-specified by the user, but **must** begin with the appropriate identifying
-base process type. Additionally for each data point three kinematic values are
-given, the *process type* being primarily to identify the nature of these
-values. Typically the first kinematic variable is the principal differential
-quantity used in the measurement. The second kinematic variable defines the
-scale of the process. The third is generally the centre-of-mass energy of the
-process, or inelasticity in the case of DIS. The allowed basic process types,
-and their corresponding three kinematic variables are outlined below.
-
-* **DIS** - Deep inelastic scattering measurements: :math:`(x,Q^2,y)`
-* **DYP** - Fixed-target Drell-Yan measurements: :math:`(y,M^2,\sqrt{s})`
-* **JET** - Jet production: :math:`(\eta,p_T^2,\sqrt{s})`
-* **DIJET** - Dijet production: :math:`(\eta,m_{12},\sqrt{s})`
-* **PHT** - Photon production: :math:`(\eta_\gamma,E_{T,\gamma}^2,\sqrt{s})`
-* **INC** - A total inclusive cross-section: :math:`(0,\mu^2,\sqrt{s})`
-* **EWK\_RAP** - Collider electroweak rapidity distribution: :math:`(\eta/y,M^2,\sqrt{s})`
-* **EWK\_PT** - Collider electroweak :math:`p_T` distribution: :math:`(p_T,M^2,\sqrt{s})`
-* **EWK\_PTRAP** - Collider electroweak :math:`p_T, y` distribution: :math:`(\eta/y, p_T^2,\sqrt{s})`
-* **EWK\_MLL** - Collider electroweak lepton-pair mass distribution: :math:`(M_{ll},M_{ll}^2,\sqrt{s})`
-* **EWJ\_(J)RAP** - Collider electroweak + jet boson(jet) rapidity distribution: :math:`(\eta/y,M^2,\sqrt{s})`
-* **EWJ\_(J)PT** - Collider electroweak + jet boson(jet) :math:`p_T` distribution: :math:`(p_T,M^2,\sqrt{s})`
-* **EWJ\_(J)PTRAP** - Collider electroweak + jet boson(jet) :math:`p_T, y` distribution: :math:`(\eta/y, p_T^2,\sqrt{s})`
-* **EWJ\_MLL** - Collider electroweak+jet lepton-pair mass distribution: :math:`(M_{ll},M_{ll}^2,\sqrt{s})`
-* **HQP\_YQQ** - Heavy diquark system rapidity :math:`(y^{QQ},\mu^2,\sqrt{s})`
-* **HQP\_MQQ** - Heavy diquark system mass :math:`(M^{QQ},\mu^2,\sqrt{s})`
-* **HQP\_PTQQ** - Heavy diquark system :math:`p_T` :math:`(p_T^{QQ},\mu^2,\sqrt{s})`
-* **HQP\_YQ** - Heavy quark rapidity :math:`(y^Q,\mu^2,\sqrt{s})`
-* **HQP\_PTQ** - Heavy quark :math:`p_T` :math:`(p_T^Q,\mu^2,\sqrt{s})`
-* **HIG\_RAP** - Higgs boson rapidity distribution :math:`(y,M_H^2,\sqrt{s})`
-
-As examples of *process type* strings, consider **EWK\_RAP** for a
-collider :math:`W` boson asymmetry measurement binned in rapidity, and
-**DIS\_F2P** for the :math:`F_2^p` structure function in DIS. The user is free to
-choose something identifying for the second segment of the process type, the
-important feature being the basic process type. However, users are encouraged to
-only use this freedom when absolutely necessary (such as when used in
-combination with APFEL).
-
-One special case is that of :math:`W` boson lepton asymmetry measurements, which being
-cross-section asymmetries may occasionally have negative data points. Therefore
-asymmetry measurements must have the final tag **ASY** to ensure that
-artificial data generation permits negative data values. An example
-*process type* string would be **EWK\_RAP\_ASY**.
-
-Notes for the future
---------------------
-
-In the future it would be nice to have a more flexible treatment of the
-kinematic variables, both in their number and labelling.
-
-``CommonData`` file format
-==============================
-
-Each experimental *Dataset* has its own ``CommonData`` file.
-``CommonData`` files contain the bulk of the experimental information used in the
-``nnpdf++`` project, with the only other experimental data files controlling
-the treatment and correlation of systematic errors. Each ``CommonData`` file
-is a plaintext file whose layout is described in the following.
-
-The first line begins with the *Dataset* name, the number of systematic
-errors, and the number of data points in the set, whitespace separated. For
-example, for the ATLAS 2010 jet measurement the first line of the file reads:
-
-	ATLASR04JETS36PB        91      90
-
-Which demonstrates that the set *name* is 'ATLASR04JETS36PB', that there
-are 91 sources of systematic uncertainty, 90 data points, one associated ``FK``
-table, and that the ``FK`` table corresponds to a proton initial state. As
-another example, consider the NMCPD *Dataset*:
-
-	NMCPD   5       211
-
-Here there are 5 sources of systematic uncertainty and 211 data points.
-Following this, each line specifies the details of a single data point. The first
-value being the data point index :math:`1< i_{\text{dat}} \leq N_{\mathrm{dat}}`,
-followed by the *process type* string as outlined above, and the three
-kinematic variables in order. These are followed by the value of the
-experimental data point itself, and the value of the statistical uncertainty
-associated with it (absolute value). Finally the systematic uncertainties are
-specified. The layout per data point is therefore
-
-	:math:`i_{\mathrm{dat}}`   *ProcessType* :math:`\text{kin}_1 \text{kin}_2 \text{kin}_3` data\_value stat\_error  :math:`[..` systematics :math:`..]`
-
-For example, in the case of a DIS data point from the BCDMSD *Dataset*:
-
-	1    DIS\_F2D 7.0e-02   8.75e+00   5.666e-01   3.6575e-01   6.43e-03 :math:`[..` systematics :math:`..]`
-
-In these lines the systematic uncertainties are laid out as so. For each
-uncertainty, additive and multiplicative versions are given. The additive
-uncertainty is given by absolute value, and the multiplicative as a percentage
-of the data value (that is, relative error multiplied by 100). The systematics
-string is formed by the sequence of :math:`N_{\text{sys}}` pairs of systematic
-uncertainties:
-
-	:math:`[..` systematics :math:`..] =  \sigma^{\mathrm{add}}_0 \quad  \sigma^{\mathrm{mul}}_0\quad \sigma^{\mathrm{add}}_1 \quad \sigma^{\mathrm{mul}}_1 \quad....\quad \sigma^{\mathrm{add}}_n  \quad\sigma^{\mathrm{mul}}_n`
-
-where :math:`\sigma^{\mathrm{add}}_i` and :math:`\sigma^{\mathrm{mul}}_i` are the additive
-and multiplicative versions respectively of the systematic uncertainty arising
-from the :math:`i\text{th}` source. While it may seem at first that the multiplicative error
-is spurious given the presence of the additive error and data central value,
-this may not be the case. For example, in a closure test scenario, the data
-central values may have been replaced in the ``CommonData`` file by
-theoretical predictions. Therefore if you wish to use a covariance matrix
-generated with the original multiplicative uncertainties via the :math:`t_0` method,
-you must also store the original multiplicative (percentage) error. For
-flexibility and ease of I/O this is therefore done in the ``CommonData`` file
-itself.
-
-For a *Dataset* with :math:`N_{\text{dat}}` data points and :math:`N_{\text{sys}}`
-sources of systematic uncertainty, the total ``CommonData`` file should
-therefore be :math:`N_{\text{dat}}+1` lines long. Its first line contains the set
-parameters, and every subsequent line should consist of the description of a
-single data point. Each data point line should therefore contain :math:`7 +
-2N_{\text{sys}}` columns.
-
-``SYSTYPE`` file format
-=======================
-
-The explicit presentation of the systematic uncertainties in the
-``CommonData`` file allows for a great deal of flexibility in the treatment of
-these errors. Specifically, whether they should be treated as additive or
-multiplicative uncertainties, and how they are correlated, both within the
-*Dataset* and within a larger *Experiment*. A specification for how
-the systematic uncertainties should be treated is provided by a ``SYSTYPE``
-file. As there is not always an unambiguous method for the treatment of these
-uncertainties, these information is kept outside the (unambiguous)
-``CommonData`` file. Several options for this treatment are often provided in the
-form of multiple ``SYSTYPE`` files which may be selected between in the fit.
-
-Each ``SYSTYPE`` file begins with a line specifying the total number of
-systematics. Naturally this must match with the :math:`N_{\text{sys}}` variable
-specified in the associated ``CommonData`` file. This is presented as a single
-integer. For example, in the case of the BCDMSD ``SYSTYPE`` files, the first line is
-
-	8
-
-as there are :math:`N_{\text{sys}}=8` sources of systematic uncertainty for this
-*Dataset*. Following this line there are :math:`N_{\text{sys}}` lines describing each
-source of systematic uncertainty. For each source two parameters are provided,
-the *uncertainty treatment* and the *uncertainty description*. These
-are laid out for each systematic as:
-
-	:math:`i_{\text{sys}}`	[*uncertainty treatment*]	[*uncertainty description*]
-
-where :math:`1< i_{\text{sys}} \leq N_{\mathrm{sys}}` enumerates each systematic. The
-*uncertainty treatment* determines whether the uncertainty should be
-treated as additive, multiplicative, or in cases where the choice is unclear, as
-randomised on a replica by replica basis. These choices are selected by using
-the strings **ADD**, **MULT**, or **RAND**. The *uncertainty
-description* specifies how the systematic is to be correlated with other
-data points. There are three special cases for the *uncertainty
-description*, specified by the strings **CORR**, **UNCORR**,
-**THEORYCORR**, **THEORYUNCORR** and **SKIP**. The first two
-specify whether the systematic is fully correlated **only** within the
-*Dataset* (**CORR**), or whether the systematic is totally
-uncorrelated (**UNCORR**). The **THEORY** descriptor is used to
-describe theoretical systematics due to e.g missing NNLO corrections, which are
-treated as either **CORR** or **UNCORR** according to their suffix,
-but are not included in the generation of artificial replicas (their only
-contribution is to the fitting error function). If the user wishes to correlate
-a specific uncertainty between multiple *Datasets* within an
-*Experiment*, then they should use a custom *uncertainty description*.
-When building a covariance matrix for an *Experiment*, the ``nnpdf++``
-code checks for matches between the *uncertainty descriptions* of
-systematics of its constituent *Datasets*. If a match is found, the code
-will correlate those systematics over the relevant datasets. The **SKIP**
-descriptor removes the systematic from the covariance matrices for debugging
-purposes.
-
-As an example, let us consider an NNPDF2.3 standard ``SYSTYPE`` for the BCDMSD
-*Dataset*:
-
-	| 8
-	| 1    ADD    BCDMSFB
-	| 2    ADD    BCDMSFS
-	| 3    ADD    BCDMSFR
-	| 4    MULT    BCDMSNORM
-	| 5    MULT    BCDMSRELNORMTARGET
-	| 6    MULT    CORR
-	| 7    MULT    CORR
-	| 8    MULT    CORR
-
-Here the first five systematics have custom *uncertainty descriptions*,
-thereby allowing them to be cross-correlated with other *Datasets* in a
-larger *Experiment*. Systematics six to eight are specified as being fully
-correlated, but only within the BCDMSD  *Dataset*. Additionally note that
-the first three systematics are specified as additive, and the remainder are
-multiplicative. If we compare now to the equivalent ``SYSTYPE`` file for the
-BCDMSP *Dataset*:
-
-	| 11
-	| 1    ADD    BCDMSFB
-	| 2    ADD    BCDMSFS
-	| 3    ADD    BCDMSFR
-	| 4    MULT    BCDMSNORM
-	| 5    MULT    BCDMSRELNORMTARGET
-	| 6    MULT    CORR
-	| 7    MULT    CORR
-	| 8    MULT    CORR
-	| 9    MULT    CORR
-	| 10    MULT    CORR
-	| 11    MULT    CORR
-
-it is clear that the first five systematics are the same as in the BCDMSD
-*Dataset*, and therefore should the two sets be combined into a common
-*Experiment*, the code will cross-correlate them appropriately. The
-combination of ``SYSTYPE`` and ``CommonData`` is quite flexible. As stated
-previously, once generated from the original raw experimental data, the
-``CommonData`` file is fixed and should not be altered apart from for the purpose
-of correcting errors. In practice the full details on the systematic correlation
-and their treatment is often not precisely specified. This system allows for the
-safe variation of these parameters for testing purposes.
diff --git a/doc/sphinx/source/data/fk-config-variables.rst b/doc/sphinx/source/data/fk-config-variables.rst
deleted file mode 100644
index 142ba92b19..0000000000
--- a/doc/sphinx/source/data/fk-config-variables.rst
+++ /dev/null
@@ -1,18 +0,0 @@
-.. _fk_config_variables:
-
-==============================
-``FK`` configuration variables
-==============================
-
-Table specifying the required elements of the GridInfo ``FK`` header
-segment. The Key column specifies the exact format of the Key in the K-V pair
-used in the GridInfo segment.
-
-========  =======  ======================  ==================================
-Key       Type     Description             Comments
-========  =======  ======================  ==================================
-SETNAME   String   *SetName*               N/A
-HADRONIC  Boolean  Hadronic flag           0 or 1
-NDATA     Integer  :math:`N_{\text{dat}}`  Number of data points
-NX        Integer  :math:`N_x`             Number of :math:`x`-points in grid
-========  =======  ======================  ==================================
diff --git a/doc/sphinx/source/data/index.rst b/doc/sphinx/source/data/index.rst
index d18ed4b31d..3483436b5f 100644
--- a/doc/sphinx/source/data/index.rst
+++ b/doc/sphinx/source/data/index.rst
@@ -8,10 +8,9 @@ namely data files and the corresponding files containing theoretical predictions
    :maxdepth: 1
 
    ./intro
-   ./exp-data-files
+   ./commondata
+   ./dataset-naming-convention
    ./th-data-files
    ./data-config
-   ./fk-config-variables
    ./example-cfactor-file
-   ./example-fk-preamble
-   ./plotting_format
+   ./plotting-format
diff --git a/doc/sphinx/source/data/intro.rst b/doc/sphinx/source/data/intro.rst
index 30428a450c..7fdddc0a6f 100644
--- a/doc/sphinx/source/data/intro.rst
+++ b/doc/sphinx/source/data/intro.rst
@@ -2,19 +2,20 @@
 Introduction
 ============
 
-In the ``nnpdf++`` project, data files used by the code may be grouped into
+In the ``nnpdf`` project, data files used by the code may be grouped into
 two categories, theory and experiment. Experimental data and the information
-pertaining to the treatment of systematic errors are held in ``CommonData``
-and ``SYSTYPE`` files. ``FK`` tables, ``COMPOUND`` and ``CFACTOR`` files
+pertaining to the treatment of systematic errors are held in the ``CommonData`` files.
+``FK`` tables, and ``CFACTOR`` files
 store the precomputed information for use when calculating theoretical
-predictions corresponding to information held in the equivalent ``CommonData``
-file. In this section the file formats and naming conventions for these files
+predictions corresponding to information held in the equivalent ``CommonData``.
+In this section the file formats and naming conventions for these files
 will be detailed, along with the directory structure employed by the
-``nnpdf++`` code.
+``nnpdf`` code.
 
-For NNPDF3.1 and later fits, a considerably larger number of theory options will
-be explored than in previous determinations. In NNPDF3.0 the main theory
-variations used were perturbative order, value of the strong coupling and the
+For NNPDF4.0 and later fits, a considerably larger number of theory options will
+be explored than in previous determinations.
+The current theory documentation only refers to 4.0 and previous fits and is thus outdated.
+In NNPDF3.0 the main theory variations used were perturbative order, value of the strong coupling and the
 number of active flavours in the VFNS. For NNPDF3.1 and later, it has been necessary to
 accommodate variations in additional parameters, such as treatments of the heavy
 quark mass (pole vs MS-bar), scale variations, intrinsic charm, resummation
@@ -24,9 +25,9 @@ here.
 
 This section will begin by detailing the specifications for the file formats
 used by the code, first with the experimental data file formats and layouts in
-:ref:`exp_data_files` and secondly with the file formats used for
+:ref:`commondata` and secondly with the file formats used for
 theoretical predictions in :ref:`th_data_files`. Finally the organisation of
-these files within the ``nnpdf++`` structure will be described in
+these files within the ``nnpdf`` structure will be described in
 :ref:`org_data_files`.
 
 Important definitions
@@ -39,10 +40,10 @@ terminological points to note.
 -------------------------
 
 When referring to a collection of data points two words are used in the
-``nnpdf++`` code which have specific meanings. *Dataset* refers to the result
+``nnpdf`` code which have specific meanings. *Dataset* refers to the result
 of a specific measurement, typically associated with a single experimental paper
-and corresponds to the *DataSet* class in the ``nnpdf++`` code.
-*Experiment* refers to a collection of *Datasets* which are associated
+and corresponds to the *DataSet* class in the ``nnpdf`` code.
+*Experiment* refers to a collection of *Datasets* which might be associated
 by experimental cross-correlations. For example, the ATLAS 2010 R=0.4 inclusive
 jet measurement and the ATLAS 2011 high-mass Drell-Yan measurement are both
 examples of *Datasets* as used in the NNPDF3.0 analysis. Both of these
@@ -50,19 +51,8 @@ datasets are grouped into the ATLAS *Experiment* as they have systematic
 uncertainties that are cross-correlated with each other. In this document, when
 using these terms in this sense, they will be italicised for clarity.
 
-Note however that the concept of an *Experiment* is being phased out in the NNPDF
-code. For more information on this see :ref:`data_specification`.
-
-*Dataset* and *Experiment* names
---------------------------------
-
-When referred to, the *Dataset* and *Experiment* names refer to the
-short identifying string used in the code for each *Dataset* and
-*Experiment*.  For example, the *Dataset* name for the aforementioned
-ATLAS 2010 inclusive jet measurement with R=0.4 is ATLASR04JETS36PB.
-
-New dataset naming conventions
-------------------------------
+Dataset naming conventions
+--------------------------
 
 See :ref:`dataset_naming_convention` for a definition of how datasets should be
 named.
diff --git a/doc/sphinx/source/data/new-commondata.rst b/doc/sphinx/source/data/new-commondata.rst
deleted file mode 100644
index ef549a0c9d..0000000000
--- a/doc/sphinx/source/data/new-commondata.rst
+++ /dev/null
@@ -1,234 +0,0 @@
-Naming convention and organization of the datasets
---------------------------------------------------
-
-All datasets in the new data format follow the exact same naming convention::
-
-    <experiment>_<process>_<energy>{_<extras>}_<observable>
-
-The data is contained in folders, each folder containing one single hepdata publication. 
-In all cases one can reconstruct the name of the folder by separating the observable name on the last ``_``, i.e., the folder will always be named::
-
-    <experiment>_<process>_<energy>{_<extras>}
-
-Where all observables contained in one hepdata entry are separated by their observable name.
-
-Each folder will contain one single metadata file named ``metadata.yaml`` which defines all observables implemented for a given dataset.
-
-In order to keep backward compatibility and ease the comparison between new and old commondata, the ``buildmaster/dataset_names.yml`` file keeps a mapping of the datasets implemented in both formats.
-When a ``legacy`` variant is available, the usage of the old name automatically enables such variants. The format of this mapping is as follow (which enables using variants):
-
-..  code-block:: yaml
-
-    old_name_1: new_name_1
-    old_name_2:
-        dataset: new_name_2
-        variant: this_particular_variant
-
-
-Metadata Format
----------------
-
-This ``metadata.yaml`` file contains a first portion of general information which might be shared by several sets and a list of ``implemented_observables`` which define the separate observables.
-
-
-..  code-block:: yaml
-
-    setname: "EXPERIMENT_PROCESS_ENERGY{_EXTRA}"
-
-    version: 1
-    version_comment: "Initial implementation"
-
-    # References
-    arXiv:
-        url: ""
-    iNSPIRE:
-        url: "https://inspirehep.net/literature/302822"
-    hepdata:
-        url: "https://www.hepdata.net/record/ins302822"
-        version: 1
-
-    nnpdf_metadata:
-        nnpdf31_process: "PROCESS"
-        experiment: "EXPERIMENT_NAME"
-
-    implemented_observables:
-      - observable_name: "OBS"
-        observable:
-            description: "Description of the observable"
-            label: "Latex label for the observable"
-            units: "[u]"
-        ndata: n_of_datapoints
-        tables: [n, j, k] # (optional) corresponding tables in the hepdata entry
-        npoints: [n, j, k] # (optional) number of points per table
-        process_type: INC # for instance, INC, JET, DIJET, etc
-
-        # Plotting information (for instance, the kinematics variable could be pt, mt, q2)
-        plotting:
-            dataset_label: "Label to be used in reports"
-            kinematics_override: identity
-            x_scale: log
-            plot_x: var_1
-            figure_by:
-                - var_2
-
-        kinematic_coverage: [var_1, var_2, var_3]
-
-        kinematics:
-            variables:
-                var_1: {description: "Description of var", label: "latex", units: "u"}
-                var_2: {description: "Description of var", label: "latex", units: "u"}
-                var_3: {description: "Description of var", label: "latex", units: "u"}
-            file: kinematics.yaml
-
-        data_central: data.yaml
-        data_uncertainties:
-            - uncertainties.yaml
-            - uncertainties_2.yaml
-
-        # Having variants is optional
-        # variants can overwrite the data_uncertainties 
-        variants:
-            different_errors:
-                data_uncertainties:
-                    - uncertainties.yaml
-                    - uncertainties_3.yaml
-
-        # The theory field is always optional
-        theory: 
-            FK_tables:
-                - - DYE605
-            operation: 'null'
-
-
-
-
-Versioning
-~~~~~~~~~~
-
-The initial version of a dataset should be set to ``version: 1``.
-Any change on a dataset should be *always* accompanied of a version bump and a ``version_comment`` explaining the update.
-This will allow to keep an exact tracking of all changes to every dataset even if they change over time.
-
-Variants
-~~~~~~~~
-
-In some occasions we might want to maintain two variations of the same observable.
-For instance, we might have two incompatible sources of uncertainties. In such case a variant can be added.
-The syntax of the ``variants`` is.
-
-Theory
-~~~~~~
-
-The theory field defines how predictions for the dataset are to be computed.
-It includes two entries:
-
-- ``FK_tables``: this is a list of lists which defines the FK Tables to be loaded. The outermost list are the operands (in case an operation is needed to recover the observable, more on that below). The innermost list are the grids that are to be concatenated in order to form the operands.
-- ``operaton``: operation to be applied in order to compute the observable
-
-Example:
-
-..  code-block:: yaml
-            theory: 
-            FK_tables:
-                - - Z_contribution
-                  - Wp_contribution
-                  - Wm_total
-                - - total_xs
-            operation: 'ratio'
-
-In this case the ``fktables`` for the Z, W+ and W- contributions will be concatenated (the dataset might include predictions for all three contributions).
-After that, the final observable will be computed by taking the ratio of the concatenation of all those observables and the total cross section (``total_xs``).
-
-
-..  code-block:: yaml
-
-    data_uncertainties:
-        - uncertainties.yaml
-
-    variants:
-        name_of_the_variant:
-            data_uncertainties:
-                - uncertainties.yaml
-                - extra_uncertainties.yaml
-        another_variant:
-            data_uncertainties:
-                - different_uncertainties.yaml
-
-
-When loading this dataset with no variant only the ``uncertainties.yaml`` file will be read.
-Instead, when choosing ``variant: name_of_the_variant``, both ``uncertainties.yaml`` and  ``extra_uncertainties.yaml`` will be loaded.
-Note that if we want to substitute the default set of uncertainties we just need to not include it in the variant (as done in ``another_variant``).
-
-
-Data
-----
-
-The format of the data is a ``yaml`` file with an entry ```data_central``` which is a list for all values for all bins.
-
-..  code-block:: yaml
-
-    data_central:
-        - val1
-        - val2
-        - val3
-
-Uncertainties
--------------
-
-The uncertainties are (also) ``.yaml`` files. 
-Note that in the ``metadata.yaml`` the ``data_uncertainties`` entry is given as a list. 
-When using more than one uncertainty file they will be concatenated. 
-This allows the user the flexibility of creating variants where only a subset of the uncertainties are modified.
-
-The format of the uncertainty files is of two fields, a ``definitions`` field that contains metadata about all the uncertainties (their name, their treatment (``ADD`` or ``MULT``) and their type) and a second field ``bins`` which is a list of mappings with as many entries as the `data_central` with the named uncertainties.
-
-Note that, regardless of their treatment type, the uncertainties should always be written as absolute values and not relative to the data values.
-
-..  code-block:: yaml
-
-    definitions:
-        stat:
-            description:
-            treatment:
-            type:
-        error_name:
-            description:
-            treatment:
-            type:
-        error_name_2:
-            description:
-            treatment:
-            type:
-    bins:
-        - stat:
-          error_name:
-          error_name_2:
-
-Kinematics:
------------
-The kinematics file follow a convention very similar to the uncertainties file, where the ``definitions`` field is skipped since that information is already contained in the parent ``metadata.yaml`` file.
-
-Therefore, we have a list of ``bins`` (of the same size as the list for `data_central`) and for each entry we have the information of all the variables.
-
-..  code-block:: yaml
-
-    bins:
-        - var_1:
-            min: 0
-            max: 1
-            mid: 0.5
-          var_2:
-            min: 0
-            max: 1
-            mid: 0.5
-
-Plotting
-~~~~~~~~
-
-The ``plotting`` section defines the plotting style inside ``validphys``.
-In previous implementations there were per-process options that defined plotting options for family of processes.
-In the commondata format defined in this page every plotting option must be defined in the ``plotting`` section of each observable.
-
-Internally within ``validphys`` only 3 kinematic variables are taken into account. The 3 selected variables (and their order) is defined by ``plotting::kinematic_coverage``.
-
-The name of the variables (which in this example are `var_1`, `var_2`, `var_3`) need to be the same in the plotting and kinematics.
diff --git a/doc/sphinx/source/data/plotting-format.rst b/doc/sphinx/source/data/plotting-format.rst
new file mode 100644
index 0000000000..62e31c4df0
--- /dev/null
+++ b/doc/sphinx/source/data/plotting-format.rst
@@ -0,0 +1,268 @@
+.. _plotting-format:
+
+===============
+Plotting format
+===============
+
+The ``plotting`` dictionary within the metadata of a dataset
+defines a set of options that are used for analysis
+and representation purposes, particularly to determine how datasets
+should be represented in plots.
+
+.. warning:: the information in this page is not up to date
+
+Format
+======
+
+The plotting file specifies the variable in which the data
+is to be plotted (in the  *x* axis) as well as the variables
+in which the data will be split in different lines in the
+same figure or in different figures. The possible variables
+('*kinematic labels*') are described below.
+
+The format also allows the control of several plotting properties, such
+as whether to use log scale, or the axes labels.
+
+Kinematic labels
+================
+
+.. note:: very outdated information that only applies to legacy data
+
+When a dataset has been ported from the old implementation and thus
+it has no well defined kinematic variables (but instead just k1, k2, k3)
+the default kinematic variables are inferred from the *process type*
+declared in the commondata files (more specifically from
+a substring). Currently they are:
+
+.. code-block:: python
+
+  'DIS': ('$x$', '$Q^2 (GeV^2)$', '$y$'),
+  'DYP': ('$y$', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
+  'EWJ_JPT': ('$p_T (GeV)$', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
+  'EWJ_JRAP': ('$\\eta/y$', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
+  'EWJ_MLL': ('$M_{ll} (GeV)$', '$M_{ll}^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
+  'EWJ_PT': ('$p_T (GeV)$', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
+  'EWJ_PTRAP': ('$\\eta/y$', '$p_T^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
+  'EWJ_RAP': ('$\\eta/y$', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
+  'EWK_MLL': ('$M_{ll} (GeV)$', '$M_{ll}^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
+  'EWK_PT': ('$p_T$ (GeV)', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
+  'EWK_PTRAP': ('$\\eta/y$', '$p_T^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
+  'EWK_RAP': ('$\\eta/y$', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
+  'HIG_RAP': ('$y$', '$M_H^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
+  'HQP_MQQ': ('$M^{QQ} (GeV)$', '$\\mu^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
+  'HQP_PTQ': ('$p_T^Q (GeV)$', '$\\mu^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
+  'HQP_PTQQ': ('$p_T^{QQ} (GeV)$', '$\\mu^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
+  'HQP_YQ': ('$y^Q$', '$\\mu^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
+  'HQP_YQQ': ('$y^{QQ} (GeV)$', '$\\mu^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
+  'INC': ('$0$', '$\\mu^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
+  'JET': ('$\\eta$', '$p_T^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
+  'PHT': ('$\\eta_\\gamma$', '$E_{T,\\gamma}^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
+  'SIA': ('$z$', '$Q^2 (GeV^2)$', '$y$')
+
+
+The three kinematic variables are referred to as `k1`, `k2` and `k3`
+in the plotting files. For example, for DIS processes, `k1` refers to `x`,
+`k2` to `Q`, and `k3` to `y`.
+
+These kinematic values can be overridden by some transformation of
+them. For that purpose, it is possible to define
+a `kinematics_override` key.  The value must be a class defined
+in: `validphys2/src/validphys/plotoptions/kintransforms.py`
+
+The class must have a `__call__` method that takes three parameters:
+`(k1, k2 k3)` as defined in the dataset implementation, and returns
+three new values `('k1', 'k2', k3')` which are the "transformed"
+kinematical variables, which will be used for plotting purposes every
+time the kinematic variables `k1`, `k2` and `k3` are referred to.
+Additionally, the class must implement a `new_labels` method, that
+takes the old labels and returns the new ones, and an `xq2map`
+function that takes the kinematic variables and returns a tuple of (x,
+Q²) with some approximate values. An example of such transform is:
+
+.. code-block:: python
+
+  class dis_sqrt_scale:
+      def __call__(self, k1, k2, k3):
+          ecm = sqrt(k2/(k1*k3))
+          return k1, sqrt(k2), ceil(ecm)
+
+      def new_labels(self, *old_labels):
+          return ('$x$', '$Q$ (GeV)', r'$\sqrt{s} (GeV)$')
+
+      def xq2map(self, k1, k2, k3, **extra_labels):
+          return k1, k2*k2
+
+
+Additional labels
+=================
+Additional labels can be specified by declaring an **extra_labels**
+key in the plotting file, and specifying for each new label a value
+for each point in the dataset.
+
+For example:
+
+..  code-block:: yaml
+  
+   extra_labels:
+      idat2bin:  [0, 0, 0, 0, 0, 0, 0, 0, 100, 100, 100, 100, 100, 200, 200, 200, 300, 300, 300, 400, 400, 400, 500, 500, 600, 600, 700, 700, 800, 800, 900, 1000, 1000, 1100]
+
+defines one label where the values for each of the datapoints are
+given in the list. Note that the name of the extra_label (in this case
+`idat2bin` is completely arbitrary, and will be used for plotting
+purposes (LaTeX math syntax is allowed as well). However, adding labels
+manually for each point can be tedious. This should only be reserved
+for information that cannot be recovered from the kinematics as
+defined in the CommonData file. Instead, new labels can be generated
+programmatically: every function defined in `validphys2/src/validphys/plotoptions/labelers.py`
+is a valid label. These functions take as keyword arguments the
+(possibly transformed) kinematical variables, as well as any extra
+label declared in the plotting file. For example, one might declare:
+
+.. code-block:: python
+
+  def high_xq(k1, k2, k3, **kwargs):
+      return k1 > 1e-2 and k2 > 1000
+
+
+Note that it is convenient to always declare the `**kwargs`
+parameter so that the code doesn't crash when the function is called
+with extra arguments. Similarly to the kinematics transforms, it is
+possible to decorate them with a `@label` describing a nicer latex
+label than the function name. For example:
+
+.. code-block:: python
+
+  @label(r"$I(x>10^{-2})\times I(Q > 1000 GeV)$")
+  def high_xq(k1, k2, k3, **kwargs):
+      return (k1 > 1e-2) & (k2 > 1000)
+
+
+Plotting and grouping
+=====================
+
+The variable in which the data is plotted is simply
+declared as
+
+..  code-block:: yaml
+    
+  x: <label>
+
+For example:
+
+..  code-block:: yaml
+    
+  x: k1
+
+If a `line_by` key is specified, variables with different values for
+each of the labels listed, will be represented as different lines. For
+example,
+
+
+..  code-block:: yaml
+    
+  line_by:
+    - k2
+
+for DIS would mean that the data in the same Q bin is plotted in the
+same line.
+
+Similarly, it is possible to define a `figure_by` key: Points
+with different values for the listed keys will be split across
+separated figures. For example:
+
+..  code-block:: yaml
+    
+  figure_by:
+    - idat2bin
+    - high_xq
+
+
+Transforming the result
+=======================
+
+.. note:: very outdated information that only applies to legacy data
+
+By default the *y* axis represents the central value and error. However,
+it is possible to define a results_transform in the plotting file:
+
+..  code-block:: yaml
+    
+  result_transform: qbinexp
+
+The value must be a function declared in
+`validphys2/src/validphys/plotoptions/results_transform.py`
+taking the error, the central value, as well as all the labels, and
+returning a new error and central value. For example:
+
+..  code-block:: python
+    
+  def qbinexp(cv, error, **labels):
+      q = labels['k2']
+      qbin = bins(q)
+      return 10**qbin*cv, 10**qbin*error
+
+Plotting options
+================
+
+Several plotting options can be specified.
+These include
+
+ - x/y_scale: 'linear' or 'log'.
+ - x/y_label: Any string, possibly latex formatted. Note that the
+	 x_label will be deduced automatically.
+
+Overriding configuration for normalized plots
+=============================================
+
+When the results are to be plotted as a ratio, it may be convenient to
+alter the configuration of the plots, for example by changing the
+`line_by` labels into `figure_by` (because otherwise the points would
+overlap), or by changing the scale from log to linear. To do so, we
+specify the options we want to override in a `normalize` key.
+Everything defined inside will take precedence when we produce a ratio
+plot and will be ignored for absolute value plots. For example:
+
+..  code-block:: yaml
+    
+  x: k1
+
+  x_label: '$\left\|\eta/y\right|$'
+
+  y_label: '$d\sigma/dy$ (fb)'
+
+  line_by:
+    - Boson
+
+  normalize:
+      figure_by:
+          - Boson
+
+  extra_labels:
+    Boson:  ["$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$Z$","$Z$","$Z$","$Z$","$Z$","$Z$","$Z$","$Z$"]
+
+Here, we would split the data by different figure files for each
+unique value of the key `Boson` (which is defined explicitly as an
+`extra_label`), but only one plot with the three bosons split across
+different lines will be produced in absolute value plots.
+
+Example
+=======
+
+..  code-block:: yaml
+
+  dataset_label: "Some hypothetical dataset"
+  x: k3
+  x_scale: log
+  kinematics_override: dummy_transform #defined in transforms.py
+  line_by:
+    - k2
+
+  figure_by:
+    - idat2bin #defined below
+    - high_xq  #defined in labelers.py
+
+  normalize: # Change the scale for ratio plots
+      x_scale: linear
+
+  extra_labels:
+      idat2bin:  [0, 0, 0, 0, 0, 0, 0, 0, 100, 100, 100, 100, 100, 200, 200, 200, 300, 300, 300, 400, 400, 400, 500, 500, 600, 600, 700, 700, 800, 800, 900, 1000, 1000, 1100]
diff --git a/doc/sphinx/source/data/plotting_format.md b/doc/sphinx/source/data/plotting_format.md
deleted file mode 100644
index 069691b3b5..0000000000
--- a/doc/sphinx/source/data/plotting_format.md
+++ /dev/null
@@ -1,303 +0,0 @@
-```eval_rst
-.. _plotting-format:
-```
-Plotting format
-===============
-
-A *plotting file* defines a set of options that are used for analysis
-and representation purposes, particularly to determine how datasets
-should be represented in plots and how they should be grouped
-together according to various criteria. The plotting files should be
-considered part of the implementation of the dataset, and should be
-read by various tools that want to sensibly represent the data.
-
-## Naming convention
-
-Plotting files are located in the `commondata`
-folder (`nnpdfcpp/data/commondata`).
-For a dataset labeled `<DATASET>`, the corresponding file name is
-`PLOTTING_<DATASET>.yaml` or `PLOTTING_<DATASET>.yml`
-
-For example, given the dataset "HERA1CCEP", the corresponding
-plotting file name is:
-
-````
-PLOTTING_HERA1CCEP.yaml
-````
-
-Additionally, the configuration is loaded from a per-process-type file
-called:
-
-```
-PLOTTINGTYPE_<type>.yaml
-```
-
-See [kinematic labels](#kinematic-labels) below for a list of defined types. When a key
-is present both in the dataset-specific and the per-process-type file, the
-dataset-specific one always takes precedence.
-
-
-## Format
-
-The plotting file specifies the variable in which the data
-is to be plotted (in the  *x* axis) as well as the variables
-in which the data will be split in different lines in the
-same figure or in different figures. The possible variables
-('*kinematic labels*') are described below.
-
-The format also allows the control of several plotting properties, such
-as whether to use log scale, or the axes labels.
-
-### Data label
-
-A key called `dataset_label` can be used to specify a nice plotting
-and display label for each dataset. LaTeX math is allowed between
-dollar signs. See the [example](#example) plotting file for usage.
-
-### Kinematic labels
-
-The default kinematic variables are inferred from the *process type*
-declared in the commondata files (more specifically from
-a substring). Currently they are:
-
-```
-'DIS': ('$x$', '$Q^2 (GeV^2)$', '$y$'),
-'DYP': ('$y$', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
-'EWJ_JPT': ('$p_T (GeV)$', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
-'EWJ_JRAP': ('$\\eta/y$', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
-'EWJ_MLL': ('$M_{ll} (GeV)$', '$M_{ll}^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
-'EWJ_PT': ('$p_T (GeV)$', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
-'EWJ_PTRAP': ('$\\eta/y$', '$p_T^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
-'EWJ_RAP': ('$\\eta/y$', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
-'EWK_MLL': ('$M_{ll} (GeV)$', '$M_{ll}^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
-'EWK_PT': ('$p_T$ (GeV)', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
-'EWK_PTRAP': ('$\\eta/y$', '$p_T^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
-'EWK_RAP': ('$\\eta/y$', '$M^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
-'HIG_RAP': ('$y$', '$M_H^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
-'HQP_MQQ': ('$M^{QQ} (GeV)$', '$\\mu^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
-'HQP_PTQ': ('$p_T^Q (GeV)$', '$\\mu^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
-'HQP_PTQQ': ('$p_T^{QQ} (GeV)$', '$\\mu^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
-'HQP_YQ': ('$y^Q$', '$\\mu^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
-'HQP_YQQ': ('$y^{QQ} (GeV)$', '$\\mu^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
-'INC': ('$0$', '$\\mu^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
-'JET': ('$\\eta$', '$p_T^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
-'PHT': ('$\\eta_\\gamma$', '$E_{T,\\gamma}^2 (GeV^2)$', '$\\sqrt{s} (GeV)$'),
-'SIA': ('$z$', '$Q^2 (GeV^2)$', '$y$')
-```
-
-This mapping is declared as `validphys.commondataparser.KINLABEL_LATEX`
-in the python code.
-
-The three kinematic variables are referred to as `k1`, `k2` and `k3`
-in the plotting files. For example, for DIS processes, `k1` refers to `x`,
-`k2` to `Q`, and `k3` to `y`.
-
-These kinematic values can be overridden by some transformation of
-them. For that purpose, it is possible to define
-a `kinematics_override` key.  The value must be a class defined
-in: `validphys2/src/validphys/plotoptions/kintransforms.py`
-
-The class must have a `__call__` method that takes three parameters:
-`(k1, k2 k3)` as defined in the dataset implementation, and returns
-three new values `('k1', 'k2', k3')` which are the "transformed"
-kinematical variables, which will be used for plotting purposes every
-time the kinematic variables `k1`, `k2` and `k3` are referred to.
-Additionally, the class must implement a `new_labels` method, that
-takes the old labels and returns the new ones, and an `xq2map`
-function that takes the kinematic variables and returns a tuple of (x,
-Q²) with some approximate values. An example of such transform is:
-
-````python
-class dis_sqrt_scale:
-    def __call__(self, k1, k2, k3):
-        ecm = sqrt(k2/(k1*k3))
-        return k1, sqrt(k2), ceil(ecm)
-
-    def new_labels(self, *old_labels):
-        return ('$x$', '$Q$ (GeV)', r'$\sqrt{s} (GeV)$')
-
-    def xq2map(self, k1, k2, k3, **extra_labels):
-        return k1, k2*k2
-````
-
-
-Additional labels can be specified by declaring an **extra_labels**
-key in the plotting file, and specifying for each new label a value
-for each point in the dataset.
-
-For example:
-
-````
-extra_labels:
-    idat2bin:  [0, 0, 0, 0, 0, 0, 0, 0, 100, 100, 100, 100, 100, 200, 200, 200, 300, 300, 300, 400, 400, 400, 500, 500, 600, 600, 700, 700, 800, 800, 900, 1000, 1000, 1100]
-````
-
-defines one label where the values for each of the datapoints are
-given in the list. Note that the name of the extra_label (in this case
-`idat2bin` is completely arbitrary, and will be used for plotting
-purposes (LaTeX math syntax is allowed as well). However, adding labels
-manually for each point can be tedious. This should only be reserved
-for information that cannot be recovered from the kinematics as
-defined in the CommonData file. Instead, new labels can be generated
-programmatically: every function defined in `validphys2/src/validphys/plotoptions/labelers.py`
-is a valid label. These functions take as keyword arguments the
-(possibly transformed) kinematical variables, as well as any extra
-label declared in the plotting file. For example, one might declare:
-
-````
-def high_xq(k1, k2, k3, **kwargs):
-    return k1 > 1e-2 and k2 > 1000
-
-````
-
-Note that it is convenient to always declare the `**kwargs`
-parameter so that the code doesn't crash when the function is called
-with extra arguments. Similarly to the kinematics transforms, it is
-possible to decorate them with a `@label` describing a nicer latex
-label than the function name. For example:
-
-````
-@label(r"$I(x>10^{-2})\times I(Q > 1000 GeV)$")
-def high_xq(k1, k2, k3, **kwargs):
-    return (k1 > 1e-2) & (k2 > 1000)
-
-````
-
-### Plotting and grouping
-
-The variable in which the data is plotted is simply
-declared as
-
-````
-x: <label>
-````
-
-For example:
-
-````
-x: k1
-````
-
-If a `line_by` key is specified, variables with different values for
-each of the labels listed, will be represented as different lines. For
-example,
-
-````
-line_by:
-  - k2
-````
-
-for DIS would mean that the data in the same Q bin is plotted in the
-same line.
-
-Similarly, it is possible to define a `figure_by` key: Points
-with different values for the listed keys will be split across
-separated figures. For example:
-
-````
-figure_by:
-  - idat2bin
-  - high_xq
-````
-
-### Transforming the result
-
-By default the *y* axis represents the central value and error. However,
-it is possible to define a results_transform in the plotting file:
-
-````
-result_transform: qbinexp
-````
-
-The value must be a function declared in
-`validphys2/src/validphys/plotoptions/results_transform.py`
-taking the error, the central value, as well as all the labels, and
-returning a new error and central value. For example:
-
-````
-def qbinexp(cv, error, **labels):
-    q = labels['k2']
-    qbin = bins(q)
-    return 10**qbin*cv, 10**qbin*error
-````
-
-### Plotting options
-
-Several plotting options can be specified.
-These include
-
- - x/y_scale: 'linear' or 'log'.
- - x/y_label: Any string, possibly latex formatted. Note that the
-	 x_label will be deduced automatically.
-
-### Overriding configuration for normalized plots
-
-When the results are to be plotted as a ratio, it may be convenient to
-alter the configuration of the plots, for example by changing the
-`line_by` labels into `figure_by` (because otherwise the points would
-overlap), or by changing the scale from log to linear. To do so, we
-specify the options we want to override in a `normalize` key.
-Everything defined inside will take precedence when we produce a ratio
-plot and will be ignored for absolute value plots. For example:
-```yaml
-x: k1
-
-x_label: '$\left\|\eta/y\right|$'
-
-y_label: '$d\sigma/dy$ (fb)'
-
-line_by:
-  - Boson
-
-normalize:
-    figure_by:
-        - Boson
-
-extra_labels:
-   Boson:  ["$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^+$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$W^-$","$Z$","$Z$","$Z$","$Z$","$Z$","$Z$","$Z$","$Z$"]
-
-```
-Here, we would split the data by different figure files for each
-unique value of the key `Boson` (which is defined explicitly as an
-`extra_label`), but only one plot with the three bosons split across
-different lines will be produced in absolute value plots.
-
-### Metadata keys
-
-Plotting files are also used to define metadata related to the various
-datasets. These keys include:
-
-  - `experiment` (string): The experiment which produced the experimental data.
-  - `process_description` (string): A description of the physical process
-  associated to the dataset. This would typically be defined in the
-  `PLOTTINGTYPE` files.
-  - `data_reference` (string): a LaTeX key corresponding to the
-  reference of the experimental paper.
-  - `theory_reference` (string): a LaTeX key corresponding to the
-  codes used to compute the theory predictions.
-
-## Example
-
-A complete example (all keys are optional) looks like this:
-
-```yaml
-
-dataset_label: "Some hypothetical dataset"
-experiment: ATLAS
-x: k3
-x_scale: log
-kinematics_override: dummy_transform #defined in transforms.py
-line_by:
-  - k2
-
-figure_by:
-  - idat2bin #defined below
-  - high_xq  #defined in labelers.py
-
-normalize: # Change the scale for ratio plots
-    x_scale: linear
-
-extra_labels:
-    idat2bin:  [0, 0, 0, 0, 0, 0, 0, 0, 100, 100, 100, 100, 100, 200, 200, 200, 300, 300, 300, 400, 400, 400, 500, 500, 600, 600, 700, 700, 800, 800, 900, 1000, 1000, 1100]
-
-````
diff --git a/doc/sphinx/source/external-code/apfelcomb.md b/doc/sphinx/source/external-code/apfelcomb.md
deleted file mode 100644
index 5d99cc5bd0..0000000000
--- a/doc/sphinx/source/external-code/apfelcomb.md
+++ /dev/null
@@ -1,59 +0,0 @@
-```eval_rst
-.. _apfelcomb:
-```
-# Using APFELcomb
-
-APFELcomb is the project that allows the user to generate ``FK`` tables. These
-are lookup tables that contain the relevant information to compute theoretical 
-predicitons in the NNPDF framework. Broadly speaking, this is achieved by 
-taking DGLAP evolution kernels from ``APFEL`` and combining them with
-interpolated parton-level observable kernels of various forms. The mechanism
-behind APFELcomb is presented in [arXiv:1605.02070]().
-
-APFELcomb is available from 
-```
-https://github.com/NNPDF/apfelcomb
-````
-The various data formats used in APFELcomb are described in [Experimental data files](../data/exp-data-files.html#exp-data-files).
-
-
-APFELcomb depends on the following libraries
-* [APFEL](https://github.com/scarrazza/apfel)
-* [nnpdf](https://github.com/NNPDF/nnpdf)
-* [APPLgrid](https://github.com/NNPDF/external/tree/master/applgrid-1.4.70-nnpdf)
-
-and data files from
-
-* [applgrids](https://github.com/NNPDF/applgrids)
-
-There are various ways of generating the latter, as explained in [How to 
-generate applgrids](../tutorials/APPLgrids.md).
-
-Once the above libraries and data files are set up, the APFELcomb project can be
-compiled as follows
-```
-make 
-```
-Compilation flags and various paths are defined in `Makefile.inc`. These are
-mostly inferred from `<package>-config` files with the exception of
-* `RESULTS_PATH` (default `./results`) Defines the path results are written to.
-* `APPL_PATH` (default `../applgrids`) Defines the path to the `applgrid` repository.
-* `DB_PATH` (default `.db`) Defines the path to the APFELcomb database.
-
-The defaults are configured assuming that both the `nnpdf` and `applgrid` 
-repositories are located at `../`.
-
-Each `FK` table is generated piecewise in one or more `subgrids`. The `subgrids`
-implemented in APFELcomb can be displayed by running the script
-```
-./scripts/disp_grids.py
-```
-Typically DIS and `FKGenerator` DY tables are made of only one subgrid, whereas
-FK tables generated from APPLgrids have one subgrid per APPLgrid file. 
-How subgrids are merged into grids, and the generation parameters of each 
-subgrid, is specified in the `db/apfelcomb.db` database. The database itself 
-is not stored in the repository, but it is built from the sqlite dump at 
-`db/apfelcomb.dat`. This is done automatically by the APFELcomb makefile.
-Detailed instructions to generate/implement `FK` tables for individual 
-experiments and/or a compelte theory are provided in 
-[How to generate/implement FK tables](../tutorials/apfelcomb.md).
diff --git a/doc/sphinx/source/external-code/index.rst b/doc/sphinx/source/external-code/index.rst
index d5340ef808..824dbb10b5 100644
--- a/doc/sphinx/source/external-code/index.rst
+++ b/doc/sphinx/source/external-code/index.rst
@@ -18,4 +18,3 @@ various external codes that you will frequently encounter are described.
    ./pdf-codes.md
    ./grids.md
    ./cross-secs.md
-   ./apfelcomb
diff --git a/doc/sphinx/source/tutorials/apfelcomb.md b/doc/sphinx/source/tutorials/apfelcomb.md
deleted file mode 100644
index c196607146..0000000000
--- a/doc/sphinx/source/tutorials/apfelcomb.md
+++ /dev/null
@@ -1,300 +0,0 @@
-```eval_rst
-.. _tutorialfktables:
-```
-
-# How to generate and implement FK tables
-
-APFELcomb is the project that allows the user to generate `FK` tables.
-These are lookup tables that contain the relevant information to compute
-theoretical predicitons in the NNPDF framework. Broadly speaking, this is
-achieved by taking DGLAP evolution kernels from ``APFEL`` and combining them
-with interpolated parton-level observable kernels in the APPLgrid or
-FastNLO format
-(see [How to generate APPLgrid and fastNLO tables](../tutorials/APPLgrids.md)).
-The various data formats used in APFELcomb are described in [Experimental data files](../data/exp-data-files.html#exp-data-files).
-
-The user is strongly encouraged to go through that note with care, in order to
-familiarise himself with the features and the structure of the APFELcomb
-project.
-
-## Generate a FK table
-Each `FK` table is generated piecewise in one or more `subgrids`. The `subgrids`
-implemented in APFELcomb can be displayed by running the script
-```
-./scripts/disp_grids.py
-```
-The generation of each subgrid can by achieved with the following command
-```
-./apfel_comb <source=app/dis/dyp> <subgrid id> <theory id>
-```
-where `<app/dis/dyp>` specifies whether the subgrid is in the APP, DIS or
-DYP subgrid categories in the database (`db/apfelcomb.dat`), where:
-- APP: refers to applgrids, partonic cross sections produced externally by a MonteCarlo generator.
-- DIS: Deep Inelastic Scatting, coefficient fucnctions computed by `APFEL`.
-- DYP: Drell-Yan, partonic cross sections computed by `APFEL`.
-
-`<subgrid id>` is the corresponding ID in that database (visible in the `disp\_grids` script)
-and `<theory id>` specifies the desired NNPDF theory index (the entry in
-nnpdf/nnpdfcpp/data/theory.db). As an example:
-```Shell
-./apfel_comb app 500 53
-```
-will generate the subgrid for CDFZRAP and theory 53
-(NNPDF3.1 NNLO fitted charm).
-The resulting FK subgrid
-will be written out to
-
-```
-$RESULTS_PATH/theory_<theory id>/subgrids/FK_<setname>_<subgrid id>.dat.
-```
-
-APPLgrids and FastNLO tables should be properly stored in the `applgrids` folder by means of
-[Git LFS](https://git-lfs.github.com/) (see [here](storage) for details).
-
-Once all the relevant subgrids for the desired dataset(s) are generated,
-one should run
-```
-./merge_allgrids.py <theory id>
-```
-which will loop over all datasets and attempt to merge their subgrids into a
-complete `FK` table. The resulting final `FK` table should be stored at
-```
-$RESULTS_PATH/theory_<theory id>/fastkernel/FK_<setname>.dat.
-```
-
-## Implement a new FK table
-Whenever a new dataset is implemented, it should be accompanied by the
-corresponding `FK` table. To implement a new `FK` table, one must first add
-a corresponding entry into the apfelcomb database (by editing the
-`./db/apfelcomb.dat` file) under the `grids` table.
-These entries are comprised of the following fields.
-- **id**		- The primary key identifier of the FK table.
-- **setname**		- The COMMONDATA set name of the corresponding dataset.
-- **name**		- The name of the FK table.
-- **description**	- A one-line description of the FK table.
-- **nx**		- The number of x-grid interpolation points.
-- **positivity**	- A flag specifying if the FK table is a positivity set.
-- **source**		- Specifies if the corresponding subgrids are [APP/DIS/DYP].
-
-Note that **setname** and **name** may be different in the case of compound
-observables such as ratios, where multiple FK tables are required to compute
-predictions for a single dataset. The `nx` parameter specifies the
-interpolation accuracy of the dataset (this must currently be tuned by hand,
-e.g. by making sure that the native applgrid and the generated FK tables lead
-to numerically equivalent results once they are convolved with the same PDF
-set). The `positivity` parameter restricts the observable to NLO matrix
-elements and disables target-mass corrections.
-Once this entry is complete, one must move on to adding entries in the
-corresponding subgrid table.
-
-### Implementing a new APPLgrid/FastNLO subgrid
-
-To add a new APPLgrid- or FastNLO--based subgrid, one must add a corresponding entry into
-the `app\_subgrids` table of the apfelcomb database. One entry should be added
-for each APPLgrid making up the final target `FK` table.
-The entries have the following fields:
-- **id** 	- The primary key identifier of the subgrid.
-- **fktarget**	- The name of the FK table this subgrid belongs to.
-- **applgrid**	- The filename of the corresponding APPLgrid.
-- **fnlobin**   - The fastNLO index if the table is a fastNLO grid, or -1 if not.
-- **ptmin**	- The minimum perturbative order (1 when the LO is zero, 0 if not).
-- **pdfwgt**	- A boolean flag, 1 if the APPLgrid has PDF weighting, 0 if not (depending on how the native applgrid was generated).
-- **ppbar**	- A boolean flag, 1 if the APPLgrid should be transformed to *ppbar* beams, 0 if not.
-- **mask**	- A boolean mask, specifying which APPLgrid entries should be considered data points.
-- **operators** - A list of operators to handle certain special cases (see below).
-The mask should have as many entries as APPLgrid bins and each boolean value
-should be separated by a space. For example, for an applgrid with five bins
-where we want to exclude the penultimate bin, the mask would be:
-```
-1 1 1 0 1
-```
-Note that there is no way to know a priori whether `pdfwgt` should be set to 0 or to 1, that is
-whether the grid is unweighted or weighted. However, this can easily be checked a posteriori, since
-setting `pdfwgt` to the wrong value should lead to `./apfel_comb` failing due to a large relative
-error between the value in the APPLgrid and that in the FK table.
-
-The applgrid filename assumes that the grid can be found at
-```
-$APPL_PATH/<setname>/<applgrid>
-```
-where `APPL_PATH` is defined in Makefile.am, `<setname>` is the corresponding
-`COMMONDATA` set name specified in the grids table (that should match the name
-used in the [buildmaster](../tutorials/buildmaster.md) implementation), and `<applgrid>`
-is specified in the field described above.
-
-### Implementing a new DIS or DYP subgrid
-New DIS or DYP subgrids should be entered respectively into the
-`dis_subgrids` or `dyp_subgrids` tables of the apfelcomb database.
-Typically only one subgrid is needed per DIS or DYP FK table.
-Each subgrid entry has the following fields:
-- **id**	- The primary key identifier of the subgrid
-- **fktarget**	- The name of the FK table this subgrid belongs to
-- **operators** - A list of operators to handle certain special cases (see Subgrid operators).
-For DIS there is one additional field:
-- **process**	- The process string of the observable (e.g DIS\_F2P, see DIS Processes in APFEL below)
-
-### DIS Processes in APFEL
-
-For DIS processes and since the coefficient functions are computed solely with APFEL, one needs to specify the process of the observable, in `dis_subgrids` following `APFEL`'s nomenclature.
-The list of processes below can be found in `apfel/src/DIS/FKObservables.f` in the headers corresponding to the different observables called.
-
-**Deep Inelastic Scattering Structure Functions**:
-- DIS_F2L: [EM] Light structure function F2light (electron-proton)
-- DIS_F2U: [EM] Up structure function F2u (electron-proton[up])
-- DIS_F2d: [EM] Down structure function F2d (electron-proton[down])
-- DIS_F2S: [EM] Strange structure function F2s (electron-proton[strange])
-- DIS_F2C: [EM] Charm structure function F2charm (electron-proton)
-- DIS_F2B: [EM] Bottom structure function F2bottom (electron-proton)
-- DIS_F2T: [EM] Top structure function F2top (electron-proton)
-- DIS_F2D: [EM] Deuteron structure function F2 (electron-isoscalar)
-- DIS_FLL: [EM] Light structure function FLlight (electron-proton)
-- DIS_FLC: [EM] Charm structure function FLcharm (electron-proton)
-- DIS_FLB: [EM] Bottom structure function FLbottom (electron-proton)
-- DIS_FLT: [EM] Top structure function FLtop (electron-proton)
-- DIS_FLD: [EM] Deuteron structure function FL (electron-isoscalar)
-- DIS_F2P_NC: [NC] Proton structure function F2 (electron-isoscalar)
-- DIS_F2P: [EM] Proton structure function F2 (electron-proton)
-- DIS_FLP_NC: [NC] Proton structure function FL (electron-proton)
-- DIS_FLP_CON_NC: [NC] Proton structure function FL (electron-proton)
-- DIS_FLP: [EM] Proton structure function FL (electron-proton)
-- DIS_F3P_NC: [NC] F3 structure function (electron-proton)
-
-**Deep Inelastic Scattering Reduced Cross-Sections**:
-- DIS_NCE_L: [NC] Electron scattering Reduced Cross-Section, light (electron-proton)
-- DIS_NCP_L: [NC] Positron scattering Reduced Cross-Section, light (positron-proton)
-- DIS_NCE_CH: [NC] Electron scattering Reduced Cross-Section, charm (electron-proton)
-- DIS_NCP_CH: [NC] Positron scattering Reduced Cross-Section, charm (positron-proton)
-- DIS_NCE_BT: [NC] Electron scattering Reduced Cross-Section, bottom (electron-proton)
-- DIS_NCP_BT: [NC] Positron scattering Reduced Cross-Section, bottom (positron-proton)
-- DIS_NCE_TP: [NC] Electron scattering Reduced Cross-Section, top (electron-proton)
-- DIS_NCP_TP: [NC] Positron scattering Reduced Cross-Section, top (positron-proton)
-- DIS_NCE_D: [NC] Electron scattering Reduced Cross-Section on deuteron, inclusive (electron-isosclar)
-- DIS_NCP_D: [NC] Positron scattering Reduced Cross-Section on deuteron, inclusive (positron-isoscalar)
-- DIS_NCE: [NC] Electron scattering Reduced Cross-Section, inclusive (electron-proton)
-- DIS_NCP: [NC] Positron scattering Reduced Cross-Section, inclusive (positron-proton)
-- DIS_CCE_L: [CC] Electron scattering Reduced Cross-Section, light (electron-proton)
-- DIS_CCP_L: [CC] Positron scattering Reduced Cross-Section, light (positron-proton)
-- DIS_CCE_C: [CC] Electron scattering Reduced Cross-Section, charm (electron-proton)
-- DIS_CCP_C: [CC] Positron scattering Reduced Cross-Section, charm (positron-proton)
-- DIS_CCE: [CC] Electron scattering Reduced Cross-Section, inclusive (electron-proton)
-- DIS_CCP: [CC] Positron scattering Reduced Cross-Section, inclusive (positron-proton)
-
-**Deep Inelastic Scattering Reduced Cross-Sections (heavy-ion)**:
-- DIS_SNU_L_Pb: [CC] Neutrino scattering Reduced Cross-Section, light (neutrino-lead)
-- DIS_SNB_L_Pb: [CC] Antineutrino scattering Reduced Cross-Section, light (antineutrino-lead)
-- DIS_SNU_C_Pb: [CC] Neutrino scattering Reduced Cross-Section, charm (neutrino-lead)
-- DIS_SNB_C_Pb: [CC] Antineutrino scattering Reduced Cross-Section, charm (antineutrino-lead)
-- DIS_SNU_Pb: [CC] Neutrino scattering Reduced Cross-Section, inclusive (neutrino-lead)
-- DIS_SNB_Pb: [CC] Antineutrino scattering Reduced Cross-Section, inclusive (antineutrino-lead)
-- DIS_SNU_L: [CC] Neutrino scattering Reduced Cross-Section, light (neutrino-isoscalar)
-- DIS_SNB_L: [CC] Antineutrino scattering Reduced Cross-Section, light (antineutrino-isoscalar)
-- DIS_SNU_C: [CC] Neutrino scattering Reduced Cross-Section, charm (neutrino-isoscalar)
-- DIS_SNB_C: [CC] Antineutrino scattering Reduced Cross-Section, charm (antineutrino-isoscalar)
-- DIS_SNU: [CC] Neutrino scattering Reduced Cross-Section, inclusive (neutrino-isoscalar)
-- DIS_SNB: [CC] Antineutrino scattering Reduced Cross-Section, inclusive (antineutrino-isoscalar)
-- DIS_DM_NU: [CC] Dimuon neutrino cross section (neutrino-iron)
-- DIS_DM_NB: [CC] Dimuon anti-neutrino cross section (antineutrino-iron)
-
-**Single-Inclusive electron-positron annihilation, Time-Like Evolution (SIA)**:
-- SIA_F2: [NC] SIA structure function F2 =  FT + FL (electron-proton)
-- SIA_FL: [NC] SIA structure function FL (electron-proton)
-- SIA_FA: [NC] SIA structure function FA (electron-proton)
-- SIA_XSEC_NF4: [NC] SIA absolute cross section (nf=4) (electron-proton)
-- SIA_XSEC: [NC] SIA absolute cross section (electron-proton)
-- SIA_NORM_XSEC_LONG_L: [NC] SIA normalized light longitudinal cross section (electron-proton)
-- SIA_NORM_XSEC_LONG_BT: [NC] SIA normalized bottom longitudinal cross section (electron-proton)
-- SIA_NORM_XSEC_LONG: [NC] SIA normalized total longitudinal cross section (electron-proton)
-- SIA_NORM_XSEC_L: [NC] SIA normalized light cross section (electron-proton)
-- SIA_NORM_XSEC_CH: [NC] SIA normalized charm cross section (electron-proton)
-- SIA_NORM_XSEC_BT: [NC] SIA normalized bottom cross section (electron-proton)
-- SIA_NORM_XSEC_TP: [NC] SIA normalized top cross section (electron-proton)
-- SIA_NORM_XSEC_NF4: [NC] SIA normalized total cross section (nf=4) (electron-proton)
-- SIA_NORM_XSEC: [NC] SIA normalized total cross section (electron-proton)
-
-
-### Subgrid operators
-Subgrid operators are used to provide certain subgrid-wide transformations that
-can be useful in certain circumstances. They are formed by a key-value pair
-with syntax:
-```
-<KEY>:<VALUE>
-```
-If using multiple operators, they should be comma-separated. Currently these
-operators are implemented:
-- \*:*V* - Duplicate the subgrid data point (there must be only one for this operator) *V* times.
-- +:*V*  - Increment the starting data point index of this subgrid by *V*.
-- N:*V*  - Normalise all data points in this subgrid by *V*.
-
-The \* operator is typically used for normalised cross-sections, where the
-total cross-section computation (a single data point) must be duplicated
-*N\_dat* times to correspond to the size of the `COMMONDATA` file.
-The + operator is typically used to compensate for missing subgrids,
-for example when a `COMMONDATA` file begins with several data points that
-cannot yet be computed from theory, the + operator can be used to skip those
-points. The N operator is used to perform unit conversions or the like.
-
-### Compound files and C-factors
-If the new dataset is a compound observable (that is, theory predictions are a
-function of more than one FK-product), then one should write a corresponding
-`COMPOUND` file as described in [Theory data files](../data/th-data-files.html#compound-file-format). This compound file should be stored
-in the APFELcomb repository under the `compound` directory.
-
-C-factors should be in the format specified in [Theory data files](../data/th-data-files.html#cfactor-file-format) and stored in the nnpdfcpp
-repository under
-```
-nnpdf/nnpdfcpp/data/N*LOCFAC/
-```
-directory.
-
-### Important note on subgrid ordering
-If the FK table consists of more than one subgrid to be merged into a single
-table, then the ordering of the subgrids in their subgrid **id** is vital.
-The `merge_allgrids.py` script will merge the subgrids
-in order of their **id**. So if one is constructing an FK table for a merged
-W+/W-/Z dataset, it is crucial that the ordering of the corresponding W+/W-/Z
-subgrids in id matches the ordering in `COMMONDATA`.
-
-### Important note on committing changes
-If one makes a modification to the `apfelcomb.db` database, once he is happy
-with it one *must* export it to the plain-text dump file at `db/apfelcomb.dat`.
-This file must then be committed. It is important to note that the binary
-sqlite database is not stored in the repository.
-
-A helper script is provided to do this. If you want to convert your binary
-database to the text dump, run `db/generate_dump.sh` and then commit the
-resulting `apfelcomb.dat` file.
-
-Also, note that, if one conversely modifies the `apfelcomb.dat` file, one has
-to delete and re-generate the sqlite database `apfelcomb.db` This is easily
-done by running `db/generate_database.sh`.
-
-## Helper scripts
-
-Several helper scripts are provided to make using APFELcomb easier
-(particularly when generating a full set of FK tables for a particular theory).
-- `scripts/disp_grids.py` displays a full list of APPLgrid/FastNLO, DIS or DYP subgrids
-implemented in APFELcomb.
-- `run_allgrids.py [theoryID] [job script]` scans the results directory and
-submits jobs for all missing subgrids for the specified theory.
-- `test_submit.py` is an example [job script] to be used for `run\_allgrids.py`.
-These scripts specify how jobs are launched on a given cluster.
-- `hydra_submit.py` is the [job script] for the HYDRA cluster in Oxford.
-- `merge_allgrids.py [theoryID]` merges all subgrids in the results directory
-for a specified theory into final FK tables. This does not delete subgrids.
-- `finalise.sh [theoryID]` runs C-factor scaling, copies `COMPOUND` files,
-deletes the subgrids, and finally compresses the result into a theory.tgz file
-ready for upload.
-- `results/upload_theories` automatically upload to the server all the
-theory.tgz files that have been generated.
-
-## Generating a complete theory
-
-The general workflow for generating a complete version of a given theory (on
-a cluster) cluster is then:
-```
-./run_allgrids.py <theoryID> ./hydra_submit.sh # Submit all APFELcomb subgrid-jobs
-# Once all subgrid jobs have successfully finished
-./merge_allgrids.py <theoryID> # Merge subgrids into FK tables
-# If merging is successful
-./finalise.sh <theoryID>
-# Results in a final theory at ./results/theory_<theoryID>.tgz
\ No newline at end of file
diff --git a/doc/sphinx/source/tutorials/index.rst b/doc/sphinx/source/tutorials/index.rst
index e6de646702..32a89d51dc 100644
--- a/doc/sphinx/source/tutorials/index.rst
+++ b/doc/sphinx/source/tutorials/index.rst
@@ -12,7 +12,6 @@ Running fits
    :maxdepth: 1
 
    ./run-fit.md
-   ./run-legacy-fit.rst
    ./run-iterated-fit.rst
    ./general_th_covmat.rst
    ./thcov_tutorial.rst
@@ -38,7 +37,6 @@ Adding new data
    ./buildmaster.rst
    ./APPLgrids.md
    ./APPLgrids_comp.md
-   ./apfelcomb.md
 
 Closure tests
 -------------
diff --git a/pyproject.toml b/pyproject.toml
index d681d1bfca..fb8f155847 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -92,7 +92,7 @@ hypothesis = {version = "*", optional = true}
 recommonmark = {version = "*", optional = true}
 sphinxcontrib-bibtex = {version = "*", optional = true}
 sphinx_rtd_theme = {version = "*", optional = true}
-sphinx = {version = "^4.0.2", optional = true}
+sphinx = {version = "^5.0", optional = true}
 # qed
 fiatlux = {version = "*", optional = true}
 # without lhapdf

From 1f1f6763693f1e61e3ecb6132f4636bc14c22069 Mon Sep 17 00:00:00 2001
From: juacrumar <juacrumar@lairen.eu>
Date: Sun, 3 Mar 2024 13:13:52 +0100
Subject: [PATCH 3/4] fix miscelaneous warnings and errors

---
 doc/sphinx/source/get-started/index.rst       |   2 +
 doc/sphinx/source/index.rst                   |   2 +-
 doc/sphinx/source/theory/PTevol.rst           | 306 +++++++++++++-----
 doc/sphinx/source/theory/index.rst            |   1 +
 doc/sphinx/source/theory/theoryindex.rst      |   2 +-
 doc/sphinx/source/tutorials/index.rst         |   1 +
 pyproject.toml                                |   3 +-
 .../src/validphys/datafiles/disp_theory.py    |  13 -
 8 files changed, 238 insertions(+), 92 deletions(-)

diff --git a/doc/sphinx/source/get-started/index.rst b/doc/sphinx/source/get-started/index.rst
index ee9b382c13..59c3a0c6b7 100644
--- a/doc/sphinx/source/get-started/index.rst
+++ b/doc/sphinx/source/get-started/index.rst
@@ -1,3 +1,5 @@
+.. _get-started:
+
 Getting started
 ===============
 
diff --git a/doc/sphinx/source/index.rst b/doc/sphinx/source/index.rst
index 41d2a21d1a..482da3e389 100644
--- a/doc/sphinx/source/index.rst
+++ b/doc/sphinx/source/index.rst
@@ -24,7 +24,7 @@ repositories. Along with this online documentation, we release the
 
 The code can be used to produce the ingredients needed for PDF fits, to run the fits themselves, and to analyse the results. This is the first framework used to produce a global PDF fit made publicly  available, enabling for detailed external validation and reproducibility of the NNPDF4.0 analysis. Moreover, the code enables the user to explore a number of phenomenological applications, such as the assessment of the impact of new experimental data on PDFs, the effect of changes in theory settings on the resulting PDFs and a fast quantitative comparison between theoretical predictions and experimental data over a broad range of observables.
 
-If you are a new user head along to :ref:`getstarted` and check out the :ref:`Tutorials`.
+If you are a new user head along to :ref:`get-started` and check out the :ref:`Tutorials`.
 
 The NNPDF team
 ==============
diff --git a/doc/sphinx/source/theory/PTevol.rst b/doc/sphinx/source/theory/PTevol.rst
index 76a21add7e..03cd9433d7 100644
--- a/doc/sphinx/source/theory/PTevol.rst
+++ b/doc/sphinx/source/theory/PTevol.rst
@@ -1,3 +1,5 @@
+.. _ptoevolution:
+
 | **Notes on Perturbative Evolution and PDF flavor decomposition**
 | AG & MU (with the help of Jacob Haddo, summer student) 
 
@@ -11,32 +13,48 @@ Notation
 
 Define the coupling
 
-.. math:: a_{s} = \frac{\alpha_{s}(Q^{2})}{4\pi}
+.. math::
+
+    a_{s} = \frac{\alpha_{s}(Q^{2})}{4\pi}
+
+.. math::
 
-.. math:: a_{0} = a_{s}(Q_{0}^{2})
+    a_{0} = a_{s}(Q_{0}^{2})
 
 which satisfies the Renormalisation Group Equation
 
-.. math:: \frac{da_{s}}{d\ln\mu^{2}} = \beta(a_{s}) = - \sum_{n = 0}^{\infty}\beta_{n}a_{s}^{n + 2}\,,
+.. math::
+
+    \frac{da_{s}}{d\ln\mu^{2}} = \beta(a_{s}) = - \sum_{n = 0}^{\infty}\beta_{n}a_{s}^{n + 2}\,,
 
 where
 
-.. math:: \beta_0 = \frac{11}{3}C_A - \frac{4}{3}T_FN_f
+.. math::
 
-.. math:: \beta_1 = \frac{34}{3}C^2_A - 4C_FT_FN_f - \frac{20}{3}C_AT_FN_f
+    \beta_0 = \frac{11}{3}C_A - \frac{4}{3}T_FN_f
 
-.. math:: \beta_2 = \frac{2857}{54}C^3_A + 2C^2_FT_FN_f - \frac{205}{9}C_FC_AT_FN_f - \frac{1415}{27}C^2_AT_FN_f - \frac{44}{9}C_FT^2_FN^2_f - \frac{158}{27}C_AT^2_FN^2_f.
+.. math::
+
+    \beta_1 = \frac{34}{3}C^2_A - 4C_FT_FN_f - \frac{20}{3}C_AT_FN_f
+
+.. math::
+
+    \beta_2 = \frac{2857}{54}C^3_A + 2C^2_FT_FN_f - \frac{205}{9}C_FC_AT_FN_f - \frac{1415}{27}C^2_AT_FN_f - \frac{44}{9}C_FT^2_FN^2_f - \frac{158}{27}C_AT^2_FN^2_f.
 
 
 * **Mellin transform**
 
 The Mellin transform of a function is defined as
 
-.. math:: f(N,Q^{2}) = \int_{0}^{1}dx\, x^{N - 1}f(x,Q^{2})\,,
+.. math::
+
+    f(N,Q^{2}) = \int_{0}^{1}dx\, x^{N - 1}f(x,Q^{2})\,,
 
 and we can get back the x-space distribution as
 
-.. math:: f(x,Q^{2}) = \int_{c - i\infty}^{c + i\infty}\mspace{6mu}\frac{dN}{2\pi i}\, x^{- N}f(N,Q^{2})\,,
+.. math::
+
+    f(x,Q^{2}) = \int_{c - i\infty}^{c + i\infty}\mspace{6mu}\frac{dN}{2\pi i}\, x^{- N}f(N,Q^{2})\,,
 
 where the intercept c of integration contour is chosen to be to the
 right of all singularities of f(N,Q2) in the complex N plane.
@@ -47,12 +65,16 @@ Parton evolution
 The scale dependence of the parton distribution functions is described
 by the renormalisation group equations for mass factorisation (DGLAP)
 
-.. math:: \mu^{2}\frac{\partial}{\partial\mu^{2}}f_{i}(x,\mu^{2}) = P_{ij}(x,\mu^{2}) \otimes f(x,\mu^{2})\,
+.. math::
+
+    \mu^{2}\frac{\partial}{\partial\mu^{2}}f_{i}(x,\mu^{2}) = P_{ij}(x,\mu^{2}) \otimes f(x,\mu^{2})\,
 
 where f\ :sub:`i` is the generic parton distribution function, P\ :sub:`ij` are the
 Altarelli-Parisi kernels and :math:`\otimes` denotes the Mellin convolution 
 
-.. math:: f(x) \otimes g(x) \equiv \int_{x}^{1}dyf(y)g\left( \frac{x}{y} \right)
+.. math::
+
+    f(x) \otimes g(x) \equiv \int_{x}^{1}dyf(y)g\left( \frac{x}{y} \right)
 
 We have a system of (2n\ :sub:`f` + 1) coupled
 integro-differential equations, where the summation over the parton
@@ -60,7 +82,9 @@ species j is understood.
 
 The N\ :sup:`m`\ LO approximation for the splitting functions :math:`P_{ij}(x,\mu^2)`
 
-.. math:: P_{ij}^{N^{m}LO}(x,\mu^{2}) = \sum_{k = 0}^{m}a_{s}^{k + 1}(\mu^{2})P_{ij}^{(k)}(x)
+.. math::
+
+    P_{ij}^{N^{m}LO}(x,\mu^{2}) = \sum_{k = 0}^{m}a_{s}^{k + 1}(\mu^{2})P_{ij}^{(k)}(x)
 
 where we note that the only dependence on the scale :math:`\mu^2`
 is through the coupling constant :math:`a_s(\mu^2)`. The splitting
@@ -81,9 +105,13 @@ is possible to rewrite the system of equations as :math:`(2N_f - 1)` equations
 describing the
 independent evolution of the non-singlet quark asymmetries and
 
-.. math:: q_{NS,ij}^\pm = q_i \pm Q_i - (q_j \pm Q_j)
+.. math::
+
+    q_{NS,ij}^\pm = q_i \pm Q_i - (q_j \pm Q_j)
+
+.. math::
 
-.. math:: q_{NS}^v = \sum_{i = 1}^{N_f}(q_i - Q_i)
+    q_{NS}^v = \sum_{i = 1}^{N_f}(q_i - Q_i)
 
 and a system of 2 equations describing the coupled evolution of the
 singlet and gluon parton distributions.
@@ -106,7 +134,9 @@ singlet and gluon parton distributions.
 
 where the singlet combination, :math:`\Sigma`, is defined as
 
-.. math:: \Sigma = \sum_{i = 1}^{N_{f}}(q_{i} + {\overline{q}}_{i})\,,
+.. math::
+
+    \Sigma = \sum_{i = 1}^{N_{f}}(q_{i} + {\overline{q}}_{i})\,,
 
 where :math:`N_{f}` is the number of *light flavors*, *i.e.* the number
 of flavors with :math:`m_{q}^{2} < Q^{2}`.
@@ -121,27 +151,49 @@ The evolution of the individual quark distributions with the scale can
 be computed by introducing the following set of non-singlet
 distributions:
 
-.. math:: \begin{matrix} V & = & u^{-} + d^{-} + s^{-} + c^{-} + b^{-} + t^{-} \\ \end{matrix}
+.. math::
+
+    \begin{matrix} V & = & u^{-} + d^{-} + s^{-} + c^{-} + b^{-} + t^{-} \\ \end{matrix}
+
+.. math::
+
+    \begin{matrix} V_{3} & = & u^{-} - d^{-} \\ \end{matrix}
+
+.. math::
+
+    \begin{matrix} V_{8} & = & u^{-} + d^{-} - 2s^{-} \\ \end{matrix}
+
+.. math::
+
+    \begin{matrix} V_{15} & = & u^{-} + d^{-} + s^{-} - 3c^{-} \\ \end{matrix}
+
+.. math::
+
+    \begin{matrix} V_{24} & = & u^{-} + d^{-} + s^{-} + c^{-} - 4b^{-} \\ \end{matrix}
 
-.. math:: \begin{matrix} V_{3} & = & u^{-} - d^{-} \\ \end{matrix}
+.. math::
 
-.. math:: \begin{matrix} V_{8} & = & u^{-} + d^{-} - 2s^{-} \\ \end{matrix}
+    \begin{matrix} V_{35} & = & u^{-} + d^{-} + s^{-} + c^{-} + b^{-} - 5t^{-} \\ \end{matrix}
 
-.. math:: \begin{matrix} V_{15} & = & u^{-} + d^{-} + s^{-} - 3c^{-} \\ \end{matrix}
+.. math::
 
-.. math:: \begin{matrix} V_{24} & = & u^{-} + d^{-} + s^{-} + c^{-} - 4b^{-} \\ \end{matrix}
+    \begin{matrix} T_{3} & = & u^{+} - d^{+} \\ \end{matrix}
 
-.. math:: \begin{matrix} V_{35} & = & u^{-} + d^{-} + s^{-} + c^{-} + b^{-} - 5t^{-} \\ \end{matrix}
+.. math::
 
-.. math:: \begin{matrix} T_{3} & = & u^{+} - d^{+} \\ \end{matrix}
+    \begin{matrix} T_{8} & = & u^{+} + d^{+} - 2s^{+} \\ \end{matrix}
 
-.. math:: \begin{matrix} T_{8} & = & u^{+} + d^{+} - 2s^{+} \\ \end{matrix}
+.. math::
 
-.. math:: \begin{matrix} T_{15} & = & u^{+} + d^{+} + s^{+} - 3c^{+} \\ \end{matrix}
+    \begin{matrix} T_{15} & = & u^{+} + d^{+} + s^{+} - 3c^{+} \\ \end{matrix}
 
-.. math:: \begin{matrix} T_{24} & = & u^{+} + d^{+} + s^{+} + c^{+} - 4b^{+} \\ \end{matrix}
+.. math::
 
-.. math:: \begin{matrix} T_{35} & = & u^{+} + d^{+} + s^{+} + c^{+} + b^{+} - 5t^{+} \\ \end{matrix}
+    \begin{matrix} T_{24} & = & u^{+} + d^{+} + s^{+} + c^{+} - 4b^{+} \\ \end{matrix}
+
+.. math::
+
+    \begin{matrix} T_{35} & = & u^{+} + d^{+} + s^{+} + c^{+} + b^{+} - 5t^{+} \\ \end{matrix}
 
 where :math:`q_{i}^{\pm} = q_{i} \pm {\overline{q}}_{i}`, and
 :math:`u,d,s,c,b,t` are the various flavour distributions.
@@ -177,14 +229,18 @@ that all scales are the same, in particular that the renormalization
 :math:`\mu_{R}^{2}` and factorization scales :math:`\mu_{F}^{2}` are the
 same that the hard scale of the problem :math:`\mu^{2}`,
 
-.. math:: \mu_{R}^{2} = \mu_{F}^{2} = \mu^{2}\ .
+.. math::
+
+    \mu_{R}^{2} = \mu_{F}^{2} = \mu^{2}\ .
 
 However, if this is not the case, Eq. `[eq:pmlo] <#eq:pmlo>`__ has to be
 modified as follows:
 
 -  Singlet case : up to NNLO one has
 
-.. math:: \mathbf{P}(x, \alpha_s(\mu^2_R), L_R) = \alpha_s(\mu^2_R)\mathbf{P}^{(0)}(x) + \alpha^2_s(\mu^2_R)[\mathbf{P}^{(1)}(x) - \beta_0L_R\mathbf{P}^{(0)}(x)] +\alpha^3_s(\mu^2_R)[\mathbf{P}^{(2)}(x) - 2\beta_0L_R\mathbf{P}^{(1)}(x) - (\beta_1L_R - \beta^2_0L^2_R)\mathbf{P}^{(0)}(x)]
+.. math::
+
+    \mathbf{P}(x, \alpha_s(\mu^2_R), L_R) = \alpha_s(\mu^2_R)\mathbf{P}^{(0)}(x) + \alpha^2_s(\mu^2_R)[\mathbf{P}^{(1)}(x) - \beta_0L_R\mathbf{P}^{(0)}(x)] +\alpha^3_s(\mu^2_R)[\mathbf{P}^{(2)}(x) - 2\beta_0L_R\mathbf{P}^{(1)}(x) - (\beta_1L_R - \beta^2_0L^2_R)\mathbf{P}^{(0)}(x)]
 
 -  with :math:`\mathbf{P}^{(k)}` the matrix of singlet splitting functions (in
    the :math:`\mu_{R}^{2} = \mu_{F}^{2} = \mu^{2}` case ) as defined in
@@ -195,7 +251,9 @@ modified as follows:
 -  Non-singlet case . In analogy with the singlet case, up to NNLO one
    has
 
-.. math:: P^{\pm, v}_{NS}(x, \alpha_s(\mu^2_R), L_R) = \alpha_s(\mu^2_R)P^{\pm, v(0)}_{NS}(x) + \alpha^2_s(\mu^2_R)[P^{\pm, v(1)}_{NS}(x) - \beta_0L_RP^{\pm, v(0)}_{NS}(x)] + \alpha^3_s(\mu^2_R)[P^{\pm, v(2)}_{NS}(x) - 2\beta_0L_RP^{\pm, v(1)}_{NS}(x) - (\beta_1L_R - \beta^2_0L^2_R)P^{\pm,v(0)}_{NS}(x)] 
+.. math::
+
+    P^{\pm, v}_{NS}(x, \alpha_s(\mu^2_R), L_R) = \alpha_s(\mu^2_R)P^{\pm, v(0)}_{NS}(x) + \alpha^2_s(\mu^2_R)[P^{\pm, v(1)}_{NS}(x) - \beta_0L_RP^{\pm, v(0)}_{NS}(x)] + \alpha^3_s(\mu^2_R)[P^{\pm, v(2)}_{NS}(x) - 2\beta_0L_RP^{\pm, v(1)}_{NS}(x) - (\beta_1L_R - \beta^2_0L^2_R)P^{\pm,v(0)}_{NS}(x)] 
 	  
 -  with the same conventions as in the singlet case and where the
    various combinations of non-singlet quark densities and associated
@@ -203,7 +261,9 @@ modified as follows:
    `[eq:nonsinglet] <#eq:nonsinglet>`__. Note that at NLO one has some
    simplifications:
 
-.. math:: P^{\pm, v}_{NS}(x, \alpha_s(\mu^2_R), L_R) = \alpha_s(\mu^2_R)P^{(0)}_{NS}(x) + \alpha^2_s(\mu^2_R)[P^{\pm(1)}_{NS}(x) - \beta_0L_RP^{(0)}_{NS}(x)]
+.. math::
+
+    P^{\pm, v}_{NS}(x, \alpha_s(\mu^2_R), L_R) = \alpha_s(\mu^2_R)P^{(0)}_{NS}(x) + \alpha^2_s(\mu^2_R)[P^{\pm(1)}_{NS}(x) - \beta_0L_RP^{(0)}_{NS}(x)]
 
 The DGLAP evolution equations with variations of the renormalization
 scale can be benchmarked againts the usual LH tables.
@@ -216,7 +276,9 @@ following we write the expressions of the NLO coefficient functions
 explicitly the dependence on the factorization and renormalization
 scales, :math:`\mu_{r}^{2}` and :math:`\mu_{f}^{2}`.
 
-.. math:: C_{a}^{\pm}(N,\alpha_{s}(\mu_{f}^{2}),Q^{2}/\mu_{r}^{2},\mu_{f}^{2}/\mu_{r}^{2}) = 1 + a_{s}(\mu_{r}^{2})\left\lbrack c_{a,NS}^{(1)}(N) + \gamma_{NS}^{(0)}(N)\log\left( \frac{Q^{2}}{\mu_{f}^{2}} \right) \right\rbrack + \mathcal{O}(a_{s}^{2})
+.. math::
+
+    C_{a}^{\pm}(N,\alpha_{s}(\mu_{f}^{2}),Q^{2}/\mu_{r}^{2},\mu_{f}^{2}/\mu_{r}^{2}) = 1 + a_{s}(\mu_{r}^{2})\left\lbrack c_{a,NS}^{(1)}(N) + \gamma_{NS}^{(0)}(N)\log\left( \frac{Q^{2}}{\mu_{f}^{2}} \right) \right\rbrack + \mathcal{O}(a_{s}^{2})
 
 .. math::
 
@@ -228,19 +290,33 @@ scales, :math:`\mu_{r}^{2}` and :math:`\mu_{f}^{2}`.
 we can write down the explicit expression for all the NLo coefficient
 functions:
 
-.. math:: C_2^{NS}(N,a_s(\mu_r^2),Q^2/\mu_f^2) = 1 + a_s(\mu_r^2)\cdot C_F\bigg[2S_1(N)^2 - 2 S_2(N) + 3S_1(N) - 2\frac{S_1(N)}{N(N+1)}+\frac{3}{N}+\frac{4}{N+1}+\frac{2}{N^2}-9 +\log(\frac{Q^2}{\mu_f^2})(3 - 4 S_1(N) +\frac{2}{N(N+1)}\bigg]
+.. math::
+
+    C_2^{NS}(N,a_s(\mu_r^2),Q^2/\mu_f^2) = 1 + a_s(\mu_r^2)\cdot C_F\bigg[2S_1(N)^2 - 2 S_2(N) + 3S_1(N) - 2\frac{S_1(N)}{N(N+1)}+\frac{3}{N}+\frac{4}{N+1}+\frac{2}{N^2}-9 +\log(\frac{Q^2}{\mu_f^2})(3 - 4 S_1(N) +\frac{2}{N(N+1)}\bigg]
 	  
-.. math:: C_2^q(N,a_s(\mu_r^2),Q^2/\mu_f^2) = C_2^{NS}(N,a_s(\mu_r^2),Q^2/\mu_f^2)
+.. math::
 
-.. math:: C_2^g(N,a_s(\mu_r^2),Q^2/\mu_f^2) = a_s(\mu_r^2)\cdot 4n_fT_R\bigg[\frac{4}{N+1} - \frac{4}{N+2} - (1+S_1(N))\cdot \frac{N^2+N+2}{N(N+1)(N+2)}+\frac{1}{N_1} +\log(\frac{Q^2}{\mu_f^2})\frac{N^2+N+2}{N(N+1)(N+2)}\bigg]
+    C_2^q(N,a_s(\mu_r^2),Q^2/\mu_f^2) = C_2^{NS}(N,a_s(\mu_r^2),Q^2/\mu_f^2)
 
-.. math:: C_L^{NS}(N,a_s(\mu_r^2)) = a_s(\mu_r^2)\cdot C_F \frac{4}{N+1}
+.. math::
 
-.. math:: C_L^q(N,a_s(\mu_r^2)) = C_L^{NS}(N,a_s(\mu_r^2))
+    C_2^g(N,a_s(\mu_r^2),Q^2/\mu_f^2) = a_s(\mu_r^2)\cdot 4n_fT_R\bigg[\frac{4}{N+1} - \frac{4}{N+2} - (1+S_1(N))\cdot \frac{N^2+N+2}{N(N+1)(N+2)}+\frac{1}{N_1} +\log(\frac{Q^2}{\mu_f^2})\frac{N^2+N+2}{N(N+1)(N+2)}\bigg]
 
-.. math:: C_L^g(N,a_s(\mu_r^2)) = a_s(\mu_r^2)\cdot 4n_fT_R \frac{4}{(N+1)(N+2)}
+.. math::
 
-.. math:: C_3^{NS}(N,a_s(\mu_r^2),Q^2/\mu_f^2) = 1 + a_s(\mu_r^2)\cdot C_F\bigg[2S_1(N)^2 - 2 S_2(N) + 3S_1(N)- 2\frac{S_1(N)}{N(N+1)} +\frac{3}{N}+\frac{4}{N+1} +\frac{2}{N^2}-9 -\frac{4N+2}{N(N+1)} +\log(\frac{Q^2}{\mu_f^2})(3 - 4 S_1(N) +\frac{2}{N(N+1)})\bigg]
+    C_L^{NS}(N,a_s(\mu_r^2)) = a_s(\mu_r^2)\cdot C_F \frac{4}{N+1}
+
+.. math::
+
+    C_L^q(N,a_s(\mu_r^2)) = C_L^{NS}(N,a_s(\mu_r^2))
+
+.. math::
+
+    C_L^g(N,a_s(\mu_r^2)) = a_s(\mu_r^2)\cdot 4n_fT_R \frac{4}{(N+1)(N+2)}
+
+.. math::
+
+    C_3^{NS}(N,a_s(\mu_r^2),Q^2/\mu_f^2) = 1 + a_s(\mu_r^2)\cdot C_F\bigg[2S_1(N)^2 - 2 S_2(N) + 3S_1(N)- 2\frac{S_1(N)}{N(N+1)} +\frac{3}{N}+\frac{4}{N+1} +\frac{2}{N^2}-9 -\frac{4N+2}{N(N+1)} +\log(\frac{Q^2}{\mu_f^2})(3 - 4 S_1(N) +\frac{2}{N(N+1)})\bigg]
 
 * **Implementation of the heavy quarks**
 
@@ -256,7 +332,9 @@ code.
    scale :math:`Q^{2} > m_{c}^{2}` according to the NS evolution
    equation:
 
-.. math:: T_{15}(Q^{2},x) = \Gamma_{NS}^{+}(Q_{0}^{2},Q^{2},x) \otimes T_{15}(Q_{0}^{2},x).
+.. math::
+
+    T_{15}(Q^{2},x) = \Gamma_{NS}^{+}(Q_{0}^{2},Q^{2},x) \otimes T_{15}(Q_{0}^{2},x).
 
 -  Instead the :math:`T_{24}` parton distribution defined in Eq. (15)
    coincides with the Singlet distribution up to the bottom threshold,
@@ -320,7 +398,9 @@ code.
    scale to any final scale :math:`Q^{2} > m_{c}^{2}` according to the
    NS minus evolution equation:
 
-.. math:: V_{15}(Q^{2},x) = \Gamma_{NS}^{-}(Q_{0}^{2},Q^{2},x) \otimes V_{15}(Q_{0}^{2},x).
+.. math::
+
+    V_{15}(Q^{2},x) = \Gamma_{NS}^{-}(Q_{0}^{2},Q^{2},x) \otimes V_{15}(Q_{0}^{2},x).
 
 -  Instead the :math:`V_{24}` parton distribution defined in Eq. (15)
    coincides with the total valence distribution :math:`V` up to the
@@ -342,7 +422,9 @@ code.
    well as the :math:`\Gamma_{NS}^{q,24}` and :math:`\Gamma_{NS}^{g,24}`
    kernels, a :math:`\Gamma_{NS}^{- ,24}` kernel as:
 
-.. math:: \Gamma_{NS}^{- ,24}(Q_{0}^{2},Q^{2},N) = \Gamma_{NS}^{-}(m_{b}^{2},Q^{2},N)\Gamma_{NS}^{v}(Q_{0}^{2},m_{b}^{2},N)
+.. math::
+
+    \Gamma_{NS}^{- ,24}(Q_{0}^{2},Q^{2},N) = \Gamma_{NS}^{-}(m_{b}^{2},Q^{2},N)\Gamma_{NS}^{v}(Q_{0}^{2},m_{b}^{2},N)
 
 -  In the same way we can write explicitely the evolution of the
    :math:`V_{35}` parton distribution function up to a scale
@@ -358,7 +440,9 @@ code.
 
 -  In our code we must define :math:`\Gamma_{NS}^{- ,35}` as
 
-.. math:: \Gamma_{NS}^{- ,35}(Q_{0}^{2},Q^{2},N) = \Gamma_{NS}^{-}(m_{t}^{2},Q^{2},N)\Gamma_{NS}^{v}(m_{t}^{2},Q^{2},N)
+.. math::
+
+    \Gamma_{NS}^{- ,35}(Q_{0}^{2},Q^{2},N) = \Gamma_{NS}^{-}(m_{t}^{2},Q^{2},N)\Gamma_{NS}^{v}(m_{t}^{2},Q^{2},N)
 
 -  Case II: general case :math:`Q_{0}^{2} < m_{c}^{2}`
 
@@ -486,7 +570,9 @@ code.
 
 -  In our code we must define :math:`\Gamma_{NS}^{- ,15}` as:
 
-.. math:: \Gamma_{NS}^{- ,15}(Q_{0}^{2},Q^{2},N) = \Gamma_{NS}^{-}(m_{c}^{2},Q^{2},N)\Gamma_{NS}^{v}(Q_{0}^{2},m_{c}^{2},N)
+.. math::
+
+    \Gamma_{NS}^{- ,15}(Q_{0}^{2},Q^{2},N) = \Gamma_{NS}^{-}(m_{c}^{2},Q^{2},N)\Gamma_{NS}^{v}(Q_{0}^{2},m_{c}^{2},N)
 
 -  In the same way, if :math:`Q^{2} > m_{b}^{2}` the :math:`V_{24}`
    parton distribution coincides with the Total valence distribution up
@@ -507,7 +593,9 @@ code.
    can be easily used for a NNLO evolution code we should define a
    :math:`\Gamma_{NS}^{- ,24}` kernel as:
 
-.. math:: \Gamma_{NS}^{- ,24}(Q_{0}^{2},Q^{2},N) = \Gamma_{NS}^{-}(m_{b}^{2},Q^{2},N)\Gamma_{NS}^{v}(Q_{0}^{2},m_{b}^{2},N)
+.. math::
+
+    \Gamma_{NS}^{- ,24}(Q_{0}^{2},Q^{2},N) = \Gamma_{NS}^{-}(m_{b}^{2},Q^{2},N)\Gamma_{NS}^{v}(Q_{0}^{2},m_{b}^{2},N)
 
 -  In the same way we can write explicitely the evolution of the
    :math:`V_{35}` parton distribution function up to a scale
@@ -524,7 +612,9 @@ code.
 -  Correspondingly, in our code we should define
    :math:`\Gamma_{NS}^{- ,35}` as
 
-.. math:: \Gamma_{NS}^{- ,35}(Q_{0}^{2},Q^{2},N) = \Gamma_{NS}^{-}(m_{t}^{2},Q^{2},N)\Gamma_{NS}^{v}(m_{t}^{2},Q^{2},N)
+.. math::
+
+    \Gamma_{NS}^{- ,35}(Q_{0}^{2},Q^{2},N) = \Gamma_{NS}^{-}(m_{t}^{2},Q^{2},N)\Gamma_{NS}^{v}(m_{t}^{2},Q^{2},N)
 
 N space solutions to the evolution equations (Ref. )
 ----------------------------------------------------
@@ -541,76 +631,112 @@ N space solutions to the evolution equations (Ref. )
    rewrite the DGLAP evolution equation for the quark-singlet and gluon
    distributions, in Mellin-\ :math:`N` space, as.
 
-   .. math:: a_s\frac{\partial}{\partial a_s} \binom{\Sigma}{g}(N, a_s) = -\mathbf{R} \cdot \binom{\Sigma}{g}(N, a_s),
+   .. math::
+
+    a_s\frac{\partial}{\partial a_s} \binom{\Sigma}{g}(N, a_s) = -\mathbf{R} \cdot \binom{\Sigma}{g}(N, a_s),
 
 -  where the matrix **R** has the following perturbative expansion
 
-.. math:: \mathbf{R} = \mathbf{R}_0+a_s\mathbf{R}_1+a_s\mathbf{R}_2 + \dots
+.. math::
+
+    \mathbf{R} = \mathbf{R}_0+a_s\mathbf{R}_1+a_s\mathbf{R}_2 + \dots
 
 -  with
 
-.. math:: \mathbf{R}_0 \equiv \frac{\boldsymbol{\gamma}^{(0)}}{\beta_0}
+.. math::
+
+    \mathbf{R}_0 \equiv \frac{\boldsymbol{\gamma}^{(0)}}{\beta_0}
+
+.. math::
 
-.. math:: \mathbf{R}_k \equiv \frac{\boldsymbol{\gamma}^{(k)}}{\beta_0} - \sum_{i=1}^k \frac{\beta_i}{\beta_0}R_{k-i}
+    \mathbf{R}_k \equiv \frac{\boldsymbol{\gamma}^{(k)}}{\beta_0} - \sum_{i=1}^k \frac{\beta_i}{\beta_0}R_{k-i}
 
 -  where the :math:`\mathbf{\gamma}` stands for the matrix of anomalous
    dimensions.
 
    The solution of the singlet evolution equation at leading order is:
 
-.. math:: \mathbf{q}_{LO}(x,Q^2) = \mathbf{L}(a_s,a_0,N)\mathbf{q}_{LO}(x,Q_0^2).
+.. math::
+
+    \mathbf{q}_{LO}(x,Q^2) = \mathbf{L}(a_s,a_0,N)\mathbf{q}_{LO}(x,Q_0^2).
 
 -  The leading order evolution operator :math:`\mathbf{L}` is written, in terms
    of the eigenvalues of the leading order anomalous dimension matrix
 
-.. math:: \lambda_{\pm} = \frac{1}{2\beta_{0}}\left\lbrack \gamma_{qq}^{0} + \gamma_{gg}^{0} \pm \sqrt{\left( \gamma_{qq}^{0} - \gamma_{gg}^{0} \right)^{2} + 4\gamma_{qg}^{0}\gamma_{gq}^{0}} \right\rbrack
+.. math::
+
+    \lambda_{\pm} = \frac{1}{2\beta_{0}}\left\lbrack \gamma_{qq}^{0} + \gamma_{gg}^{0} \pm \sqrt{\left( \gamma_{qq}^{0} - \gamma_{gg}^{0} \right)^{2} + 4\gamma_{qg}^{0}\gamma_{gq}^{0}} \right\rbrack
 
 -  and the corresponding projector matrices
 
-.. math:: \mathbf{e}_\pm=\frac{\pm 1}{\lambda_+ - \lambda_-}(R^{(0)}-\lambda_\mp\mathbb{I}),
+.. math::
+
+    \mathbf{e}_\pm=\frac{\pm 1}{\lambda_+ - \lambda_-}(R^{(0)}-\lambda_\mp\mathbb{I}),
 
 -  in the following form
 
-.. math:: \mathbf{L}(a_s,a_0,N)= \mathbf{e}_-(\frac{a_s}{a_0})^{-\lambda_{-(N)}} + \mathbf{e}_+(\frac{a_s}{a_0})^{-\lambda_{+(N)}}.
+.. math::
+
+    \mathbf{L}(a_s,a_0,N)= \mathbf{e}_-(\frac{a_s}{a_0})^{-\lambda_{-(N)}} + \mathbf{e}_+(\frac{a_s}{a_0})^{-\lambda_{+(N)}}.
 
 -  We express the solution of the evolution equation
    `[eq:stdevol] <#eq:stdevol>`__ as a perturbative expansion around the
    LO solution :math:`\mathbf{L}(a_s,a_0,N)`
 
-.. math:: \binom{\Sigma}{g}(N,a_s) = \bigg[\mathbb{I}+\sum_{k=1}^{\infty}a_s^kU_k(N)\bigg] \mathbf{L}(a_s,a_0,N)\bigg[\mathbb{I}+\sum_{k=1}^{\infty}a_0^kU_k(N)\bigg]^{-1}\binom{\Sigma}{g}(N,a_0)\equiv \mathbf{\Gamma}_S(N,a_s,a_0)\binom{\Sigma}{g}(N,a_0)
+.. math::
+
+    \binom{\Sigma}{g}(N,a_s) = \bigg[\mathbb{I}+\sum_{k=1}^{\infty}a_s^kU_k(N)\bigg] \mathbf{L}(a_s,a_0,N)\bigg[\mathbb{I}+\sum_{k=1}^{\infty}a_0^kU_k(N)\bigg]^{-1}\binom{\Sigma}{g}(N,a_0)\equiv \mathbf{\Gamma}_S(N,a_s,a_0)\binom{\Sigma}{g}(N,a_0)
 
 -  The *fully truncated*\  [1]_ expression of the matrix evolution
    kernel up to NNLO reads
 
-.. math:: \mathbf{\Gamma}_S(N) = \big[\mathbf{L} + a_s\mathbf{U}_1\mathbf{L} - a_0\mathbf{LU}_1 + a_s^2 \mathbf{U}_2\mathbf{L} - a_sa_0 \mathbf{U}_1\mathbf{LU}_1 + a_0^2\mathbf{L}(\mathbf{U}_1^2 - \mathbf{U}_2)\big].
+.. math::
+
+    \mathbf{\Gamma}_S(N) = \big[\mathbf{L} + a_s\mathbf{U}_1\mathbf{L} - a_0\mathbf{LU}_1 + a_s^2 \mathbf{U}_2\mathbf{L} - a_sa_0 \mathbf{U}_1\mathbf{LU}_1 + a_0^2\mathbf{L}(\mathbf{U}_1^2 - \mathbf{U}_2)\big].
 
 -  The :math:`U` matrices introduced in the previous equation are
    defined by this commutation relations
 
-.. math:: \big[ \mathbf{U}_1, \mathbf{R}_0 \big] =  \mathbf{R}_1 + \mathbf{R}_1
+.. math::
+
+    \big[ \mathbf{U}_1, \mathbf{R}_0 \big] =  \mathbf{R}_1 + \mathbf{R}_1
+
+.. math::
+
+    \big[ \mathbf{R}_2, \mathbf{R}_0 \big]  =  \mathbf{R}_2 +\mathbf{R}_1 \mathbf{U}_1 + 2 \mathbf{U}_2
+
+.. math::
 
-.. math:: \big[ \mathbf{R}_2, \mathbf{R}_0 \big]  =  \mathbf{R}_2 +\mathbf{R}_1 \mathbf{U}_1 + 2 \mathbf{U}_2
+    \vdots
 
-.. math:: \vdots
+.. math::
 
-.. math:: \big[ \mathbf{U}_k, \mathbf{R}_0 \big] =  \mathbf{R}_k + \sum_{i=1}^{k-1} \mathbf{R}_{k-i} \mathbf{U}_i + k \mathbf{U}_k \equiv\ \widetilde{\mathbf{R}}_k + k \mathbf{U}_k.
+    \big[ \mathbf{U}_k, \mathbf{R}_0 \big] =  \mathbf{R}_k + \sum_{i=1}^{k-1} \mathbf{R}_{k-i} \mathbf{U}_i + k \mathbf{U}_k \equiv\ \widetilde{\mathbf{R}}_k + k \mathbf{U}_k.
 
 -  as
 
-.. math:: \mathbf{U}_k=-\frac{1}{k}[e_+\widetilde{\mathbf{R}}_ke_+ + e_-\widetilde{\mathbf{R}}_ke_-] + \frac{e_+ \widetilde{\mathbf{R}}_k e_-}{\lambda_- -\lambda_+ - k} + \frac{e_-\widetilde{\mathbf{R}}_ke_+}{\lambda_+ -\lambda_- - k}
+.. math::
+
+    \mathbf{U}_k=-\frac{1}{k}[e_+\widetilde{\mathbf{R}}_ke_+ + e_-\widetilde{\mathbf{R}}_ke_-] + \frac{e_+ \widetilde{\mathbf{R}}_k e_-}{\lambda_- -\lambda_+ - k} + \frac{e_-\widetilde{\mathbf{R}}_ke_+}{\lambda_+ -\lambda_- - k}
 
 -  where
 
-.. math:: \widetilde{\mathbf{R}}_k = \mathbf{R}_k+\sum_{i=1}^{k-1}\mathbf{R}_{k-i}\mathbf{U}_i.
+.. math::
+
+    \widetilde{\mathbf{R}}_k = \mathbf{R}_k+\sum_{i=1}^{k-1}\mathbf{R}_{k-i}\mathbf{U}_i.
 
 -  By solving recursively equations
    `[eq:ukexplicit] <#eq:ukexplicit>`__,
    `[eq:rtwiddle] <#eq:rtwiddle>`__ and the NLO approximation of
    eq.\ `[eq:r] <#eq:r>`__:
 
-.. math:: \mathbf{R}_0 \equiv \frac{\boldsymbol{\gamma}^{(0)}}{\beta_0}
+.. math::
+
+    \mathbf{R}_0 \equiv \frac{\boldsymbol{\gamma}^{(0)}}{\beta_0}
+
+.. math::
 
-.. math:: \mathbf{R}_k\equiv - b_1 \mathbf{R}_{k-1} + \mathcal{O}(\textrm{NNLO})
+    \mathbf{R}_k\equiv - b_1 \mathbf{R}_{k-1} + \mathcal{O}(\textrm{NNLO})
 
 -  the NLO full solution (corresponding to IMODEV=1 in ref.) can be
    easily implemented into the code. Practically the sum in
@@ -633,12 +759,16 @@ N space solutions to the evolution equations (Ref. )
    allows us to wrote down explicitly for $U_k^{\,\rm ns}$. At LO the
    solution simply reads as:
 
-.. math:: \Gamma_{NS,LO}^{\pm,v}(N,a_s,a_0)= (\frac{a_s}{a_0})^{-R_0^{ns}}
+.. math::
+
+    \Gamma_{NS,LO}^{\pm,v}(N,a_s,a_0)= (\frac{a_s}{a_0})^{-R_0^{ns}}
 
 -  Both iterated and truncated non-singlet solutions can be written down
    in a compact closed form at NLO as well. Iterated solution:
 
-.. math:: \Gamma^{\pm,v}_{NS,NLO}(N,a_s,a_0) =\exp\bigg{\frac{U^{\pm,v}_1} {b_1}\ln(\frac{1+b_1a_s}{1+b_1 a_0})\bigg}(\frac{a_s}{a_0})^{-R_0^{ns}}.
+.. math::
+
+    \Gamma^{\pm,v}_{NS,NLO}(N,a_s,a_0) =\exp\bigg{\frac{U^{\pm,v}_1} {b_1}\ln(\frac{1+b_1a_s}{1+b_1 a_0})\bigg}(\frac{a_s}{a_0})^{-R_0^{ns}}.
 
 -  Truncated solution:
 
@@ -671,7 +801,9 @@ The evolution kernels :math:`\Gamma(x)` are defined as the inverse
 Mellin transforms of the evolution factors introduced in eqs.
 (`[eq:solutionexpand] <#eq:solutionexpand>`__)
 
-.. math:: \Gamma_{S}(x,a_{s},a_{0}) = \int_{c - i\infty}^{c_{+}i\infty}\frac{dN}{2\pi i}x^{- N}\Gamma_{S}(N,a_{s},a_{0})
+.. math::
+
+    \Gamma_{S}(x,a_{s},a_{0}) = \int_{c - i\infty}^{c_{+}i\infty}\frac{dN}{2\pi i}x^{- N}\Gamma_{S}(N,a_{s},a_{0})
 
 Note however that all splitting functions, except the off-diagonal
 entries of the singlet matrix, diverge when :math:`x = 1`, this implies
@@ -683,11 +815,15 @@ can be defined as distributions. To this purpose consider the generic
 evolution factor :math:`\Gamma` such that (omitting the explicit
 dependence of :math:`\Gamma` on the coupling :math:`a_{s}`)
 
-.. math:: f(x,Q^{2}) = \int_{x}^{1}\frac{dy}{y}\Gamma(y)f\left( \frac{x}{y},Q_{0}^{2} \right)\,.
+.. math::
+
+    f(x,Q^{2}) = \int_{x}^{1}\frac{dy}{y}\Gamma(y)f\left( \frac{x}{y},Q_{0}^{2} \right)\,.
 
 Defining the distribution
 
-.. math:: \Gamma_{+}(x) = \Gamma(x) - \gamma\delta(1 - x)\,,\text{\quad\quad}where\quad\gamma = \int_{0}^{1}dx\Gamma(x)\,.
+.. math::
+
+    \Gamma_{+}(x) = \Gamma(x) - \gamma\delta(1 - x)\,,\text{\quad\quad}where\quad\gamma = \int_{0}^{1}dx\Gamma(x)\,.
 
 Equation (`[eq:gengamma] <#eq:gengamma>`__) can then be rewritten as
 
@@ -705,14 +841,18 @@ the parton distribution functions in :math:`x` space, determining
 :math:`\Gamma` numerically from eq.\ `[eq:xkernels] <#eq:xkernels>`__
 and :math:`\gamma` as
 
-.. math:: \gamma = \int_{0}^{1}dx\int_{c - i\infty}^{c + i\infty}\frac{dN}{2\pi i}x^{- N}\Gamma(N) = \int_{c - i\infty}^{c + i\infty}\frac{dN}{2\pi i}\frac{\Gamma(N)}{1 - N}\,.
+.. math::
+
+    \gamma = \int_{0}^{1}dx\int_{c - i\infty}^{c + i\infty}\frac{dN}{2\pi i}x^{- N}\Gamma(N) = \int_{c - i\infty}^{c + i\infty}\frac{dN}{2\pi i}\frac{\Gamma(N)}{1 - N}\,.
 
 In this singlet case, however this prescription has been slightly
 modified because :math:`\Gamma(N)|_{N = 1}` is indeed infinite. So
 eq.\ `[eq:genexp] <#eq:genexp>`__ is rewritten in another equivalent
 form. Let us define
 
-.. math:: f^{(1)}(x,Q^{2}) = x\, f(x,Q^{2})\text{\quad\quad}\Gamma^{(1)}(x,Q_{0}^{2},Q^{2}) = x\Gamma(x,Q_{0}^{2},Q^{2}).
+.. math::
+
+    f^{(1)}(x,Q^{2}) = x\, f(x,Q^{2})\text{\quad\quad}\Gamma^{(1)}(x,Q_{0}^{2},Q^{2}) = x\Gamma(x,Q_{0}^{2},Q^{2}).
 
 Thus
 
@@ -737,7 +877,9 @@ From Eq. (4.19) of Ref. , if we identify :math:`F` with
 equation in the limit of zero target mass, we obtain the expression of
 the NLT correction to the structure function :math:`F_{2}:`
 
-.. math:: F_{2}^{NLT}(x,Q^{2}) = \frac{x^{2}}{\tau^{3/2}}\frac{F_{2}^{LT}(\xi,Q^{2})}{\xi^{2}} + 6\frac{M^{2}}{Q^{2}}\frac{x^{3}}{\tau^{2}}I_{2}(\xi,Q^{2})
+.. math::
+
+    F_{2}^{NLT}(x,Q^{2}) = \frac{x^{2}}{\tau^{3/2}}\frac{F_{2}^{LT}(\xi,Q^{2})}{\xi^{2}} + 6\frac{M^{2}}{Q^{2}}\frac{x^{3}}{\tau^{2}}I_{2}(\xi,Q^{2})
 
 where
 
@@ -753,7 +895,9 @@ Now let us Mellin transform and antitransform
 :math:`F_{2}^{LT}(\xi,Q^{2})` and :math:`I_{2}(\xi,Q^{2})` with respect
 to the variable :math:`\xi`:
 
-.. math:: F_{2}^{LT}(\xi,Q^{2}) = \int\frac{dN}{2\pi i}\,\xi^{- N}\Gamma(N,Q_{0}^{2},Q^{2})\, f\left( N,Q_{0}^{2} \right)
+.. math::
+
+    F_{2}^{LT}(\xi,Q^{2}) = \int\frac{dN}{2\pi i}\,\xi^{- N}\Gamma(N,Q_{0}^{2},Q^{2})\, f\left( N,Q_{0}^{2} \right)
 
 while
 
@@ -784,7 +928,9 @@ Now we can reinterpret the factor in front of
 coefficient function, which can be written as a function of
 :math:`\tau`:
 
-.. math:: C_{2}^{TMC}(N,\alpha_{s}(Q^{2})) = \frac{(1 + \sqrt{\tau})^{2}}{4\tau^{3/2}}\left( 1 + \frac{3\left( 1 - 1/\sqrt{\tau} \right)}{N + 1} \right)C_{2}(N,\alpha_{s}(Q^{2})).
+.. math::
+
+    C_{2}^{TMC}(N,\alpha_{s}(Q^{2})) = \frac{(1 + \sqrt{\tau})^{2}}{4\tau^{3/2}}\left( 1 + \frac{3\left( 1 - 1/\sqrt{\tau} \right)}{N + 1} \right)C_{2}(N,\alpha_{s}(Q^{2})).
 
 Notice that into the limit
 :math:`M_{p}/Q \rightarrow 0,\,\tau \rightarrow 1`,
@@ -796,11 +942,15 @@ corrections to the :math:`F_{L}` and :math:`F_{3}` structure functions.
 
 Starting from formula (4.21b) of Ref. , being
 
-.. math:: \frac{\nu W_{2}}{M} = F_{2}\text{\quad\quad}W_{1} = F_{1}\text{\quad\quad}F_{L} = \frac{\nu W_{2}}{M} - 2xW_{1} = 2xW_{L} - \frac{4x^{2}M^{2}}{Q^{2}}\frac{\nu W_{2}}{M},
+.. math::
+
+    \frac{\nu W_{2}}{M} = F_{2}\text{\quad\quad}W_{1} = F_{1}\text{\quad\quad}F_{L} = \frac{\nu W_{2}}{M} - 2xW_{1} = 2xW_{L} - \frac{4x^{2}M^{2}}{Q^{2}}\frac{\nu W_{2}}{M},
 
 we find
 
-.. math:: F_{L}^{NLT}(x,Q^{2}) = F_{L}^{LT}(x,Q^{2}) + \frac{x^{2}(1 - \tau)}{\tau^{3/2}}\frac{F_{2}^{LT}(\xi,Q^{2})}{\xi^{2}} + \frac{M^{2}}{Q^{2}}\frac{x^{3}(6 - 2\tau)}{\tau^{2}}I_{2}(\xi,Q^{2})
+.. math::
+
+    F_{L}^{NLT}(x,Q^{2}) = F_{L}^{LT}(x,Q^{2}) + \frac{x^{2}(1 - \tau)}{\tau^{3/2}}\frac{F_{2}^{LT}(\xi,Q^{2})}{\xi^{2}} + \frac{M^{2}}{Q^{2}}\frac{x^{3}(6 - 2\tau)}{\tau^{2}}I_{2}(\xi,Q^{2})
 
 where :math:`I_{2}` is defined in Eq. \ `[eq:i2] <#eq:i2>`__. With the
 same calculations as in the :math:`F_{2}` case we obtain the following
@@ -830,11 +980,15 @@ Ref. , where :math:`F = 2F_{3}(y)/y` as we can see by comparing the left
 and right hand side members of the equation in the limit of
 :math:`M \rightarrow 0`:
 
-.. math:: F_{L}^{NLT}(x,Q^{2}) = \frac{x}{\tau}\frac{F_{3}^{LT}(\xi,Q^{2})}{\xi} + \frac{2M^{2}}{Q^{2}}\frac{x^{2}}{\tau^{3/2}}I_{3}(\xi,Q^{2})
+.. math::
+
+    F_{L}^{NLT}(x,Q^{2}) = \frac{x}{\tau}\frac{F_{3}^{LT}(\xi,Q^{2})}{\xi} + \frac{2M^{2}}{Q^{2}}\frac{x^{2}}{\tau^{3/2}}I_{3}(\xi,Q^{2})
 
 where
 
-.. math:: I_{3}(\xi,Q^{2}) = \int_{\xi}^{1}\, dz\,\frac{2F_{3}^{LT}(z,Q^{2})}{z}.
+.. math::
+
+    I_{3}(\xi,Q^{2}) = \int_{\xi}^{1}\, dz\,\frac{2F_{3}^{LT}(z,Q^{2})}{z}.
 
 With the same calculations as in the :math:`F_{2}` case and by noticing
 that
diff --git a/doc/sphinx/source/theory/index.rst b/doc/sphinx/source/theory/index.rst
index fa8da85280..87a70ba17f 100644
--- a/doc/sphinx/source/theory/index.rst
+++ b/doc/sphinx/source/theory/index.rst
@@ -15,3 +15,4 @@ this information using the NNPDF code itself is detailed.
    ./theoryparamsdefinitions
    ./theoryindex
    ./theoryparamsinfo
+   ./PTevol
diff --git a/doc/sphinx/source/theory/theoryindex.rst b/doc/sphinx/source/theory/theoryindex.rst
index 8de7984ebb..8e5d3372b5 100644
--- a/doc/sphinx/source/theory/theoryindex.rst
+++ b/doc/sphinx/source/theory/theoryindex.rst
@@ -5,7 +5,7 @@ Theory indexes
 
 In the table below you can see explicit content of the ``TheoryIndex`` table
 from the ``theory.db`` file. Note that not every theory listed in the database
-is available to be downloaded from the :ref:`NNPDF-server`. In particular,
+is available to be downloaded from the NNPDF :ref:`server`. In particular,
 theories that are outdated are not stored on the server, but their settings will
 remain in the database. To find a list of the theories that are available on the
 server, one can use the vp-list script (see :ref:`vp-list`) as so: :code:`vp-
diff --git a/doc/sphinx/source/tutorials/index.rst b/doc/sphinx/source/tutorials/index.rst
index 32a89d51dc..6e515bd0f4 100644
--- a/doc/sphinx/source/tutorials/index.rst
+++ b/doc/sphinx/source/tutorials/index.rst
@@ -13,6 +13,7 @@ Running fits
 
    ./run-fit.md
    ./run-iterated-fit.rst
+   ./run-qed-fit.rst
    ./general_th_covmat.rst
    ./thcov_tutorial.rst
 
diff --git a/pyproject.toml b/pyproject.toml
index fb8f155847..3cae1bbbe8 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -93,6 +93,7 @@ recommonmark = {version = "*", optional = true}
 sphinxcontrib-bibtex = {version = "*", optional = true}
 sphinx_rtd_theme = {version = "*", optional = true}
 sphinx = {version = "^5.0", optional = true}
+tabulate = {version = "*", optional = true}
 # qed
 fiatlux = {version = "*", optional = true}
 # without lhapdf
@@ -102,7 +103,7 @@ lhapdf-management = {version = "^0.5", optional = true}
 # Optional dependencies
 [tool.poetry.extras]
 tests = ["pytest", "pytest-mpl", "hypothesis"]
-docs = ["recommonmark", "sphinxcontrib", "sphinx-rtd-theme", "sphinx"]
+docs = ["recommonmark", "sphinxcontrib", "sphinx-rtd-theme", "sphinx", "tabulate"]
 qed = ["fiatlux"]
 nolha = ["pdfflow", "lhapdf-management"]
 
diff --git a/validphys2/src/validphys/datafiles/disp_theory.py b/validphys2/src/validphys/datafiles/disp_theory.py
index 78c0db26e5..31bdd181a6 100755
--- a/validphys2/src/validphys/datafiles/disp_theory.py
+++ b/validphys2/src/validphys/datafiles/disp_theory.py
@@ -1,19 +1,8 @@
 #!/usr/bin/env python
 
 import sqlite3 as lite
-import sys,os
 
 # Attempt to find tablulate
-import imp
-try:
-    imp.find_module('tabulate')
-    found = True
-except ImportError:
-    found = False
-
-# Install/import tabulate
-if found == False:
-    os.system("pip install tabulate --user")
 from tabulate import tabulate
 
 # sqlite con
@@ -33,7 +22,6 @@
     col_names = [cn[0] for cn in cur.description]
     col_sub = [col_names[0], col_names[33]]
 
-
     table = []
     rows = cur.fetchall()
     for row in rows:
@@ -44,7 +32,6 @@
 except lite.Error as e:
 
     print("Error %s:" % e.args[0])
-    sys.exit(1)
 
 finally:
 

From 2a0fedce4e771d002dbb4047adafa9631dc6a0fc Mon Sep 17 00:00:00 2001
From: "Juan M. Cruz-Martinez" <juacrumar@lairen.eu>
Date: Mon, 4 Mar 2024 13:52:30 +0100
Subject: [PATCH 4/4] Apply suggestions from code review

Co-authored-by: Felix Hekhorn <felixhekhorn@users.noreply.github.com>
---
 doc/sphinx/source/data/commondata.rst | 32 +++++++++++++--------------
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/doc/sphinx/source/data/commondata.rst b/doc/sphinx/source/data/commondata.rst
index 0ea37f666d..d3b6a52253 100644
--- a/doc/sphinx/source/data/commondata.rst
+++ b/doc/sphinx/source/data/commondata.rst
@@ -42,7 +42,7 @@ The data downloaded or parsed from hepdata or other sources is kept in the
 ``<setname>/<rawdata>`` folder and it is not installed with the rest of the code.
 Each folder must contain a ``<setname>/metadata.yaml`` file which will define
 all datasets implemented within the folder and that will be described below.
-Only ``.yaml`` file are allowed to be installed together with the ``nnpdf`` code.
+Only ``.yaml`` files are allowed to be installed together with the ``nnpdf`` code.
 
 In order to keep backward compatibility and allow the reproducibility of the 4.0 family of fits
 a ``dataset_names.yml`` file keeps a mapping of the datasets that were used in 4.0.
@@ -99,7 +99,7 @@ The header of the ``metadata.yaml`` file contains information shared among diffe
 Setname
 ~~~~~~~
 
-Correspond to the name of the set and must be equal to the folder. It acts a s a sanity check.
+Correspond to the name of the set and must be equal to the folder. It acts as a sanity check.
 
 Versioning
 ~~~~~~~~~~
@@ -206,7 +206,7 @@ All entries must be latex-compilable as they are used by various plotting routin
 ``kinematics::file``
 ~~~~~~~~~~~~~~~~~~~~
 A reference to a ``.yaml`` file containing all kinematic information.
-The file contain a list of ``ndata`` ``bins`` for which information about all variables
+The file contains a list of ``ndata`` ``bins`` for which information about all variables
 is included for all bins.
 When ``mid`` is not given, it will be automatically filled with the midpoint between min and max.
 Only ``mid`` is used for cuts, while ``min`` and ``max`` may be used for plotting routines.
@@ -255,7 +255,7 @@ list for all values for all bins.
 
 ``data_uncertainties``
 ~~~~~~~~~~~~~~~~~~~~~~
-A list of ``.yaml`` file containing the uncertainty information for the measurement.
+A list of ``.yaml`` files containing the uncertainty information for the measurement.
 When using more than one uncertainty file they will be concatenated. 
 This allows the user the flexibility of creating variants
 where only a subset of the uncertainties are modified.
@@ -266,27 +266,27 @@ and a second field ``bins`` which is a list of mappings with ``ndata`` entries
 with the named uncertainties.
 
 Note that, regardless of their treatment, uncertainties should always be written as absolute values
-and not relative to the data values. If the data should be updated, the uncertainties should be too.
+and not relative to the data values. If the data is updated, the uncertainties have to be too.
 
 ..  code-block:: yaml
 
     definitions:
         stat:
-            description:
-            treatment:
+            description: statistical error
+            treatment: ADD
             type:
         error_name:
-            description:
-            treatment:
+            description: an additive uncertainty
+            treatment: ADD
             type:
         error_name_2:
-            description:
-            treatment:
+            description: an multiplicative uncertainty
+            treatment: MULT
             type:
     bins:
-        - stat:
-          error_name:
-          error_name_2:
+        - stat: 1.0
+          error_name: 2.0
+          error_name_2: 3.0
 
 
 
@@ -329,7 +329,7 @@ The theory field defines how predictions for the dataset are to be computed.
 It includes two entries:
 
 - ``FK_tables``: this is a list of lists which defines the FK Tables to be loaded. The outermost list are the operands (in case an operation is needed to recover the observable, more on that below). The innermost list are the grids that are to be concatenated in order to form the operands.
-- ``operaton``: operation to be applied in order to compute the observable
+- ``operation``: operation to be applied in order to compute the observable. If no operation is needed it can be written as 'null' or None. vp currently supports ``RATIO``, ``ASY``, ``ADD``,``SMN``, ``COM``, ``SMT``, ``NULL``
 
 Example:
 
@@ -352,4 +352,4 @@ After that, the final observable will be computed by taking the ratio of the con
 The ``plotting`` section defines the plotting style inside ``validphys``
 and is described in detail in :ref:`plotting-format`.
 
-Note that name of the variables need to be the same in the plotting and kinematics.
+Note that the names of the variables need to be the same in the plotting and kinematics.