GitHub - FPPDF/fppdf: Interface code for fixed parameterisation fit to NNPDF code

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
configs		configs
input		input
publication_tests		publication_tests
reproduce_results		reproduce_results
src/fppdf		src/fppdf
.gitignore		.gitignore
LICENSE		LICENSE
README		README
example_full_short.yml		example_full_short.yml
example_runcard.yml		example_runcard.yml
pyproject.toml		pyproject.toml

Repository files navigation

FPPDF
-----

This is the code for `FPPDF` (Fixed Parametrizatin PDF), an open-source program to perform parametric fits of Parton Distribution Functions (PDFs) using fixed parametrization
(with an arbitrary functional form).
Uncertainty estimation is computed as Hessian uncertainties via eigenvector scans (although a replica-based approach can also be taken by repeating the central-value estimation to an ensemble of data replicas).

PDF fits are performed at a fixed scale, using the theory and data made available by the [NNPDF collaboration](https://github.com/nnpdf/nnpdf).
This code also uses the NNPDF framework to perform theory predictions and data-theory comparisons required during the optimization.
LHAPDF grids are generated by evolving the fixed-scale PDF with [eko](https://github.com/nnpdf/eko).

# Installation

Cloning this repository and installing with pip should install both this code and all necessary dependencies:

```
pip install .
```

If a python environment is also required, our recommendation is to use `conda` to create the environment:

```
conda create -n fppdf_environment nnpdf -c conda-forge
conda activate fppdf_environment
python -m pip install . --no-deps
```
this code does not add extra dependencies with respect to `nnpdf` (and `nnpdf` is a dependency of this code), therefore this should be enough to get this code and all dependencies in the environment.

After the installation several commands prefixed with `fppdf_` will be installed: `fppdf_setupfit`, `fppdf_fitpdf`, `fppdf_hessianerr` and `fppdf_evolve`.

# Running the code

## Runcard preparation

The optimization procedure, minimization, and creation of LHAPDF grid is governed by a `.yaml` runcard, of which there are several examples in this repository (e.g., `example_full_short.yml`).
Below we describe some of the options in these runcards:

### Global options
- `dataset_inputs`: a list of datasets that will be considered in the fit. It follows the conventions of the NNPDF runcards (see [here](https://github.com/NNPDF/nnpdf/tree/master/n3fit/runcards/examples)) to allow for an easier comparison between our fits and NNPDF fits.
- `posdatasets`: same, for the list of positivity datasets
- `added_filter_rules`: extra cuts to add beyond the [default set of cuts](https://github.com/NNPDF/nnpdf/blob/master/validphys2/src/validphys/cuts/filters.yaml) of the NNPDF data
- `theoryid`: ID defining the theory assumptions under which the optimization is performed. The full database of theory IDs is available [here](https://docs.nnpdf.science/theory/theoryindex.html)
- `genrep`: whether MC replicas should be generated (default `False`, and should be `False` for hessian uncertainties)
- `mcseed`: seed used to generate the MC replicas

### `inout_parameters`
Parameters to define the input / output of a run
- `inputnam`: file with the starting parameters. Some examples can be seen in the `input` folder. In these files the second column indicates whether a parameter is free (1) or fixed (0). If they are all fixed no optimization can occur.
- `label`: label of the run, can be arbitrary, output files might have a suffix depending on whether dynamic tolerance is being used
- `covinput`: input covariance matrix file (if used)
- `readcov`: default `False`, whether to read in a covariance matrix file to do an eigenvector scan.

### `fit_parameters`
Parameters to change specific options of the optimization.
- `fixpar`: if `True`, don't perform an optimization, just evaluate the different quantities needed during the fit. Useful for debugging. Default `False`.
- `nnpdf_pos`: whether to impose NNPDF positivity in the fit

### `chi2_parameters`
These parameter control how the uncertainties will be computed
- `dynamic_tol`: whether to use dynamic tolerance or not
- `t2_err`: tolerance (squared) to use when `dyanmic_tol` is set to `False`

### theorycovmatconfig:
If this namespace exists, the theory covariance matrix will be computed with the following parameters:
- `point_prescriptions`: The point prescription(s) to be used for the computation of the theory covariance matrix (e.g., 7-points scale variations, ["7 points"])
- `pdf`: Which PDF to use to compute the theory covariance matrix.
- `theoryid`: Theory ID to use for the theory covariance matrix

# Step-by-step fit

## 0. (only in the case of missing higher order uncertainties in the covariance matrix)

Run the following command to prepare the run and compute the theory covariance matrix.
Running this command will download any grids necessary to compute the theory predictions for each of the theories involved in the computation of the covariance matrix and will prepare the theory covariance matrix to be used during optimization.
```
fppdf_setupfit <runcard.yml>
```

## 1. Perform the minimization.

Performs the fit. If the theory covariance matrix is to be used, `setupfit` (see above) must be run first.
This procedure might take some hours depending on the number of datasets specified and how far off the minimum the fit started from.
At the end of this procedure the best fit to the central data will be produced.
```
fppdf_fitpdf <runcard.yml>
```

## 2. Compute the hessian error members.

Once the fit has finished, used the fit parameters found in the previous step to compute the hessian error members using the chosen prescription.
```
fppdf_hessianerr <runcard.yml>
```

## 3. Evolve and create the final grid.

After the central PDF and the eigenvectors have been computed, they need to be evolve to generate a full LHAPDF grid for all scales. This is obtained with:
```
fppdf_evolve <runcard.yml>
```
which will generate an LHAPDF grid in the `outputs/evgrids/` folder.

The outputs are in the outputs/ folder:

/evgrids : the output pdf grids which can then be evolved using the NNPDF evolution (to be tided/documented)
/buffer : outputs from running code
/cov : covariance matrix (used for error calculation)
/evscans: outputs if eigenvector scan is done
/pars : the PDF parameters