-
Notifications
You must be signed in to change notification settings - Fork 14
Add an overview of loss functions to the documentation #1495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
883ca61
add description of different figures of merit to docs
RoyStegeman 6e1ed87
update chi2 overview docs
RoyStegeman 51c88d5
add instruction for integrability in the runcard to docs
RoyStegeman 2a2c7ab
make t0 a subsection of the basis chi squared description in docs
RoyStegeman 845a4fb
replace chi square distribution with chi square statistic
RoyStegeman 2dc7aa0
replace arxiv links with internal citation in docs
RoyStegeman 5aa7da1
Merge branch 'docs_chi2_definitions' of github.com:NNPDF/nnpdf into d…
RoyStegeman 9668104
Apply suggestions from code review
RoyStegeman File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,184 @@ | ||
| Chi square figures of merit | ||
| ================================================================================ | ||
|
|
||
| Within the NNPDF methodology various figures of merit are used, each of which | ||
| can be used in different situations. To avoid confusion, it is important to | ||
| understand the differences between the various figures of merit, and to | ||
| understand which definition we are referring to in a given context. In | ||
| particular, it is worth stressing that whenever a figure of merit is discussed, | ||
| the :math:`t_0` method (discussed below) applies. | ||
|
|
||
| Here we we provide an overview of the different figures of merit, and discuss | ||
| when each of them is used. | ||
|
|
||
|
|
||
| The basis of the loss functions: 𝜒² | ||
| -------------------------------------------------------------------------------- | ||
| The :math:`\chi^2` figures of merit used in the NNPDF methodology are all | ||
| based on the chi square statistic: | ||
|
|
||
| .. math:: | ||
| \chi^{2}=\sum_{i, j}^{N_{\text {dat }}}(D-P)_{i} C_{i j}^{-1}(D-P)_{j}, | ||
|
|
||
| where :math:`D_i` is the :math:`i`-th datapoint, :math:`P_i` is the prediction | ||
| of the corresponding datapoint calculated from the convolution product | ||
| between the :ref:`FastKernel tables<fktables>` for point :math:`i` and the PDF | ||
| model, and :math:`C_{ij}` is the covariance between datapoints :math:`i` | ||
| and :math:`j`. | ||
|
|
||
| The covariance matrix accounts for correlated systematic uncertainties, | ||
| normalization uncertainties, and statistical uncertainties as provided by the | ||
| experimental collaborations. | ||
|
|
||
| .. note:: | ||
| This definition of :math:`\chi^2` is not used as a figure of merit | ||
| anywhere in NNDPF fits. Instead, variations discussed below | ||
| are used. | ||
|
|
||
|
|
||
| Avoiding bias: t₀ method | ||
| ~~~~~~~~~~~~~~~~~~~~~~~~ | ||
| The :math:`t_0` method introduced in | ||
| :cite:p:`Ball:2009qv` aims to | ||
| remove systematic biases as a result of a naive treatment of multiplicative | ||
| uncertainties. This is done by redefining the covariance matrix in the | ||
| definition of :math:`\chi^2`, resulting in a :math:`t_0` covariance matrix | ||
| :math:`C_{t_0}` and a corresponding figure of merit sometimes denoted by | ||
| :math:`\chi^2_{t_0}`, though often simply written as :math:`\chi^2`. | ||
|
|
||
| .. note:: | ||
| From NNPDF2.0 onwards the t₀ formalism has been used to define the figure of | ||
| merit used during the fitting of the PDFs. | ||
|
|
||
|
|
||
| Missing higher order uncertainties | ||
| -------------------------------------------------------------------------------- | ||
| Another source of uncertainties that we may want to include in the covariance | ||
| matrix are theoretical uncertainties, particularly missing higher order | ||
| uncertainties estimated through scale variations. These unceratinties can be | ||
| considered in the figure of merit through the implementation of a 'theory | ||
| covariance matrix'. A paper discussing the formalism can be found here: | ||
| :cite:p:`AbdulKhalek:2019bux`. For a tutorial see | ||
| :ref:`How to include a theory covariance matrix in a fit <thcov_tutorial>`. | ||
|
|
||
|
|
||
| Future test: including PDF errors | ||
| -------------------------------------------------------------------------------- | ||
| To test the generalization power of the NNPDF fitting framework in the region | ||
| where PDFs are not constrained by data, the 'future test' has been developed. | ||
| The figure of merit considered in a future test is again the :math:`\chi^2`, | ||
| however, in this case the covariance matrix is not only the covariance matrix | ||
| corresponding to the datasets, but it is instead the sum of the covariance | ||
| matrix describing the data uncertainties and the covariance matrix describing | ||
| the PDF uncertainties. | ||
|
|
||
| For a more detailed discussion of the future test formalism see e.g. | ||
| :cite:p:`Cruz-Martinez:2021rgy`, or learn | ||
| :ref:`How to run a Future Test <futuretests>` | ||
|
|
||
|
|
||
| Regularized covariance matrices | ||
| -------------------------------------------------------------------------------- | ||
| Information about the accuracy of the experimental uncertainty is generally not | ||
| available, nevertheless inaccuracies in an experimental covariance matrix can | ||
| lead to problems during optimization. Simply making a conservative estimate of | ||
| the correlations does not always guarantee this problem is avoided and this is | ||
| where the regularized covariance matrix comes in: it aims to provide a matrix | ||
| which is closely related to the original experimental covariance matrix while | ||
| avoiding the problems during optimization. | ||
|
|
||
| The function that performs the regularization is | ||
| :py:meth:`validphys.calcutils.regularize_l2`. A regularized covarinace marix | ||
| cannot be generated while performing a fit as it is necesarry to produce | ||
| corresponding :ref:`FastKernel tables<fktables>` and include it in the theory | ||
| as a separete dataset. For instructions on how to do this see | ||
| :ref:`tutorialfktables` | ||
|
|
||
| A more detailed discussion of regularization procedure, and how it is used | ||
| within NNPDF can be found in sections 4.2 and 8.7 of the NNPDF4.0 paper | ||
| :cite:p:`nnpdf40`. | ||
|
|
||
|
|
||
| The weighted fit method | ||
| -------------------------------------------------------------------------------- | ||
| To determine whether a specific dataset shows inconsistencies with the | ||
| global dataset, one can produce a PDF determination in which that measurement | ||
| is given an increased weight (usually equal to the combined weight of the other | ||
| datasets). The idea being that if -- in oder to accommodate the dataset under | ||
| investigation -- the agreement to the other datasets deteriorates, this dataset | ||
| is likely inconsistent with the global dataset. | ||
|
|
||
| When performing a weighted fit the figure of merit is hence redefined as | ||
|
|
||
| .. math:: | ||
| \chi^{2}=\frac{1}{N_{\text {dat }}-N_{\text {dat }}^{(j)}} | ||
| \sum_{i \neq j}^{n_{\text {exp }}}N_{\text {dat }}^{(i)}\chi_{i}^{2} | ||
| +\omega^{(j)} \chi_{j}^{2} | ||
|
|
||
| with :math:`w^{(j)}=N_{\rm dat}/N^{(j)}_{\rm dat}`. | ||
|
|
||
| A dataset can be given an additional weight by explictitly writing a weight key | ||
| for a given dataset in the :ref:`n3fit runcard <runcard-detailed>`. For example, | ||
| while the default weight is 1, one can set the weight of the | ||
| HERACOMB_SIGMARED_C dataset to 100 by adding the following to the runcard: | ||
|
|
||
| .. code-block:: yaml | ||
|
|
||
| dataset_inputs: | ||
| - {dataset: HERACOMB_SIGMARED_C, frac: 0.75, weight: 100} | ||
|
|
||
|
|
||
| Experimental, validation, and training 𝜒² | ||
| -------------------------------------------------------------------------------- | ||
| When performing a PDF fit we generally distinguish three different definitions | ||
| of the :math:`\chi^2` loss function, namely the experimental loss | ||
| :math:`\chi^2_{\rm exp}`, the training loss :math:`\chi^2_{rm tr}` and the | ||
| validation loss :math:`\chi^2_{val}`, all of which are defined using the | ||
| :math:`t_0` method. Here the experimental loss is calculated with respect to the | ||
| experimental covariance matrix and corresponding central values, while the | ||
| training and validation losses are defined with respect to the central values | ||
| of the psuedodata replicas. | ||
|
|
||
| The training and validation losses are used for cross-correlation in the | ||
| early stopping algorithm, and can further be adjusted to ensure positivity and | ||
| integrability of the resulting PDFs after the fit by adding a component to the | ||
| loss function (see :ref:`below <lagrange-multipliers>`). | ||
|
|
||
| More details of these loss functions and the role they play within the training | ||
| of the neural network can be found in the :ref:`methodology overview | ||
| <methodology>`. | ||
|
|
||
|
|
||
| .. _lagrange-multipliers: | ||
| Positivity and integrability: Lagrange multipliers | ||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
| Generally in an NNPDF fit we will want to ensure positivity and integrability of | ||
| the resulting PDFs. This is enforced by means of Lagrange multipliers, which | ||
| provide an additional contribution to the definition of the chi squared | ||
| loss function. | ||
|
|
||
| For an discussion of how exactly the loss function is adjusted upon including | ||
| the Lagrange multipliers, see sections 3.1.3 and 3.1.4 of the NNPDF4.0 paper | ||
| :cite:p:`nnpdf40`. | ||
|
|
||
| An explanation of how the runcard should be adjusted to include the additional | ||
| positivity Lagrange multiplier can be found :ref:`here <positivity-label>`, | ||
| while the analogous information for integrability can be found | ||
| :ref:`here <integrability-label>`. | ||
|
|
||
|
|
||
| Hyperoptimized figure of merit | ||
| -------------------------------------------------------------------------------- | ||
| To test the generalization power of a given methodology (a specific set of | ||
| hyperparameter values), we employ hyperoptimization, specifically we use | ||
| K-folds cross-validation. The idea of K-folds cross-validation is to create | ||
| subsets of data representative of the global dataset, and then perform a | ||
| fit to :math:`K-1` subsets while using the :math:`K^{\rm th}` subset as a test | ||
| set to check the generalization performance after the neural network has been | ||
| trained. The figure of merit that is minimized during the hyperoptimization | ||
| routine is obtained by summing over all :math:`K` test losses that are obtained | ||
| after performing :math:`K` fits to each possible combination of :math:`K-1` | ||
| datasets. | ||
|
|
||
| For a more detailed description of the hyperoptimization loss see the | ||
| documentation of the :ref:`hyperoptimization algorithm<hyperoptimization>`. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2 changes: 1 addition & 1 deletion
2
validphys2/src/validphys/theorycovariance/theorycovarianceutils.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,5 @@ | ||
| """ | ||
| theorycovariance.py | ||
| theorycovarianceutils.py | ||
|
|
||
| Low level utilities for theorycovariance module | ||
| """ | ||
|
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that here it is somewhat ambiguous that we are discussing the so called "experimental" chi2, and seems to me the word "experimental" should appear in a few places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is something that confused me a bit about the discussion for the referee reply, perhaps you can clear this up. What we call the "experimental" chi2 in n3fit uses the same t0 covmats as the validation and training losses, however based on the mail exchange this week it seems that the "experimental" chi2 doesn't use the t0 prescription?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps I can chime in here. For all our fits we use the covariance matrix constructed under the t0 prescription. That is: multiplicative uncertainties are multiplied by predictions from the t0 PDF (as opposed to the experimental measurement) to transform them to additive uncertainties before constructive the covariance matrix.
The reason we do this has to do with the d'agostini bias: if we attempt to use the non-t0 covmat (which is the experimental covmat) to compute the loss, then our results will be a biased estimator.
The discussion for the reply to the referee pertains to what covmat we use to compute the chi2 tables in the reports. The experimental chi2 uses the experimental covmat; the t0 chi2 similarly uses the t0 covmat. Really we should look at the t0 chi2 because that is what the replicas are optimising for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, but this is indeed my point: what you now call the "t0 chi2" is simply called "experimental chi2" (or sometimes "true chi2" iirc) inside n3fit.
For external communications I suppose that's fine, as long as we're consistent in distinguishing t0 chi2 from experimental chi2, but in the docs this double use of the name can cause problems/confusion.