Add an overview of loss functions to the documentation#1495
Conversation
Zaharid
left a comment
There was a problem hiding this comment.
Looks good. Would like to amend the introduction a bit and perhaps think on one or two more substantive comments.
Co-authored-by: Zaharid <zk261@cam.ac.uk>
What comments do you think are missing? |
Replace arxiv links with internal BibTex citations in the figure of merit page.
…ocs_chi2_definitions
| \chi^{2}=\sum_{i, j}^{N_{\text {dat }}}(D-P)_{i} C_{i j}^{-1}(D-P)_{j}, | ||
|
|
||
| where :math:`D_i` is the :math:`i`-th datapoint, :math:`P_i` is the prediction | ||
| of the corresponding datapoint calculated by performing the convolution product |
There was a problem hiding this comment.
This can be a bit more ambiguous to also be correct for compound observables. Although we might want to delete it altogether.
| of the corresponding datapoint calculated by performing the convolution product | |
| of the corresponding datapoint calculated from the convolution product |
| when each of them is used. | ||
|
|
||
|
|
||
| The basis of the loss functions: 𝜒² |
There was a problem hiding this comment.
I feel that here it is somewhat ambiguous that we are discussing the so called "experimental" chi2, and seems to me the word "experimental" should appear in a few places.
There was a problem hiding this comment.
So this is something that confused me a bit about the discussion for the referee reply, perhaps you can clear this up. What we call the "experimental" chi2 in n3fit uses the same t0 covmats as the validation and training losses, however based on the mail exchange this week it seems that the "experimental" chi2 doesn't use the t0 prescription?
There was a problem hiding this comment.
Perhaps I can chime in here. For all our fits we use the covariance matrix constructed under the t0 prescription. That is: multiplicative uncertainties are multiplied by predictions from the t0 PDF (as opposed to the experimental measurement) to transform them to additive uncertainties before constructive the covariance matrix.
The reason we do this has to do with the d'agostini bias: if we attempt to use the non-t0 covmat (which is the experimental covmat) to compute the loss, then our results will be a biased estimator.
The discussion for the reply to the referee pertains to what covmat we use to compute the chi2 tables in the reports. The experimental chi2 uses the experimental covmat; the t0 chi2 similarly uses the t0 covmat. Really we should look at the t0 chi2 because that is what the replicas are optimising for.
There was a problem hiding this comment.
Thanks, but this is indeed my point: what you now call the "t0 chi2" is simply called "experimental chi2" (or sometimes "true chi2" iirc) inside n3fit.
For external communications I suppose that's fine, as long as we're consistent in distinguishing t0 chi2 from experimental chi2, but in the docs this double use of the name can cause problems/confusion.
comments from ZK on the chi squared overview Co-authored-by: Zaharid <zk261@cam.ac.uk>
This addresses #1478.
This adds an overview of the loss functions we use in NNPDF to the documentation. The page itself does not go into much detail but I think it links to the relevant resources so it can be used as a reference in case someone is confused about the definition of the chi squared used in various places.
@Zaharid is this roughly what you had in mind?
Closes #1478