Skip to content

Add an overview of loss functions to the documentation#1495

Merged
scarrazza merged 8 commits into
masterfrom
docs_chi2_definitions
Jan 12, 2022
Merged

Add an overview of loss functions to the documentation#1495
scarrazza merged 8 commits into
masterfrom
docs_chi2_definitions

Conversation

@RoyStegeman
Copy link
Copy Markdown
Member

@RoyStegeman RoyStegeman commented Jan 5, 2022

This addresses #1478.

This adds an overview of the loss functions we use in NNPDF to the documentation. The page itself does not go into much detail but I think it links to the relevant resources so it can be used as a reference in case someone is confused about the definition of the chi squared used in various places.

@Zaharid is this roughly what you had in mind?

Closes #1478

@RoyStegeman RoyStegeman requested a review from Zaharid January 5, 2022 12:50
@RoyStegeman RoyStegeman added the documentation Issues and PRs related to documentation label Jan 5, 2022
Copy link
Copy Markdown
Contributor

@Zaharid Zaharid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Would like to amend the introduction a bit and perhaps think on one or two more substantive comments.

Comment thread doc/sphinx/source/figuresofmerit/index.rst Outdated
Comment thread doc/sphinx/source/figuresofmerit/index.rst Outdated
Comment thread doc/sphinx/source/figuresofmerit/index.rst Outdated
Co-authored-by: Zaharid <zk261@cam.ac.uk>
@RoyStegeman
Copy link
Copy Markdown
Member Author

Looks good. Would like to amend the introduction a bit and perhaps think on one or two more substantive comments.

What comments do you think are missing?

Replace arxiv links with internal BibTex citations in the figure of merit page.
Comment thread doc/sphinx/source/figuresofmerit/index.rst Outdated
\chi^{2}=\sum_{i, j}^{N_{\text {dat }}}(D-P)_{i} C_{i j}^{-1}(D-P)_{j},

where :math:`D_i` is the :math:`i`-th datapoint, :math:`P_i` is the prediction
of the corresponding datapoint calculated by performing the convolution product
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be a bit more ambiguous to also be correct for compound observables. Although we might want to delete it altogether.

Suggested change
of the corresponding datapoint calculated by performing the convolution product
of the corresponding datapoint calculated from the convolution product

when each of them is used.


The basis of the loss functions: 𝜒²
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that here it is somewhat ambiguous that we are discussing the so called "experimental" chi2, and seems to me the word "experimental" should appear in a few places.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is something that confused me a bit about the discussion for the referee reply, perhaps you can clear this up. What we call the "experimental" chi2 in n3fit uses the same t0 covmats as the validation and training losses, however based on the mail exchange this week it seems that the "experimental" chi2 doesn't use the t0 prescription?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I can chime in here. For all our fits we use the covariance matrix constructed under the t0 prescription. That is: multiplicative uncertainties are multiplied by predictions from the t0 PDF (as opposed to the experimental measurement) to transform them to additive uncertainties before constructive the covariance matrix.

The reason we do this has to do with the d'agostini bias: if we attempt to use the non-t0 covmat (which is the experimental covmat) to compute the loss, then our results will be a biased estimator.

The discussion for the reply to the referee pertains to what covmat we use to compute the chi2 tables in the reports. The experimental chi2 uses the experimental covmat; the t0 chi2 similarly uses the t0 covmat. Really we should look at the t0 chi2 because that is what the replicas are optimising for.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, but this is indeed my point: what you now call the "t0 chi2" is simply called "experimental chi2" (or sometimes "true chi2" iirc) inside n3fit.

For external communications I suppose that's fine, as long as we're consistent in distinguishing t0 chi2 from experimental chi2, but in the docs this double use of the name can cause problems/confusion.

Comment thread doc/sphinx/source/figuresofmerit/index.rst Outdated
comments from ZK on the chi squared overview

Co-authored-by: Zaharid <zk261@cam.ac.uk>
@scarrazza scarrazza merged commit 0f9820e into master Jan 12, 2022
@scarrazza scarrazza deleted the docs_chi2_definitions branch January 12, 2022 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Issues and PRs related to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add tutorial on computing chi²

4 participants