Hardcode theories needed for scale variations by voisey · Pull Request #664 · NNPDF/nnpdf

voisey · 2020-02-25T14:14:22Z

Aims to close #454.

This is still a WIP because among other things the docs need to be updated, but I'll do this once we have settled on the code.

A runcard that worked with the old set up is this:

default_theory:
   - theoryid: 163

theoryids:
   - 163
   - 180
   - 173

fit: 190315_ern_nlo_central_163_global
use_cuts: "fromfit"

pdf:
    from_: fit

experiments:
  - experiment: NMC
    datasets:
      - dataset: NMCPD
      - dataset: NMC
  - experiment: SLAC
    datasets:
      - dataset: SLACP
      - dataset: SLACD

template_text: |

   {@with default_theory@}
      {@chi2_impact_custom@}
   {@endwith@}

actions_:
  - report(main=true)

whereas now we can have something like this:

default_theory:
   - theoryid: 163

theoryid: 163
point_prescription: '3 point'

theoryids:
    from_: scale_variation_theories

fit: 190315_ern_nlo_central_163_global
use_cuts: "fromfit"

pdf:
    from_: fit

experiments:
  - experiment: NMC
    datasets:
      - dataset: NMCPD
      - dataset: NMC
  - experiment: SLAC
    datasets:
      - dataset: SLACP
      - dataset: SLACD

template_text: |

   {@with default_theory@}
      {@chi2_impact_custom@}
   {@endwith@}

actions_:
  - report(main=true)

where you get an error if you try to use a point_prescription without 163 as the theoryid and there are five allowed point_prescriptions: '3 point', '5 point', '5bar point', '7 point', '7 point (original)' and '9 point'. Also, note that the first runcard will still work with the new set up.

Let me know what you think. I was also wondering whether it would it be sensible for us to change it so the user no longer has to explicitly define what the default_theory is, but rather this is hardcoded too?

…th scale variations

RosalynLP

This looks good, when we merge we should update the docs with the new runcard layout and point prescription flags.

wilsonmr

looks good to me, just update the docs as Rosalyn mentioned

Zaharid · 2020-03-03T18:46:33Z

I think the relation that needs to be stored is which theory is a scale variation of which other rather than how to do the point prescription for a specific theory.

The format could be something like

scale_variations_for:
  - theoryid: 163
    variations:
        - scales: {"muF": '0.5', "muR": "1'"}
          theoryid: <some id> #This is likely wrong

Given the above, one can compute e.g. point prescriptions for any theory that has the required variations without having to write a hard to debug code like the one in this PR.

It would be nice if the specification was in a yaml file, a bit like it is done in the filters. Alternatively it could be a bunch of tables in the theorydb, but at some point we said we will move away from that due to it being unfriendly to git (but then didn't act on it...).

voisey · 2020-03-12T11:44:45Z

I don't understand the benefit of your suggestion. Why would the user want to have multiple point prescriptions for a given theory and scale choice? At the end of the day, the user will want to use some specific point prescription for a certain theory set-up (hence the central theoryid). Surely then specifying only the point prescription and the central theory is the simplest thing to do?

Also, I don't see how this would prevent the code being hard to debug - right now the code depends on you giving the theoryids in a particular order, but that would be the case with your suggestion too, no?

Zaharid · 2020-03-12T11:59:11Z

I don't understand the benefit of your suggestion. Why would the user want to have multiple point prescriptions for a given theory and scale choice? At the end of the day, the user will want to use

If we lived in an universe where NNLO scale variations would ever happen, you would have to specify which are the scale variations for theory 53 (or whatever), which is straight forward and easily machine checked. You can also reuse the same specifications for things that are not point prescriptions, such as mcscales.

Also, I don't see how this would prevent the code being hard to debug - right now the code depends on you giving the theoryids in a particular order, but that would be the case with your suggestion too, no?

You don't have to find meaning in the formula of point prescriptions every time. You can write it once in terms of scale multipliers and have that work for arbitrary theories. Instead it is easy enough to check that a given scale varied theory has the right multipliers.

voisey · 2020-03-12T13:20:23Z

I'm confused. Can you please write scale_variations_for in full as you would for a specific example, say 3pt?

wilsonmr · 2020-03-12T14:17:01Z

Surely there can be both?

So if I understand Zahari correctly: we currently only have full scale varies theories for NLO right? So in that sense 5 point does mean:

       elif pp == '5 point':
            thids = [163, 177, 176, 179, 174]

However what would be great is first of all if we have some base theory - in this case theory 163 we have all of the associated scale varied theories stored in a yaml file somewhere sensible

scale_variations_for:
  - theoryid: 163
    variations:
        - scales: {"muF": '0.5', "muR": "1'"}
          theoryid: <some id> # maybe 170 or whatever
        - scales: {"muF": '2', "muR": "1'"}
          theoryid: <some id> # next id

EDIT: clearly this is nothing to do with point prescription but is just a useful way of associating other theories with 163

so on so on... Then one could even also store point presciptions, but in word format:

point_prescriptions:
  - name: 3 point
     scales: [{muF: 0 muR: 0}, {muF: 0.5 muR: 0.5}, etc.]

or whatever they are. Then it's easier to see qualitatively what the point prescription was supposed to be combining and if there is a mistake then it was in the labelling of the theory. Also it extends to the future when you have:

scale_variations_for:
  ...
  - theoryid: 2501 # NNNLO theory
    variations:
        - scales: {"muF": '0.5', "muR": "1'"}
          theoryid: 2561 # NNNLO scale varied
         etc.

I don't know if I have misunderstood but it seems like both your ideas could be beneficial and not mutual exclusive..

Zaharid · 2020-03-16T10:52:59Z

Yeah, it is exactly what @wilsonmr says.
I think that the variations is what we want to store (in the format of the comment by @wilsonmr above) and the point prescriptions should be computed given a set of scale variations.

I don't have a strong opinion as to whether there should be a yaml file specifing the point prescriptions or these should be directly hardcoded in python in a way that they work for any set of scale variations. In any case, it will be much easier for someone to discover what the point prescriptions do if they are expressed in terms of the variations rather than some convention of indexes in a list.

scarrazza · 2020-03-31T09:33:40Z

@voisey could you please implement the suggestion from @wilsonmr and @Zaharid, so we can proceed and merge this PR?

voisey · 2020-03-31T09:34:53Z

@scarrazza Yes, it's on my to do list

voisey · 2020-04-01T08:29:53Z

Where would be the best place to store the yaml file the hardcodes the theory-scales correspondence? Are you happy with validphys/theorycovariance for the moment?

voisey · 2020-04-01T08:33:27Z

Also, @wilsonmr why did you suggest having the point prescription-scales correspondence in text rather than yaml?

Zaharid · 2020-04-01T08:48:15Z

@voisey the yaml setup should be modelled after the filters. See in particular

https://github.com/NNPDF/nnpdf/tree/master/validphys2/src/validphys/cuts

nnpdf/validphys2/src/validphys/filters.py

Line 10 in 172748c

from importlib.resources import read_text

nnpdf/validphys2/src/validphys/filters.py

Line 38 in 172748c

def default_filter_settings():

wilsonmr · 2020-04-01T15:40:16Z

Also, @wilsonmr why did you suggest having the point prescription-scales correspondence in text rather than yaml?

No no it's still a yaml file, I mean word format as opposed to a list of numbers

voisey · 2020-04-01T15:41:26Z

Yep, understood!

…and point prescription-scale variation correspondence

…red point prescription instead of defining them in the production rule

voisey · 2020-04-02T12:18:24Z

@Zaharid @wilsonmr Can you please tell me what you think of this now?

wilsonmr · 2020-04-02T14:17:31Z

does this definitely install the new files you added? I seem to remember that when you add new files in a new directory you might have to update setup.py around this part:

nnpdf/validphys2/setup.py

Line 42 in 7eb1089

'cuts': ['*'],

because you're using a development installation, the code is looking at your git repo and finding these files, but when the corresponding conda package is installed, setup doesn't know that it needs to copy these files across unless you explicitly tell it to there.

voisey · 2020-04-02T15:37:37Z

@wilsonmr Thanks for pointing that out too! I've implemented the changes you wanted

Zaharid · 2020-04-03T10:29:17Z

This looks fine. It could use some documentation at this point.

voisey · 2020-04-08T13:50:54Z

@Zaharid, regarding #664 (comment), what does "use this somewhere" mean? And by the latter bit, do you mean we should have a test that the yaml files I created exist/can be opened?

voisey · 2020-04-14T09:35:44Z

@Zaharid #664 (comment) ?

Zaharid · 2020-04-14T09:48:46Z

@@ -0,0 +1,15 @@
+# IMPORTANT: scale combinations must be listed according to (muF, muR) in the following order:
+# (1,1), (2,1), (0.5,1), (1,2), (1,0.5), (2,2), (0.5,0.5), (2,0.5), (0.5,2)


Could we get away without requiring this? We should have enough information with the scales dictionary.

@Zaharid @RosalynLP Let me know what you think about this comment. As I understand it, with all runcards involving the theory covmat in the past, one would have to specify a list of theoryids in a particular order, otherwise one would get nonsense results. All this PR does is to hardcode the mapping between point prescriptions and scale combinations, and then scale combinations and theoryids, such that the user can specify a central theoryid and point prescription and get a list of theoryids in the correct order returned. The correct order is set in one of the yaml files (pointprescriptions.yaml), in which things are hardcoded. I would argue that this isn't too bad, considering it's not everyday that someone will define a new point prescription, so I don't think that file will be touched very much

Zaharid · 2020-04-14T10:07:43Z

+        variations = [
+                i['variations'] for i in scalevarsfor_list if i['theoryid'] == int(th)
+        ][0]
+        thids = [j['theoryid'] for i in scales for j in variations if i == j['scales']]


Also here, it seems it would be quite a bit clearer to build a dict indexed by a tuple of two scale multipliers and index with that.

Furthermore we probably want an error message if we are missing a certain theory in scalevariationtheoryids.yaml that is required for some point prescription. E.g. we may have only those theory for 3 point for a while.

FYI I've now tried to address both of these points

Zaharid · 2020-04-14T10:08:22Z

@Zaharid #664 (comment) ?

Never mind. I found my question answered!

…ault 7 point are the same as for the original version

… incorporate this in config.py

…ke variations a dictionary rather than a list in scalevariationtheoryids.yaml, and update config.yaml accordingly

…ted for a given central theoryid and the users wants to use one that is not implemented. Also, change notation of scale multipliers from muF and muR to k_F and k_R.

voisey · 2020-04-16T14:56:48Z

It looks like the build is failing because it can't find yaml... https://travis-ci.com/github/NNPDF/nnpdf/jobs/320040651#L43119

wilsonmr · 2020-04-16T15:02:31Z

 import numbers
 import copy
 import os
+import yaml


I think you should do from reportengine.compat import yaml no?

Thanks Mikey! Didn't realise this

yeah I'm not sure it's documented anywhere or if that solves the problem but everywhere else in the code does it like that :P

voisey · 2020-04-16T15:43:20Z

This now passes on Linux but not on Mac because of a timeout (quelle surprise). What's our official policy on this now? Is this okay?

voisey · 2020-04-17T11:23:45Z

Having merged #725 both builds now pass

RosalynLP · 2020-04-17T13:26:06Z

@voisey I took a look at this again and think it looks good and that we do need to keep the order of the scale variations as specified because this ensures that when we collect results overtheoryidsthey end up in the right order, and therefore the deltas end up in the right order so when constructing the prescriptions we get the right thing.

voisey · 2020-04-17T14:43:05Z

Thanks for checking this @RosalynLP. Are you happy with this @Zaharid?

Add production rule to return list of theoryids needed for working wi…

99fcc03

…th scale variations

voisey requested review from RosalynLP, Zaharid and wilsonmr February 25, 2020 14:14

wilsonmr reviewed Feb 25, 2020

View reviewed changes

Comment thread validphys2/src/validphys/config.py Outdated

Import NSList at top of file

a2a4f84

RosalynLP approved these changes Feb 26, 2020

View reviewed changes

wilsonmr suggested changes Mar 3, 2020

View reviewed changes

voisey added 2 commits April 2, 2020 12:26

Add files containing info on scale variation-theoryid correspondence …

54fba15

…and point prescription-scale variation correspondence

Use hard coded yaml files to find theoryids corresponding to the desi…

a7693f9

…red point prescription instead of defining them in the production rule

wilsonmr reviewed Apr 2, 2020

View reviewed changes

Comment thread validphys2/src/validphys/config.py Outdated

voisey added 2 commits April 2, 2020 16:24

Except IndexError

91f7f13

Add new files to setup.py

0ec7082

voisey closed this Apr 14, 2020

voisey reopened this Apr 14, 2020

Zaharid reviewed Apr 14, 2020

View reviewed changes

voisey added 7 commits April 14, 2020 17:00

Take allowed central theories from hardcoded yaml file

7556f07

Remove mention of 7 point (original) because the theories for the def…

2d5e712

…ault 7 point are the same as for the original version

Fix config error

caf1312

Make pointprescriptions.yaml into a dictionary rather than a list and…

05b784d

… incorporate this in config.py

Use tuples in scalevariations yaml files rather than dictionaries, ma…

2bd0cf8

…ke variations a dictionary rather than a list in scalevariationtheoryids.yaml, and update config.yaml accordingly

Use f-string in config error

9f93203

Use f-string for point prescriptions

b9f477d

Zaharid reviewed Apr 15, 2020

View reviewed changes

Comment thread validphys2/src/validphys/config.py Outdated

Comment thread doc/sphinx/source/vp/theorycov/summary.rst Outdated

voisey added 3 commits April 15, 2020 15:38

Add an error if only a subset of the point prescriptions are implemen…

3d57330

…ted for a given central theoryid and the users wants to use one that is not implemented. Also, change notation of scale multipliers from muF and muR to k_F and k_R.

Use full names of point prescriptions in theorycovariance docs

b7e73e4

Merge branch 'master' into hardcode_scale_var_theories

00295d3

voisey changed the title ~~[WIP] Hardcode theories needed for scale variations~~ Hardcode theories needed for scale variations Apr 16, 2020

Format with black

11329b3

wilsonmr reviewed Apr 16, 2020

View reviewed changes

Import yaml from reportengine.compat

0ee40e8

Merge branch 'master' into hardcode_scale_var_theories

22c84ff

scarrazza merged commit 8f36222 into master Apr 22, 2020

scarrazza deleted the hardcode_scale_var_theories branch April 22, 2020 15:33

voisey mentioned this pull request Apr 29, 2020

Produce scale variation dataspecs #750

Closed

		@@ -0,0 +1,15 @@
		# IMPORTANT: scale combinations must be listed according to (muF, muR) in the following order:
		# (1,1), (2,1), (0.5,1), (1,2), (1,0.5), (2,2), (0.5,0.5), (2,0.5), (0.5,2)

Conversation

voisey commented Feb 25, 2020

Uh oh!

Uh oh!

RosalynLP left a comment

Choose a reason for hiding this comment

Uh oh!

wilsonmr left a comment

Choose a reason for hiding this comment

Uh oh!

Zaharid commented Mar 3, 2020

Uh oh!

voisey commented Mar 12, 2020

Uh oh!

Zaharid commented Mar 12, 2020

Uh oh!

voisey commented Mar 12, 2020

Uh oh!

wilsonmr commented Mar 12, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Zaharid commented Mar 16, 2020

Uh oh!

scarrazza commented Mar 31, 2020

Uh oh!

voisey commented Mar 31, 2020

Uh oh!

voisey commented Apr 1, 2020

Uh oh!

voisey commented Apr 1, 2020

Uh oh!

Zaharid commented Apr 1, 2020

Uh oh!

wilsonmr commented Apr 1, 2020

Uh oh!

voisey commented Apr 1, 2020

Uh oh!

voisey commented Apr 2, 2020

Uh oh!

Uh oh!

wilsonmr commented Apr 2, 2020

Uh oh!

voisey commented Apr 2, 2020

Uh oh!

Zaharid commented Apr 3, 2020

Uh oh!

voisey commented Apr 8, 2020

Uh oh!

voisey commented Apr 14, 2020

Uh oh!

Zaharid Apr 14, 2020

Choose a reason for hiding this comment

Uh oh!

voisey Apr 15, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Zaharid Apr 14, 2020

Choose a reason for hiding this comment

Uh oh!

voisey Apr 15, 2020

Choose a reason for hiding this comment

Uh oh!

Zaharid commented Apr 14, 2020

Uh oh!

Uh oh!

Uh oh!

voisey commented Apr 16, 2020

Uh oh!

wilsonmr Apr 16, 2020

Choose a reason for hiding this comment

Uh oh!

voisey Apr 16, 2020

Choose a reason for hiding this comment

Uh oh!

wilsonmr Apr 16, 2020

Choose a reason for hiding this comment

Uh oh!

wilsonmr commented Mar 12, 2020 •

edited

Loading