Skip to content

Hardcode theories needed for scale variations#664

Merged
scarrazza merged 23 commits into
masterfrom
hardcode_scale_var_theories
Apr 22, 2020
Merged

Hardcode theories needed for scale variations#664
scarrazza merged 23 commits into
masterfrom
hardcode_scale_var_theories

Conversation

@voisey
Copy link
Copy Markdown
Contributor

@voisey voisey commented Feb 25, 2020

Aims to close #454.

This is still a WIP because among other things the docs need to be updated, but I'll do this once we have settled on the code.

A runcard that worked with the old set up is this:

default_theory:
   - theoryid: 163

theoryids:
   - 163
   - 180
   - 173

fit: 190315_ern_nlo_central_163_global
use_cuts: "fromfit"

pdf:
    from_: fit

experiments:
  - experiment: NMC
    datasets:
      - dataset: NMCPD
      - dataset: NMC
  - experiment: SLAC
    datasets:
      - dataset: SLACP
      - dataset: SLACD

template_text: |

   {@with default_theory@}
      {@chi2_impact_custom@}
   {@endwith@}

actions_:
  - report(main=true)

whereas now we can have something like this:

default_theory:
   - theoryid: 163

theoryid: 163
point_prescription: '3 point'

theoryids:
    from_: scale_variation_theories

fit: 190315_ern_nlo_central_163_global
use_cuts: "fromfit"

pdf:
    from_: fit

experiments:
  - experiment: NMC
    datasets:
      - dataset: NMCPD
      - dataset: NMC
  - experiment: SLAC
    datasets:
      - dataset: SLACP
      - dataset: SLACD

template_text: |

   {@with default_theory@}
      {@chi2_impact_custom@}
   {@endwith@}

actions_:
  - report(main=true)

where you get an error if you try to use a point_prescription without 163 as the theoryid and there are five allowed point_prescriptions: '3 point', '5 point', '5bar point', '7 point', '7 point (original)' and '9 point'. Also, note that the first runcard will still work with the new set up.

Let me know what you think. I was also wondering whether it would it be sensible for us to change it so the user no longer has to explicitly define what the default_theory is, but rather this is hardcoded too?

Comment thread validphys2/src/validphys/config.py Outdated
Copy link
Copy Markdown
Contributor

@RosalynLP RosalynLP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, when we merge we should update the docs with the new runcard layout and point prescription flags.

Copy link
Copy Markdown
Contributor

@wilsonmr wilsonmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, just update the docs as Rosalyn mentioned

@Zaharid
Copy link
Copy Markdown
Contributor

Zaharid commented Mar 3, 2020

I think the relation that needs to be stored is which theory is a scale variation of which other rather than how to do the point prescription for a specific theory.

The format could be something like

scale_variations_for:
  - theoryid: 163
    variations:
        - scales: {"muF": '0.5', "muR": "1'"}
          theoryid: <some id> #This is likely wrong

Given the above, one can compute e.g. point prescriptions for any theory that has the required variations without having to write a hard to debug code like the one in this PR.

It would be nice if the specification was in a yaml file, a bit like it is done in the filters. Alternatively it could be a bunch of tables in the theorydb, but at some point we said we will move away from that due to it being unfriendly to git (but then didn't act on it...).

@voisey
Copy link
Copy Markdown
Contributor Author

voisey commented Mar 12, 2020

I don't understand the benefit of your suggestion. Why would the user want to have multiple point prescriptions for a given theory and scale choice? At the end of the day, the user will want to use some specific point prescription for a certain theory set-up (hence the central theoryid). Surely then specifying only the point prescription and the central theory is the simplest thing to do?

Also, I don't see how this would prevent the code being hard to debug - right now the code depends on you giving the theoryids in a particular order, but that would be the case with your suggestion too, no?

@Zaharid
Copy link
Copy Markdown
Contributor

Zaharid commented Mar 12, 2020

I don't understand the benefit of your suggestion. Why would the user want to have multiple point prescriptions for a given theory and scale choice? At the end of the day, the user will want to use

If we lived in an universe where NNLO scale variations would ever happen, you would have to specify which are the scale variations for theory 53 (or whatever), which is straight forward and easily machine checked. You can also reuse the same specifications for things that are not point prescriptions, such as mcscales.

Also, I don't see how this would prevent the code being hard to debug - right now the code depends on you giving the theoryids in a particular order, but that would be the case with your suggestion too, no?

You don't have to find meaning in the formula of point prescriptions every time. You can write it once in terms of scale multipliers and have that work for arbitrary theories. Instead it is easy enough to check that a given scale varied theory has the right multipliers.

@voisey
Copy link
Copy Markdown
Contributor Author

voisey commented Mar 12, 2020

I'm confused. Can you please write scale_variations_for in full as you would for a specific example, say 3pt?

@wilsonmr
Copy link
Copy Markdown
Contributor

wilsonmr commented Mar 12, 2020

Surely there can be both?

So if I understand Zahari correctly: we currently only have full scale varies theories for NLO right? So in that sense 5 point does mean:

       elif pp == '5 point':
            thids = [163, 177, 176, 179, 174]

However what would be great is first of all if we have some base theory - in this case theory 163 we have all of the associated scale varied theories stored in a yaml file somewhere sensible

scale_variations_for:
  - theoryid: 163
    variations:
        - scales: {"muF": '0.5', "muR": "1'"}
          theoryid: <some id> # maybe 170 or whatever
        - scales: {"muF": '2', "muR": "1'"}
          theoryid: <some id> # next id

EDIT: clearly this is nothing to do with point prescription but is just a useful way of associating other theories with 163

so on so on... Then one could even also store point presciptions, but in word format:

point_prescriptions:
  - name: 3 point
     scales: [{muF: 0 muR: 0}, {muF: 0.5 muR: 0.5}, etc.]

or whatever they are. Then it's easier to see qualitatively what the point prescription was supposed to be combining and if there is a mistake then it was in the labelling of the theory. Also it extends to the future when you have:

scale_variations_for:
  ...
  - theoryid: 2501 # NNNLO theory
    variations:
        - scales: {"muF": '0.5', "muR": "1'"}
          theoryid: 2561 # NNNLO scale varied
         etc.

I don't know if I have misunderstood but it seems like both your ideas could be beneficial and not mutual exclusive..

@Zaharid
Copy link
Copy Markdown
Contributor

Zaharid commented Mar 16, 2020

Yeah, it is exactly what @wilsonmr says.
I think that the variations is what we want to store (in the format of the comment by @wilsonmr above) and the point prescriptions should be computed given a set of scale variations.

I don't have a strong opinion as to whether there should be a yaml file specifing the point prescriptions or these should be directly hardcoded in python in a way that they work for any set of scale variations. In any case, it will be much easier for someone to discover what the point prescriptions do if they are expressed in terms of the variations rather than some convention of indexes in a list.

@scarrazza
Copy link
Copy Markdown
Member

@voisey could you please implement the suggestion from @wilsonmr and @Zaharid, so we can proceed and merge this PR?

@voisey
Copy link
Copy Markdown
Contributor Author

voisey commented Mar 31, 2020

@scarrazza Yes, it's on my to do list

@voisey
Copy link
Copy Markdown
Contributor Author

voisey commented Apr 1, 2020

Where would be the best place to store the yaml file the hardcodes the theory-scales correspondence? Are you happy with validphys/theorycovariance for the moment?

@voisey
Copy link
Copy Markdown
Contributor Author

voisey commented Apr 1, 2020

Also, @wilsonmr why did you suggest having the point prescription-scales correspondence in text rather than yaml?

@Zaharid
Copy link
Copy Markdown
Contributor

Zaharid commented Apr 1, 2020

@voisey the yaml setup should be modelled after the filters. See in particular

https://github.com/NNPDF/nnpdf/tree/master/validphys2/src/validphys/cuts

from importlib.resources import read_text

def default_filter_settings():

@wilsonmr
Copy link
Copy Markdown
Contributor

wilsonmr commented Apr 1, 2020

Also, @wilsonmr why did you suggest having the point prescription-scales correspondence in text rather than yaml?

No no it's still a yaml file, I mean word format as opposed to a list of numbers

@voisey
Copy link
Copy Markdown
Contributor Author

voisey commented Apr 1, 2020

Yep, understood!

voisey added 2 commits April 2, 2020 12:26
…and point prescription-scale variation correspondence
…red point prescription instead of defining them in the production rule
@voisey
Copy link
Copy Markdown
Contributor Author

voisey commented Apr 2, 2020

@Zaharid @wilsonmr Can you please tell me what you think of this now?

Comment thread validphys2/src/validphys/config.py Outdated
@wilsonmr
Copy link
Copy Markdown
Contributor

wilsonmr commented Apr 2, 2020

does this definitely install the new files you added? I seem to remember that when you add new files in a new directory you might have to update setup.py around this part:

'cuts': ['*'],

because you're using a development installation, the code is looking at your git repo and finding these files, but when the corresponding conda package is installed, setup doesn't know that it needs to copy these files across unless you explicitly tell it to there.

@voisey
Copy link
Copy Markdown
Contributor Author

voisey commented Apr 2, 2020

@wilsonmr Thanks for pointing that out too! I've implemented the changes you wanted

@Zaharid
Copy link
Copy Markdown
Contributor

Zaharid commented Apr 3, 2020

This looks fine. It could use some documentation at this point.

@voisey
Copy link
Copy Markdown
Contributor Author

voisey commented Apr 8, 2020

@Zaharid, regarding #664 (comment), what does "use this somewhere" mean? And by the latter bit, do you mean we should have a test that the yaml files I created exist/can be opened?

@voisey voisey closed this Apr 14, 2020
@voisey voisey reopened this Apr 14, 2020
@voisey
Copy link
Copy Markdown
Contributor Author

voisey commented Apr 14, 2020

@Zaharid #664 (comment) ?

@@ -0,0 +1,15 @@
# IMPORTANT: scale combinations must be listed according to (muF, muR) in the following order:
# (1,1), (2,1), (0.5,1), (1,2), (1,0.5), (2,2), (0.5,0.5), (2,0.5), (0.5,2)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get away without requiring this? We should have enough information with the scales dictionary.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Zaharid @RosalynLP Let me know what you think about this comment. As I understand it, with all runcards involving the theory covmat in the past, one would have to specify a list of theoryids in a particular order, otherwise one would get nonsense results. All this PR does is to hardcode the mapping between point prescriptions and scale combinations, and then scale combinations and theoryids, such that the user can specify a central theoryid and point prescription and get a list of theoryids in the correct order returned. The correct order is set in one of the yaml files (pointprescriptions.yaml), in which things are hardcoded. I would argue that this isn't too bad, considering it's not everyday that someone will define a new point prescription, so I don't think that file will be touched very much

Comment thread validphys2/src/validphys/config.py Outdated
Comment thread validphys2/src/validphys/config.py Outdated
Comment thread validphys2/src/validphys/config.py Outdated
Comment thread validphys2/src/validphys/config.py Outdated
variations = [
i['variations'] for i in scalevarsfor_list if i['theoryid'] == int(th)
][0]
thids = [j['theoryid'] for i in scales for j in variations if i == j['scales']]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here, it seems it would be quite a bit clearer to build a dict indexed by a tuple of two scale multipliers and index with that.

Furthermore we probably want an error message if we are missing a certain theory in scalevariationtheoryids.yaml that is required for some point prescription. E.g. we may have only those theory for 3 point for a while.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I've now tried to address both of these points

@Zaharid
Copy link
Copy Markdown
Contributor

Zaharid commented Apr 14, 2020

@Zaharid #664 (comment) ?

Never mind. I found my question answered!

Comment thread validphys2/src/validphys/config.py Outdated
Comment thread doc/sphinx/source/vp/theorycov/summary.rst Outdated
voisey added 3 commits April 15, 2020 15:38
…ted for a given central theoryid and the users wants to use one that is not implemented. Also, change notation of scale multipliers from muF and muR to k_F and k_R.
@voisey voisey changed the title [WIP] Hardcode theories needed for scale variations Hardcode theories needed for scale variations Apr 16, 2020
@voisey
Copy link
Copy Markdown
Contributor Author

voisey commented Apr 16, 2020

It looks like the build is failing because it can't find yaml... https://travis-ci.com/github/NNPDF/nnpdf/jobs/320040651#L43119

Comment thread validphys2/src/validphys/config.py Outdated
import numbers
import copy
import os
import yaml
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should do from reportengine.compat import yaml no?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Mikey! Didn't realise this

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I'm not sure it's documented anywhere or if that solves the problem but everywhere else in the code does it like that :P

@voisey
Copy link
Copy Markdown
Contributor Author

voisey commented Apr 16, 2020

This now passes on Linux but not on Mac because of a timeout (quelle surprise). What's our official policy on this now? Is this okay?

@voisey
Copy link
Copy Markdown
Contributor Author

voisey commented Apr 17, 2020

Having merged #725 both builds now pass

@RosalynLP
Copy link
Copy Markdown
Contributor

@voisey I took a look at this again and think it looks good and that we do need to keep the order of the scale variations as specified because this ensures that when we collect results overtheoryidsthey end up in the right order, and therefore the deltas end up in the right order so when constructing the prescriptions we get the right thing.

@voisey
Copy link
Copy Markdown
Contributor Author

voisey commented Apr 17, 2020

Thanks for checking this @RosalynLP. Are you happy with this @Zaharid?

@scarrazza scarrazza merged commit 8f36222 into master Apr 22, 2020
@scarrazza scarrazza deleted the hardcode_scale_var_theories branch April 22, 2020 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hard code scale variation theories in validphys

5 participants