Only require necessary info in runcards by voisey · Pull Request #973 · NNPDF/nnpdf

voisey · 2020-10-16T08:49:27Z

One step towards closing #536.

So far this just removes the requirement for closuretest and fakedata to be specified in a runcard when running vp-setupfit. I can imagine there being a reason for what I've done being suboptimal, but I thought I'd just open it to the floor anyway so that you can just tell me 😄

…pecified

voisey · 2020-10-16T09:09:57Z

@scarlehoff Maybe it would be useful if you could list the parameters that you think the code requires you to have in the runcard but that n3fit does not actually require, so that we can look at editing the checks?

wilsonmr · 2020-10-16T09:12:45Z

Seems reasonable enough to me. I think there are some cases in code which possibly need to be updated to not assume that key exists:

I think some of the closuretest.closure_checks assume that this key is always present so will fail with odd errors. @Zaharid can one control which order checks get executed? Would be good if we had the check if fit is closure firing first and then the rest can assume that closuretest key exists. But if not all the checks will need to be changed.

Also the eff_exponents.iterated_runcard_yaml assumes this key is there.

scarlehoff · 2020-10-16T09:22:25Z

The seed and rnalgo flags are not used by n3fit. Also the stopping dictionary that was used in nnfit.

voisey · 2020-10-16T09:36:50Z

Seems reasonable enough to me. I think there are some cases in code which possibly need to be updated to not assume that key exists:

I think some of the closuretest.closure_checks assume that this key is always present so will fail with odd errors. @Zaharid can one control which order checks get executed? Would be good if we had the check if fit is closure firing first and then the rest can assume that closuretest key exists. But if not all the checks will need to be changed.

Also the eff_exponents.iterated_runcard_yaml assumes this key is there.

Good points!

voisey · 2020-10-16T09:40:41Z

The seed and rnalgo flags are not used by n3fit. Also the stopping dictionary that was used in nnfit.

Okay, as a first thing vp-setupfit seems to run fine when I remove all of these. I guess you're concerned that if these are removed from the runcard something else breaks further down the line, e.g. some vp actions that are needed in processing/analysing the fit after it's been run?

scarlehoff · 2020-10-16T09:43:15Z

To be honest I don't remember if I didn't remove them because something broke a long time ago or whether I was just afraid to do so.

Zaharid · 2020-10-16T11:57:45Z

@wilsonmr The check order on a given action is deterministic and is inner to outer, I believe.

…closuretest namespace is always there

voisey · 2020-10-16T12:15:47Z

@wilsonmr The check order on a given action is deterministic and is inner to outer, I believe.

What does this mean for what we should do here? Don't the following checks mean we're okay? https://github.com/NNPDF/nnpdf/blob/master/validphys2/src/validphys/closuretest/closure_checks.py#L24 and https://github.com/NNPDF/nnpdf/blob/master/validphys2/src/validphys/closuretest/closure_checks.py#L52. I've probably just not fully understood

wilsonmr · 2020-10-16T12:33:29Z

can I get back to you on that? I need to check in more detail

voisey · 2020-10-16T12:38:00Z

@wilsonmr Yep, of course, there's no rush!

voisey · 2020-10-16T12:50:51Z

To be honest I don't remember if I didn't remove them because something broke a long time ago or whether I was just afraid to do so.

Yep, I understand. From having a (quick) look at where those three things appear in the code I didn't see anything that would obviously break when getting rid of them, but someone else should check too. Just to be sure, n3fit doesn't care about these two lines, does it?

nnpdf/nnpdfcpp/src/common/src/nnpdfsettings.cc

Line 221 in fc5be3f

RandomGenerator::InitRNG(Get("fitting","rngalgo").as<int>(),

wilsonmr · 2020-10-20T10:23:31Z

Hi @voisey in closuretest.closure_checks.py the check if fit is closure needs to be changed to first get the closuretest from input and if None then raise check error and then check that fakedata inside of it is True

I'm not sure why in closuretest.closure-results the action bootstrap_bias_experiment isn't decorated with this check. Although I think the production rule fitunderlyinglaw "safely" checks for this (in the sense that it will raise a config error if closuretest doesn't exist) - I suppose that this could in theory have a better error message but is fine for now.

I think that the check_fits_areclosures needs to be moved further in on a few functions (and will also need to be edited like check_fit_isclosure in fact it should probably call that check on each fit rather than reproducing the code)

I think that in closuretest.multiclosure the check also willl need to be moved further in but then should be fine. Finally in closuretest.multiclosure_pdfs there are no checks which was probably an oversight although I think it's fairly safe because of the multiclosure_underlyinglaw production rule also raising config errors..

Are you happy to look into this? I don't think it's too much work - just refining some checks and rearranging the order in which they are executed mainly. If not I can take a look at some point

voisey · 2020-10-21T10:40:29Z

Thanks for looking into this @wilsonmr. I'm happy to look at doing the things you mention. I'm not sure when but it's on my to do list

wilsonmr · 2020-10-21T10:49:53Z

I guess this isn't a blocker for anything

voisey · 2020-10-26T16:06:39Z

@wilsonmr I'd like to check something with you regarding what you say about moving checks further in. For example, if I run fits_datasets_bias_variance on a fit where I've removed the closuretest namespace from the runcard, I get the following:

[ERROR]: Bad configuration encountered:
fit: 191015-mw-002 does not have a `closuretest` namespace in runcard

where I suppose the error comes from the production rule here: https://github.com/NNPDF/nnpdf/blob/master/validphys2/src/validphys/config.py#L269.

Now I suppose what you're saying is that I should move check_fits_areclosures so that it is checked before the production rule is executed, is that correct? If so, could you tell me specifically where it should be moved to? I've tried moving it around but I always seem to get the error from the production rule.

wilsonmr · 2020-10-26T16:36:54Z

Although I think the production rule fitunderlyinglaw "safely" checks for this (in the sense that it will raise a config error if closuretest doesn't exist) - I suppose that this could in theory have a better error message but is fine for now.

By this I meant that the production rule raising the error is fine (still happens at production time and the error explains pretty clearly what has happened) however you are welcome to write more in the error message if you think it wouldn't be clear to a user why this has happened (I suppose it doesn't say why we want closuretest namespace)

I don't think any moving of the checks could remedy this because the checks look at the arguments of the action (but the resources are clearly resolved at that point because e.g check_pdf_is_montecarlo receives an actual PDF object not just the name of the pdf you specified in the runcard, and so any production rule which is a dependency of that action has probably already been executed)

I was worried about potential cases where there was no dependency on the underlying law production rule but checks which assumed that there was a closuretest namespace, however it appears these don't exist so probably we are fine actually (the production rule will catch everything)

So actually the only thing to worry about is probably in closure_results because for example variance_dataset has check_fit_isclosure which currently will look like this if closuretest doesn't exist:

$ validphys test.yml -o testout
[WARNING]: Output folder exists: /Users/michael/Documents/NNPDF_Coding/validphysapi/testout Overwriting contents

----

Traceback (most recent call last):
  File "/Users/michael/conda/envs/nnpdf/bin/validphys", line 33, in <module>
    sys.exit(load_entry_point('validphys', 'console_scripts', 'validphys')())
  File "/Users/michael/nnpdfgit/nnpdf/validphys2/src/validphys/scripts/main.py", line 10, in main
    vp.main()
  File "/Users/michael/nnpdfgit/nnpdf/validphys2/src/validphys/app.py", line 153, in main
    a.main()
  File "/Users/michael/conda/envs/nnpdf/lib/python3.7/site-packages/reportengine/app.py", line 376, in main
    self.run()
  File "/Users/michael/nnpdfgit/nnpdf/validphys2/src/validphys/app.py", line 149, in run
    super().run()
  File "/Users/michael/conda/envs/nnpdf/lib/python3.7/site-packages/reportengine/app.py", line 342, in run
    rb.resolve_fuzzytargets()
  File "/Users/michael/conda/envs/nnpdf/lib/python3.7/site-packages/reportengine/resourcebuilder.py", line 370, in resolve_fuzzytargets
    self.resolve_fuzzytarget(target)
  File "/Users/michael/conda/envs/nnpdf/lib/python3.7/site-packages/reportengine/resourcebuilder.py", line 379, in resolve_fuzzytarget
    self.process_targetspec(fuzzytarget.name, spec, fuzzytarget.extraargs)
  File "/Users/michael/conda/envs/nnpdf/lib/python3.7/site-packages/reportengine/resourcebuilder.py", line 390, in process_targetspec
    gen.send(None)
  File "/Users/michael/conda/envs/nnpdf/lib/python3.7/site-packages/reportengine/resourcebuilder.py", line 445, in _process_requirement
    yield from self._make_node(name, nsspec, extraargs, parents)
  File "/Users/michael/conda/envs/nnpdf/lib/python3.7/site-packages/reportengine/resourcebuilder.py", line 461, in _make_node
    yield from self._make_callspec(f, name, nsspec, extraargs, parents)
  File "/Users/michael/conda/envs/nnpdf/lib/python3.7/site-packages/reportengine/resourcebuilder.py", line 551, in _make_callspec
    environment=self.environment)
  File "/Users/michael/conda/envs/nnpdf/lib/python3.7/site-packages/reportengine/checks.py", line 131, in check
    res = saturate(check_func, ns)
  File "/Users/michael/conda/envs/nnpdf/lib/python3.7/site-packages/reportengine/utils.py", line 53, in saturate
    return func(**kwargs)
  File "/Users/michael/nnpdfgit/nnpdf/validphys2/src/validphys/closuretest/closure_checks.py", line 26, in check_fit_isclosure
    if not fit.as_input()["closuretest"]["fakedata"]:
KeyError: 'closuretest'

----

[CRITICAL]: A critical error ocurred. This is likely due to one of the following reasons:

 - A bug in validphys.
 - Corruption of the provided resources (e.g. incorrect plotting files).
 - Cosmic rays hitting your CPU and altering the registers.

The traceback above should help determine the cause of the problem. If you
believe this is a bug in validphys (please discard the cosmic rays first),
please open an issue on GitHub<https://github.com/NNPDF/nnpdf/issues>,
including the contents of the following file:

/var/folders/jf/s7bvdj957y925nr1kkn1fr400000gn/T/validphys-crash-dpqfksdv

voisey · 2020-11-11T10:50:33Z

@wilsonmr I'm trying to call check_fit_isclosure in check_fits_areclosures to simplify things by doing something like:

@make_argcheck
def check_fits_areclosures(fits):
    for fit in fits:
        check_fit_isclosure(fit)

but this doesn't work in that the check isn't executed and the code just runs. Do you know what I'm doing wrong here? Probably something silly

wilsonmr · 2020-11-11T11:02:07Z

oh yeah @Zaharid showed me something like this before. The decorator means the check can't be called directly.

Instead try check_fit_isclosure.__wrapped__(fit)

see #977 (comment) for a more intelligent explanation

voisey · 2020-11-11T11:22:33Z

Great, that works now, thanks!

voisey · 2020-11-11T12:01:57Z

@wilsonmr I've gone through closure_results and multiclosure and with 31be492 I think I've taken care of the relevant cases, but please tell me what you think. Regarding multiclosure_pdf, you said that you thoughts things were okay there, since the multiclosure_underlyinglaw production rule raises config errors, but I just tested a few functions and I was able to get things like fits_covariance_matrix_totalpdf to run without any errors being raised when using a fit with fakedata: false. Do you agree that checks should therefore be added to this file?

voisey · 2020-12-02T11:58:46Z

@wilsonmr Do you have any thoughts on the above?

wilsonmr · 2020-12-08T17:00:10Z

Sorry for ignoring you, my github notifications were going straight to spam for a while. So I think those funcitons are ok because they're technically nit unique to closure tests. So I think we can leave them! A covmat is a covmat after all..

voisey · 2020-12-09T09:58:54Z

No problem! Ah of course, I hadn't thought of that! At this point I think it could be good to produce a fit with all of these pieces removed (closuretest, seed, rnalgo, stopping) to check that nothing breaks, do you agree @wilsonmr? If so, would it be cheeky to remove these from n3fit/runcards/developing.yml on this branch, get the fit bot to run it and then remove the commit?

voisey · 2020-12-09T10:01:31Z

Actually, if the bot were successful I suppose we may want to keep the changes to the runcard so that it's as simple as possible, so the last point could be scrapped

wilsonmr · 2020-12-09T10:06:19Z

Actually, if the bot were successful I suppose we may want to keep the changes to the runcard so that it's as simple as possible, so the last point could be scrapped

I was about to say this

wilsonmr · 2020-12-09T10:07:21Z

I'll double check the closure stuff by running a few reports but it looks good to me

…veloping.yml

voisey · 2020-12-09T10:31:53Z

I can also remove several other things from developing.yml I think, namely fitmethod, nmutants, nnodes, and I'm guessing ngen and paramtype. Can you please confirm @scarlehoff?

voisey · 2020-12-09T10:32:30Z

Sounds good, thanks @wilsonmr!

github-actions · 2020-12-10T13:58:49Z

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Fit Name: NNBOT-e6e6e7a1-2020-12-10
Fit Report: https://vp.nnpdf.science/F7JmFCxOSpawS1jDvJfgHQ==
Fit Data: https://data.nnpdf.science/fits/NNBOT-e6e6e7a1-2020-12-10.tar.gz

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

wilsonmr · 2020-12-10T14:03:29Z

Hmm starting to think we should actually update the fit the bot compares to instead of having to dig out #1019 everytime to check nothing was changed

voisey · 2020-12-11T10:14:54Z

That certainly seems like the safer thing to do. In any case, this shows that nothing breaks in the fit or in comparefits when removing this stuff from the runcard. At this point, are you happy for me to go through the default runcards and remove all of this info @scarlehoff @wilsonmr?

scarlehoff · 2020-12-11T10:16:32Z

Yes, please do. wrt the bot, we have #1025, but in the meantime I agree, it makes sense to update the current baseline.

voisey · 2020-12-14T11:40:12Z

If the fit bot tag is attached to a PR does it run with every commit? I've removed it here just to be sure

scarlehoff · 2020-12-14T11:41:15Z

No, it should only run when tagged (and to run it again you have to untag-tag)

voisey · 2020-12-14T11:41:47Z

Okay, good to know, thanks!

voisey · 2020-12-14T11:48:32Z

I've just removed the unnecessary fields from the runcards in the n3fit/runcards folders. Can a couple of people please review this now? I think it would be good to have this merged before the 4.0 baseline is run and to make sure we're careful to remove these fields from the 4.0 runcards to avoid unnecessary info being propagated to all future 4.0 variants

scarlehoff · 2020-12-14T12:11:00Z

+            f"The `fakedata` key does not exist in the `closuretest` namespace of {fit}'s runcard. "
+            f"{fit} is therefore not suitable for closure-test studies."
+        )
+    if not fitinfo["closuretest"]["fakedata"]:


Ok, maybe it is not related specifically to this PR but what's the point of having a key that is mandatory and that is only allowed to be set to true?

because it essentially is checking that somebody didn't give a normal fit but try to calc a closure estimator. Because otherwise it could fail silently since - up until this PR - all fits have had a fakepdf which does absolutely nothing.

As the name of the check suggests

Remove requirement by vp-setupfit that closuretest and fakedata are s…

5a3a3e8

…pecified

voisey requested review from Zaharid, scarlehoff and wilsonmr October 16, 2020 08:49

Update eff_exponents.iterated_runcard_yaml so it doesn't assume that …

43bba1f

…closuretest namespace is always there

Make n3fit run without closuretest namespace in runcard

9777ac4

Expand check_fit_isclosure so that it gives useful error messages

d496832

Call check_fit_isclosure in check_fits_areclosures to simplify things

74c1b0d

Add extra checks to closure_results.py

31be492

voisey added 2 commits December 9, 2020 10:24

Merge branch 'master' into only_require_necessary_info

1295b6b

Remove unnecessary info from n3fit runcard that fit bot sees, i.e. de…

258f480

…veloping.yml

voisey added 2 commits December 9, 2020 15:57

Merge branch 'master' into only_require_necessary_info

a54abab

Remove more unnecessary info from developing.yml

b59b0e9

voisey added the run-fit-bot Starts fit bot from a PR. label Dec 10, 2020

voisey removed the run-fit-bot Starts fit bot from a PR. label Dec 14, 2020

Remove unnecessary fields from n3fit runcards

595ebd4

scarlehoff approved these changes Dec 14, 2020

View reviewed changes

RosalynLP self-requested a review December 16, 2020 15:31

scarrazza merged commit 47c1a11 into master Dec 16, 2020

scarrazza deleted the only_require_necessary_info branch December 16, 2020 15:32

voisey mentioned this pull request Dec 16, 2020

Fit runcards should not have innecesary information #536

Closed

wilsonmr mentioned this pull request Mar 5, 2021

Checks for common errors / defaults #1129

Closed

Conversation

voisey commented Oct 16, 2020

Uh oh!

voisey commented Oct 16, 2020

Uh oh!

wilsonmr commented Oct 16, 2020

Uh oh!

scarlehoff commented Oct 16, 2020

Uh oh!

voisey commented Oct 16, 2020

Uh oh!

voisey commented Oct 16, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scarlehoff commented Oct 16, 2020

Uh oh!

Zaharid commented Oct 16, 2020

Uh oh!

voisey commented Oct 16, 2020

Uh oh!

wilsonmr commented Oct 16, 2020

Uh oh!

voisey commented Oct 16, 2020

Uh oh!

voisey commented Oct 16, 2020

Uh oh!

wilsonmr commented Oct 20, 2020

Uh oh!

voisey commented Oct 21, 2020

Uh oh!

wilsonmr commented Oct 21, 2020

Uh oh!

voisey commented Oct 26, 2020

Uh oh!

wilsonmr commented Oct 26, 2020

Uh oh!

voisey commented Nov 11, 2020

Uh oh!

wilsonmr commented Nov 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

voisey commented Nov 11, 2020

Uh oh!

voisey commented Nov 11, 2020

Uh oh!

voisey commented Dec 2, 2020

Uh oh!

wilsonmr commented Dec 8, 2020

Uh oh!

voisey commented Dec 9, 2020

Uh oh!

voisey commented Dec 9, 2020

Uh oh!

wilsonmr commented Dec 9, 2020

Uh oh!

wilsonmr commented Dec 9, 2020

Uh oh!

voisey commented Dec 9, 2020

Uh oh!

voisey commented Dec 9, 2020

Uh oh!

github-actions Bot commented Dec 10, 2020

Uh oh!

wilsonmr commented Dec 10, 2020

Uh oh!

voisey commented Dec 11, 2020

Uh oh!

scarlehoff commented Dec 11, 2020

Uh oh!

voisey commented Dec 14, 2020

Uh oh!

scarlehoff commented Dec 14, 2020

Uh oh!

voisey commented Dec 14, 2020

Uh oh!

voisey commented Dec 14, 2020

Uh oh!

scarlehoff Dec 14, 2020

Choose a reason for hiding this comment

Uh oh!

voisey commented Oct 16, 2020 •

edited

Loading

wilsonmr commented Nov 11, 2020 •

edited

Loading