Remove useless namespaces / change runcard names by scarlehoff · Pull Request #1211 · NNPDF/nnpdf

scarlehoff · 2021-05-04T11:50:49Z

In this PR I will try to address #1166 so that the runcard specification when the code is as sensible as possible.

I was not sure whether it was better to have a different PR for every specific change which would make it easier to review but since the runcards will look differently before and after this PR I think it might be better to have them all in one...

(or all separated but they need to be merged at the same time with potential for out of sync...)

The plan is that after this PR (if) is merged the runcards before and after will be incompatible, which means we 100% want to do this before going public...

I will list here the changes I've implemented. I'll try to separate them in different commits so they can be easily cherry picked

poslambda -> maxlambda: this value stopped long ago being the "positivity lambda" to become the "max. positivity lambda" and with the inclusion of integrability not even the "positivity" part is right anymore.
Moved tensorboard and save/load outside of the fitting dictionary. Not even sure what I was thinking about when I put them in there. I guess encapsulating the n3fit piece or smt.
Moved parameters outside fitting.

Things I'll do between tomorrow and Friday I guess so people can complain beforehand:

Put the fitbasis and basis inside the same namespace (I find it annoying that they are two different items) and add the sum_rules also there (and modify the check so that only the right sum rules can be used!) --> this is fitting now.
Move genrep outside the fitting dictionary

Feel free to suggest other changes I might have forgotten about (this is the time to do so!)

"hacks" done to keep backwards compatibility will be marked with a #BCH comment.

Note: not fixing the test for now as it requires to take a decision on how hard to break backwards compatibility.
Note to self: before merging all the runcards need to be rechecked to ensure they are not out of sync (and that includes the ones in the runcards PR!)

Zaharid · 2021-05-04T15:46:31Z

Seems to me one PR is better here. But seems to me if we want to do this (which I think we do), then it is a blocker for the next tag.

scarlehoff · 2021-05-05T09:57:34Z

I actually branched out from #1153 thinking it could be the "open soruce tag" but not necessarily the "paper fits tag" but it should be easy enough to rebase to master.

I won't be able to attend the CM today but take my official stance as "whatever you guys decide".

wilsonmr · 2021-05-05T11:31:31Z

combining the namespaces seems sensible. There is currently a slightly annoying feature that validphys sometimes uses these as namespaces but also uses them as dictionaries which are taken from the runcard directly. I think the latter can be changed (e.g. in effective exponents) to use the former. We'll have to take care to keep bakwards compatibility at least until 4.0 because we still want to be able to run old fits until then.

Same applies to moving genrep outside of the fitting dictionary. We just need to work out how we can expect it either inside or outside (for now)

I think the filterseed should possibly stay with the closure "dict" for now since it really only applies to that, but the other seeds can, by all means, be grouped in a more sensible manner.

scarrazza · 2021-05-05T14:44:55Z

@scarlehoff we have discussed about this today, and we agree that we should include the proposed changes asap.

scarlehoff · 2021-05-06T07:31:23Z

@scarlehoff we have discussed about this today, and we agree that we should include the proposed changes asap.

What was the agreement on how hard to break backwards compatibility? If I don't need to care about nnfit or old runcards I can finish this much quicker.

Zaharid · 2021-05-06T08:14:34Z

@scarlehoff I think we can live without older n3fit runcards, but if nnfit could still work at least with comparefits, that would be ideal. It seems to me it should be possible to hack fit.as_input to support everything though.

wilsonmr · 2021-05-06T08:54:04Z

@scarlehoff I think we can live without older n3fit runcards, but if nnfit could still work at least with comparefits, that would be ideal. It seems to me it should be possible to hack fit.as_input to support everything though.

This is true. I think what I had in mind yesterday was definitely take all n3fit specific things outside of the bits like fitting etc. Then we should take a look at things like genrep and see what the damage is:

EDIT: sorry I said fitbasis but that's actually not an n3fit thing, I meant the fitting dict

git grep genrep
<some runcards>
...
<n3fit (only a few places) plus it's handled properly (prsed from namespace) so I don't think things will break>
validphys2/src/validphys/n3fit_data.py:def replica_mcseed(replica, mcseed, genrep):
validphys2/src/validphys/n3fit_data.py:    if not genrep:
validphys2/src/validphys/n3fit_data.py:    >>> API.training_mask_table(dataset_inputs=ds_inp, replicas=reps, trvlseed=123, theoryid=162, use_cuts="nocuts", mcseed=None, genrep=False)
validphys2/src/validphys/pseudodata.py:    trvlseed, mcseed, genrep = [
validphys2/src/validphys/pseudodata.py:        for i in ["trvlseed", "mcseed", "genrep"]
validphys2/src/validphys/pseudodata.py:    # common_data_reader expects None if genrep is False
validphys2/src/validphys/pseudodata.py:    if genrep:
validphys2/src/validphys/pseudodata.py:            replica_mcseed(rep, mcseed, genrep) for rep in replica
validphys2/src/validphys/tests/regressions/dummy_closure_runcard.yaml:  genrep: true          # true = generate MC replicas, false = use real data

The pseudodata function should probably be replaced at some point to use the new functions which already exist in validphys (n3fit_data). But otherwise I think nothing would break so we can basically just change the runcards..

I think there will only be very few cases where we have to change something in validphys or choose to leave it:

e.g.

git grep fitbasis
<runcards>
<collects over production rule>
...
src/validphys/config.py:        is set using the key ``fitbasis``, but it is exposed to validphys
src/validphys/config.py:        set to not conflict with the existing ``fitbasis`` runcard key.
src/validphys/config.py:        basis = fitting["fitbasis"]
src/validphys/eff_exponents.py:        pdfs. If this flavour doesn't exist in the current pdf's fitbasis or
src/validphys/eff_exponents.py:        if fitting["fitbasis"] == basis:
src/validphys/eff_exponents.py:        fitting["fitbasis"],
src/validphys/eff_exponents.py:    if fit is not None and fit.as_input()["fitting"]["fitbasis"] == basis:
src/validphys/eff_exponents.py:    basis = filtermap["fitting"]["fitbasis"]
<not actions>
<runcards>

The stuff in config can be changed to use the parse_from properly and/or we can maybe just get old runcards and put the fitbasis in the base namespace. As @Zaharid just suggested.

wilsonmr · 2021-05-06T08:55:37Z

The short version of that is that I think you can be quite brutal and not too many things break, if they do I don't think they will be hard to fix.

scarlehoff · 2021-05-06T09:24:41Z

Ok, I'll do my best to first get the "final version" and then I'll encapsulate the backwards compatibility hacks as much as I can so when we decide to pull the plug it can be done easily.

scarlehoff · 2021-05-06T10:27:31Z

At this point the test pass in my computer, I wonder whether Travis will. Not sure about the seeds namespace since it will be just a container... might make more sense to have them simply in the outside. I don't know.

Edit: also, we might want it to change in order to force inconsistency but I think it is actually fine if the basis-fitbasis-sumrules dictionary is called fitting...

scarlehoff · 2021-05-07T08:13:50Z

@scarrazza why can't the tests be restarted anymore? This last commit passed everything but one (and because of docker)

Zaharid · 2021-05-07T08:24:47Z

On 7 May 2021, at 10:14, Juan M. Cruz-Martinez ***@***.***> wrote: @scarrazza why can't the tests be restarted anymore? This last commit passed everything but one (and because of docker)

Bitcoin most likely.

…

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

scarlehoff · 2021-05-07T12:09:11Z

Some questions and requests:

Could we merge Fit many replicas in parallel #1153 ?
Is it fine to leave the basis/fitbasis/sum_rule dictionary as fitting or would you rather use some other name so it is explicitly different?
I think the docs/code is already finished, I will go through the runcards now and change them all, but the code/docs can already be reviewed.

Zaharid · 2021-05-07T12:50:41Z

I guess at this point delaying #1153 is counter productive and besides I suppose this counts as a separate change wrt the data additions which is what we agreed to hold it for.

How does the maxlambda change affect compatibility? Can we run vp with old fits?

scarlehoff · 2021-05-07T12:55:17Z

How does the maxlambda change affect compatibility? Can we run vp with old fits?

No, you're right. I should not "fix the error" but rather allow it (and maybe print a "deprecated" warning)

github-actions · 2021-05-07T19:31:30Z

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Fit Name: NNBOT-9fab19dbd-2021-05-07
Fit Report: https://vp.nnpdf.science/nbmOVrj2RQqemgvCNEgpSQ==
Fit Data: https://data.nnpdf.science/fits/NNBOT-9fab19dbd-2021-05-07.tar.gz

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

wilsonmr

some small comments but on the whole looks fine.

Zaharid · 2021-05-14T10:08:43Z

Would be good to have explicit green light from at least @wilsonmr.

To clarify, is it correct that:

Old nnfit runcards work with no changes.
All fits that used to work in comparefits still do.

Zaharid · 2021-05-14T10:11:58Z

FWIW the code changes look reasonable to me.

scarlehoff · 2021-05-14T10:21:02Z

Old nnfit runcards work with no changes.

Yes. Tested with nnpdfcpp/config/NNPDF31_nlo_as_0118.yml (haven't run a full fit, just check that it can run)

All fits that used to work in comparefits still do.

All fits that I've tested do. I've run the bots a few times (it's running now) and I just sent a fit to the cluster which will be compared to a March fit.

github-actions · 2021-05-14T13:09:46Z

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Fit Name: NNBOT-ff104b583-2021-05-14
Fit Report: https://vp.nnpdf.science/OyjtJSVDQmCSJZlRPeGQTg==
Fit Data: https://data.nnpdf.science/fits/NNBOT-ff104b583-2021-05-14.tar.gz

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

scarlehoff · 2021-05-14T13:38:53Z

The full (only 50 replicas) fit seems reasonable as well: https://vp.nnpdf.science/qcvzhloyTJKn0eByukqlsg==/

wilsonmr · 2021-05-14T13:42:55Z

Sorry had a busy day I will look at this ASAP

wilsonmr

subject to the suggestions I made (or strong objection to them) I think this looks really good, thanks @scarlehoff !

Co-authored-by: wilsonmr <33907451+wilsonmr@users.noreply.github.com>

joining the two commits for ease of revert in case Co-authored-by: wilsonmr <33907451+wilsonmr@users.noreply.github.com>

scarlehoff · 2021-05-14T21:46:08Z

Not sure about the check-core thing but if people feel strongly about it I'll do the top-level variable.

Some runcards don't run but it is because of the theory (#1227)

Rerunning the bot because it's free.

On Monday morning if there are no other requests I'll merge everything (adding the unresolved comment if needed)

github-actions · 2021-05-15T01:27:15Z

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Fit Name: NNBOT-88930b32e-2021-05-14
Fit Report: https://vp.nnpdf.science/o-Su-VeYThCKhzG1Mqc56A==
Fit Data: https://data.nnpdf.science/fits/NNBOT-88930b32e-2021-05-14.tar.gz

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

scarlehoff requested a review from wilsonmr May 4, 2021 11:50

scarlehoff force-pushed the remove_useless_namespaces branch from 1bd4eeb to aabae26 Compare May 4, 2021 11:54

scarlehoff linked an issue May 6, 2021 that may be closed by this pull request

Remove useless fit runcard namespaces #1166

Closed

Zaharid added the 4.0-blocker label May 7, 2021

scarlehoff force-pushed the multireplica_n3fit_mk3 branch from 04f2a22 to b24abb0 Compare May 7, 2021 12:02

scarlehoff force-pushed the remove_useless_namespaces branch from 47a31a3 to 45cb546 Compare May 7, 2021 12:05

scarlehoff marked this pull request as ready for review May 7, 2021 12:09

scarlehoff force-pushed the remove_useless_namespaces branch from 45cb546 to e86eb8d Compare May 7, 2021 12:10

scarlehoff added the run-fit-bot Starts fit bot from a PR. label May 7, 2021

scarlehoff removed the run-fit-bot Starts fit bot from a PR. label May 7, 2021

scarlehoff added the run-fit-bot Starts fit bot from a PR. label May 7, 2021

wilsonmr reviewed May 11, 2021

View reviewed changes

Comment thread doc/sphinx/source/tutorials/thcov_tutorial.rst Outdated

wilsonmr reviewed May 11, 2021

View reviewed changes

Comment thread validphys2/src/validphys/pseudodata.py

wilsonmr reviewed May 11, 2021

View reviewed changes

scarlehoff mentioned this pull request May 14, 2021

Fit many replicas in parallel #1153

Merged