[WIP] Running n3fit on the flavour basis#689
Conversation
|
@scarlehoff the code is fine, the |
8e98667 to
9fed37e
Compare
|
@scarlehoff at some point in #684 you were mentioning some documentation. Has it been done? Or should it be part of this PR? Also I m not able to find the fits you ran in the flavour basis on the server, like for example 270320_DIS_flavbas_jcm. Could you please upload it? |
I don't think so...
Would be appreciated!
Sure, I've uploaded |
|
uhm and what have you used to produce the effective preprocessing exponents table appearing in this report? Looking at pdf.md I see I get |
|
The basis in the runcard of the fit is fake I'm afraid. I ran that with a frankstein code. The ranges I took from using #684 with the 3.1 fit. You would need to run a new fit with this code to get something useful. |
|
@scarlehoff @Zaharid is the last commit similar to what you had in mind? |
|
That's dangerous because it will only work as long as you are using pure python things and it can easily broke if things come in a different order for instance. Thinking about it I guess what I have now in mind (which, mind you, is different from last week) is to let Some transpositions might be missing, but this would be the idea. That way the rotation is 100% done by The docs for tensordot: https://www.tensorflow.org/api_docs/python/tf/tensordot |
|
ok, will have a go |
|
@scarlehoff which flavour ordering should I consider to construct the rotation matrix in pdfbases? I mean, when you wrote the where I guess you are referring to the fact that If we choose that the input and I can hardcode it in pdfbasis. |
Yes exactly.
The order is completely arbitrary, it is just the output of the NN and by itself doesn't mean anything. It "receives" a meaning when you apply the rotation, so you just have to make sure that the rotation is following the same ordering as in the runcard.
Actually, I think it is better to construct the rotation matrix looking at the basis dictionary, that way if someone messes up the ordering of the runcard it will still work the same. A way you can do that is by constructing the matrix using the dictionary in PDF basis: nnpdf/validphys2/src/validphys/pdfbases.py Line 381 in fb26145 So that the first line of the matrix (the one corresponding to sigma) is constructed automatically with the basis dictionary. So if the dictionary is: Hope that makes sense! |
|
@scarlehoff ok thanks, i ve tried to do something like that, let me know if it may work |
|
Perfect. Haven't tested but it looks god. The only thing is that instead of reshaping the output (which can lead to mistakes as you might reshape in the wrong order + forces you to know the shape of the x at compile time, which is not always true) it is better to reorganize the tensorproduct so that the output already has the right shape. If I'm not wrong, here I think you can avoid the first transpose and then invert the order of the arguments in the tensorproduct, then the output will automatically have the right shape. |
|
ok then if you re happy with the code I ll start testing it with a dis only fit, and if everything looks fine I ll move to iterate the global ones |
|
I guess it looks fine |
|
Thanks! The code looks quite good, I like it. It is basically what I had in mind. Wrt the report, I am a bit worried about the low-x behaviour for some of the flavours but might be because of it being a DIS fit. Maybe it makes sense to do a 3.1 global fit with the same parameters. As a wish list for the future: |
|
sure, it s here |
|
should I use the same runcard for the iteration? or should I change some of the settings? |
|
No, that's fine. I was just wondering whether there was any different settings. |
|
some update:
Not sure how to proceed, we can discuss later at the pc |
|
I am worried about the large-x behaviour actually, the u-quark for instance is completely out... I think we want to do some kind of check to ensure we are fitting what we think we are fitting. The first two things that come to mind are
Edit: the sumrules enters here nnpdf/n3fit/src/n3fit/ModelTrainer.py Line 450 in 7b5b01d fitbasis layer (which is the layer after applying rotation so it should be ok).
|
|
@scarlehoff I ve computed the sum rules for the fits in the evolution and flavour basis using vp2
I m not sure how to proceed for 1). I guess I can try to use the function |
|
Ok, so the sum rules are indeed all around the place. This is good! (having a culprit for your problems is always good :P) About the possibility of having a problem in the rotation, of course this has to be checked to ensure everything is ok (maybe the gluon in the rotation matrix coming from vp is in the 3rd position and in n3fit in the 2nd position for instance) but:
I think you are probably right here.
As first approximation for this we can just add more points to the integration (I think right now it is like 1000, we can try 10 times more, the fit will be slower but we'll get a lot of information from there)
I think this would be much better in the long run. We can try first adding more points with our eyes closed and see how the results change (just to have some more info). wrt to the suggestion in today's PC about running without preprocessing, the easiest thing would be to do just set the range for alpha and beta like [0.0, 0.0] |
|
ok so having in mind today s pc I think I could proceed ad follows:
if up to this point the results in the two basis are compatible than everything is fine and we can move to study the best way to implement the sum rules, and whether or not the integrator is the problem. To do this
|
|
I thought that to run a fit without sumrules it would have been enough to set I think that the problem is that, when setting |
|
thinking again at what discusse at the pc on friday, I m writing here some considerations or I will forget:
The first option is what is implemented right now, and we have seen that it doesn't work well. Having to choose between 2) and 3), I would start with 3) |
|
Yeah, the impose sum rule flag has been broken for a while sadly (I basically never thought we didn't need it so didn't worry if something broke it...).
Maybe in theory, but in reality the preprocessing is also driving the behavior of the fit at small (and large) x so a fir without preprocessing will produce very different results.
The problem with 3) is that I am not sure whether it counts as "fitting in the flavour basis". That said, as a first test it might make sense and it should work out of the box because that rotation should be easily "absorbed" by the neural network. i.e., I think in 3) at first approximation you should get exactly the same results as before. |
I guess that the point of the flavour basis is about imposing positivity of each single flavour, and in this way you could do that imposing the positivity of each neural net, without messing with preprocessing and sumrules. However I guess it makes sense to discuss this on friday
yes that s what I was thinking, with the only difference that now you can impose positivity of each neural net. I ll start with this and then we can look at 2) as well |
This is what doesn't convince me because you are imposing positivity to some function which is not the final flavour function. And then later on multiply the preprocessing and normalization in a different basis. I think positivity will not be preserved if you rotate back to the flavour basis. Actually, thinking about it I am not even sure you can apply preprocesing before the normalization of the sum rules... But it is just a conjecture at this point, maybe it works fine. |
|
Ok I ve changed my mind :) I m now thinking to implement 2 which is what discussed ad the pc.. Not sure if this is the best way, I ve replaced the list |
The problem is, inside the All that said, being such a small dictionary it doesn't really matter and also we have decided not to train the prepossessing so it matters even less but: tl;dr, I'd do the "digestion" of the basis in |
|
ah I see, yeah I ve broken everything..ok I ll give another go |
Uhm here the order is still the one which comes from the runcard, right? I mean, just like the rotation matrix of the |
|
btw here there are some funny looking fits without preproc and without sumrules in both flav and evol basis |
|
@scarlehoff I was trying to implement the subtraction at |
From a practical point of view it could but we should discuss on Wednesday with @scarrazza how to go about this as this is no longer about "flavour basis" but rather about preprocessing. imo we should do a second branch to deal with the preprocessing and once that's dealt with we come back here (you get an operation for the But let's talk on Wednesday. |
| def dense_me(x): | ||
| """ Takes an input tensor `x` and applies all layers | ||
| from the `list_of_pdf_layers` in order """ | ||
| x0 = operations.m_tensor_ones_like(x) |
There was a problem hiding this comment.
Instead of creating a full tensor of the same size as x every time it'd be better to just feed just a 1.0
There was a problem hiding this comment.
uhm I don't understand.. what do you mean by just feed 1.0?
There was a problem hiding this comment.
Instead of passing an array of, say, 50 1.0, you can just pass 1 1.0 and should be the same.
|
To merge the flavour part (rotation) would be better to have a new branch with only that. |
|
ok I ve created a new branch of master where I ve cherry-picked the first commits of this branch regarding the implementation of the flavour basis. The corresponding PR is #749. We can merge that one while we keep working on preprocessing and sumrules on this one |
This is a template commit with the two or three changes needed tor un n3fit in the flavour basis.
The basis rotation needs to be implemented in
n3fit/src/n3fit/layers/rotations.py, I've added an example. Ideally it would be a class taking the basis information and preparing the rotation dynamically instead of having it fixed. The basis rotation should, however, be limited to a class (or a number of classes with a switch).In
n3fit/src/n3fit/model_gen.pywe need to have the information about the basis. Be it the name of the basis or thebasislist from the runcard. Anything that can telln3fit"ey, this is the basis you'll be using" but nothing more.Assigning @scarrazza as placeholder developer for now.