[New commondata format] Jets#1699
Conversation
| @author: Mark N. Costantini | ||
| """ | ||
|
|
||
| import yaml |
There was a problem hiding this comment.
As a general note, I'd prefer using the ruamel.yaml library (or from reportengine.compat import yaml).
It doesn't matter very much if you want parse into a simple dict, but it is better to use only one yaml library in the whole codebase, and also the capability to know about things like line numbers and comments is useful for things like this https://validobj.readthedocs.io/en/latest/examples.html#yaml-line-numbers
There was a problem hiding this comment.
Relatedly, I wonder if the processing of the hepdata raw files could be done with validobj (like we already do for the commondata new format). May or may not be simpler, but very likely more reusable.
|
@enocera @Radonirinaunimi what was the status of the thing that auto-downloaded the tables? Shouldn't it be used in this kind of scripts? |
| dat_file = '/Users/markcostantini/codes/nnpdfgit/nnpdf/buildmaster/results/DATA_ATLAS_2JET_7TEV_R06.dat' | ||
| sys_file = '/Users/markcostantini/codes/nnpdfgit/nnpdf/buildmaster/results/systypes/SYSTYPE_ATLAS_2JET_7TEV_R06_DEFAULT.dat' |
There was a problem hiding this comment.
These should be generalized.
| Clum = np.einsum('ij,kj->ik',Alum,Alum) | ||
|
|
||
| # construct Block diagonal statistical Covariance matrix | ||
| ndata = [21, 21, 19, 17, 8, 4] |
There was a problem hiding this comment.
There are a bunch of magic numbers here that feels like they should be read from somewhere.
| # construct Block diagonal statistical Covariance matrix | ||
| ndata = [21, 21, 19, 17, 8, 4] | ||
| stat = pd.read_csv(f"rawdata/dijet_statcov/hepcov_R06_Eta0.txt", sep = " ", header = None) | ||
| BD_stat = stat.loc[13:,1:ndata[0]].to_numpy().astype(float) |
There was a problem hiding this comment.
Why are these casts needed?
| with open("uncertainties_weaker.yaml",'w') as file: | ||
| yaml.dump(uncertainties_yaml,file, sort_keys=False) | ||
|
|
||
| return covmat |
There was a problem hiding this comment.
It doesn't look like we are using the return value.
It must be indeed part of a PR in which the changes in #1693 are introduced. AFAIC, adding this feature will just amount to moving the implementation in #1500 and ask people to test it. However, I am not sure if the utils in #1693 are already pointing to the correct branch (based on the exchanges that seems to be not yet the case). |
…r the three correlation scenarios
b2b0b14 to
b0ce842
Compare
PR for the implementation of existing Jet datasets into new CommonData format.