New ht thcovmat by achiefa · Pull Request #2126 · NNPDF/nnpdf

achiefa · 2024-07-15T10:11:47Z

This PR allows for the inclusion of theory uncertainties due to the effect of power corrections. The theory covariance matrix is constructed by computing the shifts for the theoretical predictions as done for the MHOUs. The shift is computed at the level of the structure functions. Then, the shifts for the structure functions are combined to reconstruct the shift for the xsec. For this reason, the calculation of the shift depends on the dataset. Currently, only 1-JET and DIS (NC and CC) are supported.

At the level of the runcard, power corrections are specified as follows

theorycovmatconfig:
  point_prescriptions: ["power corrections"]
  pc_parameters:
  - {ht: H2p, yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
  - {ht: H2d, yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
  - {ht: HLp, yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
  - {ht: HLd, yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
  - {ht: H3p, yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
  - {ht: H3d, yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
  - {ht: Hj, yshift: [2.0, 2.0, 2.0, 2.0, 2.0, 2.0], nodes: [0.25, 0.75, 1.25, 1.75, 2.25, 2.75]}
  pc_included_procs: ["JETS", "DIS NC", "DIS CC"]
  pc_excluded_exps: [HERA_NC_318GEV_EAVG_CHARM-SIGMARED,
                     HERA_NC_318GEV_EAVG_BOTTOM-SIGMARED]
  pdf: 240701-02-rs-nnpdf40-baseline
  use_thcovmat_in_fitting: true
  use_thcovmat_in_sampling: true

For each process/sf implemented, we need to specify a series of information:

The name of the power correction, under the key ht. This is useful to identify the type of power correction, in particular when computing the posterior.
The values in yshift are the magnitudes of the prior.
nodes contains the points where the prior is shifted

The array pc_included_procs specifies the processes for which the shifts are computed. I've also implemented the possibility to exclude some particular datasets within the selected processes, and this can be done by specifying the names in pc_excluded_exps.

~~The key func_type is temporary and will be deleted once we decide which function to use to construct the prior.~~

All the relevant details for this PR are contained in the module higher_twist_functions.py. For each observable for which the shift has to be computed, I implemented a factory that constructs a function which will then compute the shift. I thought it was kind of necessary to make the shift dependent on the shift parameters (yshift and nodes) and on the prescription according to which we vary the parameters (which for now is fixed), and not on the kinematics (I think this is known as currying in computer science).

TO DO

Add documentation
~~[x] Remove func_type~~
What happens if power_corrections is specified but the parameters are not given?
Change the name ht to name (?)
Something else?

github-actions · 2024-07-15T11:33:42Z

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Fit Name: NNBOT-2ad60a7b9-2024-07-15
Fit Report wrt master: https://vp.nnpdf.science/f0TNPeX3SlO5uwbJkr7I8Q==
Fit Report wrt latest stable reference: https://vp.nnpdf.science/X6jx-wHfT5iJuz14nPatnQ==
Fit Data: https://data.nnpdf.science/fits/NNBOT-2ad60a7b9-2024-07-15.tar.gz

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

achiefa · 2025-01-14T14:14:04Z

Hi @roy, I've updated the code so that now the covmat for power corrections can be constructed as in the case of scale variations. Please, look and let me know if something is unclear. I'd like if you could double-check the functions that compute the shifts, in particular when normalisation factors and conversion factors are used.

There are few things that I still don't like here and there, but you're free to propose modifications.

github-actions · 2025-01-14T15:29:47Z

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Fit Name: NNBOT-701fd3dd4-2025-01-14
Fit Report wrt master: https://vp.nnpdf.science/LhL3COHERwuMwHEEeGvymg==
Fit Report wrt latest stable reference: https://vp.nnpdf.science/-y7-UpCPSFGPxzP_41VXLA==
Fit Data: https://data.nnpdf.science/fits/NNBOT-701fd3dd4-2025-01-14.tar.gz

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

scarlehoff

Hi @achiefa thanks for this.

I've left a few comments on the code. The 1000line higher_twist_functions I haven't look into in much detail, but one thing that I'd like to mention is please avoid using commondata_table. This is basically an object containing all commondata information like in the old days which was necessary to add for some of validphys functions but the idea is to instead get only the information you want.

For instance, the process type is part of the metadata now (no need to load the data and uncertainties!)
Same for the kinematics (although by the time you need the kinematics you often have already read the whole commondata). You can get the kinematics directly with cd.metadata.load_kinematics() and then you already have x, Q, and y for DIS as god intended!

Excluded HERACOMB Hacking NMC dataset Grouping kinematics Reimplementing thcovmat Added comment in HT for DIS Corrected normalisation for SIGMARED DIS NC data sets Removing unused code Added HT for F2C data (EMC) - removed deprecated function Corrected EMC data iron target Removed deprecated code Refactoring + DIS CC Corrected bug - ready for cc test Corrected bug - ready Removing unnecessary code Corrected bug after rebase Add normalisation in CC x-secs Correct normalisation Restore n3fit files from master remove _PB suffix from process type format a bit Correct typo + example runcard Update for new thcovmat construction + refactor + docstrings Correct bug Correct nuclear factors for nuclear targets First implementation of jet data Change pc jet dependence from pT to eta Allowing step-function for the prior Vectorize step_function + docstring Correct bug in step function Correct bug in step function Produce covs_pt_prescrip Allow different funcs for posterior Adjusting linear triangular function Correct bug in linear function Implement multiplicative PC for jet Correct docstring Remove unused vp runcard Dijet + clean-up + checks for pc dict Remove translation layer for pc parameters Remove copy of the same collection procs_data -> groups_data_by_process + clean-up Remove unused collect Update basic runcard Allow generation of L1 data Jets with single parameters Adjust format Restoring nodes for jets and dijets

…name H2j_ystar and H2j_ymax

scarlehoff

Left a few comments but haven't run the code yet. I've also look at higher_twist_functions very superficially for now, just got curious about the cuts.

I'm a bit worried about all the if conditions that this one prescription adds and whether it breaks some assumptions in the code. In particular the one where theoryid(s) is a one-item list...

scarlehoff · 2026-05-12T10:23:39Z

  process_type: DIJET
  # Plotting
-  kinematic_coverage: [ydiff, m_jj, sqrts]
+  kinematic_coverage: [ymax, m_jj, sqrts]


Are these changes necessary / should they be in this PR?

I checked this with Emanuele, and we concluded that ydiff is equivalent to ymax. Since the other dijet datasets adopt ymax, and the name of the variables is processed in the construction of the theory covmat, I changed it to ymax

scarlehoff · 2026-05-12T10:28:56Z

+    size2 = deltas2[0].size
+    s = np.zeros(shape=(size1, size2))
+    for shift1, shift2 in zip(deltas1, deltas2):
+        s += np.outer(shift1, shift2)


It says that it is using the 5 point prescription, but this is different from the covmat_5pt function. Why?

To be fear, I don't know what to call this. Richard called this generalised 5-pt. The point is that we have a shift for each ht parameter. Moreover, since each shift is linear in the parameter, the outer products of positive and negative shifts are equivalent. That's way there is no 0.5 factor. In few words, is a generalised 5-pt prescription where positive and negative shifts are equivalent. Happy to change the name.

No problem with the name, with a mention to the difference in the docstr I'd be happy.

scarlehoff · 2026-05-12T10:29:31Z

+def covs_pt_prescrip_mhou(combine_by_type, point_prescription):
    """Produces the sub-matrices of the theory covariance matrix according
    to a point prescription which matches the number of input theories.
-    If 5 theories are provided, a scheme 'bar' or 'nobar' must be


Why has this single line fallen?

I don't know what happened here. I'll restore this line.

scarlehoff · 2026-05-12T10:35:28Z

+    start_proc_by_exp = defaultdict(list)
+    for exp_name, data_spec in datagroup_spec.items():
+        start_proc_by_exp[exp_name] = running_index
+        running_index += data_spec.load_commondata().ndata


I am a bit worried here, you are loading the commondata and taking the number of data point.
But then in the loop below, inside compute_deltas_pc you are cutting the data...

are you sure this is being done consistently? (in other words, are you sure that load_commmondata here is happening after cuts?

I think this might give me mercy, right?

nnpdf/validphys2/src/validphys/core.py

Lines 504 to 512 in 73ad16a

def load_commondata(self):

"""Strips the commondata loading from `load`"""

cd = self.commondata.load()

if self.cuts is not None:

loaded_cuts = self.cuts.load()

if not (hasattr(loaded_cuts, '_full') and loaded_cuts._full):

intmask = [int(ele) for ele in loaded_cuts]

cd = cd.with_cuts(intmask)

return cd

I see. I guess never load the data without cuts now. That's good then :)

Experience is a prison sometimes.

scarlehoff · 2026-05-12T10:36:51Z

+
+    pc_type = get_pc_type(exp_name, process_type, experiment=experiment, pc_dict=pc_dict)
+
+    if process_type.startswith('DIS'):


Counting the days until someone call a process DISASTER for some reason 🙈

I would perhaps also guard against processes that have no x or no Q2.

scarlehoff · 2026-05-12T10:44:57Z

        This hard codes the theories needed for each prescription to avoid user error."""
        th = t0id.id
+        if point_prescription == 'power corrections':
+            return NSList([t0id], nskey="theoryid")


I wonder how many things across the code will assume that a theory covmat needs more than one theory...

Co-authored-by: Juan M. Cruz-Martinez <juacrumar@lairen.eu>

achiefa added the run-fit-bot Starts fit bot from a PR. label Jul 15, 2024

achiefa requested a review from RoyStegeman July 15, 2024 10:11

achiefa self-assigned this Jul 15, 2024

scarlehoff marked this pull request as draft July 17, 2024 14:26

achiefa force-pushed the new_ht_thcovmat branch from 58ffdd3 to 89aa862 Compare July 29, 2024 10:09

RoyStegeman reviewed Oct 4, 2024

View reviewed changes

achiefa force-pushed the new_ht_thcovmat branch 2 times, most recently from 28662e8 to 6b71fae Compare October 11, 2024 22:16

RoyStegeman force-pushed the new_ht_thcovmat branch from d75f1b9 to 8e1dc74 Compare November 14, 2024 16:20

achiefa force-pushed the new_ht_thcovmat branch 2 times, most recently from 348be41 to 9ac473c Compare January 14, 2025 14:06

achiefa added run-fit-bot Starts fit bot from a PR. and removed run-fit-bot Starts fit bot from a PR. labels Jan 14, 2025

achiefa commented Jan 14, 2025

View reviewed changes

Comment thread validphys2/src/validphys/theorycovariance/construction.py Outdated

achiefa commented Jan 14, 2025

View reviewed changes

Comment thread validphys2/src/validphys/theorycovariance/construction.py Outdated

achiefa removed the run-fit-bot Starts fit bot from a PR. label Jan 14, 2025

achiefa force-pushed the new_ht_thcovmat branch from 9ac473c to 84642cb Compare January 14, 2025 14:18

achiefa commented Jan 14, 2025

View reviewed changes

Comment thread validphys2/src/validphys/theorycovariance/construction.py Outdated

achiefa force-pushed the new_ht_thcovmat branch from 39f5640 to ef59af1 Compare January 30, 2025 13:14

achiefa marked this pull request as ready for review January 30, 2025 14:13

RoyStegeman force-pushed the new_ht_thcovmat branch from fdab4c9 to 9f286c6 Compare February 14, 2025 14:35

achiefa force-pushed the new_ht_thcovmat branch from 3c8b923 to de6497e Compare February 21, 2025 14:47

achiefa requested a review from scarlehoff February 21, 2025 15:37

achiefa force-pushed the new_ht_thcovmat branch from a7e265a to 49bca4f Compare February 28, 2025 12:12

scarlehoff reviewed Mar 2, 2025

View reviewed changes

achiefa force-pushed the new_ht_thcovmat branch from 32314ea to d47da56 Compare April 14, 2025 09:16

achiefa force-pushed the new_ht_thcovmat branch from d47da56 to f709249 Compare April 23, 2025 14:12

achiefa added 16 commits March 9, 2026 14:09

First implementation of HT at the level of theory predictions

feb776a

Implementation of multiplicative shifts

c04b754

Less functions for PCs

34fcd10

Combined di-jet

481bb99

Correct combined di-jet

029d6a1

Remove func_type dependence - default is linear interpolation

0e69c05

Allow multiplicative factor for user covmat

a42eb7f

Remove debug trace

2fc7434

Removing deprecated functions + clean up

cbbc167

Polishing the code + doc

3f88078

Implementing Juan's comments

87bba31

Adding PC for dijet 3d distributions

85fc323

Use the same parameters for 3D dijet distributions and ATLAS 2D -> Re…

483693e

…name H2j_ystar and H2j_ymax

Update runcard

09b76db

Update example runcard to 4.1

bbf9a4a

achiefa force-pushed the new_ht_thcovmat branch from 931ea73 to bbf9a4a Compare March 9, 2026 14:09

achiefa and others added 11 commits March 10, 2026 16:46

Allow legacy compatibility

c964c27

2D power corrections

9dcf0b6

Merge branch 'master' into new_ht_thcovmat

e78db3b

Removing debugging info

ba9855a

Temporary fix for CMS 2jet 7 TeV

4addae8

Merge branch 'master' into new_ht_thcovmat

2c11328

Allow compute_deltas_pc to give c-factors + clean up

ae0aaa5

Change variable name for compatibility with other datasets

7227de8

Add point prescription for 2d functions

7886470

Fix bug with ht parameter combinations

ed6b66c

Merge branch 'master' into new_ht_thcovmat

6d6a5ca

scarlehoff reviewed May 12, 2026

View reviewed changes

Correct docstrings in covmat_power_corrections

73ad16a

Co-authored-by: Juan M. Cruz-Martinez <juacrumar@lairen.eu>

	def load_commondata(self):
	"""Strips the commondata loading from `load`"""
	cd = self.commondata.load()
	if self.cuts is not None:
	loaded_cuts = self.cuts.load()
	if not (hasattr(loaded_cuts, '_full') and loaded_cuts._full):
	intmask = [int(ele) for ele in loaded_cuts]
	cd = cd.with_cuts(intmask)
	return cd


		pc_type = get_pc_type(exp_name, process_type, experiment=experiment, pc_dict=pc_dict)

		if process_type.startswith('DIS'):

Conversation

achiefa commented Jul 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TO DO

Uh oh!

github-actions Bot commented Jul 15, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

achiefa commented Jan 14, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jan 14, 2025

Uh oh!

scarlehoff left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

scarlehoff left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scarlehoff May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

achiefa commented Jul 15, 2024 •

edited

Loading

scarlehoff May 13, 2026 •

edited

Loading