Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 81 additions & 2 deletions validphys2/src/validphys/sumrules.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,19 @@
LIMS = [(1e-9, 1e-5), (1e-5, 1e-3), (1e-3, 1)]
POL_LIMS = ((1e-4, 1e-3), (1e-3, 1))

# For high Mellin moments x**n * f(x) the integrand is dominated by the
# large-x region; slice (1e-3, 1) further so quad does not have to span
# many orders of magnitude in a single sub-interval.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me this comment and the code below are at odds.
We don't need to span many orders of magnitude so we break it down even further into each slice of order of magnitude.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this. The intervals (1e-3, 1e-2) and (1e-5, 1e-4) span two different order of magnitudes. So if we included these two sub-intervals into one integration, than the quadrature would range between two different orders of magnitude and the resolution of the integration would be affected. Am I wrong?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are not, that's why we separate in limits for the others. But I don't understand why the separation is even finer here if the lower subintervals are less important.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, now I got your point. If the small-x sub-intervals are really subdominant, why are we slicing them even further. Honestly I didn't spend too much time thinking about it: I just took the original intervals and added more sub-intervals at large-x. We can collpse the small-x ones into one sub-interval.

Copy link
Copy Markdown
Member

@scarlehoff scarlehoff May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so, next question. Do the intervals at large x make a difference at all?
At low-x it does because of the different orders of magnitude as you said. For large-x instead, if you are sampling linearly there's no much of a difference is there? Or does it help quad a lot?

edit: trying to understand the logic here. I just found it confusing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a concrete example:

Screenshot 2026-05-15 at 11 19 58

So the small-x region is negligible, ok. The reson for the different sub-invervals at large-x is more phenomenological than algorithmic. For example, I was interested in seeing which region at large-x contributes the most. I think this is relevant for lattice determinations, that's why I split the large-x interval. Note that the user can always specify a custom list of sub-intervals. So I'm fine with removing the global MELLIN_LIMS and using LIMS as default. Regarding the impact on quad this I haven't checked.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reson for the different sub-invervals at large-x is more phenomenological than algorithmic

Okok. This is enough for me tbh. I just wanted to understand why the choice. If you could add that to the comment that'd be great.
I'm fine with keeping the subinterval at both small-x and large-x, it doesn't really make a big difference. I was just very confused in general because you kept the small one while saying that they were not relevant and added a few more in a region in a way in which (numerically-integrating-wise) wouldn't make a difference.

But "I'm interested on this particular partial split" is good enough for me.

MELLIN_LIMS = [
(1e-9, 1e-5),
(1e-5, 1e-3),
(1e-3, 1e-1),
(1e-1, 0.3),
(0.3, 0.6),
(0.6, 0.9),
(0.9, 1.0),
]


def _momentum_sum_rule_integrand(x, lpdf, Q):
xqvals = lpdf.xfxQ(x, Q)
Expand Down Expand Up @@ -90,6 +103,24 @@ def f(x, lpdf, Q):
return f


def _make_mellin_moment_integrand(fldict, n: int):
"""Make an integrand for the n-th Mellin moment ``∫ x**n · f(x) dx`` of the linear
combination of defined by ``fldict``.

Note that LHAPDF returns ``x*f(x)``, so we multiply by ``x**(n-1)``. Reduces to
:py:func:`_make_pdf_integrand` for n=0 and to :py:func:`_make_momentum_fraction_integrand`
for n=1.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would propose merging these three functions and just apply the right value of n in each case.

"""
fldict = {parse_flarr([k])[0]: v for k, v in fldict.items()}
exponent = n - 1

def f(x, lpdf, Q):
xfvals = lpdf.xfxQ(x, Q)
return x**exponent * sum(multiplier * xfvals[fl] for fl, multiplier in fldict.items())

return f


KNOWN_SUM_RULES = {
"momentum": _momentum_sum_rule_integrand,
"uvalence": _make_pdf_integrand({"u": 1, "ubar": -1}),
Expand Down Expand Up @@ -122,6 +153,14 @@ def f(x, lpdf, Q):
"xV3": _make_momentum_fraction_integrand({'u': 1, 'ubar': -1, 'd': -1, 'dbar': 1}),
}

MELLIN_MOMENTS = {
"m_5": _make_mellin_moment_integrand({"u": 1, "ubar": -1, "d": -1, "dbar": 1}, n=4),
"m_4": _make_mellin_moment_integrand({"u": 1, "ubar": -1, "d": -1, "dbar": 1}, n=3),
"m_3": _make_mellin_moment_integrand({"u": 1, "ubar": -1, "d": -1, "dbar": 1}, n=2),
"m_2": _make_mellin_moment_integrand({"u": 1, "ubar": -1, "d": -1, "dbar": 1}, n=1),
# "Test: u momentum fraction": _make_mellin_moment_integrand({"u": 1}, n=1)
}


KNOWN_SUM_RULES_EXPECTED = {
'momentum': 1,
Expand Down Expand Up @@ -206,17 +245,38 @@ def unknown_sum_rules(pdf: PDF, Q: numbers.Real):
return _combine_limits(_sum_rules(UNKNOWN_SUM_RULES, lpdf, Q))


@check_positive('Q')
def partial_mellin_moments(pdf: PDF, Q: numbers.Real, lims: list = MELLIN_LIMS):
"""Per-interval contributions to the Mellin moments defined in
``MELLIN_MOMENTS``. Returns a list (one entry per element of ``lims``) of dicts
``{moment_name: [value_per_member]}``.

Useful for diagnosing integration stability: comparing the size of each
sub-interval's contribution to its quad tolerance reveals whether a
single (1e-3, 1) interval is good enough or whether the large-x region
needs to be sliced more finely.
"""
lpdf = pdf.load()
return _sum_rules(MELLIN_MOMENTS, lpdf, Q, lims=lims)


@check_positive('Q')
def mellin_moments(partial_mellin_moments):
"""Sum the partial Mellin-moment integrals over all x sub-intervals."""
return _combine_limits(partial_mellin_moments)


def _simple_description(data, pdf):
"""Return a table with simple descriptive statistics over the members of the PDF.
The statistics used depend on the type of PDF (MC or Hessian)."""
The statistics used depend on the type of PDF (MC or Hessian)."""
res = {}
stats_class = pdf.stats_class
for k, arr in data.items():
res[k] = stats_dict = {}
# Each entry in `data` is expected to be a vector with shape
# (central_member + error_members,), hence we reshape to (-1, 1) before
# passing to the stats class.
stats = stats_class(arr.reshape(-1,1))
stats = stats_class(arr.reshape(-1, 1))
stats_dict["mean"] = stats.central_value().item()
stats_dict["std"] = stats.std_error().item()
stats_dict["min"] = np.min(arr)
Expand Down Expand Up @@ -265,6 +325,25 @@ def unknown_sum_rules_table(unknown_sum_rules):
return _err_mean_table(unknown_sum_rules)


@table
def mellin_moments_table(mellin_moments, pdf):
"""Mean ± std (over PDF members) of the higher Mellin moments."""
return _simple_description(mellin_moments, pdf)


@table
def partial_mellin_moments_table(partial_mellin_moments):
"""Per-interval Mellin-moment contributions, averaged over PDF members.
One row per moment, one column per ``MELLIN_LIMS`` interval.
Use this to assess whether the integration is stable against slicing.
"""
rows = {}
for interval_dict, lim in zip(partial_mellin_moments, MELLIN_LIMS):
col = f"({lim[0]:.0e}, {lim[1]:.0e})"
rows[col] = {k: float(np.mean(v)) for k, v in interval_dict.items()}
return pd.DataFrame(rows)


@table
def bad_replica_sumrules(pdf, sum_rules, threshold: numbers.Real = 0.01):
"""Return a table with the sum rules for the replica where some sum rule is
Expand Down
Loading