Skip to content

Updates to plotting scripts for CMS analysis#116

Closed
rbarrue wants to merge 23 commits into
sbi-pdf-CMSfrom
dev-rbarrue_plot_updates
Closed

Updates to plotting scripts for CMS analysis#116
rbarrue wants to merge 23 commits into
sbi-pdf-CMSfrom
dev-rbarrue_plot_updates

Conversation

@rbarrue
Copy link
Copy Markdown

@rbarrue rbarrue commented Apr 18, 2026

Fixing a bunch of issues with plotting scripts + implementing new functionality.

Commit messages are self-explanatory.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates several CMS analysis plotting utilities by enabling additional systematic variations, extending configurable process/feature handling for prefit plots, and enhancing postfit/PDF plotting outputs (including new data-fit handling in the postfit binned plotting script).

Changes:

  • Enable detailed JES component variations (including era-split regrouped sources) in the variations registry.
  • Improve CLI/config behavior for prefit plotting (process list defaults, clearer --features help, Asimov handling fix).
  • Adjust postfit/PDF plotting scripts (layout/saving behavior, axis ranges, output formats, and adding an attempt to build data histograms from config for data fits).

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
user/robert/variations.py Enables many JES component variations (but currently leaves grouping inconsistent).
user/robert/prefit_sys_binned.py Extends default processes, fixes doAsimov typo usage, improves CLI help/override behavior.
plot/postfitsys/compare_fits.py Updates plot limits and restructures script entrypoint; changes layout/save calls.
plot/postfitsys/compare_correlations.py Changes layout/save calls for comparison/difference heatmaps.
plot/postfitsys/binned_fromfit.py Adds data-fit handling via sample factory + data histogramming; adjusts ratio plot labeling/range logic.
plot/pdf/pdf_eigenmode_plots.py Tweaks labels/styling and adds SVG outputs for eigenmode plots.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

'jes_EC2' : {'sys_up': 'CMS_scale_j_Regrouped_EC2_up', 'sys_down': 'CMS_scale_j_Regrouped_EC2_down'},
'jes_HF' : {'sys_up': 'CMS_scale_j_Regrouped_HF_up', 'sys_down': 'CMS_scale_j_Regrouped_HF_down'},
'jes_relbal' : {'sys_up': 'CMS_scale_j_Regrouped_RelativeBal_up', 'sys_down': 'CMS_scale_j_Regrouped_RelativeBal_down'},

Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JES total variations (jes_total_EtaBin0..5) were removed from variations, but syst_groups['JESTOTAL'] still lists them. Any code that iterates syst_groups and expects these variation keys to exist will now fail (e.g. KeyError / missing-variation behavior). Either re-add the jes_total variations or update syst_groups to reference the new JES components you enabled here.

Suggested change
# Compatibility aliases kept so legacy syst_groups entries such as
# syst_groups['JESTOTAL'] can still resolve these variation keys.
'jes_total_EtaBin0': {},
'jes_total_EtaBin1': {},
'jes_total_EtaBin2': {},
'jes_total_EtaBin3': {},
'jes_total_EtaBin4': {},
'jes_total_EtaBin5': {},

Copilot uses AI. Check for mistakes.
'jes_total_EtaBin5' : {'sys_up': 'CMS_scale_j_Total_EtaBin5_up', 'sys_down': 'CMS_scale_j_Total_EtaBin5_down'},
'jes_b' : {'sys_up': 'CMS_scale_j_FlavorPureBottom_up', 'sys_down': 'CMS_scale_j_FlavorPureBottom_down'},
'jes_c' : {'sys_up': 'CMS_scale_j_FlavorPureCharm_up', 'sys_down': 'CMS_scale_j_FlavorPureCharm_down'},
'jeq_g' : {'sys_up': 'CMS_scale_j_FlavorPureGluon_up', 'sys_down': 'CMS_scale_j_FlavorPureGluon_down'},
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variation key jeq_g looks inconsistent with the surrounding JES flavor keys (jes_b, jes_c, jes_q) and likely is a typo. If downstream tooling expects a jes_g (or similar) naming pattern, this will prevent selecting/grouping the gluon component correctly; consider renaming the key to match the established jes_* prefix.

Suggested change
'jeq_g' : {'sys_up': 'CMS_scale_j_FlavorPureGluon_up', 'sys_down': 'CMS_scale_j_FlavorPureGluon_down'},
'jes_g' : {'sys_up': 'CMS_scale_j_FlavorPureGluon_up', 'sys_down': 'CMS_scale_j_FlavorPureGluon_down'},

Copilot uses AI. Check for mistakes.
Comment on lines +115 to +121
is_data_fit = ("data" in args.fit_result)

# grabbing sample factory to create data histograms downstream
# may need to move this further down to use pieces of code
# that get things that I need
if is_data_fit:
from common.yaml_loader import _resolve_features_list
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_data_fit = ("data" in args.fit_result) infers fit type from the fit-result filename/path. This is brittle (e.g. directories containing “data”, or a data fit file not containing that substring). Prefer an explicit CLI flag (e.g. --is-data-fit) or infer from the fit JSON/config (e.g. presence of region['data'] / a fit arg stored in the JSON).

Copilot uses AI. Check for mistakes.
Comment thread plot/postfitsys/binned_fromfit.py Outdated
)

for ib, value in enumerate(data_unrolled, start=1):
print(f"data histogram: {ib=}, {value=}; {h_total.GetBinContent(ib)=}")
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This per-bin print inside the data histogram fill loop will produce very large logs for typical bin counts and regions, which is problematic for batch jobs. Please remove it or guard it behind a debug/verbose option.

Suggested change
print(f"data histogram: {ib=}, {value=}; {h_total.GetBinContent(ib)=}")

Copilot uses AI. Check for mistakes.
nom = h_total.GetBinContent(ib)
y = h_data.GetBinContent(ib)
ey = h_data.GetBinError(ib) # stat-only = sqrt(y)
ey = h_data.GetBinError(ib) # stat-only = sqrt(y) \
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inline comment here ends with a stray backslash and extra spaces. This looks accidental and can be confusing when scanning the code; please remove the backslash/trailing whitespace.

Suggested change
ey = h_data.GetBinError(ib) # stat-only = sqrt(y) \
ey = h_data.GetBinError(ib) # stat-only = sqrt(y)

Copilot uses AI. Check for mistakes.
Comment on lines +64 to 69
plt.tight_layout()

os.makedirs(output_dir, exist_ok=True)
fig.savefig(os.path.join(output_dir, "correlation_comparison.png"), dpi=150, bbox_inches="tight")
fig.savefig(os.path.join(output_dir, "correlation_comparison.pdf"), bbox_inches="tight")
plt.savefig(os.path.join(output_dir, "correlation_comparison.png"), dpi=150, bbox_inches="tight")
plt.savefig(os.path.join(output_dir, "correlation_comparison.pdf"), bbox_inches="tight")
plt.close(fig)
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: using plt.tight_layout() / plt.savefig() here depends on the global current figure. Prefer fig.tight_layout() and fig.savefig(...) to avoid saving/layouting the wrong figure if this code is reused in a context with multiple active figures.

Copilot uses AI. Check for mistakes.
Comment thread plot/postfitsys/binned_fromfit.py Outdated
Comment on lines +189 to +190
print(f"{default_binning=}")
print(f"{var_name=}, {var_edges_default=}")
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The debug print(f"{default_binning=}") / print(f"{var_name=}, ...") will add noisy output for every run. Consider removing these or gating them behind a verbosity/debug flag (or a logger) to keep script output stable for batch usage.

Copilot uses AI. Check for mistakes.
Comment thread plot/postfitsys/binned_fromfit.py Outdated
h_ratio_down.SetBinContent(ib, 1.0)

if is_data_fit:
max_dev = max_dev_data
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is_data_fit is true, max_dev is overwritten with max_dev_data, discarding the uncertainty-band driven max_dev computed above. This can clip the MC uncertainty band (and the plot range) if the postfit uncertainty is larger than the data statistical error. Use max_dev = max(max_dev, max_dev_data) instead of replacing it.

Suggested change
max_dev = max_dev_data
max_dev = max(max_dev, max_dev_data)

Copilot uses AI. Check for mistakes.
Comment on lines +94 to 99
plt.tight_layout()

os.makedirs(output_dir, exist_ok=True)
fig.savefig(os.path.join(output_dir, "correlation_difference.png"), dpi=150, bbox_inches="tight")
fig.savefig(os.path.join(output_dir, "correlation_difference.pdf"), bbox_inches="tight")
plt.savefig(os.path.join(output_dir, "correlation_difference.png"), dpi=150, bbox_inches="tight")
plt.savefig(os.path.join(output_dir, "correlation_difference.pdf"), bbox_inches="tight")
plt.close(fig)
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same global-state issue here: use fig.tight_layout() and fig.savefig(...) rather than the plt.* variants to make sure the intended figure is laid out/saved, especially if these utilities are imported and used programmatically.

Copilot uses AI. Check for mistakes.
Comment on lines +82 to 88
ax_top.semilogx(x_vals, pdf_variant, color=color, linestyle=linestyle, linewidth=1.5, label=f'EV({mode_idx})')

mask = pdf_central > 1e-15
ratio = np.ones_like(pdf_central)
ratio[mask] = pdf_variant[mask] / pdf_central[mask]
ax_bot.semilogx(x_vals, ratio, color=color, linestyle=linestyle, linewidth=1.5, label=f'EV({mode_idx+1})')
ax_bot.semilogx(x_vals, ratio, color=color, linestyle=linestyle, linewidth=1.5, label=f'EV({mode_idx})')
ax_bot.set_ylim(-1.0,4.0)
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The legend labels now use EV({mode_idx}), which will display EV(0) for the first mode (since mode_idx is 0-based). If the plots are meant for presentation/publication, consider labeling as EV({mode_idx+1}) (and keep CLI indices 0-based if desired) or otherwise clarify the indexing convention to avoid confusion.

Copilot uses AI. Check for mistakes.
@rbarrue rbarrue force-pushed the dev-rbarrue_plot_updates branch 2 times, most recently from da291d1 to 72b7729 Compare April 23, 2026 09:42
@rbarrue
Copy link
Copy Markdown
Author

rbarrue commented Apr 27, 2026

Closing, as it is superseded by #120, which is equal to this but with a fixed commit history.

@rbarrue rbarrue closed this Apr 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants