Add Lightcurve.acf() and plot_acf() for data-driven and GP-implied autocorrelation by Copilot · Pull Request #146 · ICSM/pgmuvi

Copilot · 2026-04-29T20:07:05Z

Fix acf() docstring: clarify band param — required for 2D+data, 2D+gp is unsupported (NotImplementedError)
Add zero-variance RuntimeError in _acf_data() when normalize=True
Add test: constant light curve + normalize=True raises RuntimeError
Add control test: constant light curve + normalize=False succeeds, acf[0]==0

Agent-Logs-Url: https://github.com/ICSM/pgmuvi/sessions/26a6281f-1a5e-42c3-85e0-d816898da200 Co-authored-by: sundarjhu <15619959+sundarjhu@users.noreply.github.com>

…D GP guard, fitted-state check, mock tests Agent-Logs-Url: https://github.com/ICSM/pgmuvi/sessions/8ee59df0-dfab-4d1d-b0de-42d7488907c0 Co-authored-by: sundarjhu <15619959+sundarjhu@users.noreply.github.com>

…nce with normalize=True Agent-Logs-Url: https://github.com/ICSM/pgmuvi/sessions/8e79cae9-eadf-4596-887d-3985c1bb6447 Co-authored-by: sundarjhu <15619959+sundarjhu@users.noreply.github.com>

Copilot

Pull request overview

Adds an autocorrelation-function API to pgmuvi.Lightcurve, supporting both a data-driven estimator (pairwise lag binning for uneven sampling) and a GP-implied ACF derived from the fitted covariance model, plus plotting and tests.

Changes:

Introduces ACFResult plus Lightcurve.acf() and Lightcurve.plot_acf().
Implements _acf_data() (pairwise binned estimator) and _acf_gp() (covariance-derived ACF).
Adds a new tests/test_acf.py covering core behaviors, 2D preconditions, and constant-series normalization errors.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File	Description
`pgmuvi/lightcurve.py`	Adds the ACF result container and the `acf()`/`plot_acf()` implementations (data + GP paths).
`tests/test_acf.py`	Adds unit tests for the new ACF API, including error cases and basic plotting.

sundarjhu · 2026-04-30T21:57:01Z

+        self._validate_n_lags(n_lags)
+
+        t = self._get_time_axis().double()
+        y = self.ydata.double()
+


@copilot, implement this.

Already implemented in commit 0e137aa — _validate_n_lags() is now called only inside the lag_edges is None branch, so passing lag_edges with an irrelevant n_lags value no longer raises.

sundarjhu · 2026-04-30T21:57:14Z

+                raise ValueError(
+                    "lag_edges must be strictly increasing."
+                )
+            max_lag = float(edges[-1])


@copilot, implement this.

Already implemented in commit 0e137aa — float(edges[-1]) replaced with edges[-1].detach().item().

+        # ------------------------------------------------------------------
+        # Zero-lag entry: all n self-pairs
+        # ------------------------------------------------------------------
+        variance = float((y_centered**2).mean())


sundarjhu · 2026-04-30T21:58:11Z

+        t_ref = float(t.mean()) if reference_time is None else float(reference_time)
+


@copilot, implement this.

Already implemented in commit 0e137aa — float(t.mean()) replaced with t.mean().detach().item().

+            if normalize:
+                x_ref = x1[:1]
+                lazy_var = self.model.covar_module(x_ref, x_ref)
+                variance = float(lazy_var.diagonal()[0].detach())


Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

… only when lag_edges is None Agent-Logs-Url: https://github.com/ICSM/pgmuvi/sessions/abc3876e-6f2d-442d-85a8-cc0a6e833166 Co-authored-by: sundarjhu <15619959+sundarjhu@users.noreply.github.com>

pscicluna · 2026-04-30T23:07:40Z

+        # ------------------------------------------------------------------
+        # Vectorised pairwise computation (i < j pairs)
+        # ------------------------------------------------------------------
+        idx_i, idx_j = torch.triu_indices(n, n, offset=1, device=t.device)
+        dt = (t[idx_j] - t[idx_i]).abs()
+        vals = y_centered[idx_i] * y_centered[idx_j]
+
+        # Keep only pairs within max_lag
+        in_range = dt <= max_lag
+        dt = dt[in_range]
+        vals = vals[in_range]
+
+        # Bin dt into lag bins
+        bin_idx = torch.searchsorted(edges, dt) - 1
+        valid_bins = (bin_idx >= 0) & (bin_idx < n_bins)
+        bin_idx = bin_idx[valid_bins]
+        vals = vals[valid_bins]
+
+        bin_sums = torch.zeros(n_bins, dtype=torch.float64, device=t.device)
+        bin_counts = torch.zeros(n_bins, dtype=torch.float64, device=t.device)
+        bin_sums.scatter_add_(0, bin_idx, vals)
+        bin_counts.scatter_add_(0, bin_idx, torch.ones_like(vals))
+
+        # Normalise by counts
+        valid = bin_counts > 0
+        acf_vals = torch.zeros(n_bins, dtype=torch.float64, device=t.device)
+        acf_vals[valid] = bin_sums[valid] / bin_counts[valid]
+
+        if normalize:
+            acf_vals = acf_vals / variance
+            zero_lag_val = 1.0
+
+        lag_centers = (edges[:-1] + edges[1:]) / 2.0
+
+        # Prepend zero-lag point
+        zero_lag_tensor = torch.zeros(1, dtype=torch.float64, device=t.device)
+        zero_acf_tensor = torch.tensor(
+            [zero_lag_val], dtype=torch.float64, device=t.device
+        )
+        zero_count_tensor = torch.tensor(
+            [float(n)], dtype=torch.float64, device=t.device
+        )
+
+        full_lag = torch.cat([zero_lag_tensor, lag_centers])
+        full_acf = torch.cat([zero_acf_tensor, acf_vals])
+        full_counts = torch.cat([zero_count_tensor, bin_counts])


I notice this is the brute-force ACF calculation, which is an O(n**2) algorithm. Is it worth switching to one of the nlogn algorithms? For example, pyro has one (https://docs.pyro.ai/en/stable/ops.html#pyro.ops.stats.autocorrelation) and since pyro is a (future) dependency for MCMC anyway, is it worth just using that? The complication is that that implementation doesn't know about the time sampling of the data. If that's a bottleneck, the FFT algorithm could perhaps be implemented using finufft? This might be a problem for a future issue, however.

Good point. I was working with 100-point LCs so I didn't even consider this. Which of the two would you like to consider as a replacement (or offer the user a choice)? We can add it as a feature request.

I guess we have irregularly-sampled times, so the pyro implementation is probably not an option? @copilot please comment on the practicality of different ways to implement O(n logn) ACF calculations, using either/both FFTs or convolutions instead of brute-force calculations.

You're right that the current estimator is O(n²). Here's a summary of the O(n log n) options for irregularly-sampled data:

Pyro's autocorrelation — uses FFT via the Wiener–Khinchin theorem, but assumes evenly-spaced samples. It's not directly applicable here without gridding first.

Grid + FFT (O(n log n) after gridding) — Snap or kernel-average the irregular samples onto a regular time grid, then use torch.fft.rfft / irfft on the gridded series. This is practical and requires no new dependencies. The main caveat is that the gridding step introduces a bias proportional to the gap between the true sample times and the grid nodes, which worsens for sparsely or irregularly sampled regions.

NUFFT via finufft — Computes the autocorrelation directly from the irregular sample positions (essentially evaluating the structure function on a frequency grid without gridding). Achieves true O(n log n) with no gridding bias, but adds finufft as a new dependency and the API is considerably more involved.

Lomb–Scargle + inverse FFT — The LS periodogram is already an O(n log n) spectral estimate for unevenly-sampled data, and by the Wiener–Khinchin theorem the ACF is (approximately) its inverse Fourier transform. Since pgmuvi already has multiband_ls_significance.py, this could reuse existing infrastructure. The catch is that the LS window function introduces sidelobes in the recovered ACF unless a debiasing step is added.

Recommendation for a future issue: For the typical astronomical light curve (n ~ 10²–10³), the O(n²) cost is rarely the bottleneck in practice. When it becomes one, the gridding + FFT route is the lowest-friction path (no new dependencies, easily expressed in pure PyTorch). The LS-based route is more principled for very sparse/irregular sampling. I'd suggest filing this as a feature request and noting the gridding + FFT approach as the preferred first step, with finufft as the stretch goal for truly publication-quality bias control.

Implement Lightcurve.acf() and Lightcurve.plot_acf() methods

0cad086

Agent-Logs-Url: https://github.com/ICSM/pgmuvi/sessions/26a6281f-1a5e-42c3-85e0-d816898da200 Co-authored-by: sundarjhu <15619959+sundarjhu@users.noreply.github.com>

Copilot AI assigned Copilot and sundarjhu Apr 29, 2026

Copilot created this pull request from a session on behalf of sundarjhu April 29, 2026 20:07 View session

Copilot started work on behalf of sundarjhu April 29, 2026 20:10 View session

Copilot AI requested a review from sundarjhu April 29, 2026 20:22

Copilot finished work on behalf of sundarjhu April 29, 2026 20:22

Copilot started work on behalf of sundarjhu April 29, 2026 20:26 View session

Fix ACF: update docstring for 2D+GP, raise RuntimeError on zero varia…

e14a14b

…nce with normalize=True Agent-Logs-Url: https://github.com/ICSM/pgmuvi/sessions/8e79cae9-eadf-4596-887d-3985c1bb6447 Co-authored-by: sundarjhu <15619959+sundarjhu@users.noreply.github.com>

Copilot finished work on behalf of sundarjhu April 29, 2026 20:31

sundarjhu assigned pscicluna Apr 30, 2026

sundarjhu requested a review from Copilot April 30, 2026 00:09

Copilot started reviewing on behalf of sundarjhu April 30, 2026 00:09 View session

Copilot AI reviewed Apr 30, 2026

View reviewed changes

Copilot started work on behalf of sundarjhu April 30, 2026 21:56 View session

sundarjhu and others added 2 commits April 30, 2026 15:57

Potential fix for pull request finding

526aa00

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Fix float(tensor) → .detach().item() for CUDA compat; validate n_lags…

0e137aa

… only when lag_edges is None Agent-Logs-Url: https://github.com/ICSM/pgmuvi/sessions/abc3876e-6f2d-442d-85a8-cc0a6e833166 Co-authored-by: sundarjhu <15619959+sundarjhu@users.noreply.github.com>

Copilot finished work on behalf of sundarjhu April 30, 2026 22:02

Copilot started work on behalf of sundarjhu April 30, 2026 22:02 View session

Copilot finished work on behalf of sundarjhu April 30, 2026 22:03

pscicluna reviewed Apr 30, 2026

View reviewed changes

Copilot started work on behalf of pscicluna May 1, 2026 00:06 View session

Copilot finished work on behalf of pscicluna May 1, 2026 00:07

Copilot AI requested a review from pscicluna May 1, 2026 00:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Lightcurve.acf() and plot_acf() for data-driven and GP-implied autocorrelation#146

Add Lightcurve.acf() and plot_acf() for data-driven and GP-implied autocorrelation#146
Copilot wants to merge 5 commits intomainfrom
copilot/implement-acf-method-lightcurve

Copilot AI commented Apr 29, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

sundarjhu Apr 30, 2026

Uh oh!

Copilot AI Apr 30, 2026

Uh oh!

Uh oh!

sundarjhu Apr 30, 2026

Uh oh!

Copilot AI Apr 30, 2026

Uh oh!

sundarjhu Apr 30, 2026

Uh oh!

Copilot AI Apr 30, 2026

Uh oh!

pscicluna Apr 30, 2026

Uh oh!

sundarjhu Apr 30, 2026

Uh oh!

pscicluna May 1, 2026

Uh oh!

Copilot AI May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		t_ref = float(t.mean()) if reference_time is None else float(reference_time)

Conversation

Copilot AI commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

sundarjhu Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sundarjhu Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

sundarjhu Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

pscicluna Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

sundarjhu Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

pscicluna May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Copilot AI commented Apr 29, 2026 •

edited

Loading