Conversation
This commit includes a function to produce gifs from multiband lightcurves, and some helpers. It will need some adaptation for the rest of pgmuvi to interact with it more effectively.
There was a problem hiding this comment.
Pull request overview
This pull request adds visualization capabilities for multiband lightcurve data by introducing a new pgmuvi.viz module with GIF generation functionality, along with a new pgmuvi.utils module containing helper functions. The PR enables phase-folded SED (Spectral Energy Distribution) GIF creation from multiband data, which will help astronomers visualize variability across different wavelengths over time.
Changes:
- Added
pgmuvi/viz/gif.pywithmake_sed_gif_phasefunction for generating phase-folded SED animations - Added
pgmuvi/utils/helpers.pywith utility functions for directory creation, magnitude detection, and error cleaning - Created module initialization files for both new subpackages
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 19 comments.
| File | Description |
|---|---|
| pgmuvi/viz/gif.py | Implements main GIF generation function with phase-folding logic and helper functions; includes misplaced CLI harmonization function |
| pgmuvi/viz/init.py | Exports the main visualization function |
| pgmuvi/utils/helpers.py | Provides utility functions for file operations and data validation |
| pgmuvi/utils/init.py | Empty initialization file for utils subpackage |
| if mu.size: | ||
| y_frame[i] = float(np.nanmean(mu)) | ||
| overlay_vals = [] | ||
| if overlay_dphi and overlay_dphi > 0: |
There was a problem hiding this comment.
The condition overlay_dphi and overlay_dphi > 0 is redundant. If overlay_dphi is a float, checking if it's truthy (non-zero) is sufficient since the second check already ensures it's positive. If overlay_dphi can be None, it would be clearer to write: if overlay_dphi is not None and overlay_dphi > 0:.
However, looking at the function signature, overlay_dphi is typed as float (line 24), so it cannot be None, making the truthiness check even more confusing.
| def make_sed_gif_phase(object_id: str, | ||
| lam_unique: np.ndarray, | ||
| band_predictors: list, | ||
| t: np.ndarray, y: np.ndarray, yerr: np.ndarray, | ||
| period_days: float, t0: float, rcfg, outdir: str, | ||
| is_mag_val: bool, overlay_dphi: float, n_phase: int, | ||
| cycles: int | None, model_label: str, | ||
| meta_sidecar: dict | None=None, | ||
| lam_full: np.ndarray=None) -> str: | ||
| """ | ||
| Build a phase-folded SED GIF. | ||
| - Model drives the SED at every phase for every band (smooth evolution). | ||
| - Optional overlay: one *median* data point per band per frame, | ||
| computed from observations with phase within ±overlay_dphi (in phase). | ||
| - Y-limits are computed robustly from DATA ONLY (same policy as time-GIF). | ||
| """ | ||
| if lam_full is None: | ||
| raise ValueError('make_sed_gif_phase requires lam_full (wavelength per data point).') # noqa: E501 | ||
| finite_y = np.isfinite(y) | ||
| if rcfg.sed_loglog and (not is_mag_val): | ||
| base = y[finite_y & (y > 0)] | ||
| if base.size: | ||
| y_p1, y_p99 = np.nanpercentile(base, [1.0, 99.0]) | ||
| y_lo, y_hi = (max(y_p1 / 1.5, 1e-06), y_p99 * 1.5) | ||
| if not np.isfinite(y_hi) or y_hi <= y_lo: | ||
| y_lo, y_hi = (1e-06, 1.0) | ||
| else: | ||
| y_lo, y_hi = (1e-06, 1.0) | ||
| else: | ||
| base = y[finite_y] | ||
| if base.size: | ||
| y_p1, y_p99 = np.nanpercentile(base, [1.0, 99.0]) | ||
| pad = 0.05 * (y_p99 - y_p1) if np.isfinite(y_p99 - y_p1) else 1.0 | ||
| y_lo = float(y_p1 - pad) | ||
| y_hi = float(y_p99 + pad) | ||
| if not np.isfinite(y_hi) or y_hi <= y_lo: | ||
| y_lo, y_hi = (float(np.nanmin(base) - 1.0), | ||
| float(np.nanmax(base) + 1.0)) | ||
| else: | ||
| y_lo, y_hi = (-1.0, 1.0) | ||
| phases = np.linspace(0.0, 1.0, int(max(4, n_phase)), endpoint=False) | ||
| if rcfg.phase_no_average: | ||
| k_list = [int(np.round((np.nanmedian(t) - t0) / period_days))] | ||
| else: | ||
| tmin, tmax = (float(np.nanmin(t)), float(np.nanmax(t))) | ||
| k0 = int(np.floor((tmin - t0) / period_days)) - 1 | ||
| k1 = int(np.ceil((tmax - t0) / period_days)) + 1 | ||
| k_range = list(range(k0, k1 + 1)) | ||
| if cycles is not None and cycles > 0 and (len(k_range) > cycles): | ||
| k_center = int(np.round((np.nanmedian(t) - t0) / period_days)) | ||
| half = cycles // 2 | ||
| k_list = list(range(k_center - half, k_center - half + cycles)) | ||
| else: | ||
| k_list = k_range | ||
| frames = [] | ||
| for phi in phases: | ||
| y_frame = np.full(len(lam_unique), np.nan, float) | ||
| for i in range(len(lam_unique)): | ||
| t_eval = np.asarray([t0 + (k + phi) * period_days for k in k_list], float) | ||
| mu = np.asarray(band_predictors[i](t_eval), float) | ||
| mu = mu[np.isfinite(mu)] | ||
| if mu.size: | ||
| y_frame[i] = float(np.nanmean(mu)) | ||
| overlay_vals = [] | ||
| if overlay_dphi and overlay_dphi > 0: | ||
| ph_all = _phase_of(t, period_days, t0) | ||
| lo = (phi - overlay_dphi) % 1.0 | ||
| hi = (phi + overlay_dphi) % 1.0 | ||
| in_bin = ((ph_all >= lo) & (ph_all <= hi) | ||
| if hi >= lo else (ph_all >= lo) | (ph_all <= hi)) | ||
| for lv in lam_unique: | ||
| m_band = lam_full == lv | ||
| sel = in_bin & m_band | ||
| if np.any(sel): | ||
| y_med = np.nanmedian(y[sel]) | ||
| eb_clean, _ = _clean_errors(yerr[sel], 0.0) | ||
| e_med = np.nanmedian(eb_clean) | ||
| overlay_vals.append((lv, float(y_med), float(e_med))) | ||
| fig, ax = plt.subplots(figsize=(6, 4)) | ||
| valid = np.isfinite(y_frame) | ||
| lam_plot = lam_unique[valid] | ||
| y_plot = y_frame[valid] | ||
| if rcfg.sed_loglog and (not is_mag_val): | ||
| pos = y_plot > 0 | ||
| lam_plot = lam_plot[pos] | ||
| y_plot = y_plot[pos] | ||
| ax.set_xscale('log') | ||
| ax.set_yscale('log') | ||
| ax.plot(lam_plot, y_plot, '-o', lw=2, ms=4, label='model') | ||
| if overlay_dphi and overlay_vals: | ||
| for j, (lv, y_med, e_med) in enumerate(overlay_vals): | ||
| if rcfg.sed_loglog and (not is_mag_val) and (y_med <= 0): | ||
| continue | ||
| ax.errorbar([lv], [y_med], yerr=[e_med], fmt='o', ms=4, alpha=0.85, | ||
| label='data (median)' if j == 0 else None) | ||
| ax.set_ylim(y_lo, y_hi) | ||
| ax.set_xlabel('Wavelength (μm)') | ||
| ax.set_ylabel('Magnitude' if is_mag_val else 'Flux') | ||
| ax.set_title(f'{object_id}: SED @ phase={phi:0.2f} (P={period_days:.2f} d, t0={t0:.1f}, {model_label})') # noqa: E501 | ||
| fig.tight_layout() | ||
| buf = io.BytesIO() | ||
| fig.savefig(buf, format='png', dpi=150) | ||
| plt.close(fig) | ||
| buf.seek(0) | ||
| frames.append(imageio.imread(buf)) | ||
| gif_path = os.path.join(outdir, | ||
| f'{object_id}_sed_phase_P{period_days:.2f}_t0{t0:.1f}.gif') | ||
| imageio.mimsave(gif_path, frames, duration=0.12) | ||
| meta = { | ||
| 'object_id': object_id, | ||
| 'P_days': period_days, | ||
| 't0': t0, | ||
| 'n_phase': n_phase, | ||
| 'overlay_dphi': overlay_dphi, | ||
| 'cycles': None if cycles is None else int(cycles), | ||
| 'model_label': model_label, | ||
| 'sed_loglog': bool(rcfg.sed_loglog), | ||
| } | ||
| if meta_sidecar: | ||
| meta |= meta_sidecar | ||
| meta['provenance'] = meta.get('provenance', | ||
| meta.get('period_provenance', '')) or '' | ||
| with contextlib.suppress(Exception): | ||
| _debug_dump_json(os.path.join(outdir, f'{object_id}_sed_phase_meta.json'), meta) | ||
| return gif_path |
There was a problem hiding this comment.
The new visualization functionality in make_sed_gif_phase lacks test coverage. The existing test suite in tests/tests.py has tests for other modules (Transformer, Lightcurve, etc.), but there are no tests for the new viz module.
Consider adding tests that:
- Verify the function generates a GIF file at the expected path
- Test edge cases like empty data arrays, NaN handling, and phase wrapping
- Verify the JSON metadata is written correctly
- Test the helper functions
_phase_ofand_debug_dump_json
| log = logging.getLogger("pgmuvi.viz.gif") | ||
|
|
||
| def make_sed_gif_phase(object_id: str, | ||
| lam_unique: np.ndarray, | ||
| band_predictors: list, | ||
| t: np.ndarray, y: np.ndarray, yerr: np.ndarray, | ||
| period_days: float, t0: float, rcfg, outdir: str, |
There was a problem hiding this comment.
The rcfg parameter lacks a type annotation, making it unclear what type of object is expected. This makes the function harder to use and understand. Based on the usage (accessing attributes like rcfg.sed_loglog and rcfg.phase_no_average), it appears to be a configuration object, but the type should be specified or at least documented in the docstring.
| log = logging.getLogger("pgmuvi.viz.gif") | |
| def make_sed_gif_phase(object_id: str, | |
| lam_unique: np.ndarray, | |
| band_predictors: list, | |
| t: np.ndarray, y: np.ndarray, yerr: np.ndarray, | |
| period_days: float, t0: float, rcfg, outdir: str, | |
| from typing import Protocol | |
| log = logging.getLogger("pgmuvi.viz.gif") | |
| class GifPhaseConfig(Protocol): | |
| """Configuration options required by ``make_sed_gif_phase``.""" | |
| sed_loglog: bool | |
| phase_no_average: bool | |
| def make_sed_gif_phase(object_id: str, | |
| lam_unique: np.ndarray, | |
| band_predictors: list, | |
| t: np.ndarray, y: np.ndarray, yerr: np.ndarray, | |
| period_days: float, t0: float, rcfg: GifPhaseConfig, | |
| outdir: str, |
| base = y[finite_y & (y > 0)] | ||
| if base.size: | ||
| y_p1, y_p99 = np.nanpercentile(base, [1.0, 99.0]) | ||
| y_lo, y_hi = (max(y_p1 / 1.5, 1e-06), y_p99 * 1.5) | ||
| if not np.isfinite(y_hi) or y_hi <= y_lo: | ||
| y_lo, y_hi = (1e-06, 1.0) | ||
| else: | ||
| y_lo, y_hi = (1e-06, 1.0) | ||
| else: | ||
| base = y[finite_y] | ||
| if base.size: | ||
| y_p1, y_p99 = np.nanpercentile(base, [1.0, 99.0]) | ||
| pad = 0.05 * (y_p99 - y_p1) if np.isfinite(y_p99 - y_p1) else 1.0 | ||
| y_lo = float(y_p1 - pad) | ||
| y_hi = float(y_p99 + pad) | ||
| if not np.isfinite(y_hi) or y_hi <= y_lo: | ||
| y_lo, y_hi = (float(np.nanmin(base) - 1.0), | ||
| float(np.nanmax(base) + 1.0)) |
There was a problem hiding this comment.
The variable name base is ambiguous in this context. It represents the subset of valid y-values used for computing percentiles, not a "base" value. Consider renaming to something more descriptive like valid_y_values or y_for_limits to improve code clarity.
| base = y[finite_y & (y > 0)] | |
| if base.size: | |
| y_p1, y_p99 = np.nanpercentile(base, [1.0, 99.0]) | |
| y_lo, y_hi = (max(y_p1 / 1.5, 1e-06), y_p99 * 1.5) | |
| if not np.isfinite(y_hi) or y_hi <= y_lo: | |
| y_lo, y_hi = (1e-06, 1.0) | |
| else: | |
| y_lo, y_hi = (1e-06, 1.0) | |
| else: | |
| base = y[finite_y] | |
| if base.size: | |
| y_p1, y_p99 = np.nanpercentile(base, [1.0, 99.0]) | |
| pad = 0.05 * (y_p99 - y_p1) if np.isfinite(y_p99 - y_p1) else 1.0 | |
| y_lo = float(y_p1 - pad) | |
| y_hi = float(y_p99 + pad) | |
| if not np.isfinite(y_hi) or y_hi <= y_lo: | |
| y_lo, y_hi = (float(np.nanmin(base) - 1.0), | |
| float(np.nanmax(base) + 1.0)) | |
| valid_y_values = y[finite_y & (y > 0)] | |
| if valid_y_values.size: | |
| y_p1, y_p99 = np.nanpercentile(valid_y_values, [1.0, 99.0]) | |
| y_lo, y_hi = (max(y_p1 / 1.5, 1e-06), y_p99 * 1.5) | |
| if not np.isfinite(y_hi) or y_hi <= y_lo: | |
| y_lo, y_hi = (1e-06, 1.0) | |
| else: | |
| y_lo, y_hi = (1e-06, 1.0) | |
| else: | |
| valid_y_values = y[finite_y] | |
| if valid_y_values.size: | |
| y_p1, y_p99 = np.nanpercentile(valid_y_values, [1.0, 99.0]) | |
| pad = 0.05 * (y_p99 - y_p1) if np.isfinite(y_p99 - y_p1) else 1.0 | |
| y_lo = float(y_p1 - pad) | |
| y_hi = float(y_p99 + pad) | |
| if not np.isfinite(y_hi) or y_hi <= y_lo: | |
| y_lo, y_hi = (float(np.nanmin(valid_y_values) - 1.0), | |
| float(np.nanmax(valid_y_values) + 1.0)) |
| lam_unique: np.ndarray, | ||
| band_predictors: list, | ||
| t: np.ndarray, y: np.ndarray, yerr: np.ndarray, | ||
| period_days: float, t0: float, rcfg, outdir: str, |
There was a problem hiding this comment.
The outdir parameter should be validated to ensure it exists or can be created. The function attempts to write files to this directory (lines 124-126, 142) but doesn't verify the directory exists. If the directory doesn't exist, os.path.join will succeed but imageio.mimsave and file operations will fail with unclear errors.
Consider adding a check at the beginning of the function: os.makedirs(outdir, exist_ok=True) or using the ensure_dir helper from pgmuvi.utils.helpers.
| lam_full: np.ndarray=None) -> str: | ||
| """ | ||
| Build a phase-folded SED GIF. | ||
| - Model drives the SED at every phase for every band (smooth evolution). | ||
| - Optional overlay: one *median* data point per band per frame, | ||
| computed from observations with phase within ±overlay_dphi (in phase). | ||
| - Y-limits are computed robustly from DATA ONLY (same policy as time-GIF). | ||
| """ | ||
| if lam_full is None: | ||
| raise ValueError('make_sed_gif_phase requires lam_full (wavelength per data point).') # noqa: E501 |
There was a problem hiding this comment.
The parameter lam_full has a default value of None but immediately raises a ValueError if it's None. This is misleading API design - if the parameter is required, it should not have a default value.
Remove =None from the parameter definition to make it clear that this is a required parameter.
| fig, ax = plt.subplots(figsize=(6, 4)) | ||
| valid = np.isfinite(y_frame) | ||
| lam_plot = lam_unique[valid] | ||
| y_plot = y_frame[valid] | ||
| if rcfg.sed_loglog and (not is_mag_val): | ||
| pos = y_plot > 0 | ||
| lam_plot = lam_plot[pos] | ||
| y_plot = y_plot[pos] | ||
| ax.set_xscale('log') | ||
| ax.set_yscale('log') | ||
| ax.plot(lam_plot, y_plot, '-o', lw=2, ms=4, label='model') | ||
| if overlay_dphi and overlay_vals: | ||
| for j, (lv, y_med, e_med) in enumerate(overlay_vals): | ||
| if rcfg.sed_loglog and (not is_mag_val) and (y_med <= 0): | ||
| continue | ||
| ax.errorbar([lv], [y_med], yerr=[e_med], fmt='o', ms=4, alpha=0.85, | ||
| label='data (median)' if j == 0 else None) | ||
| ax.set_ylim(y_lo, y_hi) | ||
| ax.set_xlabel('Wavelength (μm)') | ||
| ax.set_ylabel('Magnitude' if is_mag_val else 'Flux') | ||
| ax.set_title(f'{object_id}: SED @ phase={phi:0.2f} (P={period_days:.2f} d, t0={t0:.1f}, {model_label})') # noqa: E501 | ||
| fig.tight_layout() | ||
| buf = io.BytesIO() | ||
| fig.savefig(buf, format='png', dpi=150) |
There was a problem hiding this comment.
The hardcoded figure size figsize=(6, 4) and DPI dpi=150 may not be optimal for all use cases. Consider making these configurable parameters or documenting why these specific values were chosen. Different output contexts (presentations, papers, web) may require different dimensions.
| frames.append(imageio.imread(buf)) | ||
| gif_path = os.path.join(outdir, | ||
| f'{object_id}_sed_phase_P{period_days:.2f}_t0{t0:.1f}.gif') | ||
| imageio.mimsave(gif_path, frames, duration=0.12) |
There was a problem hiding this comment.
The hardcoded GIF duration duration=0.12 (120ms per frame, ~8 fps) may not be suitable for all cases. Consider making this a configurable parameter. Different numbers of phases might benefit from different frame rates - more phases might need slower playback for clarity, while fewer phases might work better with faster playback.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This commit includes a function to produce gifs from multiband lightcurves, and some helpers. It will need some adaptation for the rest of pgmuvi to interact with it more effectively.