Skip to content

Add missing audit_spike_sorting.py to fix CI spike detection step#7

Merged
DaScient merged 3 commits intocopilot/fix-test-collection-errorfrom
copilot/fix-missing-audit-spike-sorting-file
Apr 10, 2026
Merged

Add missing audit_spike_sorting.py to fix CI spike detection step#7
DaScient merged 3 commits intocopilot/fix-test-collection-errorfrom
copilot/fix-missing-audit-spike-sorting-file

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 10, 2026

The bio-digital-ci-cd.yml workflow references src/signal_processing/audit_spike_sorting.py in the "Spike Detection Quality Audit" step, but the file never existed — causing every CI run to fail with [Errno 2] No such file or directory.

Changes

  • New script: src/signal_processing/audit_spike_sorting.py
    • Generates synthetic extracellular traces with ground-truth spike positions injected into Gaussian noise
    • Runs a threshold-based detector mirroring the SSD logic in template_matching.rs
    • Matches detections to ground-truth spikes within a ±30 frame (±1 ms @ 30 kHz) tolerance window
    • Asserts false-negative and false-positive rates stay ≤ 10%; exits non-zero on failure so CI fails loudly on regressions
    • Covers 4 benchmark scenarios: single spike (low/high noise), multiple spikes, and dense spike trains
# Example benchmark entry
("Dense spikes — medium noise",
 list(range(BENCHMARK_SPIKE_INTERVAL_FRAMES, BENCHMARK_DURATION_FRAMES, BENCHMARK_SPIKE_INTERVAL_FRAMES)),
 BENCHMARK_DURATION_FRAMES, 0.10, 0.9)

Follows the same structure and exit-code contract as the existing verify_latency.py audit script.

Copilot AI changed the title [WIP] Fix missing audit_spike_sorting.py file in project Add missing audit_spike_sorting.py to fix CI spike detection step Apr 10, 2026
Copilot AI requested a review from DaScient April 10, 2026 00:47
@DaScient DaScient marked this pull request as ready for review April 10, 2026 01:17
Copilot AI review requested due to automatic review settings April 10, 2026 01:18
@DaScient DaScient merged commit 981f4e4 into copilot/fix-test-collection-error Apr 10, 2026
4 checks passed
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds the missing spike-sorting audit script that the bio-digital-ci-cd.yml workflow already invokes, unblocking the “Spike Detection Quality Audit” CI step and providing a synthetic benchmark-based regression check for event detection.

Changes:

  • Introduces src/signal_processing/audit_spike_sorting.py to generate synthetic traces with injected spikes and evaluate detection quality (FN/FP rates).
  • Implements a simple threshold-crossing detector plus benchmark scenarios and CI-friendly exit codes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +71 to +77
"""
Detect positive threshold crossings with a refractory period.

Returns a list of frame indices at which a spike was detected.
This mirrors the SSD logic in RealTimeSpikeSorter: when the signal
exceeds the threshold the window is flagged as an event.
"""
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The detector implemented here is a simple amplitude threshold crossing with a refractory period, but both the module docstring and this function’s docstring claim it “mirrors” the SSD-based template matcher in template_matching.rs (which slides a fixed window and checks SSD < variance_tolerance, and has no refractory). This mismatch makes the audit less representative of the Rust implementation and can allow regressions in SSD logic to slip through; either implement an SSD-style windowed matcher against the synthetic waveform template, or update the documentation/references so the audit’s scope is accurate.

Copilot uses AI. Check for mistakes.
Comment on lines +26 to +35
DETECTION_TOLERANCE_FRAMES = 30 # ±30 frames (±1 ms at 30 kHz)
MAX_FALSE_NEGATIVE_RATE = 0.10 # at most 10 % of ground-truth spikes missed
MAX_FALSE_POSITIVE_RATE = 0.10 # at most 10 % of detections are spurious

SAMPLING_RATE_HZ = 30_000

# Benchmark timing constants (in frames at SAMPLING_RATE_HZ)
BENCHMARK_DURATION_FRAMES = 30_000 # 1 second
BENCHMARK_SPIKE_INTERVAL_FRAMES = 3_000 # 100 ms between spikes

Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SAMPLING_RATE_HZ is defined but not used, and several timing-related constants (tolerance, duration, interval) are hard-coded separately. To avoid inconsistencies if the sampling rate changes, derive these frame counts from SAMPLING_RATE_HZ (e.g., duration_frames = SAMPLING_RATE_HZ * seconds, tolerance = SAMPLING_RATE_HZ * 0.001) or remove SAMPLING_RATE_HZ if it’s intentionally informational-only.

Copilot uses AI. Check for mistakes.
("Single spike — low noise", [15_000], 30_000, 0.05, 0.9),
("Multiple spikes — low noise", [5_000, 12_000, 20_000, 27_000], 30_000, 0.05, 0.9),
("Single spike — high noise", [15_000], 30_000, 0.20, 0.9),
("Dense spikes — medium noise", list(range(BENCHMARK_SPIKE_INTERVAL_FRAMES, BENCHMARK_DURATION_FRAMES, BENCHMARK_SPIKE_INTERVAL_FRAMES)), BENCHMARK_DURATION_FRAMES, 0.10, 0.9),
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The “Dense spikes — medium noise” benchmark tuple is written as a single very long line, which hurts readability and makes diffs noisier when tweaking parameters. Consider formatting it across multiple lines (consistent with the other entries) so future edits stay clear.

Suggested change
("Dense spikes — medium noise", list(range(BENCHMARK_SPIKE_INTERVAL_FRAMES, BENCHMARK_DURATION_FRAMES, BENCHMARK_SPIKE_INTERVAL_FRAMES)), BENCHMARK_DURATION_FRAMES, 0.10, 0.9),
(
"Dense spikes — medium noise",
list(
range(
BENCHMARK_SPIKE_INTERVAL_FRAMES,
BENCHMARK_DURATION_FRAMES,
BENCHMARK_SPIKE_INTERVAL_FRAMES,
)
),
BENCHMARK_DURATION_FRAMES,
0.10,
0.9,
),

Copilot uses AI. Check for mistakes.
Comment on lines +53 to +57

for pos in spike_positions:
end = pos + len(waveform)
if end <= duration_frames:
trace[pos:end] += waveform
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_make_synthetic_trace silently skips injecting spikes whose waveform would extend past duration_frames (end > duration_frames). For an audit script this can mask misconfigured benchmarks and produce confusing FN/FP results. Consider validating spike_positions upfront (raise ValueError or log an explicit error) instead of silently ignoring out-of-bounds spikes.

Suggested change
for pos in spike_positions:
end = pos + len(waveform)
if end <= duration_frames:
trace[pos:end] += waveform
max_start = duration_frames - len(waveform)
invalid_positions = [
pos for pos in spike_positions
if pos < 0 or (pos + len(waveform)) > duration_frames
]
if invalid_positions:
raise ValueError(
"spike_positions contains out-of-bounds spike starts for "
f"duration_frames={duration_frames} and waveform_len={len(waveform)}: "
f"{invalid_positions}. Valid start positions are in the range "
f"[0, {max_start}]."
)
for pos in spike_positions:
end = pos + len(waveform)
trace[pos:end] += waveform

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants