Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -113,11 +113,9 @@ jobs:
uses: actions-rust-lang/audit@v1
with:
# RUSTSEC-2023-0071: rsa crate (from sqlx-mysql, not used - we use postgres only)
# RUSTSEC-2025-0111: tokio-tar (from testcontainers, dev dependency only)
# RUSTSEC-2026-0066: astral-tokio-tar (renamed tokio-tar, same issue, testcontainers dev dep)
# RUSTSEC-2025-0134: rustls-pemfile unmaintained (transitive dep, awaiting upstream fix)
# RUSTSEC-2026-0097: rand unsoundness with custom logger (no fix available; we don't use custom logger)
ignore: RUSTSEC-2023-0071,RUSTSEC-2025-0111,RUSTSEC-2026-0066,RUSTSEC-2025-0134,RUSTSEC-2026-0097
ignore: RUSTSEC-2023-0071,RUSTSEC-2025-0134,RUSTSEC-2026-0097
denyWarnings: false
createIssues: false

Expand Down
1 change: 1 addition & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,7 @@ See `docs/adr/` for all 19 Architecture Decision Records.
- `POST /v1/signals/webhook/{name}` - Generic webhook adapter (configured in `correlation.yaml`; JSONPath field mapping; HMAC/bearer/none auth)
- `POST /v1/signals/corroborator` - Corroborating signal adapter (ADR 021). Sources configured with `mode: corroborating` post dimension-tagged signals that strengthen open signal groups without ever triggering mitigations on their own. Declared `match_dimensions` are authoritative: only declared dimensions are consulted during matching. Rejected with 400 if the source is unknown, `mode: primary`, or no declared dimension is populated. Correlation engine must be enabled.
- `GET /v1/signals/corroborator/activity?minutes=N` - Per-source corroborator activity summary aggregated across the live cache and attached signal-group rows. Used by the Signals dashboard so `mode: corroborating` sources surface realistic `last_seen`/`count` instead of always reading as "never seen".
- `GET /v1/signals/corroborator/cache?source=X&limit=N` - **Admin only.** Lists corroborating signals currently cached and unattached + unexpired (i.e. waiting for a matching primary event to arrive within `window_seconds`). Used by the Correlation dashboard's Cache tab; pair with the `prefixd_corroborator_cache_size{source}` gauge for alerting on runaway caches.
- `GET /v1/config/correlation` - Correlation config (admin, secrets redacted)
- `PUT /v1/config/correlation` - Update correlation config (admin only, writes YAML + hot-reload)

Expand Down
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Added

- **Corroborating signals v2 (PR B)** — Follow-up to ADR 021's initial ship that addresses the four review-deferred items as a coordinated set:
- **Playbook-aware late finalization.** Migration `012_signal_groups_playbook.sql` adds nullable `signal_groups.playbook_name`, populated by the daemon on the next primary event for each group via `COALESCE`. The corroborator-side aggregate recompute now re-resolves the playbook by name from live state and is allowed to flip `corroboration_met=true` using the override min_sources/threshold. Conservative fallback is preserved: a NULL or stale `playbook_name` still keeps the v0.16.0 no-flip behavior (the next primary event picks up the flag).
- **Per-source attribution on `prefixd_corroborator_expired_total`.** The counter regains its `{source}` label set, with `delete_expired_corroborating_signals` collecting attribution in the same `DELETE … RETURNING` query that performs the delete. **Operator note:** this is a label change. PromQL queries written against the v0.16.0 unlabelled counter must add a `sum()` to recover the previous shape.
- **New gauge `prefixd_corroborator_cache_size{source}`** updated by the reconcile loop after each sweep. Operators can alert on caches growing without bound (e.g. a source posting heavily while no matching primary event ever lands). Stale labels are explicitly zeroed when a source's cache drains between ticks.
- **Cached-corroborators admin endpoint and dashboard panel.** `GET /v1/signals/corroborator/cache` (admin-only) returns `{ now, total, by_source[], signals[] }` filtered to unattached + unexpired rows, with optional `?source=` and `?limit=` (clamped to 1..1000). New "Cache" tab on the Correlation page renders per-source counts plus a dense table of cached signals with relative ingested/expires timestamps and dimension chips. Backed by a new `useCachedCorroborators` SWR hook (30s refresh).
- **`CorroboratorResponse.cached` removed** in favor of the existing `status ∈ {attached, cached}` discriminator. The boolean was always `true` and added no information; status fully describes the outcome. Coordinated minor breaking change — bump to v0.17.0 — since the endpoint is new in this release line.

## [0.16.0] - 2026-04-19

### Added
Expand Down
81 changes: 39 additions & 42 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -84,8 +84,8 @@ prost-build = "0.14"
tokio-test = "0.4"
tempfile = "3"
criterion = { version = "0.8", features = ["async_tokio"] }
testcontainers = "0.26"
testcontainers-modules = { version = "0.14", features = ["postgres"] }
testcontainers = "0.27"
testcontainers-modules = { version = "0.15", features = ["postgres"] }
proptest = "1"

[[bench]]
Expand Down
12 changes: 6 additions & 6 deletions ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -289,12 +289,12 @@ Example: FastNetMon says UDP flood at 0.6 confidence + router CPU spiking + host
- [x] Corroboration requirements ("require 2+ sources")
- [x] Correlation explainability (`why` details in API/UI for each mitigation decision)
- [x] Corroborating-only signals v1 — coarse telemetry (router CPU, PoP interface, per-customer NetFlow) can strengthen groups without ever triggering mitigations on its own (ADR 021, PR #109)
- [ ] **Corroborating signals v2 (PR B)** — follow-ups from the ADR 021 review:
- [ ] Playbook-override-aware corroborator finalization: let a late corroborator promote `corroboration_met` → `true` on its own path, using the resolved override from the group's most recent primary event. Eliminates the current dependency on "another primary event fires within the window" for late corroborators to take effect.
- [ ] Per-source attribution on `prefixd_corroborator_expired_total`: select expiring rows grouped by source before delete so the metric can be labelled without a full rewrite of the sweep path.
- [ ] API cleanup: drop the redundant `cached: true` field from `CorroboratorResponse`; `status ∈ {attached, cached}` already fully describes the outcome. Coordinate with a minor API version bump since the endpoint is new in this release.
- [ ] Dashboard "cached corroborators" panel: small widget on the Correlation dashboard showing live count of unattached-but-unexpired corroborators per source, sourced from a new `/v1/signals/corroborator/cache` listing endpoint (admin-only).
- [ ] Gauge metric `prefixd_corroborator_cache_size{source}` updated by the reconcile loop for Prometheus alerting on runaway caches.
- [x] **Corroborating signals v2 (PR B)** — follow-ups from the ADR 021 review (shipped in v0.17.0):
- [x] Playbook-override-aware corroborator finalization: late corroborators now promote `corroboration_met` → `true` on their own path using the override resolved from `signal_groups.playbook_name` (migration 012). Conservative fallback preserved when the stored playbook is missing from live config.
- [x] Per-source attribution on `prefixd_corroborator_expired_total{source}`: counter regains `&["source"]` label set; sweep collects attribution via `DELETE … RETURNING` grouped by source.
- [x] API cleanup: dropped the redundant `cached: true` field from `CorroboratorResponse`; `status ∈ {attached, cached}` is the discriminator.
- [x] Dashboard "cached corroborators" panel: new Cache tab on the Correlation page backed by `GET /v1/signals/corroborator/cache` (admin-only) showing per-source counts and a dense signal table.
- [x] Gauge metric `prefixd_corroborator_cache_size{source}` updated by the reconcile loop after each sweep; stale labels zeroed on drain.
- [ ] Router telemetry adapter (JTI, gNMI) as the first production consumer of corroborator mode (already listed under Signal Adapters)
- [ ] Replay mode for tuning (simulate historical incidents without announcing FlowSpec rules)

Expand Down
Loading
Loading