Parent epic: #22735
Depends on: #22755 (firewall policy enrichment)
Summary
Add gh aw audit diff <run-id-1> <run-id-2> to compare firewall behavior across two workflow runs. Answers "what changed?" — critical for detecting policy regressions, new unauthorized domains, and behavioral drift.
Output
- New domains — domains in run-2 not in run-1
- Removed domains — domains in run-1 not in run-2
- Status changes — domains that flipped allowed↔denied
- Volume changes — significant request count changes per domain (>100% threshold for MVP)
- Anomaly flags — new denied domains, previously-denied now allowed
Output formats: pretty (default), markdown, json
Usage
# Compare two runs
gh aw audit diff 12345 12346
# Markdown output for PR comments
gh aw audit diff 12345 12346 --format markdown
# JSON for CI integration
gh aw audit diff 12345 12346 --json
Example Output
### Firewall Diff: Run #12345 → Run #12346
**New domains (2)**
- ✅ `registry.npmjs.org` (15 requests, allowed)
- ❌ `telemetry.example.com` (2 requests, denied)
**Removed domains (1)**
- `old-api.internal.com` (was allowed, 8 requests in previous run)
**Status changes (1)**
- `staging.api.com`: ✅ allowed → ❌ denied (policy change?)
**Volume changes**
- `api.github.com`: 23 → 89 requests (+287%)
Implementation Notes
From the expert review on #22736:
Artifact Caching
The existing run_summary.json cache in logs_models.go saves processed results but not raw artifacts. Two options:
- Preferred: New cache layer for raw parsed data (avoids re-downloading AND re-parsing)
- Alternative: Extend
RunSummary with the per-domain stats needed for diffing
Volume Change Threshold
Use a simple >100% increase threshold for MVP rather than statistical methods. Avoid scope creep into anomaly detection complexity.
Subcommand Structure
Use gh aw audit diff <run-1> <run-2> as positional args (not --diff flag). Follows cleaner CLI patterns.
Key Implementation Details
- Reuse Phase 1's artifact downloading and JSONL/policy-manifest parsing
- Diff logic operates on aggregated
DomainRequestStats from both runs (struct already exists in firewall_log.go)
- Follow existing CLI patterns:
addRepoFlag(), addJSONFlag()
- Fallback chain:
audit.jsonl → access.log → no firewall data (older runs may lack JSONL)
Tasks
Parent epic: #22735
Depends on: #22755 (firewall policy enrichment)
Summary
Add
gh aw audit diff <run-id-1> <run-id-2>to compare firewall behavior across two workflow runs. Answers "what changed?" — critical for detecting policy regressions, new unauthorized domains, and behavioral drift.Output
Output formats:
pretty(default),markdown,jsonUsage
Example Output
Implementation Notes
From the expert review on #22736:
Artifact Caching
The existing
run_summary.jsoncache inlogs_models.gosaves processed results but not raw artifacts. Two options:RunSummarywith the per-domain stats needed for diffingVolume Change Threshold
Use a simple >100% increase threshold for MVP rather than statistical methods. Avoid scope creep into anomaly detection complexity.
Subcommand Structure
Use
gh aw audit diff <run-1> <run-2>as positional args (not--diffflag). Follows cleaner CLI patterns.Key Implementation Details
DomainRequestStatsfrom both runs (struct already exists infirewall_log.go)addRepoFlag(),addJSONFlag()audit.jsonl→access.log→ no firewall data (older runs may lack JSONL)Tasks
audit diffsubcommand with two positional run-id argumentspkg/cli/testdata/