Skip to content

feat: gh aw audit diff — compare firewall behavior across runs #22759

@Mossaka

Description

@Mossaka

Parent epic: #22735
Depends on: #22755 (firewall policy enrichment)

Summary

Add gh aw audit diff <run-id-1> <run-id-2> to compare firewall behavior across two workflow runs. Answers "what changed?" — critical for detecting policy regressions, new unauthorized domains, and behavioral drift.

Output

  • New domains — domains in run-2 not in run-1
  • Removed domains — domains in run-1 not in run-2
  • Status changes — domains that flipped allowed↔denied
  • Volume changes — significant request count changes per domain (>100% threshold for MVP)
  • Anomaly flags — new denied domains, previously-denied now allowed

Output formats: pretty (default), markdown, json

Usage

# Compare two runs
gh aw audit diff 12345 12346

# Markdown output for PR comments
gh aw audit diff 12345 12346 --format markdown

# JSON for CI integration
gh aw audit diff 12345 12346 --json

Example Output

### Firewall Diff: Run #12345 → Run #12346

**New domains (2)**
-`registry.npmjs.org` (15 requests, allowed)
-`telemetry.example.com` (2 requests, denied)

**Removed domains (1)**
- `old-api.internal.com` (was allowed, 8 requests in previous run)

**Status changes (1)**
- `staging.api.com`: ✅ allowed → ❌ denied (policy change?)

**Volume changes**
- `api.github.com`: 23 → 89 requests (+287%)

Implementation Notes

From the expert review on #22736:

Artifact Caching

The existing run_summary.json cache in logs_models.go saves processed results but not raw artifacts. Two options:

  • Preferred: New cache layer for raw parsed data (avoids re-downloading AND re-parsing)
  • Alternative: Extend RunSummary with the per-domain stats needed for diffing

Volume Change Threshold

Use a simple >100% increase threshold for MVP rather than statistical methods. Avoid scope creep into anomaly detection complexity.

Subcommand Structure

Use gh aw audit diff <run-1> <run-2> as positional args (not --diff flag). Follows cleaner CLI patterns.

Key Implementation Details

  • Reuse Phase 1's artifact downloading and JSONL/policy-manifest parsing
  • Diff logic operates on aggregated DomainRequestStats from both runs (struct already exists in firewall_log.go)
  • Follow existing CLI patterns: addRepoFlag(), addJSONFlag()
  • Fallback chain: audit.jsonlaccess.log → no firewall data (older runs may lack JSONL)

Tasks

  • Add audit diff subcommand with two positional run-id arguments
  • Implement domain-level diff logic (new, removed, changed status, volume delta)
  • Add anomaly detection (new denied, status flips)
  • Implement artifact caching to avoid re-downloading
  • Pretty, markdown, and JSON formatters
  • Unit tests for diff logic with fixture data in pkg/cli/testdata/

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions