feat: gh aw audit diff — compare firewall behavior across runs

Parent epic: #22735
Depends on: #22755 (firewall policy enrichment)

## Summary

Add `gh aw audit diff <run-id-1> <run-id-2>` to compare firewall behavior across two workflow runs. Answers "what changed?" — critical for detecting policy regressions, new unauthorized domains, and behavioral drift.

## Output

- **New domains** — domains in run-2 not in run-1
- **Removed domains** — domains in run-1 not in run-2
- **Status changes** — domains that flipped allowed↔denied
- **Volume changes** — significant request count changes per domain (>100% threshold for MVP)
- **Anomaly flags** — new denied domains, previously-denied now allowed

**Output formats**: `pretty` (default), `markdown`, `json`

## Usage

```bash
# Compare two runs
gh aw audit diff 12345 12346

# Markdown output for PR comments
gh aw audit diff 12345 12346 --format markdown

# JSON for CI integration
gh aw audit diff 12345 12346 --json
```

### Example Output

```markdown
### Firewall Diff: Run #12345 → Run #12346

**New domains (2)**
- ✅ `registry.npmjs.org` (15 requests, allowed)
- ❌ `telemetry.example.com` (2 requests, denied)

**Removed domains (1)**
- `old-api.internal.com` (was allowed, 8 requests in previous run)

**Status changes (1)**
- `staging.api.com`: ✅ allowed → ❌ denied (policy change?)

**Volume changes**
- `api.github.com`: 23 → 89 requests (+287%)
```

## Implementation Notes

From the [expert review on #22736](https://github.com/github/gh-aw/issues/22736#issuecomment-4120480466):

### Artifact Caching
The existing `run_summary.json` cache in `logs_models.go` saves processed results but not raw artifacts. Two options:
- **Preferred**: New cache layer for raw parsed data (avoids re-downloading AND re-parsing)
- **Alternative**: Extend `RunSummary` with the per-domain stats needed for diffing

### Volume Change Threshold
Use a simple >100% increase threshold for MVP rather than statistical methods. Avoid scope creep into anomaly detection complexity.

### Subcommand Structure
Use `gh aw audit diff <run-1> <run-2>` as positional args (not `--diff` flag). Follows cleaner CLI patterns.

### Key Implementation Details
- Reuse Phase 1's artifact downloading and JSONL/policy-manifest parsing
- Diff logic operates on aggregated `DomainRequestStats` from both runs (struct already exists in `firewall_log.go`)
- Follow existing CLI patterns: `addRepoFlag()`, `addJSONFlag()`
- Fallback chain: `audit.jsonl` → `access.log` → no firewall data (older runs may lack JSONL)

## Tasks

- [ ] Add `audit diff` subcommand with two positional run-id arguments
- [ ] Implement domain-level diff logic (new, removed, changed status, volume delta)
- [ ] Add anomaly detection (new denied, status flips)
- [ ] Implement artifact caching to avoid re-downloading
- [ ] Pretty, markdown, and JSON formatters
- [ ] Unit tests for diff logic with fixture data in `pkg/cli/testdata/`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: gh aw audit diff — compare firewall behavior across runs #22759

Summary

Output

Usage

Example Output

Implementation Notes

Artifact Caching

Volume Change Threshold

Subcommand Structure

Key Implementation Details

Tasks

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat: gh aw audit diff — compare firewall behavior across runs #22759

Description

Summary

Output

Usage

Example Output

Implementation Notes

Artifact Caching

Volume Change Threshold

Subcommand Structure

Key Implementation Details

Tasks

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions