Skip to content
Merged
114 changes: 110 additions & 4 deletions .github/workflows/daily-firewall-report.md
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,7 @@ The audit tool returns structured firewall analysis data including:
- Total requests, allowed requests, blocked requests
- Lists of allowed and blocked domains
- Request statistics per domain
- **Policy rule attribution** (when `policy-manifest.json` and `audit.jsonl` artifacts are present)

**Example of extracting firewall data from audit result:**
```javascript
Expand All @@ -243,20 +244,42 @@ result.firewall_analysis.blocked_domains // Array of blocked domain names
result.firewall_analysis.allowed_domains // Array of allowed domain names
result.firewall_analysis.total_requests // Total number of network requests
result.firewall_analysis.blocked_requests // Number of blocked requests

// Policy rule attribution (enriched data — may be null if artifacts are absent):
result.policy_analysis.policy_summary // e.g., "12 rules, SSL Bump disabled, DLP disabled"
result.policy_analysis.rule_hits // Array of {rule: {id, action, description, ...}, hits: N}
result.policy_analysis.denied_requests // Array of {ts, host, status, rule_id, action, reason}
result.policy_analysis.total_requests // Total enriched request count
result.policy_analysis.allowed_count // Allowed requests (rule-attributed)
result.policy_analysis.denied_count // Denied requests (rule-attributed)
result.policy_analysis.unique_domains // Unique domain count
```

**Important:** Do NOT manually download and parse firewall log files. Always use the `audit` tool which provides structured firewall analysis data.

### Step 3: Parse and Analyze Firewall Logs

Use the JSON output from the `audit` tool to extract firewall information. The `firewall_analysis` field in the audit JSON contains:
Use the JSON output from the `audit` tool to extract firewall information.

**Basic firewall analysis** — The `firewall_analysis` field in the audit JSON contains:
- `total_requests` - Total number of network requests
- `allowed_requests` - Count of allowed requests
- `blocked_requests` - Count of blocked requests
- `allowed_domains` - Array of unique allowed domains
- `blocked_domains` - Array of unique blocked domains
- `requests_by_domain` - Object mapping domains to request statistics (allowed/blocked counts)

**Policy rule attribution** — The `policy_analysis` field (when present) contains enriched data that attributes each request to a specific firewall policy rule:
- `policy_summary` - Human-readable summary (e.g., "12 rules, SSL Bump disabled, DLP disabled")
- `rule_hits` - Array of objects: `{rule: {id, order, action, aclName, protocol, domains, description}, hits: N}` — how many requests each rule handled
- `denied_requests` - Array of objects: `{ts, host, status, rule_id, action, reason}` — every denied request with its matching rule and reason
- `total_requests` - Total enriched request count
- `allowed_count` - Allowed requests count (rule-attributed)
- `denied_count` - Denied requests count (rule-attributed)
- `unique_domains` - Unique domain count

**Note:** `policy_analysis` is only present when the workflow run produced `policy-manifest.json` and `audit.jsonl` artifacts. If absent, fall back to `firewall_analysis` for basic domain-count data.

**Example jq filter for aggregating blocked domains:**
```bash
# Get only blocked domains across multiple runs
Expand All @@ -269,6 +292,19 @@ gh aw audit <run-id> --json | jq -r '
select(.value.blocked > 0) |
"\(.key): \(.value.blocked) blocked, \(.value.allowed) allowed"
'

# Get policy rule hit counts (when policy_analysis is available)
gh aw audit <run-id> --json | jq -r '
.policy_analysis.rule_hits // [] |
.[] | select(.hits > 0) |
"\(.rule.id) (\(.rule.action)): \(.hits) hits"
'

# Get denied requests with rule attribution
gh aw audit <run-id> --json | jq -r '
.policy_analysis.denied_requests // [] |
.[] | "\(.host) → \(.rule_id): \(.reason)"
'
```

For each workflow run with firewall data (see standardized metric names in scratchpad/metrics-glossary.md):
Expand All @@ -279,6 +315,10 @@ For each workflow run with firewall data (see standardized metric names in scrat
- Blocked requests count (`firewall_requests_blocked`)
- List of unique blocked domains (`firewall_domains_blocked`)
- Domain-level statistics (from `requests_by_domain`)
3. If `policy_analysis` is present, also track:
- Policy rule hit counts (which rules are handling traffic)
- Denied requests with rule attribution (which rule denied each request and why)
- Policy summary (rules count, SSL Bump/DLP status)

### Step 4: Aggregate Results

Expand All @@ -292,6 +332,18 @@ Combine data from all workflows (using standardized metric names):
- Total blocked domains (`firewall_domains_blocked`) - unique count
- Total blocked requests (`firewall_requests_blocked`)

**Policy rule attribution aggregation** (when `policy_analysis` data is available):
5. Aggregate policy rule hit counts across all runs:
- Build a cross-run rule hit table: rule ID → total hits across all runs
- Identify the most active allow rules and deny rules
6. Aggregate denied requests with rule attribution:
- Collect all denied requests across runs with their matching rule and reason
- Group by rule ID to show which deny rules are doing the most work
- Group by domain to show which domains trigger which deny rules
7. Track policy configuration across runs:
- Note any runs with SSL Bump or DLP enabled
- Note any differences in policy rule counts between runs

### Step 5: Generate Report

Create a comprehensive markdown report following the formatting guidelines above. Structure your report as follows:
Expand Down Expand Up @@ -326,7 +378,60 @@ A table showing the most frequently blocked domains:

Sort by frequency (most blocked first), show top 20.

#### Section 4: Detailed Request Patterns (In `<details>` Tags)
#### Section 4: Policy Rule Attribution (Always Visible — when data available)

**Include this section when `policy_analysis` data was available for at least one run.**

This section provides rule-level insights that go beyond simple domain counts, showing *which policy rules* are handling traffic and *why* specific requests were denied.

**4a. Policy Configuration**

Show the policy summary from the most recent run:
- Number of rules, SSL Bump status, DLP status
- Example: "📋 Policy: 12 rules, SSL Bump disabled, DLP disabled"

**4b. Policy Rule Hit Table**

Show aggregated rule hit counts across all analyzed runs:

```markdown
| Rule | Action | Description | Total Hits |
|------|--------|-------------|------------|
| allow-github | 🟢 allow | Allow GitHub domains | 523 |
| allow-npm | 🟢 allow | Allow npm registry | 187 |
| deny-blocked-plain | 🔴 deny | Deny all other HTTP/HTTPS | 12 |
| deny-default | 🔴 deny | Default deny | 3 |
```

- Sort by hits (descending)
- Include all rules that had at least 1 hit
- Use 🟢 for allow rules and 🔴 for deny rules in the Action column

**4c. Denied Requests with Rule Attribution**

Show denied requests grouped by rule, with domain details:

```markdown
| Domain | Deny Rule | Reason | Occurrences |
|--------|-----------|--------|-------------|
| evil.com:443 | deny-blocked-plain | Domain not in allowlist | 5 |
| tracker.io:443 | deny-blocked-plain | Domain not in allowlist | 3 |
| unknown.host:80 | deny-default | Default deny | 1 |
```

- Group by domain + rule combination
- Sort by occurrences (descending)
- Show top 30 entries; wrap the full list in `<details>` if more than 30

**4d. Rule Effectiveness Summary**

Provide a brief analysis:
- Which deny rules are doing the most work (catching the most unauthorized traffic)
- Which allow rules handle the most traffic (busiest legitimate pathways)
- Any rules with zero hits that could be removed or indicate unused policy entries
- Any `(implicit-deny)` attributions that indicate gaps in the policy (traffic denied without matching any explicit rule)

#### Section 5: Detailed Request Patterns (In `<details>` Tags)
**IMPORTANT**: Wrap this entire section in a collapsible `<details>` block:

```markdown
Expand All @@ -351,7 +456,7 @@ For each workflow that had blocked domains, provide a detailed breakdown:
</details>
```

#### Section 5: Complete Blocked Domains List (In `<details>` Tags)
#### Section 6: Complete Blocked Domains List (In `<details>` Tags)
**IMPORTANT**: Wrap this entire section in a collapsible `<details>` block:

```markdown
Expand All @@ -368,12 +473,13 @@ An alphabetically sorted list of all unique blocked domains:
</details>
```

#### Section 6: Security Recommendations (Always Visible)
#### Section 7: Security Recommendations (Always Visible)
Based on the analysis, provide actionable insights:
- Domains that appear to be legitimate services that should be allowlisted
- Potential security concerns (e.g., suspicious domains)
- Suggestions for network permission improvements
- Workflows that might need their network permissions updated
- Policy rule suggestions (e.g., rules with zero hits that could be removed, domains that should be added to allow rules)

### Step 6: Create Discussion

Expand Down
8 changes: 8 additions & 0 deletions pkg/cli/audit.go
Original file line number Diff line number Diff line change
Expand Up @@ -322,6 +322,12 @@ func AuditWorkflowRun(ctx context.Context, runID int64, owner, repo, hostname st
fmt.Fprintln(os.Stderr, console.FormatWarningMessage(fmt.Sprintf("Failed to analyze firewall logs: %v", err)))
}

// Analyze firewall policy artifacts if available (policy-manifest.json + audit.jsonl)
policyAnalysis, err := analyzeFirewallPolicy(runOutputDir, verbose)
if err != nil && verbose {
fmt.Fprintln(os.Stderr, console.FormatWarningMessage(fmt.Sprintf("Failed to analyze firewall policy: %v", err)))
}

// Analyze redacted domains if available
redactedDomainsAnalysis, err := analyzeRedactedDomains(runOutputDir, verbose)
if err != nil && verbose {
Expand All @@ -344,6 +350,7 @@ func AuditWorkflowRun(ctx context.Context, runID int64, owner, repo, hostname st
processedRun := ProcessedRun{
Run: run,
FirewallAnalysis: firewallAnalysis,
PolicyAnalysis: policyAnalysis,
RedactedDomainsAnalysis: redactedDomainsAnalysis,
MissingTools: missingTools,
MissingData: missingData,
Expand Down Expand Up @@ -415,6 +422,7 @@ func AuditWorkflowRun(ctx context.Context, runID int64, owner, repo, hostname st
Metrics: metrics,
AccessAnalysis: accessAnalysis,
FirewallAnalysis: firewallAnalysis,
PolicyAnalysis: policyAnalysis,
RedactedDomainsAnalysis: redactedDomainsAnalysis,
MissingTools: missingTools,
MissingData: missingData,
Expand Down
11 changes: 11 additions & 0 deletions pkg/cli/audit_report.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ type AuditData struct {
Noops []NoopReport `json:"noops,omitempty"`
MCPFailures []MCPFailureReport `json:"mcp_failures,omitempty"`
FirewallAnalysis *FirewallAnalysis `json:"firewall_analysis,omitempty"`
PolicyAnalysis *PolicyAnalysis `json:"policy_analysis,omitempty"`
RedactedDomainsAnalysis *RedactedDomainsAnalysis `json:"redacted_domains_analysis,omitempty"`
Errors []ErrorInfo `json:"errors,omitempty"`
Warnings []ErrorInfo `json:"warnings,omitempty"`
Expand Down Expand Up @@ -182,6 +183,15 @@ type MCPServerStats struct {
ErrorCount int `json:"error_count,omitempty" console:"header:Errors,omitempty"`
}

// PolicySummaryDisplay is a display-optimized version of PolicyAnalysis for console rendering
type PolicySummaryDisplay struct {
Policy string `console:"header:Policy"`
TotalRequests int `console:"header:Total Requests"`
Allowed int `console:"header:Allowed"`
Denied int `console:"header:Denied"`
UniqueDomains int `console:"header:Unique Domains"`
}

// OverviewDisplay is a display-optimized version of OverviewData for console rendering
type OverviewDisplay struct {
RunID int64 `console:"header:Run ID"`
Expand Down Expand Up @@ -329,6 +339,7 @@ func buildAuditData(processedRun ProcessedRun, metrics LogMetrics, mcpToolUsage
Noops: processedRun.Noops,
MCPFailures: processedRun.MCPFailures,
FirewallAnalysis: processedRun.FirewallAnalysis,
PolicyAnalysis: processedRun.PolicyAnalysis,
RedactedDomainsAnalysis: processedRun.RedactedDomainsAnalysis,
Errors: errors,
Warnings: warnings,
Expand Down
85 changes: 85 additions & 0 deletions pkg/cli/audit_report_render.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,11 @@ package cli
import (
"encoding/json"
"fmt"
"math"
"os"
"path/filepath"
"strconv"
"time"

"github.com/github/gh-aw/pkg/console"
"github.com/github/gh-aw/pkg/sliceutil"
Expand Down Expand Up @@ -120,6 +122,13 @@ func renderConsole(data AuditData, logsPath string) {
renderFirewallAnalysis(data.FirewallAnalysis)
}

// Firewall Policy Analysis Section (enriched with rule attribution)
if data.PolicyAnalysis != nil && (len(data.PolicyAnalysis.RuleHits) > 0 || data.PolicyAnalysis.PolicySummary != "") {
fmt.Fprintln(os.Stderr, console.FormatSectionHeader("Firewall Policy Analysis"))
fmt.Fprintln(os.Stderr)
renderPolicyAnalysis(data.PolicyAnalysis)
}

// Redacted Domains Section
if data.RedactedDomainsAnalysis != nil && data.RedactedDomainsAnalysis.TotalDomains > 0 {
fmt.Fprintln(os.Stderr, console.FormatSectionHeader("🔒 Redacted URL Domains"))
Expand Down Expand Up @@ -563,3 +572,79 @@ func renderPerformanceMetrics(metrics *PerformanceMetrics) {

fmt.Fprintln(os.Stderr)
}

// renderPolicyAnalysis renders the enriched firewall policy analysis with rule attribution
func renderPolicyAnalysis(analysis *PolicyAnalysis) {
auditReportLog.Printf("Rendering policy analysis: rules=%d, denied=%d", len(analysis.RuleHits), analysis.DeniedCount)

// Policy summary using RenderStruct
display := PolicySummaryDisplay{
Policy: analysis.PolicySummary,
TotalRequests: analysis.TotalRequests,
Allowed: analysis.AllowedCount,
Denied: analysis.DeniedCount,
UniqueDomains: analysis.UniqueDomains,
}
fmt.Fprint(os.Stderr, console.RenderStruct(display))
fmt.Fprintln(os.Stderr)

// Rule hit table
if len(analysis.RuleHits) > 0 {
fmt.Fprintln(os.Stderr, console.FormatInfoMessage("Policy Rules:"))
fmt.Fprintln(os.Stderr)

ruleConfig := console.TableConfig{
Headers: []string{"Rule", "Action", "Description", "Hits"},
Rows: make([][]string, 0, len(analysis.RuleHits)),
}

for _, rh := range analysis.RuleHits {
row := []string{
stringutil.Truncate(rh.Rule.ID, 30),
rh.Rule.Action,
stringutil.Truncate(rh.Rule.Description, 50),
strconv.Itoa(rh.Hits),
}
ruleConfig.Rows = append(ruleConfig.Rows, row)
}

fmt.Fprint(os.Stderr, console.RenderTable(ruleConfig))
fmt.Fprintln(os.Stderr)
}

// Denied requests detail
if len(analysis.DeniedRequests) > 0 {
fmt.Fprintln(os.Stderr, console.FormatWarningMessage(fmt.Sprintf("Denied Requests (%d):", len(analysis.DeniedRequests))))
fmt.Fprintln(os.Stderr)

deniedConfig := console.TableConfig{
Headers: []string{"Time", "Domain", "Rule", "Reason"},
Rows: make([][]string, 0, len(analysis.DeniedRequests)),
}

for _, req := range analysis.DeniedRequests {
timeStr := formatUnixTimestamp(req.Timestamp)
row := []string{
timeStr,
stringutil.Truncate(req.Host, 40),
stringutil.Truncate(req.RuleID, 25),
stringutil.Truncate(req.Reason, 40),
}
deniedConfig.Rows = append(deniedConfig.Rows, row)
}

fmt.Fprint(os.Stderr, console.RenderTable(deniedConfig))
fmt.Fprintln(os.Stderr)
}
}

// formatUnixTimestamp converts a Unix timestamp (float64) to a human-readable time string (HH:MM:SS).
func formatUnixTimestamp(ts float64) string {
if ts <= 0 {
return "-"
}
sec := int64(math.Floor(ts))
nsec := int64((ts - float64(sec)) * 1e9)
t := time.Unix(sec, nsec).UTC()
return t.Format("15:04:05")
}
Loading
Loading