diff --git a/docs/adr/29360-multi-path-artifact-discovery-for-firewall-audit.md b/docs/adr/29360-multi-path-artifact-discovery-for-firewall-audit.md new file mode 100644 index 00000000000..0e54732d02e --- /dev/null +++ b/docs/adr/29360-multi-path-artifact-discovery-for-firewall-audit.md @@ -0,0 +1,73 @@ +# ADR-29360: Multi-Path Artifact Discovery Strategy for Firewall Audit Command + +**Date**: 2026-04-30 +**Status**: Draft +**Deciders**: pelikhan + +--- + +## Part 1 — Narrative (Human-Friendly) + +### Context + +The `gh aw audit` command relies on `detectFirewallAuditArtifacts` to locate `policy-manifest.json` and `audit.jsonl` before it can display firewall audit results. Artifact directories may arrive in several layouts depending on how they were obtained: some users run `gh run download` directly without passing the directory through `flattenUnifiedArtifact`, and the internal path structure of the downloaded artifact varies further by the version of `actions/upload-artifact` used (v4+ strips the `/tmp/gh-aw/` common prefix; older releases preserve it). On top of that, the artifact container itself may be named `agent`, `agent-artifacts`, or `-agent` depending on when the workflow was created. The original implementation only covered the post-`flattenUnifiedArtifact` path and the legacy `firewall-audit-logs` directory, leaving the non-flattened cases completely unsupported. + +### Decision + +We will extend `detectFirewallAuditArtifacts` to probe four candidate locations in a fixed priority order: (1) the post-flatten primary path, (2) the non-flattened new artifact structure, (3) the non-flattened old artifact structure, and (4) the legacy standalone `firewall-audit*` directory. The function stops as soon as both files are found and uses a shared `checkDir` closure to eliminate the repetitive stat-and-assign pattern across all search steps. This makes `gh aw audit` work correctly regardless of whether `flattenUnifiedArtifact` has been called and regardless of the `actions/upload-artifact` version in use. + +### Alternatives Considered + +#### Alternative 1: Require `flattenUnifiedArtifact` Before Audit + +Reject non-flattened directories and document that `gh aw audit` requires a pre-processed directory. This would keep the function simple but breaks the manual download workflow — users who run `gh run download` and point the command at the result directory would get a silent failure (no audit output) rather than a useful error. That silent failure is worse than added complexity. + +#### Alternative 2: Normalise on Download + +Run the path-normalisation step (`flattenUnifiedArtifact`) automatically inside `gh aw audit` before probing for files. This would work but couples the audit command to the artifact download pipeline, is harder to test in isolation, and could mutate the caller's directory unexpectedly. + +### Consequences + +#### Positive +- `gh aw audit` works for both the automated pipeline path and the manual `gh run download` path without requiring any user-side workaround. +- Fully backward-compatible: all existing search paths (post-flatten primary and legacy `firewall-audit*`) are preserved at the same priority they had before. +- The `checkDir` closure removes duplicated stat+assign logic, making each search step a single readable call. + +#### Negative +- `detectFirewallAuditArtifacts` now encodes knowledge of four distinct directory layouts; adding a fifth layout in the future requires understanding this ordering contract. +- The function implicitly depends on the naming conventions of `findArtifactDir` (suffixes `agent`, `agent-artifacts`, and `*-agent`); a rename of those artifacts without a corresponding update here would silently break step 2 and 3. + +#### Neutral +- Four new table-driven sub-tests are added to `TestDetectFirewallAuditArtifacts` to cover each newly supported layout; the test file grows by ~68 lines. +- The search order is documented in a block comment on the function signature, establishing a reference point for future readers. + +--- + +## Part 2 — Normative Specification (RFC 2119) + +> The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHALL**, **SHALL NOT**, **SHOULD**, **SHOULD NOT**, **RECOMMENDED**, **MAY**, and **OPTIONAL** in this section are to be interpreted as described in [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119). + +### Artifact Discovery Search Order + +1. Implementations **MUST** probe candidate directories in the following fixed priority order and stop as soon as both `policy-manifest.json` and `audit.jsonl` are found: + 1. `/sandbox/firewall/audit/` — post-`flattenUnifiedArtifact` primary path + 2. `/sandbox/firewall/audit/` — non-flattened artifact, `actions/upload-artifact` v4+ (prefix stripped) + 3. `/tmp/gh-aw/sandbox/firewall/audit/` — non-flattened artifact, older upload action (prefix preserved) + 4. `/firewall-audit*/` — legacy standalone audit artifact +2. Implementations **MUST** resolve `` by searching `` for a subdirectory whose name is `agent`, `agent-artifacts`, or matches the pattern `*-agent` (workflow_call hash prefix). +3. Implementations **MUST NOT** mutate or reorder the caller-supplied `runDir` as a side-effect of artifact discovery. +4. Implementations **SHOULD** log the path of each file found and the search label where it was located to aid future debugging. + +### Shared Discovery Helper + +1. Implementations **MUST** use a single shared helper (closure or function) that checks one directory for both `policy-manifest.json` and `audit.jsonl`, so that the check logic is not duplicated across search steps. +2. The shared helper **MUST** only populate a result slot (manifest path or audit path) if that slot has not already been filled by an earlier search step. +3. The shared helper **MUST** return a boolean indicating whether both files have been found, so callers can short-circuit early. + +### Conformance + +An implementation is considered conformant with this ADR if it satisfies all **MUST** and **MUST NOT** requirements above. Failure to meet any **MUST** or **MUST NOT** requirement constitutes non-conformance. + +--- + +*This is a DRAFT ADR generated by the [Design Decision Gate](https://github.com/github/gh-aw/actions/runs/25181134127) workflow. The PR author must review, complete, and finalize this document before the PR can merge.* diff --git a/pkg/cli/firewall_policy.go b/pkg/cli/firewall_policy.go index 74477c4ecce..36a48667b22 100644 --- a/pkg/cli/firewall_policy.go +++ b/pkg/cli/firewall_policy.go @@ -405,55 +405,78 @@ func enrichWithPolicyRules(entries []AuditLogEntry, manifest *PolicyManifest) *P // detectFirewallAuditArtifacts looks for policy-manifest.json and audit.jsonl in the run directory. // Returns the paths if found, or empty strings if not present. +// +// Search order: +// 1. sandbox/firewall/audit/ — primary path after flattenUnifiedArtifact strips the /tmp/gh-aw/ prefix +// 2. agent/sandbox/firewall/audit/ — non-flattened unified agent artifact (new structure) +// 3. agent/tmp/gh-aw/sandbox/firewall/audit/ — non-flattened unified agent artifact (old structure) +// 4. firewall-audit*/ — legacy separate firewall-audit-logs artifact (backward compat) func detectFirewallAuditArtifacts(runDir string) (manifestPath, auditJSONLPath string) { firewallPolicyLog.Printf("Detecting firewall audit artifacts in: %s", runDir) - // Check for artifacts in sandbox/firewall/audit/ (primary path after artifact download) - auditDir := filepath.Join(runDir, "sandbox", "firewall", "audit") - manifestCandidate := filepath.Join(auditDir, "policy-manifest.json") - auditCandidate := filepath.Join(auditDir, "audit.jsonl") - - if _, err := os.Stat(manifestCandidate); err == nil { - manifestPath = manifestCandidate - firewallPolicyLog.Printf("Found policy manifest: %s", manifestPath) + // checkDir probes dir for policy-manifest.json and audit.jsonl, populating the + // return variables for any files not yet found. Returns true when both are found. + checkDir := func(dir, label string) bool { + if manifestPath == "" { + candidate := filepath.Join(dir, "policy-manifest.json") + if _, err := os.Stat(candidate); err == nil { + manifestPath = candidate + firewallPolicyLog.Printf("Found policy manifest in %s: %s", label, manifestPath) + } + } + if auditJSONLPath == "" { + candidate := filepath.Join(dir, "audit.jsonl") + if _, err := os.Stat(candidate); err == nil { + auditJSONLPath = candidate + firewallPolicyLog.Printf("Found audit JSONL in %s: %s", label, auditJSONLPath) + } + } + return manifestPath != "" && auditJSONLPath != "" } - if _, err := os.Stat(auditCandidate); err == nil { - auditJSONLPath = auditCandidate - firewallPolicyLog.Printf("Found audit JSONL: %s", auditJSONLPath) + // 1. Primary path: sandbox/firewall/audit/ after flattenUnifiedArtifact + if checkDir(filepath.Join(runDir, "sandbox", "firewall", "audit"), "sandbox/firewall/audit") { + return } - // Also check for legacy firewall-audit-logs directory (backward compat for older runs) - if manifestPath == "" || auditJSONLPath == "" { - entries, err := os.ReadDir(runDir) - if err != nil { - return - } - for _, entry := range entries { - if !entry.IsDir() { - continue + // 2 & 3. Non-flattened unified agent artifact (before flattenUnifiedArtifact is called, + // e.g., when the audit command is run on a directory populated via `gh run download`). + // Handles "agent", "agent-artifacts", and workflow_call prefixed names (e.g. "hash-agent"). + if agentDir := findArtifactDir(runDir, "agent", "agent-artifacts"); agentDir != "" { + // Guard: findArtifactDir checks existence but not type; skip if it resolved to a file. + if info, err := os.Stat(agentDir); err != nil || !info.IsDir() { + firewallPolicyLog.Printf("Skipping agent artifact path (not a directory): %s", agentDir) + } else { + agentBase := filepath.Base(agentDir) + // New artifact structure (actions/upload-artifact v4+, /tmp/gh-aw/ prefix stripped): + // /sandbox/firewall/audit/ + if !checkDir(filepath.Join(agentDir, "sandbox", "firewall", "audit"), agentBase+"/sandbox/firewall/audit") { + // Old artifact structure (/tmp/gh-aw/ prefix preserved inside the artifact): + // /tmp/gh-aw/sandbox/firewall/audit/ + checkDir(filepath.Join(agentDir, "tmp", "gh-aw", "sandbox", "firewall", "audit"), agentBase+"/tmp/gh-aw/sandbox/firewall/audit") } - name := entry.Name() - if strings.HasPrefix(name, "firewall-audit") { - dir := filepath.Join(runDir, name) - if manifestPath == "" { - candidate := filepath.Join(dir, "policy-manifest.json") - if _, err := os.Stat(candidate); err == nil { - manifestPath = candidate - firewallPolicyLog.Printf("Found policy manifest in %s: %s", name, manifestPath) - } - } - if auditJSONLPath == "" { - candidate := filepath.Join(dir, "audit.jsonl") - if _, err := os.Stat(candidate); err == nil { - auditJSONLPath = candidate - firewallPolicyLog.Printf("Found audit JSONL in %s: %s", name, auditJSONLPath) - } - } + if manifestPath != "" && auditJSONLPath != "" { + return } } } + // 4. Legacy separate firewall-audit-logs artifact (backward compat for older runs that + // uploaded the audit directory as a standalone artifact named firewall-audit-logs). + entries, err := os.ReadDir(runDir) + if err != nil { + return + } + for _, entry := range entries { + if !entry.IsDir() { + continue + } + name := entry.Name() + if strings.HasPrefix(name, "firewall-audit") { + checkDir(filepath.Join(runDir, name), name) + } + } + return manifestPath, auditJSONLPath } diff --git a/pkg/cli/firewall_policy_test.go b/pkg/cli/firewall_policy_test.go index d321e365753..347288dda37 100644 --- a/pkg/cli/firewall_policy_test.go +++ b/pkg/cli/firewall_policy_test.go @@ -566,6 +566,74 @@ func TestDetectFirewallAuditArtifacts(t *testing.T) { assert.Equal(t, auditPath, foundAudit, "Should find audit JSONL") }) + t.Run("agent artifact new structure (not yet flattened)", func(t *testing.T) { + // Simulates a directory populated by `gh run download` before flattenUnifiedArtifact + // is called. actions/upload-artifact v4+ strips the /tmp/gh-aw/ common prefix, so + // files land at agent/sandbox/firewall/audit/ inside the downloaded artifact dir. + dir := t.TempDir() + auditDir := filepath.Join(dir, "agent", "sandbox", "firewall", "audit") + require.NoError(t, os.MkdirAll(auditDir, 0755)) + + manifestPath := filepath.Join(auditDir, "policy-manifest.json") + auditPath := filepath.Join(auditDir, "audit.jsonl") + require.NoError(t, os.WriteFile(manifestPath, []byte("{}"), 0644)) + require.NoError(t, os.WriteFile(auditPath, []byte(""), 0644)) + + foundManifest, foundAudit := detectFirewallAuditArtifacts(dir) + assert.Equal(t, manifestPath, foundManifest, "Should find policy manifest in agent/sandbox/firewall/audit") + assert.Equal(t, auditPath, foundAudit, "Should find audit JSONL in agent/sandbox/firewall/audit") + }) + + t.Run("agent artifact old structure with tmp/gh-aw prefix (not yet flattened)", func(t *testing.T) { + // Simulates older artifact structure where the full /tmp/gh-aw/ path was preserved + // inside the agent artifact directory before the v4+ prefix-stripping behavior. + dir := t.TempDir() + auditDir := filepath.Join(dir, "agent", "tmp", "gh-aw", "sandbox", "firewall", "audit") + require.NoError(t, os.MkdirAll(auditDir, 0755)) + + manifestPath := filepath.Join(auditDir, "policy-manifest.json") + auditPath := filepath.Join(auditDir, "audit.jsonl") + require.NoError(t, os.WriteFile(manifestPath, []byte("{}"), 0644)) + require.NoError(t, os.WriteFile(auditPath, []byte(""), 0644)) + + foundManifest, foundAudit := detectFirewallAuditArtifacts(dir) + assert.Equal(t, manifestPath, foundManifest, "Should find policy manifest in agent/tmp/gh-aw/sandbox/firewall/audit") + assert.Equal(t, auditPath, foundAudit, "Should find audit JSONL in agent/tmp/gh-aw/sandbox/firewall/audit") + }) + + t.Run("agent-artifacts legacy artifact name (not yet flattened)", func(t *testing.T) { + // Simulates the legacy "agent-artifacts" artifact name used before the rename to "agent". + dir := t.TempDir() + auditDir := filepath.Join(dir, "agent-artifacts", "sandbox", "firewall", "audit") + require.NoError(t, os.MkdirAll(auditDir, 0755)) + + manifestPath := filepath.Join(auditDir, "policy-manifest.json") + auditPath := filepath.Join(auditDir, "audit.jsonl") + require.NoError(t, os.WriteFile(manifestPath, []byte("{}"), 0644)) + require.NoError(t, os.WriteFile(auditPath, []byte(""), 0644)) + + foundManifest, foundAudit := detectFirewallAuditArtifacts(dir) + assert.Equal(t, manifestPath, foundManifest, "Should find policy manifest in agent-artifacts/sandbox/firewall/audit") + assert.Equal(t, auditPath, foundAudit, "Should find audit JSONL in agent-artifacts/sandbox/firewall/audit") + }) + + t.Run("workflow_call prefixed agent artifact (not yet flattened)", func(t *testing.T) { + // Simulates the workflow_call artifact naming where a hash prefix is added: + // e.g., "abc123-agent" instead of "agent". + dir := t.TempDir() + auditDir := filepath.Join(dir, "abc123-agent", "sandbox", "firewall", "audit") + require.NoError(t, os.MkdirAll(auditDir, 0755)) + + manifestPath := filepath.Join(auditDir, "policy-manifest.json") + auditPath := filepath.Join(auditDir, "audit.jsonl") + require.NoError(t, os.WriteFile(manifestPath, []byte("{}"), 0644)) + require.NoError(t, os.WriteFile(auditPath, []byte(""), 0644)) + + foundManifest, foundAudit := detectFirewallAuditArtifacts(dir) + assert.Equal(t, manifestPath, foundManifest, "Should find policy manifest in prefixed-agent/sandbox/firewall/audit") + assert.Equal(t, auditPath, foundAudit, "Should find audit JSONL in prefixed-agent/sandbox/firewall/audit") + }) + t.Run("firewall-audit-logs directory", func(t *testing.T) { dir := t.TempDir() auditDir := filepath.Join(dir, "firewall-audit-logs") @@ -587,6 +655,17 @@ func TestDetectFirewallAuditArtifacts(t *testing.T) { assert.Empty(t, foundManifest, "Should not find manifest") assert.Empty(t, foundAudit, "Should not find audit JSONL") }) + + t.Run("file named 'agent' does not panic or falsely match", func(t *testing.T) { + // If a plain file happens to be named "agent" in the run directory (e.g., a flattened + // single-file artifact from a different upload), the lookup must skip it gracefully. + dir := t.TempDir() + require.NoError(t, os.WriteFile(filepath.Join(dir, "agent"), []byte("not a directory"), 0644)) + + foundManifest, foundAudit := detectFirewallAuditArtifacts(dir) + assert.Empty(t, foundManifest, "Should not find manifest when 'agent' is a file") + assert.Empty(t, foundAudit, "Should not find audit JSONL when 'agent' is a file") + }) } func TestAnalyzeFirewallPolicy(t *testing.T) {