Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 73 additions & 0 deletions docs/adr/29360-multi-path-artifact-discovery-for-firewall-audit.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# ADR-29360: Multi-Path Artifact Discovery Strategy for Firewall Audit Command

**Date**: 2026-04-30
**Status**: Draft
**Deciders**: pelikhan

---

## Part 1 — Narrative (Human-Friendly)

### Context

The `gh aw audit` command relies on `detectFirewallAuditArtifacts` to locate `policy-manifest.json` and `audit.jsonl` before it can display firewall audit results. Artifact directories may arrive in several layouts depending on how they were obtained: some users run `gh run download` directly without passing the directory through `flattenUnifiedArtifact`, and the internal path structure of the downloaded artifact varies further by the version of `actions/upload-artifact` used (v4+ strips the `/tmp/gh-aw/` common prefix; older releases preserve it). On top of that, the artifact container itself may be named `agent`, `agent-artifacts`, or `<hash>-agent` depending on when the workflow was created. The original implementation only covered the post-`flattenUnifiedArtifact` path and the legacy `firewall-audit-logs` directory, leaving the non-flattened cases completely unsupported.

### Decision

We will extend `detectFirewallAuditArtifacts` to probe four candidate locations in a fixed priority order: (1) the post-flatten primary path, (2) the non-flattened new artifact structure, (3) the non-flattened old artifact structure, and (4) the legacy standalone `firewall-audit*` directory. The function stops as soon as both files are found and uses a shared `checkDir` closure to eliminate the repetitive stat-and-assign pattern across all search steps. This makes `gh aw audit` work correctly regardless of whether `flattenUnifiedArtifact` has been called and regardless of the `actions/upload-artifact` version in use.

### Alternatives Considered

#### Alternative 1: Require `flattenUnifiedArtifact` Before Audit

Reject non-flattened directories and document that `gh aw audit` requires a pre-processed directory. This would keep the function simple but breaks the manual download workflow — users who run `gh run download` and point the command at the result directory would get a silent failure (no audit output) rather than a useful error. That silent failure is worse than added complexity.

#### Alternative 2: Normalise on Download

Run the path-normalisation step (`flattenUnifiedArtifact`) automatically inside `gh aw audit` before probing for files. This would work but couples the audit command to the artifact download pipeline, is harder to test in isolation, and could mutate the caller's directory unexpectedly.

### Consequences

#### Positive
- `gh aw audit` works for both the automated pipeline path and the manual `gh run download` path without requiring any user-side workaround.
- Fully backward-compatible: all existing search paths (post-flatten primary and legacy `firewall-audit*`) are preserved at the same priority they had before.
- The `checkDir` closure removes duplicated stat+assign logic, making each search step a single readable call.

#### Negative
- `detectFirewallAuditArtifacts` now encodes knowledge of four distinct directory layouts; adding a fifth layout in the future requires understanding this ordering contract.
- The function implicitly depends on the naming conventions of `findArtifactDir` (suffixes `agent`, `agent-artifacts`, and `*-agent`); a rename of those artifacts without a corresponding update here would silently break step 2 and 3.

#### Neutral
- Four new table-driven sub-tests are added to `TestDetectFirewallAuditArtifacts` to cover each newly supported layout; the test file grows by ~68 lines.
- The search order is documented in a block comment on the function signature, establishing a reference point for future readers.

---

## Part 2 — Normative Specification (RFC 2119)

> The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHALL**, **SHALL NOT**, **SHOULD**, **SHOULD NOT**, **RECOMMENDED**, **MAY**, and **OPTIONAL** in this section are to be interpreted as described in [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119).

### Artifact Discovery Search Order

1. Implementations **MUST** probe candidate directories in the following fixed priority order and stop as soon as both `policy-manifest.json` and `audit.jsonl` are found:
1. `<runDir>/sandbox/firewall/audit/` — post-`flattenUnifiedArtifact` primary path
2. `<agentDir>/sandbox/firewall/audit/` — non-flattened artifact, `actions/upload-artifact` v4+ (prefix stripped)
3. `<agentDir>/tmp/gh-aw/sandbox/firewall/audit/` — non-flattened artifact, older upload action (prefix preserved)
4. `<runDir>/firewall-audit*/` — legacy standalone audit artifact
2. Implementations **MUST** resolve `<agentDir>` by searching `<runDir>` for a subdirectory whose name is `agent`, `agent-artifacts`, or matches the pattern `*-agent` (workflow_call hash prefix).
3. Implementations **MUST NOT** mutate or reorder the caller-supplied `runDir` as a side-effect of artifact discovery.
4. Implementations **SHOULD** log the path of each file found and the search label where it was located to aid future debugging.

### Shared Discovery Helper

1. Implementations **MUST** use a single shared helper (closure or function) that checks one directory for both `policy-manifest.json` and `audit.jsonl`, so that the check logic is not duplicated across search steps.
2. The shared helper **MUST** only populate a result slot (manifest path or audit path) if that slot has not already been filled by an earlier search step.
3. The shared helper **MUST** return a boolean indicating whether both files have been found, so callers can short-circuit early.

### Conformance

An implementation is considered conformant with this ADR if it satisfies all **MUST** and **MUST NOT** requirements above. Failure to meet any **MUST** or **MUST NOT** requirement constitutes non-conformance.

---

*This is a DRAFT ADR generated by the [Design Decision Gate](https://github.com/github/gh-aw/actions/runs/25181134127) workflow. The PR author must review, complete, and finalize this document before the PR can merge.*
97 changes: 60 additions & 37 deletions pkg/cli/firewall_policy.go
Original file line number Diff line number Diff line change
Expand Up @@ -405,55 +405,78 @@ func enrichWithPolicyRules(entries []AuditLogEntry, manifest *PolicyManifest) *P

// detectFirewallAuditArtifacts looks for policy-manifest.json and audit.jsonl in the run directory.
// Returns the paths if found, or empty strings if not present.
//
// Search order:
// 1. sandbox/firewall/audit/ — primary path after flattenUnifiedArtifact strips the /tmp/gh-aw/ prefix
// 2. agent/sandbox/firewall/audit/ — non-flattened unified agent artifact (new structure)
// 3. agent/tmp/gh-aw/sandbox/firewall/audit/ — non-flattened unified agent artifact (old structure)
// 4. firewall-audit*/ — legacy separate firewall-audit-logs artifact (backward compat)
func detectFirewallAuditArtifacts(runDir string) (manifestPath, auditJSONLPath string) {
firewallPolicyLog.Printf("Detecting firewall audit artifacts in: %s", runDir)

// Check for artifacts in sandbox/firewall/audit/ (primary path after artifact download)
auditDir := filepath.Join(runDir, "sandbox", "firewall", "audit")
manifestCandidate := filepath.Join(auditDir, "policy-manifest.json")
auditCandidate := filepath.Join(auditDir, "audit.jsonl")

if _, err := os.Stat(manifestCandidate); err == nil {
manifestPath = manifestCandidate
firewallPolicyLog.Printf("Found policy manifest: %s", manifestPath)
// checkDir probes dir for policy-manifest.json and audit.jsonl, populating the
// return variables for any files not yet found. Returns true when both are found.
checkDir := func(dir, label string) bool {
if manifestPath == "" {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The checkDir closure is a clean approach that reduces duplication nicely. One suggestion: consider logging the resolved agentDir path once (before probing steps 2 & 3) so debugging is easier when the audit files aren't found — the label strings are hardcoded as agent/... but the actual dir may be agent-artifacts or abc123-agent.

candidate := filepath.Join(dir, "policy-manifest.json")
if _, err := os.Stat(candidate); err == nil {
manifestPath = candidate
firewallPolicyLog.Printf("Found policy manifest in %s: %s", label, manifestPath)
}
}
if auditJSONLPath == "" {
candidate := filepath.Join(dir, "audit.jsonl")
if _, err := os.Stat(candidate); err == nil {
auditJSONLPath = candidate
firewallPolicyLog.Printf("Found audit JSONL in %s: %s", label, auditJSONLPath)
}
}
return manifestPath != "" && auditJSONLPath != ""
}

if _, err := os.Stat(auditCandidate); err == nil {
auditJSONLPath = auditCandidate
firewallPolicyLog.Printf("Found audit JSONL: %s", auditJSONLPath)
// 1. Primary path: sandbox/firewall/audit/ after flattenUnifiedArtifact
if checkDir(filepath.Join(runDir, "sandbox", "firewall", "audit"), "sandbox/firewall/audit") {
return
}

// Also check for legacy firewall-audit-logs directory (backward compat for older runs)
if manifestPath == "" || auditJSONLPath == "" {
entries, err := os.ReadDir(runDir)
if err != nil {
return
}
for _, entry := range entries {
if !entry.IsDir() {
continue
// 2 & 3. Non-flattened unified agent artifact (before flattenUnifiedArtifact is called,
// e.g., when the audit command is run on a directory populated via `gh run download`).
// Handles "agent", "agent-artifacts", and workflow_call prefixed names (e.g. "hash-agent").
if agentDir := findArtifactDir(runDir, "agent", "agent-artifacts"); agentDir != "" {
// Guard: findArtifactDir checks existence but not type; skip if it resolved to a file.
if info, err := os.Stat(agentDir); err != nil || !info.IsDir() {
firewallPolicyLog.Printf("Skipping agent artifact path (not a directory): %s", agentDir)
} else {
agentBase := filepath.Base(agentDir)
// New artifact structure (actions/upload-artifact v4+, /tmp/gh-aw/ prefix stripped):
// <agentDir>/sandbox/firewall/audit/
if !checkDir(filepath.Join(agentDir, "sandbox", "firewall", "audit"), agentBase+"/sandbox/firewall/audit") {
// Old artifact structure (/tmp/gh-aw/ prefix preserved inside the artifact):
// <agentDir>/tmp/gh-aw/sandbox/firewall/audit/
checkDir(filepath.Join(agentDir, "tmp", "gh-aw", "sandbox", "firewall", "audit"), agentBase+"/tmp/gh-aw/sandbox/firewall/audit")
}
name := entry.Name()
if strings.HasPrefix(name, "firewall-audit") {
dir := filepath.Join(runDir, name)
if manifestPath == "" {
candidate := filepath.Join(dir, "policy-manifest.json")
if _, err := os.Stat(candidate); err == nil {
manifestPath = candidate
firewallPolicyLog.Printf("Found policy manifest in %s: %s", name, manifestPath)
}
}
if auditJSONLPath == "" {
candidate := filepath.Join(dir, "audit.jsonl")
if _, err := os.Stat(candidate); err == nil {
auditJSONLPath = candidate
firewallPolicyLog.Printf("Found audit JSONL in %s: %s", name, auditJSONLPath)
}
}
if manifestPath != "" && auditJSONLPath != "" {
return
}
}
}

// 4. Legacy separate firewall-audit-logs artifact (backward compat for older runs that
// uploaded the audit directory as a standalone artifact named firewall-audit-logs).
entries, err := os.ReadDir(runDir)
if err != nil {
return
}
for _, entry := range entries {
if !entry.IsDir() {
continue
}
name := entry.Name()
if strings.HasPrefix(name, "firewall-audit") {
checkDir(filepath.Join(runDir, name), name)
}
}

return manifestPath, auditJSONLPath
}

Expand Down
79 changes: 79 additions & 0 deletions pkg/cli/firewall_policy_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -566,6 +566,74 @@ func TestDetectFirewallAuditArtifacts(t *testing.T) {
assert.Equal(t, auditPath, foundAudit, "Should find audit JSONL")
})

t.Run("agent artifact new structure (not yet flattened)", func(t *testing.T) {
// Simulates a directory populated by `gh run download` before flattenUnifiedArtifact
// is called. actions/upload-artifact v4+ strips the /tmp/gh-aw/ common prefix, so
// files land at agent/sandbox/firewall/audit/ inside the downloaded artifact dir.
dir := t.TempDir()
auditDir := filepath.Join(dir, "agent", "sandbox", "firewall", "audit")
require.NoError(t, os.MkdirAll(auditDir, 0755))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good table-driven test coverage for the new artifact layouts. Consider adding a test case where both the primary path AND one of the non-flattened paths exist simultaneously — verifying that the primary path takes priority (step 1 wins) would give extra confidence in the search order contract.


manifestPath := filepath.Join(auditDir, "policy-manifest.json")
auditPath := filepath.Join(auditDir, "audit.jsonl")
require.NoError(t, os.WriteFile(manifestPath, []byte("{}"), 0644))
require.NoError(t, os.WriteFile(auditPath, []byte(""), 0644))

foundManifest, foundAudit := detectFirewallAuditArtifacts(dir)
assert.Equal(t, manifestPath, foundManifest, "Should find policy manifest in agent/sandbox/firewall/audit")
assert.Equal(t, auditPath, foundAudit, "Should find audit JSONL in agent/sandbox/firewall/audit")
})

t.Run("agent artifact old structure with tmp/gh-aw prefix (not yet flattened)", func(t *testing.T) {
// Simulates older artifact structure where the full /tmp/gh-aw/ path was preserved
// inside the agent artifact directory before the v4+ prefix-stripping behavior.
dir := t.TempDir()
auditDir := filepath.Join(dir, "agent", "tmp", "gh-aw", "sandbox", "firewall", "audit")
require.NoError(t, os.MkdirAll(auditDir, 0755))

manifestPath := filepath.Join(auditDir, "policy-manifest.json")
auditPath := filepath.Join(auditDir, "audit.jsonl")
require.NoError(t, os.WriteFile(manifestPath, []byte("{}"), 0644))
require.NoError(t, os.WriteFile(auditPath, []byte(""), 0644))

foundManifest, foundAudit := detectFirewallAuditArtifacts(dir)
assert.Equal(t, manifestPath, foundManifest, "Should find policy manifest in agent/tmp/gh-aw/sandbox/firewall/audit")
assert.Equal(t, auditPath, foundAudit, "Should find audit JSONL in agent/tmp/gh-aw/sandbox/firewall/audit")
})

t.Run("agent-artifacts legacy artifact name (not yet flattened)", func(t *testing.T) {
// Simulates the legacy "agent-artifacts" artifact name used before the rename to "agent".
dir := t.TempDir()
auditDir := filepath.Join(dir, "agent-artifacts", "sandbox", "firewall", "audit")
require.NoError(t, os.MkdirAll(auditDir, 0755))

manifestPath := filepath.Join(auditDir, "policy-manifest.json")
auditPath := filepath.Join(auditDir, "audit.jsonl")
require.NoError(t, os.WriteFile(manifestPath, []byte("{}"), 0644))
require.NoError(t, os.WriteFile(auditPath, []byte(""), 0644))

foundManifest, foundAudit := detectFirewallAuditArtifacts(dir)
assert.Equal(t, manifestPath, foundManifest, "Should find policy manifest in agent-artifacts/sandbox/firewall/audit")
assert.Equal(t, auditPath, foundAudit, "Should find audit JSONL in agent-artifacts/sandbox/firewall/audit")
})

t.Run("workflow_call prefixed agent artifact (not yet flattened)", func(t *testing.T) {
// Simulates the workflow_call artifact naming where a hash prefix is added:
// e.g., "abc123-agent" instead of "agent".
dir := t.TempDir()
auditDir := filepath.Join(dir, "abc123-agent", "sandbox", "firewall", "audit")
require.NoError(t, os.MkdirAll(auditDir, 0755))

manifestPath := filepath.Join(auditDir, "policy-manifest.json")
auditPath := filepath.Join(auditDir, "audit.jsonl")
require.NoError(t, os.WriteFile(manifestPath, []byte("{}"), 0644))
require.NoError(t, os.WriteFile(auditPath, []byte(""), 0644))

foundManifest, foundAudit := detectFirewallAuditArtifacts(dir)
assert.Equal(t, manifestPath, foundManifest, "Should find policy manifest in prefixed-agent/sandbox/firewall/audit")
assert.Equal(t, auditPath, foundAudit, "Should find audit JSONL in prefixed-agent/sandbox/firewall/audit")
})

t.Run("firewall-audit-logs directory", func(t *testing.T) {
dir := t.TempDir()
auditDir := filepath.Join(dir, "firewall-audit-logs")
Expand All @@ -587,6 +655,17 @@ func TestDetectFirewallAuditArtifacts(t *testing.T) {
assert.Empty(t, foundManifest, "Should not find manifest")
assert.Empty(t, foundAudit, "Should not find audit JSONL")
})

t.Run("file named 'agent' does not panic or falsely match", func(t *testing.T) {
// If a plain file happens to be named "agent" in the run directory (e.g., a flattened
// single-file artifact from a different upload), the lookup must skip it gracefully.
dir := t.TempDir()
require.NoError(t, os.WriteFile(filepath.Join(dir, "agent"), []byte("not a directory"), 0644))

foundManifest, foundAudit := detectFirewallAuditArtifacts(dir)
assert.Empty(t, foundManifest, "Should not find manifest when 'agent' is a file")
assert.Empty(t, foundAudit, "Should not find audit JSONL when 'agent' is a file")
})
}

func TestAnalyzeFirewallPolicy(t *testing.T) {
Expand Down
Loading