Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
173 changes: 173 additions & 0 deletions docs/src/content/docs/reference/artifacts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
---
title: Artifacts
description: Complete reference for artifact names, directory structures, and download patterns used by GitHub Agentic Workflows.
sidebar:
order: 298
---

GitHub Agentic Workflows upload several artifacts during workflow execution. This reference documents every artifact name, its contents, and how to access the data — especially for downstream workflows that use `gh run download` directly instead of `gh aw logs`.

## Quick Reference

| Artifact Name | Constant | Type | Description |
|---------------|----------|------|-------------|
| `agent` | `constants.AgentArtifactName` | Multi-file | Unified agent job outputs (logs, safe outputs, token usage summary) |
| `activation` | `constants.ActivationArtifactName` | Multi-file | Activation job output (`aw_info.json`, `prompt.txt`, rate limits) |
| `firewall-audit-logs` | `constants.FirewallAuditArtifactName` | Multi-file | AWF firewall audit/observability logs (token usage, network policy, audit trail) |
| `detection` | `constants.DetectionArtifactName` | Single-file | Threat detection log (`detection.log`) |
| `safe-output` | `constants.SafeOutputArtifactName` | Legacy/back-compat | Historical standalone safe output artifact (`safe_output.jsonl`); in current compiled workflows this content is included in the unified `agent` artifact instead |
| `agent-output` | `constants.AgentOutputArtifactName` | Legacy/back-compat | Historical standalone agent output artifact (`agent_output.json`); in current compiled workflows this content is included in the unified `agent` artifact instead |
| `aw-info` | — | Single-file | Engine configuration (`aw_info.json`) |
| `prompt` | — | Single-file | Generated prompt (`prompt.txt`) |
| `safe-outputs-items` | `constants.SafeOutputItemsArtifactName` | Single-file | Safe output items manifest |
| `code-scanning-sarif` | `constants.SarifArtifactName` | Single-file | SARIF file for code scanning results |

## Artifact Sets

The `gh aw logs` and `gh aw audit` commands support `--artifacts` to download only specific artifact groups:

| Set Name | Artifacts Downloaded | Use Case |
|----------|---------------------|----------|
| `all` | Everything | Full analysis (default) |
| `agent` | `agent` | Agent logs and outputs |
| `activation` | `activation` | Activation data (`aw_info.json`, `prompt.txt`) |
| `firewall` | `firewall-audit-logs` | Network policy and firewall audit data |
| `mcp` | `firewall-audit-logs` | MCP gateway traffic logs |
| `detection` | `detection` | Threat detection output |
| `github-api` | `activation`, `agent` | GitHub API rate limit logs |

```bash
# Download only firewall artifacts
gh aw logs <run-id> --artifacts firewall

# Download agent and firewall artifacts
gh aw logs <run-id> --artifacts agent --artifacts firewall

# Download everything (default)
gh aw logs <run-id>
```

## `firewall-audit-logs`

The `firewall-audit-logs` artifact is uploaded by **all firewall-enabled workflows**. It contains AWF (Agent Workflow Firewall) structured audit and observability logs.

> **⚠️ Important:** This artifact is **separate** from the `agent` artifact. Token usage data (`token-usage.jsonl`) lives here, not in the `agent` artifact.

### Directory Structure

```
firewall-audit-logs/
├── api-proxy-logs/
│ └── token-usage.jsonl ← Token usage data (input/output/cache tokens per API request)
├── squid-logs/
│ └── access.log ← Network policy log (domain allow/deny decisions)
├── audit.jsonl ← Firewall audit trail (policy matches, rule evaluations)
└── policy-manifest.json ← Policy configuration snapshot
```

### Accessing Token Usage Data

**Recommended: Use `gh aw logs`**

```bash
# Download and analyze firewall data
gh aw logs <run-id> --artifacts firewall

# Output as JSON for scripting
gh aw logs <run-id> --artifacts firewall --json
```

**Direct download with `gh run download`:**

```bash
# Download the firewall-audit-logs artifact
gh run download <run-id> -n firewall-audit-logs

# Token usage data is at:
cat firewall-audit-logs/api-proxy-logs/token-usage.jsonl

# Network access log is at:
cat firewall-audit-logs/squid-logs/access.log

# Audit trail is at:
cat firewall-audit-logs/audit.jsonl

# Policy manifest is at:
cat firewall-audit-logs/policy-manifest.json
```

### Common Mistake

Downstream workflows sometimes download `agent-artifacts` or `agent` expecting to find `token-usage.jsonl`. This will silently return no data — the token usage file is only in the `firewall-audit-logs` artifact.

```bash
# ❌ WRONG — token-usage.jsonl is NOT in the agent artifact
gh run download <run-id> -n agent
cat agent/token-usage.jsonl # File not found!

# ✅ CORRECT — download from firewall-audit-logs
gh run download <run-id> -n firewall-audit-logs
cat firewall-audit-logs/api-proxy-logs/token-usage.jsonl
```

## `agent`

The unified `agent` artifact contains all agent job outputs.

### Contents

- Agent execution logs
- Safe output data (`agent_output.json`)
- GitHub API rate limit logs (`github_rate_limits.jsonl`)
- Token usage summary (`agent_usage.json`) — aggregated totals only; per-request data is in `firewall-audit-logs`

## `activation`

The `activation` artifact contains activation job outputs.

### Contents

- `aw_info.json` — Engine configuration and workflow metadata
- `prompt.txt` — The generated prompt sent to the AI agent
- `github_rate_limits.jsonl` — Rate limit data from the activation job

## `detection`

The `detection` artifact contains threat detection output.

### Contents

- `detection.log` — Threat detection analysis results

Legacy name: `threat-detection.log` (still supported for backward compatibility).

## Naming Compatibility

Artifact names changed between upload-artifact v4 and v5. The `gh aw logs` and `gh aw audit` commands handle both naming schemes transparently:

| Old Name (pre-v5) | New Name (v5+) | File Inside |
|--------------------|----------------|-------------|
| `aw_info.json` | `aw-info` | `aw_info.json` |
| `safe_output.jsonl` | `safe-output` | `safe_output.jsonl` |
| `agent_output.json` | `agent-output` | `agent_output.json` |
| `prompt.txt` | `prompt` | `prompt.txt` |
| `threat-detection.log` | `detection` | `detection.log` |

Single-file artifacts are automatically flattened to root level regardless of their artifact directory name. Multi-file artifacts (`firewall-audit-logs`, `agent`, `activation`) retain their directory structure.

## Workflow Call Prefixes

When workflows are invoked via `workflow_call`, GitHub Actions prepends a short hash to artifact names (e.g., `abc123-firewall-audit-logs`). The CLI handles this automatically by matching artifact names that end with `-{base-name}`.

```bash
# Both of these are recognized as the firewall artifact:
# - firewall-audit-logs (direct invocation)
# - abc123-firewall-audit-logs (workflow_call invocation)
```

## Related Documentation

- [Audit Commands](/gh-aw/reference/audit/) — Download and analyze workflow run artifacts
- [Cost Management](/gh-aw/reference/cost-management/) — Track token usage and inference spend
- [Network](/gh-aw/reference/network/) — Firewall and domain allow/deny configuration
- [Compilation Process](/gh-aw/reference/compilation-process/) — How workflows are compiled including artifact upload steps
19 changes: 18 additions & 1 deletion docs/src/content/docs/reference/compilation-process.md
Original file line number Diff line number Diff line change
Expand Up @@ -252,10 +252,27 @@ Workflows generate several artifacts during execution:
| **agent_output.json** | `/tmp/gh-aw/safeoutputs/` | AI agent output with structured safe output data (create_issue, add_comment, etc.) | Uploaded by agent job, downloaded by safe output jobs, auto-deleted after 90 days |
| **agent_usage.json** | `/tmp/gh-aw/` | Aggregated token counts: `{"input_tokens":…,"output_tokens":…,"cache_read_tokens":…,"cache_write_tokens":…}` | Bundled in the unified agent artifact when the firewall is enabled; accessible to third-party tools without parsing step summaries |
| **prompt.txt** | `/tmp/gh-aw/aw-prompts/` | Generated prompt sent to AI agent (includes markdown instructions, imports, context variables) | Retained for debugging and reproduction |
| **firewall-logs/** | `/tmp/gh-aw/firewall-logs/` | Network access logs in Squid format (when `network.firewall:` enabled) | Analyzed by `gh aw logs` command |
| **firewall-audit-logs** | See structure below | Dedicated artifact for AWF audit/observability logs (token usage, network policy, audit trail) | Uploaded by all firewall-enabled workflows; analyzed by `gh aw logs --artifacts firewall` |
| **firewall-logs/** | `/tmp/gh-aw/sandbox/firewall/logs/` | Network access logs in Squid format (when `network.firewall:` enabled) | Analyzed by `gh aw logs` command |
| **cache-memory/** | `/tmp/gh-aw/cache-memory/` | Persistent agent memory across runs (when `tools.cache-memory:` configured) | Restored at start, saved at end via GitHub Actions cache |
| **patches/**, **sarif/**, **metadata/** | Various | Safe output data (git patches, SARIF files, metadata JSON) | Temporary, cleaned after processing |

### `firewall-audit-logs` Artifact Structure

The `firewall-audit-logs` artifact is a dedicated multi-file artifact uploaded by all firewall-enabled workflows. It is **separate** from the unified `agent` artifact. Downstream workflows that need token usage data or firewall audit logs must download this artifact specifically.

```
firewall-audit-logs/
├── api-proxy-logs/
│ └── token-usage.jsonl ← Token usage data per request
├── squid-logs/
│ └── access.log ← Network policy log (allow/deny)
├── audit.jsonl ← Firewall audit trail
└── policy-manifest.json ← Policy configuration snapshot
```

> **Tip:** Use `gh aw logs <run-id> --artifacts firewall` to download and analyze firewall data instead of `gh run download` directly. The CLI handles artifact naming and backward compatibility automatically. See the [Artifacts reference](/gh-aw/reference/artifacts/) for the complete artifact naming guide.

## MCP Server Integration

Model Context Protocol (MCP) servers provide tools to AI agents. Compilation generates `mcp-config.json` from workflow configuration.
Expand Down
59 changes: 59 additions & 0 deletions scratchpad/artifact-naming-compatibility.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,12 +31,71 @@ The `gh aw logs` and `gh aw audit` commands maintain full backward and forward c

## Compatibility Matrix

### Single-File Artifacts

These artifacts contain exactly one file and are flattened to the root directory by `flattenSingleFileArtifacts()`:

| Artifact Name (Old) | Artifact Name (New) | File in Artifact | After Flattening | CLI Expects |
|---------------------|---------------------|------------------|------------------|-------------|
| `aw_info.json` | `aw-info` | `aw_info.json` | `aw_info.json` | ✅ |
| `safe_output.jsonl` | `safe-output` | `safe_output.jsonl` | `safe_output.jsonl` | ✅ |
| `agent_output.json` | `agent-output` | `agent_output.json` | `agent_output.json` | ✅ |
| `prompt.txt` | `prompt` | `prompt.txt` | `prompt.txt` | ✅ |
| `threat-detection.log` | `detection` | `detection.log` | `detection.log` | ✅ |

### Multi-File Artifacts

These artifacts are initially downloaded by `gh run download` as directory trees that retain their internal structure. However, unlike the single-file artifact handling above, `gh aw logs` / `gh aw audit` may perform additional post-processing for some multi-file artifacts (notably `agent` and `activation`) to move expected files into the final layout used by the CLI.

| Artifact Name | Constant | Contents | Notes |
|---------------|----------|----------|-------|
| `firewall-audit-logs` | `constants.FirewallAuditArtifactName` | AWF structured audit/observability logs | Uploaded by all firewall-enabled workflows; retains directory structure after download |
| `agent` | `constants.AgentArtifactName` | Unified agent job outputs (logs, safe outputs, token usage) | Downloaded as a directory tree, then post-processed by CLI flattening/reorganization helpers |
| `activation` | `constants.ActivationArtifactName` | Activation job output (`aw_info.json`, `prompt.txt`) | Downloaded as a directory tree, then post-processed by CLI flattening helpers for downstream use |

#### `firewall-audit-logs` Directory Structure

The `firewall-audit-logs` artifact (constant: `constants.FirewallAuditArtifactName`) is uploaded by all firewall-enabled agentic workflows. It is **separate** from the `agent` artifact and must be downloaded independently.

```
firewall-audit-logs/
├── api-proxy-logs/
│ └── token-usage.jsonl ← Token usage data (input/output/cache tokens per request)
├── squid-logs/
│ └── access.log ← Network policy log (domain allow/deny decisions)
├── audit.jsonl ← Firewall audit trail (policy matches, rule evaluations)
└── policy-manifest.json ← Policy configuration snapshot
```

**Downloading firewall audit logs with `gh run download`:**

```bash
# Download only the firewall-audit-logs artifact
gh run download <run-id> -n firewall-audit-logs

# The data is then at:
# firewall-audit-logs/api-proxy-logs/token-usage.jsonl
# firewall-audit-logs/squid-logs/access.log
# firewall-audit-logs/audit.jsonl
# firewall-audit-logs/policy-manifest.json
```

**Recommended: Use `gh aw logs` instead of `gh run download`:**

The `gh aw logs` command knows the correct artifact names and handles backward compatibility automatically:

```bash
# Download and analyze all logs (including firewall data)
gh aw logs <run-id>

# Download only firewall artifacts
gh aw logs <run-id> --artifacts firewall

# Output as JSON for programmatic use
gh aw logs <run-id> --artifacts firewall --json
```

> **⚠️ Common mistake:** Downloading `agent-artifacts` or `agent` and expecting to find `token-usage.jsonl` there. Token usage data lives in the `firewall-audit-logs` artifact, not in the agent artifact.

## Testing

Expand Down