⚡ Claude Token Optimization2026-04-19 — security-guard

## Target Workflow: `security-guard`

**Source (redacted) Workflow run analysis from `/tmp/gh-aw/token-audit/claude-logs.json` (last 7 days, 4 runs)
**Estimated cost per run:** $0.35
**Total tokens per run:** ~429K
**Cache read rate:** ~99.997% (≈ all input served from cache after Turn 1 writes)
**Cache write rate:** ~48K tokens written per run (Turn 1)
**LLM turns:** avg 9.5 (range: 8–13)
**Model:** claude-sonnet-4-6

## Current Configuration

| Setting | Value |
|---------|-------|
| Tools loaded | `github` with `toolsets: [pull_requests, repos]` |
| Network groups | `github` only |
| Pre-agent steps | Yes — diff fetch + security-relevance file count |
| Prompt size | 6,768 bytes (~1,700 tokens) |
| Max turns | 10 |
| AGENTS.md (system context) | 26,704 bytes (~6,676 tokens) |

## Cost Breakdown Per Run

| Cost Driver | Tokens | Cost | % of Total |
|-------------|-------:|-----:|-----------:|
| Cache writes (Turn 1) | ~48,196 | $0.181 | **51%** |
| Cache reads (Turns 2–N) | ~377,373 | $0.113 | **32%** |
| Output tokens | ~3,835 | $0.058 | **16%** |
| Net new input | ~11 | <$0.001 | ~0% |
| **Total** | **~429K** | **$0.352** | 100% |

## Recommendations

### 1. Add Job-Level Skip for Non-Security PRs

**Estimated savings:** ~100% cost on non-security PRs (~80% of PRs touch no security-critical files)

The workflow already pre-computes `steps.security-relevance.outputs.security_files_changed`, but the agent still starts and uses 8–13 turns before calling `noop`. Add a dedicated job to compute relevance, then gate the agent job with `if: needs.check.outputs.count != '0'`:

```yaml
# In the compiled .lock.yml, add a gating job:
jobs:
  check-relevance:
    runs-on: ubuntu-latest
    outputs:
      security_files_changed: $\{\{ steps.check.outputs.count }}
    steps:
      - id: check
        run: |
          COUNT=$(gh api "repos/\$\{GITHUB_REPOSITORY}/pulls/$\{\{ github.event.pull_request.number }}/files" \
            --paginate --jq '.[].filename' \
            | grep -cE "host-iptables|setup-iptables|squid-config|docker-manager|seccomp-profile|domain-patterns|entrypoint\.sh|Dockerfile|containers/" || true)
          echo "count=$COUNT" >> "$GITHUB_OUTPUT"
        env:
          GH_TOKEN: $\{\{ github.token }}

  security-guard:
    needs: check-relevance
    if: needs.check-relevance.outputs.security_files_changed != '0'
    ...
```

Alternatively, move this into a `steps:` `if:` condition that sets an output that causes the framework to skip the agent entirely. The current approach leaves the decision to the LLM, which costs ~$0.35 even for a `noop`.

### 2. Reduce `max-turns` from 10 to 6

**Estimated savings:** ~$0.05/run (~14%) on runs that currently use the full turn budget

Average turns across the 4 observed runs is 9.5 (range 8–13). One run used 13 turns — the max is 10 for the framework but turns counted include tool calls within one agent turn. A proper security review of a focused diff should complete in 4–5 turns. Capping at 6 prevents runaway loops.

```yaml
engine:
  id: claude
  max-turns: 6   # was: 10
```

**Estimated savings per run:** ~3.5 fewer average turns
- Cache reads saved: ~377K × (3.5/9.5) = ~139K tokens × $0.30/M = **~$0.042**
- Output saved: ~3,835 × (3.5/9.5) = ~1,414 tokens × $15/M = **~$0.021**
- Total: **~$0.063/run (~18%)**

### 3. Reduce Prompt Verbosity

**Estimated savings:** ~$0.02/run (~5%) — affects Turn 1 cache write size

The "Security Checks" section in `security-guard.md` (~1,500 tokens) is exhaustive but many bullet points are redundant with general LLM security knowledge. Condense to a reference checklist:

**Current** (verbose):
```markdown
### iptables and Network Filtering
- Changes that add new ACCEPT rules without proper justification
- Removal or weakening of DROP/REJECT rules
- Changes to the firewall chain structure (FW_WRAPPER, DOCKER-USER)
- DNS exfiltration prevention bypasses (allowing arbitrary DNS servers)
- IPv6 filtering gaps that could allow bypasses
...
```

**Proposed** (condensed):
```markdown
## Security Checks
Check for: new ACCEPT rules, weakened DROP/REJECT, DNS bypass, IPv6 gaps,
ACL reordering in Squid, non-standard ports, capability additions (SYS_ADMIN/NET_RAW),
seccomp relaxations, resource limit removal, wildcard pattern abuse, command injection,
hardcoded secrets, disabled security env vars.
```

Removing 5 detailed subsections (~1,200 tokens) saves ~4,500 tokens/turn-1 × $3.75/M cache write = **~$0.017/run**, plus ~$0.001/turn in reads.

### 4. Trim Irrelevant AGENTS.md Sections from System Context

**Estimated savings:** ~$0.024/run (~7%)

The AGENTS.md (26,704 bytes, ~6,676 tokens) is included in every run as the repository's system context. For a security reviewer, large sections are irrelevant: "Log Streaming and Persistence", "Cleanup Lifecycle", "Log Analysis Commands", "Development Commands", "Local Installation".

These sections together represent ~3,000–4,000 tokens. Removing or trimming them would reduce the Turn 1 cache write by that amount:
- Savings on cache write: ~3,500 × $3.75/M = **~$0.013/run**
- Savings on cache reads (reused 8x): ~3,500 × 8 × $0.30/M = **~$0.008/run**
- Total: **~$0.021/run (~6%)**

However, AGENTS.md is shared across all workflows. Consider splitting security-critical context into a dedicated section and trimming development docs.

## Cache Analysis (Anthropic)

| Run ID | Turns | Cache Write | Cache Read | Output | Est. Cost |
|--------|------:|------------:|-----------:|-------:|----------:|
| 24615750573 | 8 | 53,970 | 274,933 | 2,258 | $0.319 |
| 24615841386 | 13 | 42,570 | 560,203 | 4,826 | $0.400 |
| 24615910447 | 8 | 53,669 | 315,159 | 1,758 | $0.322 |
| 24615927594 | 9 | 42,576 | 359,196 | 6,496 | $0.365 |
| **Avg** | **9.5** | **48,196** | **377,373** | **3,835** | **$0.352** |

**Cache write amortization:** Turn 1 writes ~48K tokens, which are then read ~8× across subsequent turns. This is excellent cache reuse (8:1 read:write ratio) and the caching strategy is working well *within* each run. The problem is that every new run re-writes the cache (Anthropic's 5-min TTL means cross-run reuse is unlikely for a scheduled or PR-triggered workflow).

**Cache write cost vs benefit analysis:** Cache writes cost $0.181/run and cache reads save approximately (48K × 8 turns × $3/1M input - 48K × 8 × $0.30/1M) = $0.144/run in read discount. Net caching benefit within a run is positive (+$0.144 saved - $0.181 write cost = **-$0.037/run net cost from caching**). Caching is currently a net negative for this workflow because each run is a fresh cold start. The only way to make caching pay is to reduce the content being cached — hence recommendations 3 and 4.

## Expected Impact

| Metric | Current | Projected | Savings |
|--------|---------|-----------|---------|
| Cost/run (security PRs) | $0.352 | $0.270 | -$0.082 (-23%) |
| Cost/run (non-security PRs) | $0.352 | ~$0.001 | -$0.351 (-99%) |
| LLM turns (security PRs) | 9.5 avg | 5–6 avg | -4 turns |
| Cache write tokens | ~48K | ~44K | -4K (-8%) |
| Monthly cost (est. 20 runs) | ~$7.04 | ~$1.50–$2.50 | -64–79% |

*Monthly estimate assumes ~20% of PRs touch security-critical files after gating job is added.*

## Implementation Checklist

- [ ] Add a gating job (or `if:` condition) in `security-guard.md` that prevents agent execution when `security_files_changed == 0`
- [ ] Reduce `max-turns: 10` → `max-turns: 6` in `security-guard.md`
- [ ] Condense the "Security Checks" section in `security-guard.md` to a compact checklist (~1,200 tokens removed)
- [ ] Trim "Log Streaming", "Cleanup Lifecycle", "Development Commands", and "Local Installation" sections from `AGENTS.md` (verify no other workflow needs them in-context)
- [ ] Recompile: `gh aw compile .github/workflows/security-guard.md`
- [ ] Post-process (if applicable): `npx tsx scripts/ci/postprocess-smoke-workflows.ts`
- [ ] Verify CI passes on a test PR that touches security-critical files
- [ ] Verify gating job skips the agent on a non-security PR
- [ ] Compare token usage on new run vs this baseline




> Generated by [Daily Claude Token Optimization Advisor](https://github.com/github/gh-aw-firewall/actions/runs/24638122342/agentic_workflow) · ● 478.5K · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw-firewall+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw-firewall%2Fclaude-token-optimizer%22&type=issues)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⚡ Claude Token Optimization2026-04-19 — security-guard #2100

Target Workflow: `security-guard`

Current Configuration

Cost Breakdown Per Run

Recommendations

1. Add Job-Level Skip for Non-Security PRs

2. Reduce `max-turns` from 10 to 6

3. Reduce Prompt Verbosity

4. Trim Irrelevant AGENTS.md Sections from System Context

Cache Analysis (Anthropic)

Expected Impact

Implementation Checklist

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Setting	Value
Tools loaded	`github` with `toolsets: [pull_requests, repos]`
Network groups	`github` only
Pre-agent steps	Yes — diff fetch + security-relevance file count
Prompt size	6,768 bytes (~1,700 tokens)
Max turns	10
AGENTS.md (system context)	26,704 bytes (~6,676 tokens)

Cost Driver	Tokens	Cost	% of Total
Cache writes (Turn 1)	~48,196	$0.181	51%
Cache reads (Turns 2–N)	~377,373	$0.113	32%
Output tokens	~3,835	$0.058	16%
Net new input	~11	<$0.001	~0%
Total	~429K	$0.352	100%

Run ID	Turns	Cache Write	Cache Read	Output	Est. Cost
24615750573	8	53,970	274,933	2,258	$0.319
24615841386	13	42,570	560,203	4,826	$0.400
24615910447	8	53,669	315,159	1,758	$0.322
24615927594	9	42,576	359,196	6,496	$0.365
Avg	9.5	48,196	377,373	3,835	$0.352

Metric	Current	Projected	Savings
Cost/run (security PRs)	$0.352	$0.270	-$0.082 (-23%)
Cost/run (non-security PRs)	$0.352	~$0.001	-$0.351 (-99%)
LLM turns (security PRs)	9.5 avg	5–6 avg	-4 turns
Cache write tokens	~48K	~44K	-4K (-8%)
Monthly cost (est. 20 runs)	~$7.04	~$1.50–$2.50	-64–79%

⚡ Claude Token Optimization2026-04-19 — security-guard #2100

Description

Target Workflow: security-guard

Current Configuration

Cost Breakdown Per Run

Recommendations

1. Add Job-Level Skip for Non-Security PRs

2. Reduce max-turns from 10 to 6

3. Reduce Prompt Verbosity

4. Trim Irrelevant AGENTS.md Sections from System Context

Cache Analysis (Anthropic)

Expected Impact

Implementation Checklist

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Target Workflow: `security-guard`

2. Reduce `max-turns` from 10 to 6