Target Workflow: security-guard
**Source (redacted) Workflow run analysis from /tmp/gh-aw/token-audit/claude-logs.json (last 7 days, 4 runs)
Estimated cost per run: $0.35
Total tokens per run: ~429K
Cache read rate: ~99.997% (≈ all input served from cache after Turn 1 writes)
Cache write rate: ~48K tokens written per run (Turn 1)
LLM turns: avg 9.5 (range: 8–13)
Model: claude-sonnet-4-6
Current Configuration
| Setting |
Value |
| Tools loaded |
github with toolsets: [pull_requests, repos] |
| Network groups |
github only |
| Pre-agent steps |
Yes — diff fetch + security-relevance file count |
| Prompt size |
6,768 bytes (~1,700 tokens) |
| Max turns |
10 |
| AGENTS.md (system context) |
26,704 bytes (~6,676 tokens) |
Cost Breakdown Per Run
| Cost Driver |
Tokens |
Cost |
% of Total |
| Cache writes (Turn 1) |
~48,196 |
$0.181 |
51% |
| Cache reads (Turns 2–N) |
~377,373 |
$0.113 |
32% |
| Output tokens |
~3,835 |
$0.058 |
16% |
| Net new input |
~11 |
<$0.001 |
~0% |
| Total |
~429K |
$0.352 |
100% |
Recommendations
1. Add Job-Level Skip for Non-Security PRs
Estimated savings: ~100% cost on non-security PRs (~80% of PRs touch no security-critical files)
The workflow already pre-computes steps.security-relevance.outputs.security_files_changed, but the agent still starts and uses 8–13 turns before calling noop. Add a dedicated job to compute relevance, then gate the agent job with if: needs.check.outputs.count != '0':
# In the compiled .lock.yml, add a gating job:
jobs:
check-relevance:
runs-on: ubuntu-latest
outputs:
security_files_changed: $\{\{ steps.check.outputs.count }}
steps:
- id: check
run: |
COUNT=$(gh api "repos/\$\{GITHUB_REPOSITORY}/pulls/$\{\{ github.event.pull_request.number }}/files" \
--paginate --jq '.[].filename' \
| grep -cE "host-iptables|setup-iptables|squid-config|docker-manager|seccomp-profile|domain-patterns|entrypoint\.sh|Dockerfile|containers/" || true)
echo "count=$COUNT" >> "$GITHUB_OUTPUT"
env:
GH_TOKEN: $\{\{ github.token }}
security-guard:
needs: check-relevance
if: needs.check-relevance.outputs.security_files_changed != '0'
...
Alternatively, move this into a steps: if: condition that sets an output that causes the framework to skip the agent entirely. The current approach leaves the decision to the LLM, which costs ~$0.35 even for a noop.
2. Reduce max-turns from 10 to 6
Estimated savings: ~$0.05/run (~14%) on runs that currently use the full turn budget
Average turns across the 4 observed runs is 9.5 (range 8–13). One run used 13 turns — the max is 10 for the framework but turns counted include tool calls within one agent turn. A proper security review of a focused diff should complete in 4–5 turns. Capping at 6 prevents runaway loops.
engine:
id: claude
max-turns: 6 # was: 10
Estimated savings per run: ~3.5 fewer average turns
- Cache reads saved: ~377K × (3.5/9.5) =
139K tokens × $0.30/M = **$0.042**
- Output saved: ~3,835 × (3.5/9.5) =
1,414 tokens × $15/M = **$0.021**
- Total: ~$0.063/run (~18%)
3. Reduce Prompt Verbosity
Estimated savings: ~$0.02/run (~5%) — affects Turn 1 cache write size
The "Security Checks" section in security-guard.md (~1,500 tokens) is exhaustive but many bullet points are redundant with general LLM security knowledge. Condense to a reference checklist:
Current (verbose):
### iptables and Network Filtering
- Changes that add new ACCEPT rules without proper justification
- Removal or weakening of DROP/REJECT rules
- Changes to the firewall chain structure (FW_WRAPPER, DOCKER-USER)
- DNS exfiltration prevention bypasses (allowing arbitrary DNS servers)
- IPv6 filtering gaps that could allow bypasses
...
Proposed (condensed):
## Security Checks
Check for: new ACCEPT rules, weakened DROP/REJECT, DNS bypass, IPv6 gaps,
ACL reordering in Squid, non-standard ports, capability additions (SYS_ADMIN/NET_RAW),
seccomp relaxations, resource limit removal, wildcard pattern abuse, command injection,
hardcoded secrets, disabled security env vars.
Removing 5 detailed subsections (~1,200 tokens) saves 4,500 tokens/turn-1 × $3.75/M cache write = **$0.017/run**, plus ~$0.001/turn in reads.
4. Trim Irrelevant AGENTS.md Sections from System Context
Estimated savings: ~$0.024/run (~7%)
The AGENTS.md (26,704 bytes, ~6,676 tokens) is included in every run as the repository's system context. For a security reviewer, large sections are irrelevant: "Log Streaming and Persistence", "Cleanup Lifecycle", "Log Analysis Commands", "Development Commands", "Local Installation".
These sections together represent ~3,000–4,000 tokens. Removing or trimming them would reduce the Turn 1 cache write by that amount:
- Savings on cache write:
3,500 × $3.75/M = **$0.013/run**
- Savings on cache reads (reused 8x):
3,500 × 8 × $0.30/M = **$0.008/run**
- Total: ~$0.021/run (~6%)
However, AGENTS.md is shared across all workflows. Consider splitting security-critical context into a dedicated section and trimming development docs.
Cache Analysis (Anthropic)
| Run ID |
Turns |
Cache Write |
Cache Read |
Output |
Est. Cost |
| 24615750573 |
8 |
53,970 |
274,933 |
2,258 |
$0.319 |
| 24615841386 |
13 |
42,570 |
560,203 |
4,826 |
$0.400 |
| 24615910447 |
8 |
53,669 |
315,159 |
1,758 |
$0.322 |
| 24615927594 |
9 |
42,576 |
359,196 |
6,496 |
$0.365 |
| Avg |
9.5 |
48,196 |
377,373 |
3,835 |
$0.352 |
Cache write amortization: Turn 1 writes ~48K tokens, which are then read ~8× across subsequent turns. This is excellent cache reuse (8:1 read:write ratio) and the caching strategy is working well within each run. The problem is that every new run re-writes the cache (Anthropic's 5-min TTL means cross-run reuse is unlikely for a scheduled or PR-triggered workflow).
Cache write cost vs benefit analysis: Cache writes cost $0.181/run and cache reads save approximately (48K × 8 turns × $3/1M input - 48K × 8 × $0.30/1M) = $0.144/run in read discount. Net caching benefit within a run is positive (+$0.144 saved - $0.181 write cost = -$0.037/run net cost from caching). Caching is currently a net negative for this workflow because each run is a fresh cold start. The only way to make caching pay is to reduce the content being cached — hence recommendations 3 and 4.
Expected Impact
| Metric |
Current |
Projected |
Savings |
| Cost/run (security PRs) |
$0.352 |
$0.270 |
-$0.082 (-23%) |
| Cost/run (non-security PRs) |
$0.352 |
~$0.001 |
-$0.351 (-99%) |
| LLM turns (security PRs) |
9.5 avg |
5–6 avg |
-4 turns |
| Cache write tokens |
~48K |
~44K |
-4K (-8%) |
| Monthly cost (est. 20 runs) |
~$7.04 |
~$1.50–$2.50 |
-64–79% |
Monthly estimate assumes ~20% of PRs touch security-critical files after gating job is added.
Implementation Checklist
Generated by Daily Claude Token Optimization Advisor · ● 478.5K · ◷
Target Workflow:
security-guard**Source (redacted) Workflow run analysis from
/tmp/gh-aw/token-audit/claude-logs.json(last 7 days, 4 runs)Estimated cost per run: $0.35
Total tokens per run: ~429K
Cache read rate: ~99.997% (≈ all input served from cache after Turn 1 writes)
Cache write rate: ~48K tokens written per run (Turn 1)
LLM turns: avg 9.5 (range: 8–13)
Model: claude-sonnet-4-6
Current Configuration
githubwithtoolsets: [pull_requests, repos]githubonlyCost Breakdown Per Run
Recommendations
1. Add Job-Level Skip for Non-Security PRs
Estimated savings: ~100% cost on non-security PRs (~80% of PRs touch no security-critical files)
The workflow already pre-computes
steps.security-relevance.outputs.security_files_changed, but the agent still starts and uses 8–13 turns before callingnoop. Add a dedicated job to compute relevance, then gate the agent job withif: needs.check.outputs.count != '0':Alternatively, move this into a
steps:if:condition that sets an output that causes the framework to skip the agent entirely. The current approach leaves the decision to the LLM, which costs ~$0.35 even for anoop.2. Reduce
max-turnsfrom 10 to 6Estimated savings: ~$0.05/run (~14%) on runs that currently use the full turn budget
Average turns across the 4 observed runs is 9.5 (range 8–13). One run used 13 turns — the max is 10 for the framework but turns counted include tool calls within one agent turn. A proper security review of a focused diff should complete in 4–5 turns. Capping at 6 prevents runaway loops.
Estimated savings per run: ~3.5 fewer average turns
139K tokens × $0.30/M = **$0.042**1,414 tokens × $15/M = **$0.021**3. Reduce Prompt Verbosity
Estimated savings: ~$0.02/run (~5%) — affects Turn 1 cache write size
The "Security Checks" section in
security-guard.md(~1,500 tokens) is exhaustive but many bullet points are redundant with general LLM security knowledge. Condense to a reference checklist:Current (verbose):
Proposed (condensed):
## Security Checks Check for: new ACCEPT rules, weakened DROP/REJECT, DNS bypass, IPv6 gaps, ACL reordering in Squid, non-standard ports, capability additions (SYS_ADMIN/NET_RAW), seccomp relaxations, resource limit removal, wildcard pattern abuse, command injection, hardcoded secrets, disabled security env vars.Removing 5 detailed subsections (~1,200 tokens) saves
4,500 tokens/turn-1 × $3.75/M cache write = **$0.017/run**, plus ~$0.001/turn in reads.4. Trim Irrelevant AGENTS.md Sections from System Context
Estimated savings: ~$0.024/run (~7%)
The AGENTS.md (26,704 bytes, ~6,676 tokens) is included in every run as the repository's system context. For a security reviewer, large sections are irrelevant: "Log Streaming and Persistence", "Cleanup Lifecycle", "Log Analysis Commands", "Development Commands", "Local Installation".
These sections together represent ~3,000–4,000 tokens. Removing or trimming them would reduce the Turn 1 cache write by that amount:
3,500 × $3.75/M = **$0.013/run**3,500 × 8 × $0.30/M = **$0.008/run**However, AGENTS.md is shared across all workflows. Consider splitting security-critical context into a dedicated section and trimming development docs.
Cache Analysis (Anthropic)
Cache write amortization: Turn 1 writes ~48K tokens, which are then read ~8× across subsequent turns. This is excellent cache reuse (8:1 read:write ratio) and the caching strategy is working well within each run. The problem is that every new run re-writes the cache (Anthropic's 5-min TTL means cross-run reuse is unlikely for a scheduled or PR-triggered workflow).
Cache write cost vs benefit analysis: Cache writes cost $0.181/run and cache reads save approximately (48K × 8 turns × $3/1M input - 48K × 8 × $0.30/1M) = $0.144/run in read discount. Net caching benefit within a run is positive (+$0.144 saved - $0.181 write cost = -$0.037/run net cost from caching). Caching is currently a net negative for this workflow because each run is a fresh cold start. The only way to make caching pay is to reduce the content being cached — hence recommendations 3 and 4.
Expected Impact
Monthly estimate assumes ~20% of PRs touch security-critical files after gating job is added.
Implementation Checklist
if:condition) insecurity-guard.mdthat prevents agent execution whensecurity_files_changed == 0max-turns: 10→max-turns: 6insecurity-guard.mdsecurity-guard.mdto a compact checklist (~1,200 tokens removed)AGENTS.md(verify no other workflow needs them in-context)gh aw compile .github/workflows/security-guard.mdnpx tsx scripts/ci/postprocess-smoke-workflows.ts