Skip to content

⚡ Claude Token Optimization2026-04-19 — security-guard #2100

@github-actions

Description

@github-actions

Target Workflow: security-guard

**Source (redacted) Workflow run analysis from /tmp/gh-aw/token-audit/claude-logs.json (last 7 days, 4 runs)
Estimated cost per run: $0.35
Total tokens per run: ~429K
Cache read rate: ~99.997% (≈ all input served from cache after Turn 1 writes)
Cache write rate: ~48K tokens written per run (Turn 1)
LLM turns: avg 9.5 (range: 8–13)
Model: claude-sonnet-4-6

Current Configuration

Setting Value
Tools loaded github with toolsets: [pull_requests, repos]
Network groups github only
Pre-agent steps Yes — diff fetch + security-relevance file count
Prompt size 6,768 bytes (~1,700 tokens)
Max turns 10
AGENTS.md (system context) 26,704 bytes (~6,676 tokens)

Cost Breakdown Per Run

Cost Driver Tokens Cost % of Total
Cache writes (Turn 1) ~48,196 $0.181 51%
Cache reads (Turns 2–N) ~377,373 $0.113 32%
Output tokens ~3,835 $0.058 16%
Net new input ~11 <$0.001 ~0%
Total ~429K $0.352 100%

Recommendations

1. Add Job-Level Skip for Non-Security PRs

Estimated savings: ~100% cost on non-security PRs (~80% of PRs touch no security-critical files)

The workflow already pre-computes steps.security-relevance.outputs.security_files_changed, but the agent still starts and uses 8–13 turns before calling noop. Add a dedicated job to compute relevance, then gate the agent job with if: needs.check.outputs.count != '0':

# In the compiled .lock.yml, add a gating job:
jobs:
  check-relevance:
    runs-on: ubuntu-latest
    outputs:
      security_files_changed: $\{\{ steps.check.outputs.count }}
    steps:
      - id: check
        run: |
          COUNT=$(gh api "repos/\$\{GITHUB_REPOSITORY}/pulls/$\{\{ github.event.pull_request.number }}/files" \
            --paginate --jq '.[].filename' \
            | grep -cE "host-iptables|setup-iptables|squid-config|docker-manager|seccomp-profile|domain-patterns|entrypoint\.sh|Dockerfile|containers/" || true)
          echo "count=$COUNT" >> "$GITHUB_OUTPUT"
        env:
          GH_TOKEN: $\{\{ github.token }}

  security-guard:
    needs: check-relevance
    if: needs.check-relevance.outputs.security_files_changed != '0'
    ...

Alternatively, move this into a steps: if: condition that sets an output that causes the framework to skip the agent entirely. The current approach leaves the decision to the LLM, which costs ~$0.35 even for a noop.

2. Reduce max-turns from 10 to 6

Estimated savings: ~$0.05/run (~14%) on runs that currently use the full turn budget

Average turns across the 4 observed runs is 9.5 (range 8–13). One run used 13 turns — the max is 10 for the framework but turns counted include tool calls within one agent turn. A proper security review of a focused diff should complete in 4–5 turns. Capping at 6 prevents runaway loops.

engine:
  id: claude
  max-turns: 6   # was: 10

Estimated savings per run: ~3.5 fewer average turns

  • Cache reads saved: ~377K × (3.5/9.5) = 139K tokens × $0.30/M = **$0.042**
  • Output saved: ~3,835 × (3.5/9.5) = 1,414 tokens × $15/M = **$0.021**
  • Total: ~$0.063/run (~18%)

3. Reduce Prompt Verbosity

Estimated savings: ~$0.02/run (~5%) — affects Turn 1 cache write size

The "Security Checks" section in security-guard.md (~1,500 tokens) is exhaustive but many bullet points are redundant with general LLM security knowledge. Condense to a reference checklist:

Current (verbose):

### iptables and Network Filtering
- Changes that add new ACCEPT rules without proper justification
- Removal or weakening of DROP/REJECT rules
- Changes to the firewall chain structure (FW_WRAPPER, DOCKER-USER)
- DNS exfiltration prevention bypasses (allowing arbitrary DNS servers)
- IPv6 filtering gaps that could allow bypasses
...

Proposed (condensed):

## Security Checks
Check for: new ACCEPT rules, weakened DROP/REJECT, DNS bypass, IPv6 gaps,
ACL reordering in Squid, non-standard ports, capability additions (SYS_ADMIN/NET_RAW),
seccomp relaxations, resource limit removal, wildcard pattern abuse, command injection,
hardcoded secrets, disabled security env vars.

Removing 5 detailed subsections (~1,200 tokens) saves 4,500 tokens/turn-1 × $3.75/M cache write = **$0.017/run**, plus ~$0.001/turn in reads.

4. Trim Irrelevant AGENTS.md Sections from System Context

Estimated savings: ~$0.024/run (~7%)

The AGENTS.md (26,704 bytes, ~6,676 tokens) is included in every run as the repository's system context. For a security reviewer, large sections are irrelevant: "Log Streaming and Persistence", "Cleanup Lifecycle", "Log Analysis Commands", "Development Commands", "Local Installation".

These sections together represent ~3,000–4,000 tokens. Removing or trimming them would reduce the Turn 1 cache write by that amount:

  • Savings on cache write: 3,500 × $3.75/M = **$0.013/run**
  • Savings on cache reads (reused 8x): 3,500 × 8 × $0.30/M = **$0.008/run**
  • Total: ~$0.021/run (~6%)

However, AGENTS.md is shared across all workflows. Consider splitting security-critical context into a dedicated section and trimming development docs.

Cache Analysis (Anthropic)

Run ID Turns Cache Write Cache Read Output Est. Cost
24615750573 8 53,970 274,933 2,258 $0.319
24615841386 13 42,570 560,203 4,826 $0.400
24615910447 8 53,669 315,159 1,758 $0.322
24615927594 9 42,576 359,196 6,496 $0.365
Avg 9.5 48,196 377,373 3,835 $0.352

Cache write amortization: Turn 1 writes ~48K tokens, which are then read ~8× across subsequent turns. This is excellent cache reuse (8:1 read:write ratio) and the caching strategy is working well within each run. The problem is that every new run re-writes the cache (Anthropic's 5-min TTL means cross-run reuse is unlikely for a scheduled or PR-triggered workflow).

Cache write cost vs benefit analysis: Cache writes cost $0.181/run and cache reads save approximately (48K × 8 turns × $3/1M input - 48K × 8 × $0.30/1M) = $0.144/run in read discount. Net caching benefit within a run is positive (+$0.144 saved - $0.181 write cost = -$0.037/run net cost from caching). Caching is currently a net negative for this workflow because each run is a fresh cold start. The only way to make caching pay is to reduce the content being cached — hence recommendations 3 and 4.

Expected Impact

Metric Current Projected Savings
Cost/run (security PRs) $0.352 $0.270 -$0.082 (-23%)
Cost/run (non-security PRs) $0.352 ~$0.001 -$0.351 (-99%)
LLM turns (security PRs) 9.5 avg 5–6 avg -4 turns
Cache write tokens ~48K ~44K -4K (-8%)
Monthly cost (est. 20 runs) ~$7.04 ~$1.50–$2.50 -64–79%

Monthly estimate assumes ~20% of PRs touch security-critical files after gating job is added.

Implementation Checklist

  • Add a gating job (or if: condition) in security-guard.md that prevents agent execution when security_files_changed == 0
  • Reduce max-turns: 10max-turns: 6 in security-guard.md
  • Condense the "Security Checks" section in security-guard.md to a compact checklist (~1,200 tokens removed)
  • Trim "Log Streaming", "Cleanup Lifecycle", "Development Commands", and "Local Installation" sections from AGENTS.md (verify no other workflow needs them in-context)
  • Recompile: gh aw compile .github/workflows/security-guard.md
  • Post-process (if applicable): npx tsx scripts/ci/postprocess-smoke-workflows.ts
  • Verify CI passes on a test PR that touches security-critical files
  • Verify gating job skips the agent on a non-security PR
  • Compare token usage on new run vs this baseline

Generated by Daily Claude Token Optimization Advisor · ● 478.5K ·

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions