Target Workflow: secret-digger-claude.md
Source report: #1967
Estimated cost per run: $0.51
Total tokens per run: ~126K (77K raw; token_usage field includes overhead multiplier)
Cache read rate: ~49% of context (within-run only; no cross-run reuse)
Cache write rate: ~50% of context
LLM turns: 3 (1 main agent + 2 threat detection)
Current Configuration
| Setting |
Value |
| Tools loaded |
bash, cache-memory (minimal — already optimal) |
| Network groups |
defaults only |
| Pre-agent steps |
None |
| Prompt size |
~5,850 chars (user content) + ~145K chars framework injections → ~38K token system prompt |
| Models |
Main agent: claude-haiku-4-5-20251001 (1 turn) · Threat detection: claude-sonnet-4-6 (2 turns) |
Turn Architecture
This workflow runs two separate claude subprocess invocations:
- Main agent (
claude --model claude-haiku-4-5-20251001 --max-turns 4): Uses Haiku ✅ — runs the security investigation (bash commands, file reads). Makes 1 turn.
- Threat detection (
claude — no --model flag): Defaults to claude-sonnet-4-6 ❌ — runs post-hoc validation. Makes 2 turns.
The GH_AW_MODEL_AGENT_CLAUDE: "claude-haiku-4-5-20251001" env var in engine.env correctly controls the main agent. The threat detection step reads its model from $\{\{ vars.GH_AW_MODEL_DETECTION_CLAUDE || '' }} — a repository variable, not the frontmatter env. Since that variable is unset, the detection step falls back to Sonnet.
Token Analysis: The Cache Write Problem
Every run pays a cache write penalty of ~38K tokens for the full system prompt. Because runs are spaced ~1 hour apart and the Anthropic cache TTL is ~5 minutes, zero cross-run cache reuse occurs — each run is a cold start.
Within a single run, however, the cache works efficiently: the detection step's Turn 1 writes ~38K tokens, and Turn 2 reads them back (≈1.0× within-run efficiency).
Per-Turn Cache Breakdown (avg over 5 runs)
| Turn |
Model |
Input |
Output |
Cache Read |
Cache Write |
Naive Cost |
| T1 (main agent) |
Haiku |
123 |
~87 |
0 |
0 |
~$0.001 |
| T2 (detection) |
Sonnet |
~2 |
~450 |
0 |
~38,990 |
~$0.153 |
| T3 (detection) |
Sonnet |
~2 |
~400 |
~37,927 |
~0 |
~$0.017 |
| Total |
|
~127 |
~937 |
~37,927 |
~38,990 |
~$0.171 |
Note: Reported cost ($0.51) is ~3× the naive calculation — consistent with a cost estimator applying higher effective rates for the context overhead in token_usage (125,840) vs the raw summary sum (77,728). The 3× ratio is stable across all 5 runs (σ < 1%), so savings estimates below apply proportionally.
Cache write amortization: Turn 2's 38K cache write is read once by Turn 3. That's a 1:1 write-to-read ratio within the run. Since no cross-run reads occur, each run's cache write is effectively single-use.
Cache cost vs benefit (Sonnet): Writing 38K tokens at $3.75/M = ~$0.145. Reading them at $0.30/M = ~$0.011. Break-even requires ~12.5× reads per write. With only 1 intra-run read, caching saves ~$0.011 but costs ~$0.145 — a net loss of ~$0.134 vs uncached if the output could be regenerated more cheaply. The saving grace: moving to Haiku reduces the cache write cost by 3.75×.
Recommendations
1. Set GH_AW_MODEL_DETECTION_CLAUDE repository variable
Estimated savings: ~$0.37/run (~73%)
The threat detection step (2 of 3 API calls) uses Sonnet by default because vars.GH_AW_MODEL_DETECTION_CLAUDE is unset in the repository. The task — running bash commands and analyzing sandbox observations — does not require frontier reasoning; Haiku handles it equally well.
No code change required. Set a repository variable:
Repository Settings → Secrets and variables → Actions → Variables
Name: GH_AW_MODEL_DETECTION_CLAUDE
Value: claude-haiku-4-5-20251001
All Anthropic token rates scale by exactly 1/3.75 when switching from Sonnet to Haiku:
- Cache write: $3.75/M → $1.00/M
- Cache read: $0.30/M → $0.08/M
- Output: $15.00/M → $4.00/M
Projected cost per run: $0.51 × (1/3.75) ≈ $0.14/run
2. Remove Task tool from both claude invocations
Estimated savings: Risk mitigation (prevents Sonnet sub-agent spawns)
The compiled lock file's --allowed-tools includes Task for both the main agent and threat detection steps. The Task tool allows Claude to spawn sub-agents — which may use the default model (Sonnet), bypassing the --model flag. The Secret Digger doesn't need sub-agent spawning; it's a linear investigation workflow.
Fix: In .github/workflows/shared/secret-audit.md, the current tools block is:
tools:
cache-memory: true
bash: true
The Task tool is included by gh-aw as a default Claude tool, not via the tools: block. To explicitly exclude it, check if gh-aw supports a tools.task: false option when it becomes available. Until then, monitor whether Task is ever invoked in runs (it hasn't appeared in tool usage logs for the 5 analyzed runs, suggesting Claude isn't using it).
3. Condense secret-audit.md prompt
Estimated savings: ~$0.002/run (<1%) — low priority
The 10-item investigation areas list (5,366 chars) is verbose. The Emergency Exit Rule section (turn budget guidance) is particularly long relative to its value. Trimming to ~2,500 chars saves ~720 tokens.
At Haiku cache write rate ($1.00/M): 720 tokens × $1.00/M × 3 (multiplier) = ~$0.002/run. Negligible.
Only worthwhile if the prompt is causing model confusion (e.g., if the agent is ignoring the "ONE deep area" focus instruction and spreading across all 10 areas — which poor_agentic_control flags have suggested in other Claude workflows).
4. Remove version-reporting.md import
Estimated savings: ~$0.0004/run — trivial
488 chars / ~120 tokens. The version information (cli_version from the lock file) doesn't add investigation value for a security scan. Remove this import to shrink the system prompt slightly.
Expected Impact
| Metric |
Current |
Projected |
Savings |
| Cost/run |
$0.51 |
~$0.14 |
−73% |
| Cost per 5-run session |
$2.55 |
~$0.69 |
−$1.86 |
| Monthly cost (est. 20 runs) |
~$10.20 |
~$2.76 |
−$7.44 |
| Cache write cost/run |
~$0.145 |
~$0.039 |
−73% |
| Model quality |
Sonnet detection |
Haiku detection |
✅ Adequate for bash-based investigation |
Implementation Checklist
Technical Notes
The GH_AW_MODEL_AGENT_CLAUDE: "claude-haiku-4-5-20251001" env var in the frontmatter already correctly overrides the main agent model. The pattern \$\{GH_AW_MODEL_AGENT_CLAUDE:+ --model "$GH_AW_MODEL_AGENT_CLAUDE"} in the compiled lock file appends --model claude-haiku-4-5-20251001 to the main claude invocation. The threat detection invocation uses the analogous \$\{GH_AW_MODEL_DETECTION_CLAUDE:+ --model "$GH_AW_MODEL_DETECTION_CLAUDE"} pattern — but reads its value from $\{\{ vars.GH_AW_MODEL_DETECTION_CLAUDE || '' }} (a GitHub Actions variable, not the frontmatter env), which evaluates to empty string when the variable is unset.
Generated by Daily Claude Token Optimization Advisor · ● 1.3M · ◷
Target Workflow:
secret-digger-claude.mdSource report: #1967
Estimated cost per run: $0.51
Total tokens per run: ~126K (77K raw; token_usage field includes overhead multiplier)
Cache read rate: ~49% of context (within-run only; no cross-run reuse)
Cache write rate: ~50% of context
LLM turns: 3 (1 main agent + 2 threat detection)
Current Configuration
bash,cache-memory(minimal — already optimal)defaultsonlyclaude-haiku-4-5-20251001(1 turn) · Threat detection:claude-sonnet-4-6(2 turns)Turn Architecture
This workflow runs two separate
claudesubprocess invocations:claude --model claude-haiku-4-5-20251001 --max-turns 4): Uses Haiku ✅ — runs the security investigation (bash commands, file reads). Makes 1 turn.claude— no--modelflag): Defaults toclaude-sonnet-4-6❌ — runs post-hoc validation. Makes 2 turns.The
GH_AW_MODEL_AGENT_CLAUDE: "claude-haiku-4-5-20251001"env var inengine.envcorrectly controls the main agent. The threat detection step reads its model from$\{\{ vars.GH_AW_MODEL_DETECTION_CLAUDE || '' }}— a repository variable, not the frontmatter env. Since that variable is unset, the detection step falls back to Sonnet.Token Analysis: The Cache Write Problem
Every run pays a cache write penalty of ~38K tokens for the full system prompt. Because runs are spaced ~1 hour apart and the Anthropic cache TTL is ~5 minutes, zero cross-run cache reuse occurs — each run is a cold start.
Within a single run, however, the cache works efficiently: the detection step's Turn 1 writes ~38K tokens, and Turn 2 reads them back (≈1.0× within-run efficiency).
Per-Turn Cache Breakdown (avg over 5 runs)
Cache write amortization: Turn 2's 38K cache write is read once by Turn 3. That's a 1:1 write-to-read ratio within the run. Since no cross-run reads occur, each run's cache write is effectively single-use.
Cache cost vs benefit (Sonnet): Writing 38K tokens at $3.75/M = ~$0.145. Reading them at $0.30/M = ~$0.011. Break-even requires ~12.5× reads per write. With only 1 intra-run read, caching saves ~$0.011 but costs ~$0.145 — a net loss of ~$0.134 vs uncached if the output could be regenerated more cheaply. The saving grace: moving to Haiku reduces the cache write cost by 3.75×.
Recommendations
1. Set
GH_AW_MODEL_DETECTION_CLAUDErepository variableEstimated savings: ~$0.37/run (~73%)
The threat detection step (2 of 3 API calls) uses Sonnet by default because
vars.GH_AW_MODEL_DETECTION_CLAUDEis unset in the repository. The task — running bash commands and analyzing sandbox observations — does not require frontier reasoning; Haiku handles it equally well.No code change required. Set a repository variable:
All Anthropic token rates scale by exactly 1/3.75 when switching from Sonnet to Haiku:
Projected cost per run: $0.51 × (1/3.75) ≈ $0.14/run
2. Remove
Tasktool from both claude invocationsEstimated savings: Risk mitigation (prevents Sonnet sub-agent spawns)
The compiled lock file's
--allowed-toolsincludesTaskfor both the main agent and threat detection steps. TheTasktool allows Claude to spawn sub-agents — which may use the default model (Sonnet), bypassing the--modelflag. The Secret Digger doesn't need sub-agent spawning; it's a linear investigation workflow.Fix: In
.github/workflows/shared/secret-audit.md, the current tools block is:The
Tasktool is included by gh-aw as a default Claude tool, not via thetools:block. To explicitly exclude it, check if gh-aw supports atools.task: falseoption when it becomes available. Until then, monitor whetherTaskis ever invoked in runs (it hasn't appeared in tool usage logs for the 5 analyzed runs, suggesting Claude isn't using it).3. Condense
secret-audit.mdpromptEstimated savings: ~$0.002/run (<1%) — low priority
The 10-item investigation areas list (5,366 chars) is verbose. The
Emergency Exit Rulesection (turn budget guidance) is particularly long relative to its value. Trimming to ~2,500 chars saves ~720 tokens.At Haiku cache write rate ($1.00/M): 720 tokens × $1.00/M × 3 (multiplier) = ~$0.002/run. Negligible.
Only worthwhile if the prompt is causing model confusion (e.g., if the agent is ignoring the "ONE deep area" focus instruction and spreading across all 10 areas — which
poor_agentic_controlflags have suggested in other Claude workflows).4. Remove
version-reporting.mdimportEstimated savings: ~$0.0004/run — trivial
488 chars / ~120 tokens. The version information (
cli_versionfrom the lock file) doesn't add investigation value for a security scan. Remove this import to shrink the system prompt slightly.Expected Impact
Implementation Checklist
GH_AW_MODEL_DETECTION_CLAUDE = claude-haiku-4-5-20251001in repository settings (no code change needed)secret-audit.mdto remove the verbose 10-item area list and collapse into 5 focus areasversion-reporting.mdimport fromsecret-digger-claude.mdgh aw compile .github/workflows/secret-digger-claude.mdand post-process withnpx tsx scripts/ci/postprocess-smoke-workflows.tsTechnical Notes
The
GH_AW_MODEL_AGENT_CLAUDE: "claude-haiku-4-5-20251001"env var in the frontmatter already correctly overrides the main agent model. The pattern\$\{GH_AW_MODEL_AGENT_CLAUDE:+ --model "$GH_AW_MODEL_AGENT_CLAUDE"}in the compiled lock file appends--model claude-haiku-4-5-20251001to the mainclaudeinvocation. The threat detection invocation uses the analogous\$\{GH_AW_MODEL_DETECTION_CLAUDE:+ --model "$GH_AW_MODEL_DETECTION_CLAUDE"}pattern — but reads its value from$\{\{ vars.GH_AW_MODEL_DETECTION_CLAUDE || '' }}(a GitHub Actions variable, not the frontmatter env), which evaluates to empty string when the variable is unset.