Target Workflow: secret-digger-claude.md
Source report: #1951
Estimated cost per run: $0.51 avg (range $0.48–$0.54)
Total cost this period: $10.71 (21 runs on 2026-04-05)
Total tokens per run: ~126K (38K cache write + 38K cache read + ~1K output + ~127 input)
Cache read rate: 49% (0.97× write/read ratio — effectively no cross-run reuse)
Cache write rate: 100% of context per run
LLM turns: 3.0 avg (1 Haiku triage + 2 Sonnet main)
Model: claude-sonnet-4-6 (main) + claude-haiku-4-5-20251001 (triage)
Share of total period cost: 63% ($10.71 of $16.86)
Current Configuration
| Setting |
Value |
| Tools loaded |
19 (Bash, BashOutput, Edit, Edit(/cache-memory/*), ExitPlanMode, Glob, Grep, KillBash, LS, MultiEdit, MultiEdit(/cache-memory/*), NotebookEdit, NotebookRead, Read, Read(/cache-memory/*), Task, TodoWrite, Write, Write(/cache-memory/*)) |
| Tools actually used |
Unknown (no tool_usage data) — likely: Bash, Read, Write(/cache-memory/*), Read(/cache-memory/*) |
| Network groups |
defaults (inherited from shared/secret-audit.md) |
| Pre-agent steps |
No |
| Prompt size |
~5,900 bytes user content (secret-audit.md + version-reporting.md); ~38K tokens total context (dominated by framework system prompts + tool schemas) |
| max-turns |
8 (actual usage: always 3) |
Root Cause Analysis
The cost profile is dominated by Anthropic cache write charges on every run:
| Turn |
Model |
Input |
Output |
Cache Read |
Cache Write |
| 1 |
Haiku (triage) |
123 |
~91 |
0 |
0 |
| 2 |
Sonnet 4.6 |
~4 |
~430 |
0 |
~38,731 |
| 3 |
Sonnet 4.6 |
~0 |
~430 |
37,927 |
~500 |
Turn 2 writes the full 38K-token context (framework system prompts + tool schemas + user prompt) to Anthropic's prompt cache. Turn 3 reads it back. This within-run reuse is the only caching that occurs.
Why no cross-run cache reuse? Anthropic's cache TTL is ~5 minutes. The Secret Digger schedule runs hourly — every run finds a cold cache and pays the full cache write cost again. The 0.97× cache reuse ratio in the report confirms this: each run writes ~39K and reads only ~38K (from the same run's Turn 2), not from prior runs.
Implied cache write cost for claude-sonnet-4-6: Reverse-engineering from observed $0.51/run with 38.7K cache write tokens implies **$11.12/M tokens** — approximately 3× the claude-3.5-sonnet rate of $3.75/M. This is the primary cost driver.
Recommendations
1. Switch main agent to Haiku (96% cost reduction)
Estimated savings: $0.49/run ($10.35 per 21-run period)
The Secret Digger task is bash-based security exploration: running env, ps aux, find, inspecting /proc, reading files. This is read-only shell forensics — it does not require Sonnet-level reasoning. Haiku already handles the triage turn; it can handle the full investigation too.
Haiku cache write pricing is ~37× cheaper than the implied Sonnet 4.6 rate:
- Sonnet 4.6 cache write: ~$11.12/M (implied from data)
- Haiku cache write: ~$0.30/M
Implementation — Option A: workflow-level override (preferred, scoped to this workflow only):
Edit .github/workflows/secret-digger-claude.md:
engine:
id: claude
max-turns: 4 # also reduce from 8 (see rec #2)
env:
BASH_DEFAULT_TIMEOUT_MS: "1800000"
BASH_MAX_TIMEOUT_MS: "1800000"
GH_AW_MODEL_AGENT_CLAUDE: "claude-haiku-4-5-20251001"
Note: GH_AW_MODEL_AGENT_CLAUDE is read by the lock file as \$\{GH_AW_MODEL_AGENT_CLAUDE:+ --model "$GH_AW_MODEL_AGENT_CLAUDE"} — setting it via engine.env injects it into the agent environment. Verify this takes precedence over the vars.GH_AW_MODEL_AGENT_CLAUDE repo variable in the compiled lock.
Implementation — Option B: repo variable (affects all Claude workflows):
Set the GitHub Actions repository variable GH_AW_MODEL_AGENT_CLAUDE to claude-haiku-4-5-20251001. This is simpler but affects Smoke Claude and Security Guard too. Use Option A if you want this scoped.
Quality consideration: The workflow prompt in shared/secret-audit.md already instructs 6–8 focused tool calls per run and uses cache-memory to maintain state across runs. Haiku is well-suited to this structured, bash-heavy task. The final report is written to a GitHub Issue — review a few Haiku-generated issues to confirm quality meets the bar before committing.
2. Lower max-turns from 8 to 4
Estimated savings: $0 token savings (turns are bounded by task completion, not the limit)
Risk reduction: Prevents runaway turn escalation (all 21 runs completed in exactly 3 turns)
In .github/workflows/secret-digger-claude.md:
engine:
id: claude
max-turns: 4 # was 8; actual usage is always 3
3. Remove unused tools: NotebookEdit, NotebookRead, Task
Estimated savings: ~1,500–2,100 tokens/run from smaller tool schema (~3–5% of cache write)
Security scanning never needs Jupyter notebook editing or sub-agent spawning. These tools are included by the bash: true tool config but add schema tokens to every prompt.
Check whether gh-aw supports a tool exclusion syntax (not currently documented in secret-digger-claude.md). If supported, add to the workflow:
tools:
cache-memory: true
bash: true
bash-exclude: # hypothetical — verify syntax in gh-aw docs
- NotebookEdit
- NotebookRead
- Task
- TodoWrite
- ExitPlanMode
github: false
If exclusion isn't supported, this is a framework feature request. Each removed tool saves ~500–700 schema tokens × 2 turns × 21 runs = up to 88K tokens/period.
4. Trim shared/secret-audit.md (minor)
Estimated savings: ~600 tokens/run (~1.5% of cache write)
The investigation prompt is 5,366 bytes (~1,340 tokens) with 10 numbered investigation areas plus emergency exit rules, workflow steps, and reporting instructions. Consolidate redundant sections:
- Merge steps 1–3 of "Investigation Workflow" into one sentence (they restate the prompt body)
- Abbreviate the 10 investigation areas into a compact list (agent already knows these techniques)
- Remove the "Emergency Exit Rule" section — the
max-turns config enforces this structurally
Estimated 40–50% size reduction of secret-audit.md → ~670 token savings × 2 turns = ~1,340 tokens/run.
Cache Analysis (Anthropic-Specific)
| Turn |
Model |
Input |
Output |
Cache Read |
Cache Write |
Net New |
| 1 |
Haiku |
123 |
~91 |
0 |
0 |
214 |
| 2 |
Sonnet 4.6 |
~4 |
~430 |
0 |
38,731 |
38,735 |
| 3 |
Sonnet 4.6 |
~0 |
~430 |
37,927 |
~500 |
~500 |
Cache write amortization: Turn 2's 38,731 cache write tokens are read back ONCE by Turn 3. Cost/benefit per run: write cost ≈ $0.43 (at implied $11.12/M), read savings ≈ $0.034 (at $0.88/M implied read price). The cache costs 13× more per run than it saves — it is purely a within-run mechanism and the TTL-vs-schedule mismatch makes it structurally inefficient at this cadence.
Cache write grows across runs: cache_write ranges from 38,526 to 40,088 across the 21 runs (min→max), suggesting cache-memory state accumulation adds ~1–2K tokens to context over time. This will continue growing slowly as investigation history accumulates.
Expected Impact
| Metric |
Current |
Projected (Haiku + max-turns 4) |
Savings |
| Cost/run |
$0.51 |
~$0.018 |
-96% |
| Cache write cost/run |
~$0.43 |
~$0.012 |
-97% |
| Cost for 21-run period |
$10.71 |
~$0.38 |
~$10.33 |
| LLM turns |
3 |
3 (unchanged) |
0 |
| Max runaway turns |
8 |
4 |
-50% |
| Token volume |
126K/run |
~39K effective |
-69% |
Implementation Checklist
Generated by Daily Claude Token Optimization Advisor · ● 856.9K · ◷
Target Workflow:
secret-digger-claude.mdSource report: #1951
Estimated cost per run: $0.51 avg (range $0.48–$0.54)
Total cost this period: $10.71 (21 runs on 2026-04-05)
Total tokens per run: ~126K (38K cache write + 38K cache read + ~1K output + ~127 input)
Cache read rate: 49% (0.97× write/read ratio — effectively no cross-run reuse)
Cache write rate: 100% of context per run
LLM turns: 3.0 avg (1 Haiku triage + 2 Sonnet main)
Model:
claude-sonnet-4-6(main) +claude-haiku-4-5-20251001(triage)Share of total period cost: 63% ($10.71 of $16.86)
Current Configuration
Bash,BashOutput,Edit,Edit(/cache-memory/*),ExitPlanMode,Glob,Grep,KillBash,LS,MultiEdit,MultiEdit(/cache-memory/*),NotebookEdit,NotebookRead,Read,Read(/cache-memory/*),Task,TodoWrite,Write,Write(/cache-memory/*))tool_usagedata) — likely:Bash,Read,Write(/cache-memory/*),Read(/cache-memory/*)defaults(inherited fromshared/secret-audit.md)Root Cause Analysis
The cost profile is dominated by Anthropic cache write charges on every run:
Turn 2 writes the full 38K-token context (framework system prompts + tool schemas + user prompt) to Anthropic's prompt cache. Turn 3 reads it back. This within-run reuse is the only caching that occurs.
Why no cross-run cache reuse? Anthropic's cache TTL is ~5 minutes. The Secret Digger schedule runs hourly — every run finds a cold cache and pays the full cache write cost again. The 0.97× cache reuse ratio in the report confirms this: each run writes ~39K and reads only ~38K (from the same run's Turn 2), not from prior runs.
Implied cache write cost for
claude-sonnet-4-6: Reverse-engineering from observed $0.51/run with38.7K cache write tokens implies **$11.12/M tokens** — approximately 3× the claude-3.5-sonnet rate of $3.75/M. This is the primary cost driver.Recommendations
1. Switch main agent to Haiku (96% cost reduction)
Estimated savings:
$0.49/run ($10.35 per 21-run period)The Secret Digger task is bash-based security exploration: running
env,ps aux,find, inspecting/proc, reading files. This is read-only shell forensics — it does not require Sonnet-level reasoning. Haiku already handles the triage turn; it can handle the full investigation too.Haiku cache write pricing is ~37× cheaper than the implied Sonnet 4.6 rate:
Implementation — Option A: workflow-level override (preferred, scoped to this workflow only):
Edit
.github/workflows/secret-digger-claude.md:Implementation — Option B: repo variable (affects all Claude workflows):
Set the GitHub Actions repository variable
GH_AW_MODEL_AGENT_CLAUDEtoclaude-haiku-4-5-20251001. This is simpler but affects Smoke Claude and Security Guard too. Use Option A if you want this scoped.Quality consideration: The workflow prompt in
shared/secret-audit.mdalready instructs 6–8 focused tool calls per run and uses cache-memory to maintain state across runs. Haiku is well-suited to this structured, bash-heavy task. The final report is written to a GitHub Issue — review a few Haiku-generated issues to confirm quality meets the bar before committing.2. Lower
max-turnsfrom 8 to 4Estimated savings: $0 token savings (turns are bounded by task completion, not the limit)
Risk reduction: Prevents runaway turn escalation (all 21 runs completed in exactly 3 turns)
In
.github/workflows/secret-digger-claude.md:3. Remove unused tools:
NotebookEdit,NotebookRead,TaskEstimated savings: ~1,500–2,100 tokens/run from smaller tool schema (~3–5% of cache write)
Security scanning never needs Jupyter notebook editing or sub-agent spawning. These tools are included by the
bash: truetool config but add schema tokens to every prompt.Check whether gh-aw supports a tool exclusion syntax (not currently documented in
secret-digger-claude.md). If supported, add to the workflow:If exclusion isn't supported, this is a framework feature request. Each removed tool saves ~500–700 schema tokens × 2 turns × 21 runs = up to 88K tokens/period.
4. Trim
shared/secret-audit.md(minor)Estimated savings: ~600 tokens/run (~1.5% of cache write)
The investigation prompt is 5,366 bytes (~1,340 tokens) with 10 numbered investigation areas plus emergency exit rules, workflow steps, and reporting instructions. Consolidate redundant sections:
max-turnsconfig enforces this structurallyEstimated 40–50% size reduction of
secret-audit.md→ ~670 token savings × 2 turns = ~1,340 tokens/run.Cache Analysis (Anthropic-Specific)
Cache write amortization: Turn 2's 38,731 cache write tokens are read back ONCE by Turn 3. Cost/benefit per run: write cost ≈ $0.43 (at implied $11.12/M), read savings ≈ $0.034 (at $0.88/M implied read price). The cache costs 13× more per run than it saves — it is purely a within-run mechanism and the TTL-vs-schedule mismatch makes it structurally inefficient at this cadence.
Cache write grows across runs:
cache_writeranges from 38,526 to 40,088 across the 21 runs (min→max), suggesting cache-memory state accumulation adds ~1–2K tokens to context over time. This will continue growing slowly as investigation history accumulates.Expected Impact
Implementation Checklist
envoverride (Option A) vs repo variable (Option B).github/workflows/secret-digger-claude.md: set Haiku model, lowermax-turnsto 4gh aw compile .github/workflows/secret-digger-claude.mdnpx tsx scripts/ci/postprocess-smoke-workflows.tsworkflow_dispatch) and verify investigation quality in created issuesestimated_costin next token usage report vs $0.51 baselineshared/secret-audit.mdcache_writetrend — consider pruning old cache-memory entries if context grows beyond ~45K tokens