⚡ Claude Token Optimization2026-04-12 — Secret Digger (Claude)

## Target Workflow: `secret-digger-claude.md`

**Source report:** #1951
**Estimated cost per run:** $0.51 avg (range $0.48–$0.54)
**Total cost this period:** $10.71 (21 runs on 2026-04-05)
**Total tokens per run:** ~126K (38K cache write + 38K cache read + ~1K output + ~127 input)
**Cache read rate:** 49% (0.97× write/read ratio — effectively no cross-run reuse)
**Cache write rate:** 100% of context per run
**LLM turns:** 3.0 avg (1 Haiku triage + 2 Sonnet main)
**Model:** `claude-sonnet-4-6` (main) + `claude-haiku-4-5-20251001` (triage)
**Share of total period cost:** 63% ($10.71 of $16.86)

---

## Current Configuration

| Setting | Value |
|---------|-------|
| Tools loaded | 19 (`Bash`, `BashOutput`, `Edit`, `Edit(/cache-memory/*)`, `ExitPlanMode`, `Glob`, `Grep`, `KillBash`, `LS`, `MultiEdit`, `MultiEdit(/cache-memory/*)`, `NotebookEdit`, `NotebookRead`, `Read`, `Read(/cache-memory/*)`, `Task`, `TodoWrite`, `Write`, `Write(/cache-memory/*)`) |
| Tools actually used | Unknown (no `tool_usage` data) — likely: `Bash`, `Read`, `Write(/cache-memory/*)`, `Read(/cache-memory/*)` |
| Network groups | `defaults` (inherited from `shared/secret-audit.md`) |
| Pre-agent steps | No |
| Prompt size | ~5,900 bytes user content (secret-audit.md + version-reporting.md); ~38K tokens total context (dominated by framework system prompts + tool schemas) |
| max-turns | 8 (actual usage: always 3) |

---

## Root Cause Analysis

The cost profile is dominated by Anthropic **cache write charges on every run**:

| Turn | Model | Input | Output | Cache Read | Cache Write |
|------|-------|------:|-------:|-----------:|------------:|
| 1 | Haiku (triage) | 123 | ~91 | 0 | 0 |
| 2 | Sonnet 4.6 | ~4 | ~430 | 0 | **~38,731** |
| 3 | Sonnet 4.6 | ~0 | ~430 | **37,927** | ~500 |

Turn 2 writes the full 38K-token context (framework system prompts + tool schemas + user prompt) to Anthropic's prompt cache. Turn 3 reads it back. **This within-run reuse is the only caching that occurs.**

**Why no cross-run cache reuse?** Anthropic's cache TTL is ~5 minutes. The Secret Digger schedule runs hourly — every run finds a cold cache and pays the full cache write cost again. The 0.97× cache reuse ratio in the report confirms this: each run writes ~39K and reads only ~38K (from the _same_ run's Turn 2), not from prior runs.

**Implied cache write cost for `claude-sonnet-4-6`:** Reverse-engineering from observed $0.51/run with ~38.7K cache write tokens implies **~$11.12/M tokens** — approximately 3× the claude-3.5-sonnet rate of $3.75/M. This is the primary cost driver.

---

## Recommendations

### 1. Switch main agent to Haiku (96% cost reduction)

**Estimated savings:** ~$0.49/run (~$10.35 per 21-run period)

The Secret Digger task is bash-based security exploration: running `env`, `ps aux`, `find`, inspecting `/proc`, reading files. This is **read-only shell forensics** — it does not require Sonnet-level reasoning. Haiku already handles the triage turn; it can handle the full investigation too.

Haiku cache write pricing is ~37× cheaper than the implied Sonnet 4.6 rate:
- Sonnet 4.6 cache write: ~$11.12/M (implied from data)
- Haiku cache write: ~$0.30/M

**Implementation — Option A: workflow-level override** (preferred, scoped to this workflow only):

Edit `.github/workflows/secret-digger-claude.md`:

```yaml
engine:
  id: claude
  max-turns: 4        # also reduce from 8 (see rec #2)
  env:
    BASH_DEFAULT_TIMEOUT_MS: "1800000"
    BASH_MAX_TIMEOUT_MS: "1800000"
    GH_AW_MODEL_AGENT_CLAUDE: "claude-haiku-4-5-20251001"
```

> **Note:** `GH_AW_MODEL_AGENT_CLAUDE` is read by the lock file as `\$\{GH_AW_MODEL_AGENT_CLAUDE:+ --model "$GH_AW_MODEL_AGENT_CLAUDE"}` — setting it via `engine.env` injects it into the agent environment. Verify this takes precedence over the `vars.GH_AW_MODEL_AGENT_CLAUDE` repo variable in the compiled lock.

**Implementation — Option B: repo variable** (affects all Claude workflows):

Set the GitHub Actions repository variable `GH_AW_MODEL_AGENT_CLAUDE` to `claude-haiku-4-5-20251001`. This is simpler but affects Smoke Claude and Security Guard too. Use Option A if you want this scoped.

**Quality consideration:** The workflow prompt in `shared/secret-audit.md` already instructs 6–8 focused tool calls per run and uses cache-memory to maintain state across runs. Haiku is well-suited to this structured, bash-heavy task. The final report is written to a GitHub Issue — review a few Haiku-generated issues to confirm quality meets the bar before committing.

---

### 2. Lower `max-turns` from 8 to 4

**Estimated savings:** $0 token savings (turns are bounded by task completion, not the limit)
**Risk reduction:** Prevents runaway turn escalation (all 21 runs completed in exactly 3 turns)

In `.github/workflows/secret-digger-claude.md`:

```yaml
engine:
  id: claude
  max-turns: 4    # was 8; actual usage is always 3
```

---

### 3. Remove unused tools: `NotebookEdit`, `NotebookRead`, `Task`

**Estimated savings:** ~1,500–2,100 tokens/run from smaller tool schema (~3–5% of cache write)

Security scanning never needs Jupyter notebook editing or sub-agent spawning. These tools are included by the `bash: true` tool config but add schema tokens to every prompt.

Check whether gh-aw supports a tool exclusion syntax (not currently documented in `secret-digger-claude.md`). If supported, add to the workflow:

```yaml
tools:
  cache-memory: true
  bash: true
  bash-exclude:           # hypothetical — verify syntax in gh-aw docs
    - NotebookEdit
    - NotebookRead
    - Task
    - TodoWrite
    - ExitPlanMode
  github: false
```

If exclusion isn't supported, this is a framework feature request. Each removed tool saves ~500–700 schema tokens × 2 turns × 21 runs = up to 88K tokens/period.

---

### 4. Trim `shared/secret-audit.md` (minor)

**Estimated savings:** ~600 tokens/run (~1.5% of cache write)

The investigation prompt is 5,366 bytes (~1,340 tokens) with 10 numbered investigation areas plus emergency exit rules, workflow steps, and reporting instructions. Consolidate redundant sections:

- Merge steps 1–3 of "Investigation Workflow" into one sentence (they restate the prompt body)
- Abbreviate the 10 investigation areas into a compact list (agent already knows these techniques)
- Remove the "Emergency Exit Rule" section — the `max-turns` config enforces this structurally

Estimated 40–50% size reduction of `secret-audit.md` → ~670 token savings × 2 turns = ~1,340 tokens/run.

---

## Cache Analysis (Anthropic-Specific)

| Turn | Model | Input | Output | Cache Read | Cache Write | Net New |
|------|-------|------:|-------:|-----------:|------------:|--------:|
| 1 | Haiku | 123 | ~91 | 0 | 0 | 214 |
| 2 | Sonnet 4.6 | ~4 | ~430 | 0 | 38,731 | 38,735 |
| 3 | Sonnet 4.6 | ~0 | ~430 | 37,927 | ~500 | ~500 |

**Cache write amortization:** Turn 2's 38,731 cache write tokens are read back ONCE by Turn 3. Cost/benefit per run: write cost ≈ $0.43 (at implied $11.12/M), read savings ≈ $0.034 (at $0.88/M implied read price). **The cache costs 13× more per run than it saves** — it is purely a within-run mechanism and the TTL-vs-schedule mismatch makes it structurally inefficient at this cadence.

**Cache write grows across runs:** `cache_write` ranges from 38,526 to 40,088 across the 21 runs (min→max), suggesting cache-memory state accumulation adds ~1–2K tokens to context over time. This will continue growing slowly as investigation history accumulates.

---

## Expected Impact

| Metric | Current | Projected (Haiku + max-turns 4) | Savings |
|--------|---------|--------------------------------|---------|
| Cost/run | $0.51 | ~$0.018 | -96% |
| Cache write cost/run | ~$0.43 | ~$0.012 | -97% |
| Cost for 21-run period | $10.71 | ~$0.38 | ~$10.33 |
| LLM turns | 3 | 3 (unchanged) | 0 |
| Max runaway turns | 8 | 4 | -50% |
| Token volume | 126K/run | ~39K effective | -69% |

---

## Implementation Checklist

- [ ] **Decide on model approach**: workflow-level `env` override (Option A) vs repo variable (Option B)
- [ ] Edit `.github/workflows/secret-digger-claude.md`: set Haiku model, lower `max-turns` to 4
- [ ] Recompile: `gh aw compile .github/workflows/secret-digger-claude.md`
- [ ] Post-process: `npx tsx scripts/ci/postprocess-smoke-workflows.ts`
- [ ] Trigger 2–3 manual runs (`workflow_dispatch`) and verify investigation quality in created issues
- [ ] Compare `estimated_cost` in next token usage report vs $0.51 baseline
- [ ] (Optional) Investigate tool exclusion syntax with gh-aw team; trim `shared/secret-audit.md`
- [ ] (Optional) Investigate growing `cache_write` trend — consider pruning old cache-memory entries if context grows beyond ~45K tokens




> Generated by [Daily Claude Token Optimization Advisor](https://github.com/github/gh-aw-firewall/actions/runs/24315540348/agentic_workflow) · ● 856.9K · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw-firewall+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw-firewall%2Fclaude-token-optimizer%22&type=issues)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⚡ Claude Token Optimization2026-04-12 — Secret Digger (Claude) #1953

Target Workflow: `secret-digger-claude.md`

Current Configuration

Root Cause Analysis

Recommendations

1. Switch main agent to Haiku (96% cost reduction)

2. Lower `max-turns` from 8 to 4

3. Remove unused tools: `NotebookEdit`, `NotebookRead`, `Task`

4. Trim `shared/secret-audit.md` (minor)

Cache Analysis (Anthropic-Specific)

Expected Impact

Implementation Checklist

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Setting	Value
Tools loaded	19 (`Bash`, `BashOutput`, `Edit`, `Edit(/cache-memory/)`, `ExitPlanMode`, `Glob`, `Grep`, `KillBash`, `LS`, `MultiEdit`, `MultiEdit(/cache-memory/)`, `NotebookEdit`, `NotebookRead`, `Read`, `Read(/cache-memory/)`, `Task`, `TodoWrite`, `Write`, `Write(/cache-memory/)`)
Tools actually used	Unknown (no `tool_usage` data) — likely: `Bash`, `Read`, `Write(/cache-memory/)`, `Read(/cache-memory/)`
Network groups	`defaults` (inherited from `shared/secret-audit.md`)
Pre-agent steps	No
Prompt size	~5,900 bytes user content (secret-audit.md + version-reporting.md); ~38K tokens total context (dominated by framework system prompts + tool schemas)
max-turns	8 (actual usage: always 3)

Turn	Model	Input	Output	Cache Read	Cache Write
1	Haiku (triage)	123	~91	0	0
2	Sonnet 4.6	~4	~430	0	~38,731
3	Sonnet 4.6	~0	~430	37,927	~500

Turn	Model	Input	Output	Cache Read	Cache Write	Net New
1	Haiku	123	~91	0	0	214
2	Sonnet 4.6	~4	~430	0	38,731	38,735
3	Sonnet 4.6	~0	~430	37,927	~500	~500

Metric	Current	Projected (Haiku + max-turns 4)	Savings
Cost/run	$0.51	~$0.018	-96%
Cache write cost/run	~$0.43	~$0.012	-97%
Cost for 21-run period	$10.71	~$0.38	~$10.33
LLM turns	3	3 (unchanged)	0
Max runaway turns	8	4	-50%
Token volume	126K/run	~39K effective	-69%

⚡ Claude Token Optimization2026-04-12 — Secret Digger (Claude) #1953

Description

Target Workflow: secret-digger-claude.md

Current Configuration

Root Cause Analysis

Recommendations

1. Switch main agent to Haiku (96% cost reduction)

2. Lower max-turns from 8 to 4

3. Remove unused tools: NotebookEdit, NotebookRead, Task

4. Trim shared/secret-audit.md (minor)

Cache Analysis (Anthropic-Specific)

Expected Impact

Implementation Checklist

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Target Workflow: `secret-digger-claude.md`

2. Lower `max-turns` from 8 to 4

3. Remove unused tools: `NotebookEdit`, `NotebookRead`, `Task`

4. Trim `shared/secret-audit.md` (minor)