⚡ Claude Token Optimization2026-04-13 — Secret Digger (Claude)

## Target Workflow: `secret-digger-claude.md`

**Source report:** #1967
**Estimated cost per run:** $0.51
**Total tokens per run:** ~126K (77K raw; token_usage field includes overhead multiplier)
**Cache read rate:** ~49% of context (within-run only; no cross-run reuse)
**Cache write rate:** ~50% of context
**LLM turns:** 3 (1 main agent + 2 threat detection)

---

## Current Configuration

| Setting | Value |
|---------|-------|
| Tools loaded | `bash`, `cache-memory` (minimal — already optimal) |
| Network groups | `defaults` only |
| Pre-agent steps | None |
| Prompt size | ~5,850 chars (user content) + ~145K chars framework injections → **~38K token system prompt** |
| Models | Main agent: `claude-haiku-4-5-20251001` (1 turn) · Threat detection: `claude-sonnet-4-6` (2 turns) |

### Turn Architecture

This workflow runs **two separate `claude` subprocess invocations**:

1. **Main agent** (`claude --model claude-haiku-4-5-20251001 --max-turns 4`): Uses Haiku ✅ — runs the security investigation (bash commands, file reads). Makes 1 turn.
2. **Threat detection** (`claude` — no `--model` flag): Defaults to `claude-sonnet-4-6` ❌ — runs post-hoc validation. Makes 2 turns.

The `GH_AW_MODEL_AGENT_CLAUDE: "claude-haiku-4-5-20251001"` env var in `engine.env` correctly controls the main agent. The threat detection step reads its model from `$\{\{ vars.GH_AW_MODEL_DETECTION_CLAUDE || '' }}` — a **repository variable**, not the frontmatter env. Since that variable is unset, the detection step falls back to Sonnet.

---

## Token Analysis: The Cache Write Problem

Every run pays a cache write penalty of ~38K tokens for the full system prompt. Because runs are spaced ~1 hour apart and the Anthropic cache TTL is ~5 minutes, **zero cross-run cache reuse occurs** — each run is a cold start.

Within a single run, however, the cache works efficiently: the detection step's Turn 1 writes ~38K tokens, and Turn 2 reads them back (≈1.0× within-run efficiency).

### Per-Turn Cache Breakdown (avg over 5 runs)

| Turn | Model | Input | Output | Cache Read | Cache Write | Naive Cost |
|------|-------|------:|-------:|-----------:|------------:|-----------:|
| T1 (main agent) | Haiku | 123 | ~87 | 0 | 0 | ~$0.001 |
| T2 (detection) | **Sonnet** | ~2 | ~450 | 0 | ~38,990 | **~$0.153** |
| T3 (detection) | **Sonnet** | ~2 | ~400 | ~37,927 | ~0 | **~$0.017** |
| **Total** | | **~127** | **~937** | **~37,927** | **~38,990** | **~$0.171** |

> **Note:** Reported cost ($0.51) is ~3× the naive calculation — consistent with a cost estimator applying higher effective rates for the context overhead in `token_usage` (125,840) vs the raw summary sum (77,728). The 3× ratio is stable across all 5 runs (σ < 1%), so savings estimates below apply proportionally.

**Cache write amortization:** Turn 2's 38K cache write is read once by Turn 3. That's a 1:1 write-to-read ratio *within* the run. Since no cross-run reads occur, each run's cache write is effectively single-use.

**Cache cost vs benefit (Sonnet):** Writing 38K tokens at $3.75/M = ~$0.145. Reading them at $0.30/M = ~$0.011. Break-even requires ~12.5× reads per write. With only 1 intra-run read, caching saves ~$0.011 but costs ~$0.145 — a **net loss of ~$0.134 vs uncached** if the output could be regenerated more cheaply. The saving grace: moving to Haiku reduces the cache write cost by 3.75×.

---

## Recommendations

### 1. Set `GH_AW_MODEL_DETECTION_CLAUDE` repository variable

**Estimated savings: ~$0.37/run (~73%)**

The threat detection step (2 of 3 API calls) uses Sonnet by default because `vars.GH_AW_MODEL_DETECTION_CLAUDE` is unset in the repository. The task — running bash commands and analyzing sandbox observations — does not require frontier reasoning; Haiku handles it equally well.

**No code change required.** Set a repository variable:

```
Repository Settings → Secrets and variables → Actions → Variables
Name:  GH_AW_MODEL_DETECTION_CLAUDE
Value: claude-haiku-4-5-20251001
```

All Anthropic token rates scale by exactly **1/3.75** when switching from Sonnet to Haiku:
- Cache write: $3.75/M → $1.00/M
- Cache read: $0.30/M → $0.08/M
- Output: $15.00/M → $4.00/M

Projected cost per run: **$0.51 × (1/3.75) ≈ $0.14/run**

### 2. Remove `Task` tool from both claude invocations

**Estimated savings: Risk mitigation (prevents Sonnet sub-agent spawns)**

The compiled lock file's `--allowed-tools` includes `Task` for both the main agent and threat detection steps. The `Task` tool allows Claude to spawn sub-agents — which may use the default model (Sonnet), bypassing the `--model` flag. The Secret Digger doesn't need sub-agent spawning; it's a linear investigation workflow.

**Fix:** In `.github/workflows/shared/secret-audit.md`, the current tools block is:
```yaml
tools:
  cache-memory: true
  bash: true
```

The `Task` tool is included by gh-aw as a default Claude tool, not via the `tools:` block. To explicitly exclude it, check if gh-aw supports a `tools.task: false` option when it becomes available. Until then, monitor whether `Task` is ever invoked in runs (it hasn't appeared in tool usage logs for the 5 analyzed runs, suggesting Claude isn't using it).

### 3. Condense `secret-audit.md` prompt

**Estimated savings: ~$0.002/run (<1%) — low priority**

The 10-item investigation areas list (5,366 chars) is verbose. The `Emergency Exit Rule` section (turn budget guidance) is particularly long relative to its value. Trimming to ~2,500 chars saves ~720 tokens.

At Haiku cache write rate ($1.00/M): 720 tokens × $1.00/M × 3 (multiplier) = ~$0.002/run. Negligible.

**Only worthwhile if the prompt is causing model confusion** (e.g., if the agent is ignoring the "ONE deep area" focus instruction and spreading across all 10 areas — which `poor_agentic_control` flags have suggested in other Claude workflows).

### 4. Remove `version-reporting.md` import

**Estimated savings: ~$0.0004/run — trivial**

488 chars / ~120 tokens. The version information (`cli_version` from the lock file) doesn't add investigation value for a security scan. Remove this import to shrink the system prompt slightly.

---

## Expected Impact

| Metric | Current | Projected | Savings |
|--------|---------|-----------|---------|
| Cost/run | $0.51 | ~$0.14 | **−73%** |
| Cost per 5-run session | $2.55 | ~$0.69 | **−$1.86** |
| Monthly cost (est. 20 runs) | ~$10.20 | ~$2.76 | **−$7.44** |
| Cache write cost/run | ~$0.145 | ~$0.039 | −73% |
| Model quality | Sonnet detection | Haiku detection | ✅ Adequate for bash-based investigation |

---

## Implementation Checklist

- [ ] **Primary fix**: Set repo variable `GH_AW_MODEL_DETECTION_CLAUDE = claude-haiku-4-5-20251001` in repository settings (no code change needed)
- [ ] Verify subsequent run uses Haiku for all 3 turns in token report
- [ ] Optional: Trim `secret-audit.md` to remove the verbose 10-item area list and collapse into 5 focus areas
- [ ] Optional: Remove `version-reporting.md` import from `secret-digger-claude.md`
- [ ] If prompt changes are made: recompile with `gh aw compile .github/workflows/secret-digger-claude.md` and post-process with `npx tsx scripts/ci/postprocess-smoke-workflows.ts`
- [ ] Compare token usage on next run vs baseline ($0.51/run)

---

## Technical Notes

The `GH_AW_MODEL_AGENT_CLAUDE: "claude-haiku-4-5-20251001"` env var in the frontmatter already correctly overrides the main agent model. The pattern `\$\{GH_AW_MODEL_AGENT_CLAUDE:+ --model "$GH_AW_MODEL_AGENT_CLAUDE"}` in the compiled lock file appends `--model claude-haiku-4-5-20251001` to the main `claude` invocation. The threat detection invocation uses the analogous `\$\{GH_AW_MODEL_DETECTION_CLAUDE:+ --model "$GH_AW_MODEL_DETECTION_CLAUDE"}` pattern — but reads its value from `$\{\{ vars.GH_AW_MODEL_DETECTION_CLAUDE || '' }}` (a GitHub Actions variable, not the frontmatter env), which evaluates to empty string when the variable is unset.




> Generated by [Daily Claude Token Optimization Advisor](https://github.com/github/gh-aw-firewall/actions/runs/24365366709/agentic_workflow) · ● 1.3M · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw-firewall+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw-firewall%2Fclaude-token-optimizer%22&type=issues)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⚡ Claude Token Optimization2026-04-13 — Secret Digger (Claude) #1968

Target Workflow: `secret-digger-claude.md`

Current Configuration

Turn Architecture

Token Analysis: The Cache Write Problem

Per-Turn Cache Breakdown (avg over 5 runs)

Recommendations

1. Set `GH_AW_MODEL_DETECTION_CLAUDE` repository variable

2. Remove `Task` tool from both claude invocations

3. Condense `secret-audit.md` prompt

4. Remove `version-reporting.md` import

Expected Impact

Implementation Checklist

Technical Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Setting	Value
Tools loaded	`bash`, `cache-memory` (minimal — already optimal)
Network groups	`defaults` only
Pre-agent steps	None
Prompt size	~5,850 chars (user content) + ~145K chars framework injections → ~38K token system prompt
Models	Main agent: `claude-haiku-4-5-20251001` (1 turn) · Threat detection: `claude-sonnet-4-6` (2 turns)

Turn	Model	Input	Output	Cache Read	Cache Write	Naive Cost
T1 (main agent)	Haiku	123	~87	0	0	~$0.001
T2 (detection)	Sonnet	~2	~450	0	~38,990	~$0.153
T3 (detection)	Sonnet	~2	~400	~37,927	~0	~$0.017
Total		~127	~937	~37,927	~38,990	~$0.171

Metric	Current	Projected	Savings
Cost/run	$0.51	~$0.14	−73%
Cost per 5-run session	$2.55	~$0.69	−$1.86
Monthly cost (est. 20 runs)	~$10.20	~$2.76	−$7.44
Cache write cost/run	~$0.145	~$0.039	−73%
Model quality	Sonnet detection	Haiku detection	✅ Adequate for bash-based investigation

⚡ Claude Token Optimization2026-04-13 — Secret Digger (Claude) #1968

Description

Target Workflow: secret-digger-claude.md

Current Configuration

Turn Architecture

Token Analysis: The Cache Write Problem

Per-Turn Cache Breakdown (avg over 5 runs)

Recommendations

1. Set GH_AW_MODEL_DETECTION_CLAUDE repository variable

2. Remove Task tool from both claude invocations

3. Condense secret-audit.md prompt

4. Remove version-reporting.md import

Expected Impact

Implementation Checklist

Technical Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Target Workflow: `secret-digger-claude.md`

1. Set `GH_AW_MODEL_DETECTION_CLAUDE` repository variable

2. Remove `Task` tool from both claude invocations

3. Condense `secret-audit.md` prompt

4. Remove `version-reporting.md` import