⚡ Copilot Token Optimization2026-04-10 — Secret Digger (Copilot)

## Target Workflow: `secret-digger-copilot`

**Source report:** #1878
**Estimated cost per run (failure):** $5.03 | **per run (success):** $0.22–$0.57
**Weighted avg cost per run:** ~$1.87 (7 failures + 15 successes/day)
**Total estimated daily cost:** ~$41.18 (22 runs/day)
**Cache hit rate:** 33–47%
**LLM turns (failure runs):** ~31 requests | **(success runs):** 2–5 requests
**Model:** claude-sonnet-4.6 via copilot provider

> ⚠️ **Failures dominate cost**: 7 failure runs × $5.03 = ~$35.21 of ~$41.18 daily spend (86%). Success runs average just $0.40. The agent runs until the 30-minute timeout fires, grinding through all 10 investigation areas without a turn limit.

---

## Current Configuration

| Setting | Value |
|---------|-------|
| Tools loaded | 2 — `bash`, `cache-memory` (plus safeoutputs MCP tools) |
| Network groups | `defaults` only |
| Pre-agent steps | None |
| Prompt size | ~7,385 bytes total (secret-audit.md: 6,274 + workflow: 623 + shared: 488) |
| Max turns | **Not set** — agent runs until 30-minute job timeout |
| Run ID in system prompt | ✅ yes — injected into `<system>` block, breaks cross-run prefix caching |
| Run ID in user message | ✅ yes — duplicate, in `secret-digger-copilot.md` user section |
| Engine config | `engine: copilot` (simple string — no max-turns support yet applied) |

### Failure vs. Success Run Anatomy

| Metric | Failure Run | Success Run |
|--------|-------------|-------------|
| Requests | 31 | 2–5 |
| Total tokens | 2,579K | 103K–279K |
| Cost | $5.03 | $0.22–$0.57 |
| Agent duration | ~7 min | ~2 min |
| Detection job | runs (finds threat → fails) | skipped |

The failure run (ID [24213936643](https://github.com/github/gh-aw-firewall/actions/runs/24213936643)) shows the agent job ran 21:21→21:28 (7 min, success), then the detection job ran 21:28→21:30 (2 min, failure because the detection agent confirmed a threat). All 31 requests are distributed across both agent and detection invocations.

---

## Recommendations

### 1. Add `max-turns` limit to the engine config

**Estimated savings: ~$25–28/day (~65–68% of daily Secret Digger spend)**

The workflow currently uses `engine: copilot` (simple string), which compiles without a turn cap. With no limit, failure runs spend 31 turns on exhaustive investigation. Capping at 8 turns limits cost to ~26% of current failure-run tokens.

**Estimated per-failure-run savings**: $5.03 × 74% ≈ **$3.72** → 7 failures/day × $3.72 ≈ **$26/day**

**Change in `.github/workflows/secret-digger-copilot.md`:**

```diff
-engine: copilot
+engine:
+  id: copilot
+  max-turns: 8
```

The value `8` provides enough turns for:
- Turn 1: Load cache-memory state, select one investigation area
- Turns 2–6: Execute bash commands and observe results
- Turn 7: Update cache-memory with findings
- Turn 8: Call `noop` or `create_issue` to close the run

> Note: The detection job also has no `max-turns` and uses `--allow-all-tools`. Review `detect-inference-error` and the detection execution step for the same pattern. The detection Copilot invocation in the `detection` job (line ~1114 in the lock file) has a 20-minute timeout but no turn cap — consider adding the same limit there if you have control over detection template.

---

### 2. Remove the duplicate `Run ID` from the user message section

**Estimated savings: ~5% cache improvement across all runs**

`secret-digger-copilot.md` includes `- Run ID: $\{\{ github.run_id }}` in its user-facing section. The run ID is already injected into the `<github-context>` block inside `<system>` by the gh-aw framework. The duplicate serves no purpose and slightly inflates the unique-per-run portion of the prompt.

Additionally, the `- Workflow:` and `- Engine:` lines in the user section are fully static and already available to the agent from the system context.

**Change in `.github/workflows/secret-digger-copilot.md`:**

```diff
 ## Current Run Context
 
-- Repository: $\{\{ github.repository }}
-- Run ID: $\{\{ github.run_id }}
-- Workflow: $\{\{ github.workflow }}
-- Engine: GitHub Copilot
-- Runner: Check your environment carefully
+- Runner: Check your environment carefully
```

This removes 4 lines that are either already in `<system>` (repository, run_id, workflow) or static (engine). The remaining user message becomes the single-line instruction: `Begin your investigation now. Be creative, be thorough, and find those secrets!`

---

### 3. Trim `shared/secret-audit.md` investigation instructions

**Estimated savings: ~15–25K tokens per failure run (~$0.10–$0.20/failure run)**

The `shared/secret-audit.md` file is 6,274 bytes. It loads into every turn's context (system prompt is sent with every request). The verbose **Investigation Areas** section lists 10 numbered areas with multi-line descriptions — most of which are investigated based on cache-memory state, not the listing itself.

**Specific cuts (saves ~1,800–2,500 chars ≈ 450–625 tokens per turn, ~14–19K tokens over 31 turns):**

1. **Investigation Workflow steps 1–4** (~600 chars) can be condensed — the cache-memory tool prompt already explains where files live. Replace with:

```diff
-## Investigation Workflow
-
-1. **Load Previous State:**
-   - Read `/tmp/gh-aw/cache-memory/techniques.json` to see what you've tried
-   - Read `/tmp/gh-aw/cache-memory/findings.log` for previous discoveries
-   - Read `/tmp/gh-aw/cache-memory/areas_checked.txt` for checked locations
-
-2. **Select Techniques:**
-   - Choose at least 50% NEW techniques not in techniques.json
-   - Prioritize unexplored areas from areas_checked.txt
-   - Try creative combinations of multiple techniques
-
-3. **Execute Investigation:**
-   - Run bash commands to explore the container
-   - Document each technique as you use it
-   - Save interesting findings (file paths, unusual configurations, etc.)
-
-4. **Update Cache:**
-   - Append new techniques to techniques.json
-   - Log findings to findings.log
-   - Update areas_checked.txt with new locations explored
+## Investigation Workflow
+
+1. Read cache-memory state (techniques.json, findings.log, areas_checked.txt).
+2. Choose ≥50% NEW techniques. Prioritize unexplored areas.
+3. Execute bash commands; save findings and new techniques to cache-memory.
```

2. **Security Research Guidelines** section (~200 chars) is fully covered by the MISSION statement — remove it or reduce to one line.

3. **Background Knowledge Tracking** section (~250 chars) — reduce to one sentence since the cache-memory tool prompt already describes the directory.

---

### 4. Investigate root cause of the 32% failure rate

**Estimated savings: ~$20–25/day if failure rate drops to <10%**

The report notes 7/22 runs (32%) fail daily. Each failure costs $5.03 vs $0.40 for success. Reducing failures from 7/day to 2/day would save ~5 × $4.63 = **~$23/day**.

**Investigation steps:**
1. Download failure run artifacts: `./scripts/download-latest-artifact.sh 24213936643`
2. Check `agent-stdio.log` for what the agent found that triggered `create_issue`
3. Check `detection/detection.log` for why `parse_threat_detection_results` concluded "threat found"
4. Determine if the findings are true security boundary violations or false positives

If these are recurring false positives (e.g., the agent files an issue every time it sees a known-benign file), the fix would be in the investigation prompt — telling the agent to cross-reference against a known-baseline before filing. This would eliminate the detection job trigger and keep runs in the 2–5 request success profile.

---

## Expected Impact

| Metric | Current | After Rec. 1+2+3 | Savings |
|--------|---------|-------------------|---------|
| Tokens/failure run | 2,579K | ~680K | −74% |
| Cost/failure run | $5.03 | ~$1.36 | −73% |
| Daily cost (22 runs) | ~$41.18 | ~$16.18 | −$25/day (−61%) |
| Cache hit rate | 33–47% | 38–52% (est.) | +5–10% |
| LLM turns (failure) | 31 | ≤8 | −74% |
| Session time (failure) | ~9 min total | ~3–4 min total | −55% |

_Rec. 1 alone accounts for ~$25/day of the $25/day total savings. Recs. 2+3 add modest additional improvement._

---

## Implementation Checklist

- [ ] **[Highest impact]** Change `engine: copilot` to `engine: {id: copilot, max-turns: 8}` in `secret-digger-copilot.md`
- [ ] Remove duplicate `Run ID`, `Workflow`, `Engine`, `Repository` lines from user message in `secret-digger-copilot.md`
- [ ] Condense `Investigation Workflow` steps 1–4 in `shared/secret-audit.md` (shared with other Secret Digger variants)
- [ ] Remove `Security Research Guidelines` section from `shared/secret-audit.md`
- [ ] Investigate whether the 32% daily failure rate is caused by true positives or repeating false positives (see Rec. 4)
- [ ] Recompile: `gh aw compile .github/workflows/secret-digger-copilot.md`
- [ ] Post-process: `npx tsx scripts/ci/postprocess-smoke-workflows.ts`
- [ ] Trigger a manual run and compare request count vs. baseline (target: ≤8 requests even on failure path)
- [ ] After 24h, re-run the token usage analyzer and verify daily cost drops below ~$16

---

_Generated from report [#1878](https://github.com/github/gh-aw-firewall/issues/1878) — 2026-04-10 daily analysis_




> Generated by [Daily Copilot Token Optimization Advisor](https://github.com/github/gh-aw-firewall/actions/runs/24239331447/agentic_workflow) · ● 2.2M · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw-firewall+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw-firewall%2Fcopilot-token-optimizer%22&type=issues)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⚡ Copilot Token Optimization2026-04-10 — Secret Digger (Copilot) #1879

Target Workflow: `secret-digger-copilot`

Current Configuration

Failure vs. Success Run Anatomy

Recommendations

1. Add `max-turns` limit to the engine config

2. Remove the duplicate `Run ID` from the user message section

3. Trim `shared/secret-audit.md` investigation instructions

4. Investigate root cause of the 32% failure rate

Expected Impact

Implementation Checklist

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Setting	Value
Tools loaded	2 — `bash`, `cache-memory` (plus safeoutputs MCP tools)
Network groups	`defaults` only
Pre-agent steps	None
Prompt size	~7,385 bytes total (secret-audit.md: 6,274 + workflow: 623 + shared: 488)
Max turns	Not set — agent runs until 30-minute job timeout
Run ID in system prompt	✅ yes — injected into `<system>` block, breaks cross-run prefix caching
Run ID in user message	✅ yes — duplicate, in `secret-digger-copilot.md` user section
Engine config	`engine: copilot` (simple string — no max-turns support yet applied)

Metric	Failure Run	Success Run
Requests	31	2–5
Total tokens	2,579K	103K–279K
Cost	$5.03	$0.22–$0.57
Agent duration	~7 min	~2 min
Detection job	runs (finds threat → fails)	skipped

Metric	Current	After Rec. 1+2+3	Savings
Tokens/failure run	2,579K	~680K	−74%
Cost/failure run	$5.03	~$1.36	−73%
Daily cost (22 runs)	~$41.18	~$16.18	−$25/day (−61%)
Cache hit rate	33–47%	38–52% (est.)	+5–10%
LLM turns (failure)	31	≤8	−74%
Session time (failure)	~9 min total	~3–4 min total	−55%

⚡ Copilot Token Optimization2026-04-10 — Secret Digger (Copilot) #1879

Description

Target Workflow: secret-digger-copilot

Current Configuration

Failure vs. Success Run Anatomy

Recommendations

1. Add max-turns limit to the engine config

2. Remove the duplicate Run ID from the user message section

3. Trim shared/secret-audit.md investigation instructions

4. Investigate root cause of the 32% failure rate

Expected Impact

Implementation Checklist

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Target Workflow: `secret-digger-copilot`

1. Add `max-turns` limit to the engine config

2. Remove the duplicate `Run ID` from the user message section

3. Trim `shared/secret-audit.md` investigation instructions