Skip to content

implementer-agent: raise timeout to 20min + add npm/pip cache guidance to save ~2-3min per run #48

@verkyyi

Description

@verkyyi

Summary

The first real dispatch of the newly-installed implementer-agent on verkyyi/agentfolio hit the 15-minute cap with the implementation fully written and tests passing, but timed out 9 seconds before the draft PR could be created. Dogfood finding — two small changes to the catalog source would have let the run succeed and would save ~2-3min on every subsequent run.

Run: verkyyi/agentfolio run 24680708086 — job duration 15m44s, status failure (timeout).

Timeline from the failed run

Parsed from the Claude Code CLI event stream:

Window Duration What the agent was doing
17:29:27 – 17:30:36 69s 19 tool calls hunting for inputs.issue_number (dispatch inputs weren't propagated — separate user-side issue, fixed)
17:30:36 – 17:38:47 491s Silent — single thinking block over ChatPanel.tsx (329 LOC) + test file (~500 LOC), planning 5 edits
17:38:47 – 17:39:28 41s 5 Edit calls + kicked off npm install + tests
17:39:28 – 17:43:00 211s npm install --prefer-offline + first vitest run on cold node_modules
17:43:00 – 17:43:58 58s Analyzed test failure, imported act, re-ran, 108/108 green
17:43:58 – 17:44:08 10s git add + commit + branch check + ToolSearch for PR-create tool
17:44:08 – 17:44:17 9s TIMEOUT before PR create executed

Two leverage points

1. Bump timeout-minutes: 15 → 20

The agent completed the implementation, fixed the test bug, got all 108 tests green, and started staging the commit. It missed the PR creation call by ~10 seconds. A 5-minute budget bump is the cheapest possible fix for this class of failure and directly handles the 8-minute silent-thinking case (big model on large files) without requiring a prompt or model change.

Diff: catalog/agent-team/implementer-agent.md

-timeout-minutes: 15
+timeout-minutes: 20

2. Add actions/setup-node cache guidance to the implementer's default steps (or prompt)

npm install on a cold runner burned 3.5 minutes. actions/setup-node@v4 with cache: 'npm' brings this to near-zero when package-lock.json hasn't changed. The same pattern would help pip/cargo/go mod/maven projects.

Two ways to ship this, in order of preference:

(a) Bake cache setup into the checkout/runner config block. If gh-aw exposes a hook to run setup steps before the Claude Code CLI, add a language-detection step that runs the appropriate setup-* action with caching. The implementer-agent.md frontmatter already has network: allowed: [defaults, node, python, rust, dotnet, java] — leverage that signal to conditionally enable caches.

(b) Tell the agent about it in the prompt body. Cheaper. Add a section to the implementer prompt like:

## Runner performance

Before running tests, check for cached dependency directories:
- Node: if `~/.npm` / `node_modules` exist, prefer `npm ci --prefer-offline` over `npm install`.
- Python: if `~/.cache/pip` exists, prefer `pip install -r requirements.txt --no-index --find-links`.

If no cache exists, runner setup cost is unavoidable. Don't spend more than 4 minutes on test setup — if tests are still building after 4 min, fall back to running only the tests you added (`npx vitest run <path-to-new-test>`), commit, and let CI run the full suite.

The 4-minute fallback is the important bit: the agent today doesn't know there's a wall clock, so it happily waits 3.5 min for a slow install. Telling it to bail to a narrower test run preserves the discipline (TDD green before commit) without eating the whole budget.

Why not reduce the 8-min silent-thinking block

That's genuine model-time for claude-sonnet-4-6 reasoning over the files. Workarounds (smaller model, forced edit-decomposition) trade depth for speed and are probably not worth it for most tasks. The timeout bump covers this case without behavior change.

Suggested acceptance

  • catalog/agent-team/implementer-agent.md frontmatter: timeout-minutes: 20.
  • Prompt body gets the "Runner performance" note (or cache setup gets baked in if feasible).
  • README or catalog/agent-team/README.md documents the budget: spec ~4min, planner ~7min, implementer ~20min, reviewer ~10min — so users know what to expect.

Tangential

gh-aw#27407 covers the unrelated SHA-pinning bug that caused the first dispatch to fail for a different reason. That issue is independent of this one, but both surfaced in the same dogfood session.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions