Overview
Implement the HarnessRunner that orchestrates provider execution with retry logic, schema output handling (4-layer recovery), temp file lifecycle, cost tracking, and JSONL event logging.
Branch: feat/harness-v2
Design doc: docs/design/harness-v2-design.md (Sections 3.2, 5.4, 9)
Scope
Python (sdk/python/agentfield/harness/)
TypeScript (sdk/typescript/src/harness/)
Key Design Decisions
- Transient errors: rate limits, timeouts, 5xx, connection errors → retry
- Non-transient errors: auth, billing, invalid request → fail immediately
- Schema follow-up uses same session (cheap, agent has context)
- No separate schema budget — uses overall
max_budget_usd
- Always clean up temp files, even on error paths
Acceptance Criteria
- Retry correctly backs off with jitter
- All 4 schema recovery layers work in sequence
- Follow-up prompt capped at 2 attempts
- Full retry capped at 1 attempt
- Transient vs non-transient error classification correct
- Temp files always cleaned up
- All tests pass
Dependencies
Overview
Implement the
HarnessRunnerthat orchestrates provider execution with retry logic, schema output handling (4-layer recovery), temp file lifecycle, cost tracking, and JSONL event logging.Branch:
feat/harness-v2Design doc:
docs/design/harness-v2-design.md(Sections 3.2, 5.4, 9)Scope
Python (
sdk/python/agentfield/harness/)_runner.py—HarnessRunnerclass:async run(prompt, schema, options)→ orchestrates full executionTypeScript (
sdk/typescript/src/harness/)runner.ts—HarnessRunnerclass (mirrors Python)Key Design Decisions
max_budget_usdAcceptance Criteria
Dependencies