diff --git a/CLAUDE.md b/CLAUDE.md
new file mode 100644
index 0000000..8651bfa
--- /dev/null
+++ b/CLAUDE.md
@@ -0,0 +1,59 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Project
+
+OpenCode CLI Enforcer — an OpenCode plugin that orchestrates Claude, Gemini, and Codex CLIs with resilience (circuit breakers, retry with backoff, automatic fallback). Published to npm as `opencode-cli-enforcer`.
+
+## Commands
+
+```bash
+bun install # Install dependencies
+bun test # Run all tests
+bun test --watch # Run tests in watch mode
+bun test tests/retry.test.ts # Run a single test file
+bun run typecheck # Type-check without emitting
+bun run build # Build (no bundle)
+```
+
+Runtime is **Bun** (>=1.3.5), not Node. Tests use Bun's built-in test runner (`bun:test`). There is no linter or formatter configured.
+
+## Architecture
+
+The plugin exports four tools (`cli_exec`, `cli_status`, `cli_list`, `cli_route`) via the OpenCode plugin interface.
+
+**Request flow through `cli_exec`:**
+
+```
+index.ts (plugin entry, tool definitions, hooks)
+ → resilience.ts (global time budget, retry + circuit breaker + fallback)
+ → circuit-breaker.ts (per-CLI isolation: 3 failures OR 5 timeouts → open)
+ → retry.ts (exponential backoff with jitter, abort-aware sleep)
+ → executor.ts (execa wrapper: structured results, Windows .cmd handling, PATH augmentation)
+ → cli-defs.ts (arg builders + dynamic --max-turns for Claude)
+ → detection.ts (CLI availability via which/where, 5-min cache)
+ → safe-env.ts (allowlisted env vars only, no API keys)
+ → redact.ts (strips API keys from output)
+ → error-classifier.ts (transient/rate_limit/permanent/crash)
+```
+
+**Key state in `index.ts`:** three `Map`s — `breakers` (circuit breaker per CLI), `cliAvailability` (detection results), `usageStats` (call counts/timing). CLI detection runs non-blocking at startup via `Promise.allSettled`.
+
+**Global time budget** (`resilience.ts`): a single timeout budget shared across all retries AND fallbacks, preventing timeout multiplication. Process timeouts skip retries and go straight to fallback.
+
+**Circuit breaker** has separate thresholds: opens after 3 failures OR 5 timeouts (slow ≠ broken), cooldown 60s. **Retry**: max 2 retries, 1s base delay, 10s max, 0.3 jitter factor.
+
+**Role-based routing** (`cli-defs.ts`): 6 agent roles (manager, coordinator, developer, researcher, reviewer, architect) map to optimal CLI providers via `cli_route`.
+
+## Cross-Platform
+
+- `platform.ts` exports `PLATFORM` and `IS_WINDOWS`
+- Binary detection uses `which` (Unix) / `where` (Windows) with 5-minute cache
+- Windows: `.cmd/.bat` shim handling via `cmd /c`, PATH augmentation (npm, scoop, cargo, pnpm)
+- Large prompts (>30KB) delivered via stdin to avoid OS arg-length limits
+- CI runs on ubuntu, windows, and macos
+
+## Release
+
+The release workflow (`.github/workflows/release.yml`) requires a production environment approval gate, publishes to npm with provenance attestation using Node 22, and creates a GitHub release with a git tag.
diff --git a/README.md b/README.md
index 1e47ef0..4238e6e 100644
--- a/README.md
+++ b/README.md
@@ -1,21 +1,92 @@
+ Resilient multi-LLM CLI orchestration for OpenCode
+ Execute Claude, Gemini & Codex with circuit breakers, smart retry, automatic fallback, and role-based routing.
-
+
+
+
+
+
+---
+
+## Why opencode-cli-enforcer?
+
+Running AI CLIs in production is fragile. Processes timeout, rate limits hit, binaries disappear. Calling three different CLIs means three different failure modes, arg formats, and platform quirks.
+
+**opencode-cli-enforcer** wraps all of that into a single, resilient plugin:
+
+
+
+
+
+**Without this plugin**
+- Manual subprocess management
+- No retry on transient failures
+- One CLI down = entire workflow blocked
+- OS-specific arg handling per CLI
+- Secrets leak into error logs
+- No visibility into CLI health
+
+
+
+
+**With this plugin**
+- 4 tools, zero boilerplate
+- Exponential backoff + jitter retry
+- Automatic fallback chain across providers
+- Cross-platform (Windows `.cmd` shims, PATH augmentation)
+- Secret redaction on all output
+- Real-time health dashboard
+
+
+
+
+
+---
+
+## Architecture
+
+
+
+```
+cli_exec(prompt)
+ |
+ v
++----------------------------------------------------------+
+| Resilience Engine |
+| |
+| Global Time Budget (shared across ALL attempts) |
+| +---------+ +---------+ +---------+ |
+| | Claude | --> | Gemini | --> | Codex | fallback |
+| +---------+ +---------+ +---------+ chain |
+| | | | |
+| v v v |
+| [Circuit Breaker] -----> [Retry w/ Backoff] --> [execa] |
+| 3 failures = open max 2 retries 10MB |
+| 5 timeouts = open 1s-10s + jitter buffer |
+| 60s cooldown abort-aware sleep |
++----------------------------------------------------------+
+```
+
---
-Execute Claude, Gemini, and Codex CLIs with automatic OS detection, circuit breaker pattern, retry with exponential backoff, and provider fallback. Cross-platform (Windows/macOS/Linux).
+## Quick Start
-## Install
+### 1. Install as OpenCode plugin (recommended)
-### OpenCode plugin (recommended)
+Add to your OpenCode configuration:
```json
{
@@ -23,77 +94,478 @@ Execute Claude, Gemini, and Codex CLIs with automatic OS detection, circuit brea
}
```
-### npm
+### 2. Install via npm / bun
```bash
bun add opencode-cli-enforcer
+# or
+npm install opencode-cli-enforcer
```
-## Tools
+### 3. Prerequisites
+
+You need at least **one** CLI installed and authenticated:
+
+| CLI | Install | Auth |
+|-----|---------|------|
+| [Claude Code](https://docs.anthropic.com/en/docs/claude-code) | `npm i -g @anthropic-ai/claude-code` | `claude login` |
+| [Gemini CLI](https://github.com/google-gemini/gemini-cli) | `npm i -g @anthropic-ai/gemini-cli` | `gcloud auth login` |
+| [Codex CLI](https://github.com/openai/codex) | `npm i -g @openai/codex` | `codex auth` |
+
+---
+
+## Tools Reference
+
+### `cli_exec` — Execute with full resilience
+
+The primary tool. Sends a prompt to a CLI with automatic retry, circuit breaker protection, and fallback.
-### `cli_exec` — Execute a CLI with full resilience
+```typescript
+cli_exec({
+ cli: "claude",
+ prompt: "Explain the observer pattern with a TypeScript example",
+ mode: "generate", // "generate" | "analyze"
+ timeout_seconds: 300, // Global budget: 10-1800s
+ allow_fallback: true // Try gemini/codex on failure
+})
+```
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
-| `cli` | `"claude" \| "gemini" \| "codex"` | required | Primary CLI |
-| `prompt` | `string` | required | Prompt to send |
-| `mode` | `"generate" \| "analyze"` | `"generate"` | `analyze` enables file reads (Claude) |
-| `timeout_seconds` | `number` | `720` | Max seconds (10-1800) |
-| `allow_fallback` | `boolean` | `true` | Try alternatives on failure |
+| `cli` | `"claude" \| "gemini" \| "codex"` | *required* | Primary CLI provider |
+| `prompt` | `string` | *required* | Prompt to send (max 100KB) |
+| `mode` | `"generate" \| "analyze"` | `"generate"` | `analyze` enables file reads (Claude only) |
+| `timeout_seconds` | `number` | `720` | Global timeout budget in seconds |
+| `allow_fallback` | `boolean` | `true` | Auto-fallback to alternative providers |
+
+**Response:**
+
+```jsonc
+{
+ "success": true,
+ "cli": "claude",
+ "stdout": "The Observer pattern is a behavioral design pattern...",
+ "stderr": "",
+ "duration_ms": 4523,
+ "timed_out": false,
+ "used_fallback": false,
+ "fallback_chain": ["claude"],
+ "error": null,
+ "error_class": null, // "transient" | "rate_limit" | "permanent" | "crash"
+ "circuit_state": "closed", // "closed" | "open" | "half-open"
+ "attempt": 1,
+ "max_attempts": 3
+}
+```
+
+---
+
+### `cli_status` — Health dashboard
+
+Returns real-time health for all providers: installation status, circuit breaker state, and usage statistics.
-### `cli_status` — Health check dashboard
+```typescript
+cli_status({})
+```
+
+**Response:**
+
+```jsonc
+{
+ "platform": "windows",
+ "detection_complete": true,
+ "retry_config": { "max_retries": 2, "base_delay_ms": 1000, "max_delay_ms": 10000 },
+ "breaker_config": { "failure_threshold": 3, "timeout_threshold": 5, "cooldown_seconds": 60 },
+ "providers": [
+ {
+ "name": "claude",
+ "installed": true,
+ "path": "/usr/local/bin/claude",
+ "version": "1.0.16",
+ "circuit_breaker": {
+ "state": "closed",
+ "consecutive_failures": 0,
+ "consecutive_timeouts": 0,
+ "total_executions": 12,
+ "total_failures": 1,
+ "total_timeouts": 0
+ },
+ "usage": {
+ "total_calls": 12,
+ "success_rate": "92%",
+ "avg_duration_ms": 3400
+ },
+ "fallback_order": ["gemini", "codex"]
+ }
+ // ... gemini, codex
+ ]
+}
+```
+
+---
+
+### `cli_list` — List installed providers
+
+Quick check of which CLIs are available on the system.
+
+```typescript
+cli_list({})
+```
+
+**Response:**
+
+```jsonc
+{
+ "installed_count": 2,
+ "providers": [
+ { "provider": "claude", "path": "/usr/local/bin/claude", "version": "1.0.16", "strengths": ["reasoning", "code-analysis", "debugging", "architecture", "planning"] },
+ { "provider": "gemini", "path": "/usr/local/bin/gemini", "version": "0.1.8", "strengths": ["research", "trends", "knowledge", "large-context", "web-search"] }
+ ]
+}
+```
+
+---
+
+### `cli_route` — Role-based routing
-Returns platform info, detection status, circuit breaker states, and usage stats for all providers.
+Recommends the best CLI for a task based on agent role. Considers both provider strengths and real-time availability.
+
+```typescript
+cli_route({
+ role: "developer",
+ task_description: "Refactor the auth module to use JWT"
+})
+```
+
+| Parameter | Type | Description |
+|-----------|------|-------------|
+| `role` | `"manager" \| "coordinator" \| "developer" \| "researcher" \| "reviewer" \| "architect"` | Agent role |
+| `task_description` | `string?` | Optional context |
+
+**Routing table:**
+
+
+
+### Global Time Budget
+
+Unlike per-attempt timeouts, the **global time budget** is shared across ALL retries and ALL fallback providers. This prevents timeout multiplication:
+
```
-Request --> Circuit Breaker --> Retry (3x, exp backoff) --> Execute (execa)
- | | |
- v v v
- If open: If exhausted: On failure:
- skip to try next CLI record + retry
- fallback in chain
+Traditional: 3 providers x 3 attempts x 300s timeout = 2700s worst case
+This plugin: 300s total budget across everything = 300s worst case
```
-**Circuit Breaker States:**
+Each attempt receives the **remaining** seconds, not the full budget. When the budget runs out, execution stops immediately.
-| State | Behavior |
-|-------|----------|
-| closed | Normal — requests pass through |
-| open | Blocked — 3+ failures, 60s cooldown |
-| half-open | Probe — 1 request to test recovery |
+### Circuit Breaker
-**Fallback Order:** `claude --> gemini --> codex`
+Per-CLI failure isolation with **separate thresholds** for failures and timeouts (because slow ≠ broken):
-## Supported CLIs
+
+
+
+
+| State | Behavior | Transition |
+|-------|----------|------------|
+| **Closed** | Normal operation, requests pass through | 3 failures OR 5 timeouts → Open |
+| **Open** | All requests blocked, provider is skipped | After 60s cooldown → Half-Open |
+| **Half-Open** | One probe request allowed | Success → Closed / Failure → Open |
+
+### Retry with Exponential Backoff
+
+```
+Attempt 0: immediate
+Attempt 1: ~1s + jitter (+-30%)
+Attempt 2: ~2s + jitter (+-30%)
+ capped at 10s max
+```
+
+- **Transient errors** (network, socket): standard retry
+- **Rate limits** (429, quota): retry with 3x longer delay
+- **Process timeouts**: skip retries entirely, move to next provider
+- **Permanent errors** (auth, 401/403): skip retries, move to fallback
+- **Crash** (SIGKILL, ENOENT): skip retries, move to fallback
+
+### Error Classification
+
+```
+Error arrives
+ |
+ +-- exitCode 137 / SIGKILL / ENOENT ---------> CRASH (no retry)
+ +-- 429 / "rate limit" / "quota" -------------> RATE_LIMIT (retry, 3x delay)
+ +-- 401 / 403 / "auth" / "not found" --------> PERMANENT (no retry)
+ +-- everything else --------------------------> TRANSIENT (retry)
+```
+
+### Fallback Chain
+
+When a provider fails, the next one in the chain takes over automatically:
+
+```
+Claude ---[fail]---> Gemini ---[fail]---> Codex
+Gemini ---[fail]---> Claude ---[fail]---> Codex
+Codex ---[fail]---> Claude ---[fail]---> Gemini
+```
+
+---
+
+## Cross-Platform Support
+
+
+
+
Feature
+
Windows
+
macOS / Linux
+
+
+
Binary detection
+
where
+
which
+
+
+
.cmd/.bat shims
+
Auto-wrapped with cmd /c
+
N/A
+
+
+
PATH augmentation
+
npm, scoop, cargo, pnpm dirs
+
Standard PATH
+
+
+
Large prompts (>30KB)
+
Delivered via stdin to avoid OS arg-length limits
+
+
+
Environment
+
Allowlisted vars only (no secrets leak to subprocesses)
+
+
+
+### Detection Caching
+
+CLI availability is cached for **5 minutes** to avoid repeated filesystem lookups. The cache covers:
+- Binary path resolution
+- Version detection
+- Both positive and negative results
+
+---
+
+## Security
+
+| Protection | Description |
+|-----------|-------------|
+| **Secret redaction** | API keys (`sk-*`, `key-*`, `AIza*`, `ant-api*`) and Bearer tokens stripped from all output |
+| **Environment filtering** | Only system essentials + proxy vars passed to subprocesses. No API keys — CLIs handle their own auth. |
+| **Input isolation** | Large prompts (>30KB) delivered via stdin, not shell args |
+| **No shell interpolation** | All CLI execution via `execa` (no `shell: true`) |
+
+---
+
+## Examples
+
+### Basic: Ask Claude to review code
+
+```typescript
+const result = await cli_exec({
+ cli: "claude",
+ prompt: "Review this function for bugs:\n\nfunction add(a, b) { return a - b }",
+ mode: "analyze",
+ timeout_seconds: 120
+})
+// result.stdout → "Bug found: the function is named `add` but performs subtraction..."
+```
+
+### Fallback: Primary CLI is down
+
+```typescript
+// Claude's circuit breaker is open (too many recent failures)
+const result = await cli_exec({
+ cli: "claude",
+ prompt: "Generate a REST API for user management",
+ allow_fallback: true
+})
+// result.cli → "gemini" (automatic fallback)
+// result.used_fallback → true
+```
+
+### Role routing: Pick the right tool for the job
+
+```typescript
+// For a developer task, route to Codex (best at code generation)
+const recommendation = await cli_route({
+ role: "developer",
+ task_description: "Implement pagination for the /users endpoint"
+})
+// recommendation.recommended_cli → "codex"
+
+// Then execute with the recommended CLI
+const result = await cli_exec({
+ cli: recommendation.recommended_cli,
+ prompt: "Implement pagination for the /users endpoint using cursor-based pagination"
+})
+```
-| CLI | Best For |
-|-----|----------|
-| Claude | Reasoning, code analysis, debugging, architecture |
-| Gemini | Research, broad knowledge, large context |
-| Codex | Code generation, edits, refactoring |
+### Monitor health across providers
-## Prerequisites
+```typescript
+const status = await cli_status({})
-- [Bun](https://bun.sh) runtime
-- At least one CLI: [Claude Code](https://docs.anthropic.com/en/docs/claude-code), [Gemini CLI](https://github.com/google-gemini/gemini-cli), or [Codex CLI](https://github.com/openai/codex)
+for (const provider of status.providers) {
+ console.log(`${provider.name}: ${provider.circuit_breaker.state} | ${provider.usage.success_rate}`)
+}
+// claude: closed | 95%
+// gemini: closed | 88%
+// codex: open | 60% <-- circuit breaker tripped
+```
+
+### Large prompt via stdin
+
+```typescript
+const largeCodebase = readFileSync("src/index.ts", "utf-8") // 45KB file
+const result = await cli_exec({
+ cli: "claude",
+ prompt: `Analyze this codebase for security vulnerabilities:\n\n${largeCodebase}`,
+ mode: "analyze",
+ timeout_seconds: 600
+})
+// Prompt >30KB → automatically delivered via stdin (no OS arg-length issues)
+```
+
+---
+
+## Hooks
+
+The plugin injects two hooks into the OpenCode lifecycle:
+
+### `experimental.chat.system.transform`
+
+Automatically injects CLI availability into the system prompt of every agent (except `orchestrator` and `task_decomposer`), so the LLM knows which tools are available and their current health.
+
+### `tool.execute.after`
+
+Tracks when agents invoke CLIs directly via `bash` instead of using `cli_exec`, incrementing usage counters for observability.
+
+---
+
+## Provider Strengths
+
+