feat: port MCP resilience features, new tools, professional README#4
feat: port MCP resilience features, new tools, professional README#4lleontor705 merged 1 commit intomasterfrom
Conversation
…fessional README Port best features from cli-orchestrator-mcp: - Global time budget shared across retries + fallbacks (prevents timeout multiplication) - Separate timeout/failure thresholds in circuit breaker (5 timeouts vs 3 failures) - Process timeout skips retries and moves directly to fallback - CLI detection cache with 5-min TTL - Dynamic --max-turns for Claude based on remaining seconds - Windows .cmd/.bat shim handling and PATH augmentation (npm, scoop, cargo, pnpm) - Abort-aware sleep for retry backoff cancellation - Improved safe-env: more Windows vars, no API keys (CLIs auth inline) New tools: - cli_list: list installed CLI providers with paths and strengths - cli_route: role-based routing for 6 agent roles (manager, coordinator, developer, researcher, reviewer, architect) Documentation: - Professional README with SVG diagrams (architecture, circuit breaker, resilience pipeline, routing, providers) - CLAUDE.md for Claude Code context - Replaced cockatiel dependency with explicit zod Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Ports resilience and observability improvements into opencode-cli-enforcer, adds two new OpenCode tools for provider introspection/routing, and refreshes documentation to match the expanded feature set.
Changes:
- Implement global time budget across retries + fallbacks, plus abort-aware backoff and structured executor results.
- Extend circuit breaker to track timeouts separately from failures, and expose richer status metrics.
- Add new tools (
cli_list,cli_route), role-based routing definitions, detection caching, and a significantly expanded README + diagrams.
Reviewed changes
Copilot reviewed 14 out of 21 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/safe-env.test.ts | Updates SAFE_ENV expectations (no API keys; adds Windows + lowercase proxy vars). |
| tests/cli-defs.test.ts | Adds coverage for dynamic timeout args + role routing; updates Claude arg expectations. |
| tests/circuit-breaker.test.ts | Adds tests for timeout tracking and total counters. |
| src/safe-env.ts | Removes API keys from allowlist; adds Windows/system + lowercase proxy vars. |
| src/retry.ts | Makes backoff sleep abort-aware. |
| src/resilience.ts | Adds global time budget + timeout skipping retries; integrates timeout recording. |
| src/index.ts | Adds cli_list + cli_route; enriches cli_status with timeout/total metrics. |
| src/executor.ts | Returns structured results (no throw); Windows shim handling + PATH augmentation; adds timeout hint args. |
| src/detection.ts | Adds 5-minute TTL cache for detection results and a cache accessor. |
| src/cli-defs.ts | Updates provider strengths; adds buildTimeoutArgs; introduces role routing table. |
| src/circuit-breaker.ts | Adds timeout threshold + timeout counters + total counters; adds recordTimeout(). |
| README.md | Major documentation expansion with diagrams, examples, and security/config references. |
| package.json | Removes cockatiel; adds zod dependency. |
| docs/assets/architecture.svg | Adds architecture diagram asset. |
| docs/assets/circuit-breaker.svg | Adds circuit breaker diagram asset. |
| docs/assets/resilience-pipeline.svg | Adds resilience pipeline diagram asset. |
| docs/assets/routing.svg | Adds role routing diagram asset. |
| docs/assets/providers.svg | Adds providers/strengths diagram asset. |
| docs/assets/logo.svg | Adds project logo asset. |
| CLAUDE.md | Adds Claude Code-specific repo guidance and commands. |
| bun.lock | Lockfile updates for dependency changes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "cockatiel": "^3.2.1", | ||
| "execa": "^9.6.1" | ||
| "execa": "^9.6.1", | ||
| "zod": "^3.23.0" |
There was a problem hiding this comment.
cockatiel was removed from dependencies, but src/policies.ts still imports it. Because tsconfig.json includes src/**/*.ts, this will fail typecheck/build with “Cannot find module 'cockatiel'”. Either delete/stop compiling src/policies.ts, migrate it off cockatiel, or keep cockatiel as a dependency.
| "zod": "^3.23.0" | |
| "zod": "^3.23.0", | |
| "cockatiel": "^3.0.0" |
| const timer = setTimeout(resolve, ms) | ||
| signal?.addEventListener( | ||
| "abort", | ||
| () => { | ||
| clearTimeout(timer) | ||
| reject(new Error("aborted")) | ||
| }, | ||
| { once: true }, | ||
| ) |
There was a problem hiding this comment.
sleep() adds an abort event listener but never removes it when the timer completes normally. With repeated retries this can accumulate listeners on the same AbortSignal and leak memory. Consider registering the handler as a named function and removing it on timeout/resolve (or use a cleanup finally).
| const timer = setTimeout(resolve, ms) | |
| signal?.addEventListener( | |
| "abort", | |
| () => { | |
| clearTimeout(timer) | |
| reject(new Error("aborted")) | |
| }, | |
| { once: true }, | |
| ) | |
| let timer: ReturnType<typeof setTimeout> | undefined | |
| const cleanup = () => { | |
| if (timer !== undefined) { | |
| clearTimeout(timer) | |
| timer = undefined | |
| } | |
| if (signal && onAbort) { | |
| signal.removeEventListener("abort", onAbort) | |
| } | |
| } | |
| const onAbort = () => { | |
| cleanup() | |
| reject(new Error("aborted")) | |
| } | |
| timer = setTimeout(() => { | |
| cleanup() | |
| resolve() | |
| }, ms) | |
| if (signal) { | |
| signal.addEventListener("abort", onAbort) | |
| } |
| /** Merge caller signal with budget signal (compatible with all runtimes) */ | ||
| function mergeAbortSignals(a?: AbortSignal, b?: AbortSignal): AbortSignal | undefined { | ||
| if (!a) return b | ||
| if (!b) return a | ||
| const controller = new AbortController() | ||
| const onAbort = () => controller.abort() | ||
| a.addEventListener("abort", onAbort, { once: true }) | ||
| b.addEventListener("abort", onAbort, { once: true }) | ||
| return controller.signal | ||
| } |
There was a problem hiding this comment.
mergeAbortSignals() attaches abort listeners to the caller’s signal and the budget signal, but those listeners are never removed if the call completes without aborting. If the same caller AbortSignal is reused across many calls, this can leak listeners over time. Consider using AbortSignal.any() when available, or returning { signal, dispose } so the listeners can be removed in finally.
| max_attempts: ctx.retryConfig.maxRetries + 1, | ||
| const result = await executeCliOnce(currentDef, prompt, mode, remainingSeconds, mergedSignal) | ||
|
|
||
| if (result.exitCode === 0 && result.stdout) { |
There was a problem hiding this comment.
Success is currently gated on result.exitCode === 0 && result.stdout. If a CLI exits 0 but produces an empty stdout (or writes its response to stderr), this will be treated as a failure and may trigger retries/fallback incorrectly. Consider treating exitCode === 0 as success regardless of stdout content (and return empty stdout if needed).
| if (result.exitCode === 0 && result.stdout) { | |
| if (result.exitCode === 0) { |
| it("does NOT include API key vars (CLIs handle their own auth)", () => { | ||
| expect(SAFE_ENV_VARS).not.toContain("ANTHROPIC_API_KEY") | ||
| expect(SAFE_ENV_VARS).not.toContain("GOOGLE_API_KEY") | ||
| expect(SAFE_ENV_VARS).not.toContain("OPENAI_API_KEY") |
There was a problem hiding this comment.
The test says “does NOT include API key vars”, but it only checks three keys. Since SAFE_ENV_VARS also removed GEMINI_API_KEY and CODEX_API_KEY, it’d be good to assert those are excluded too to prevent regressions.
| expect(SAFE_ENV_VARS).not.toContain("OPENAI_API_KEY") | |
| expect(SAFE_ENV_VARS).not.toContain("OPENAI_API_KEY") | |
| expect(SAFE_ENV_VARS).not.toContain("GEMINI_API_KEY") | |
| expect(SAFE_ENV_VARS).not.toContain("CODEX_API_KEY") |
Summary
--max-turnsfor Claude, Windows.cmd/.bathandling + PATH augmentation, abort-aware sleep, improved safe-env (no API keys — CLIs auth inline)cli_list(list installed providers) andcli_route(role-based routing for 6 agent roles)Changes
resilience.tscircuit-breaker.tsrecordTimeout(), total countersexecutor.tscmd /cwrapping, PATH augmentationcli-defs.tsbuildTimeoutArgs(), role routing table (6 roles), updated strengthsdetection.tsretry.tssafe-env.tsindex.tscli_listandcli_routetools, updatedcli_statuswith timeout metricspackage.jsoncockatielwithzoddocs/assets/Test plan
bun test— 85 tests pass, 0 failures🤖 Generated with Claude Code