feat: cockatiel resilience, security hardening, and error classification#3
feat: cockatiel resilience, security hardening, and error classification#3lleontor705 merged 2 commits intomasterfrom
Conversation
…ication - Pin Bun >= 1.3.5 (CVE-2026-24910 trust validation bypass fix) - Add cockatiel as composable resilience engine (bulkhead, circuit breaker, retry, timeout) - Add error classification (transient/rate_limit/permanent/crash) for smart retry - Add environment variable allowlist for spawned CLI processes - Add API key redaction for all error output
There was a problem hiding this comment.
Pull request overview
Adds resilience/security enhancements around CLI execution (error classification for retry decisions, secret redaction, env-var allowlisting), plus workflow/package updates to support publishing and new dependencies.
Changes:
- Introduces env allowlist (
getSafeEnv) and secret redaction (redactSecrets) and wires them into CLI execution / tool responses. - Adds error classification (
classifyError) and uses it to drive retry/fallback behavior in the resilience loop. - Adds cockatiel policies module + updates release/CI workflows and package metadata (bun engine constraint, provenance publish, publishConfig).
Reviewed changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/safe-env.test.ts | Adds tests for env allowlist behavior. |
| tests/redact.test.ts | Adds tests for secret redaction patterns. |
| tests/error-classifier.test.ts | Adds tests for error classification categories. |
| src/safe-env.ts | Implements env-var allowlist for spawned processes. |
| src/redact.ts | Implements API key / token redaction utility. |
| src/error-classifier.ts | Implements transient/rate_limit/permanent/crash classification. |
| src/resilience.ts | Uses classification to alter retry/fallback and returns error_class. |
| src/executor.ts | Passes “safe” env to execa and redacts error messages; attaches exitCode. |
| src/policies.ts | Adds cockatiel-based composed resilience policies (currently not wired in). |
| src/index.ts | Redacts tool response stderr/error. |
| package.json | Adds cockatiel, bun engine constraint, publishConfig. |
| bun.lock | Locks cockatiel dependency. |
| .github/workflows/release.yml | Adds provenance publish, version/tag handling, id-token permission, full fetch. |
| .github/workflows/ci.yml | Adds push trigger on master. |
| .claude/settings.local.json | Adds local Claude permission settings file. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| // Crash: process killed, binary not found | ||
| if (error?.exitCode === 137 || msg.includes("SIGKILL") || msg.includes("ENOENT")) { | ||
| return "crash" | ||
| } | ||
|
|
||
| // Rate limit: HTTP 429 or quota errors | ||
| if (msg.includes("429") || msg.includes("rate limit") || msg.includes("quota")) { | ||
| return "rate_limit" | ||
| } | ||
|
|
||
| // Permanent: auth failures, not found | ||
| if ( | ||
| msg.includes("auth") || | ||
| msg.includes("401") || | ||
| msg.includes("403") || | ||
| msg.includes("not found") |
There was a problem hiding this comment.
classifyError performs case-sensitive substring checks (e.g., "rate limit", "quota", "auth"), which will miss common variants like "Rate Limit"/"AUTH". Normalize the message (e.g., toLowerCase()) before matching to make classification consistent and reliable.
| // Crash: process killed, binary not found | |
| if (error?.exitCode === 137 || msg.includes("SIGKILL") || msg.includes("ENOENT")) { | |
| return "crash" | |
| } | |
| // Rate limit: HTTP 429 or quota errors | |
| if (msg.includes("429") || msg.includes("rate limit") || msg.includes("quota")) { | |
| return "rate_limit" | |
| } | |
| // Permanent: auth failures, not found | |
| if ( | |
| msg.includes("auth") || | |
| msg.includes("401") || | |
| msg.includes("403") || | |
| msg.includes("not found") | |
| const normalizedMsg = msg.toLowerCase() | |
| // Crash: process killed, binary not found | |
| if (error?.exitCode === 137 || normalizedMsg.includes("sigkill") || normalizedMsg.includes("enoent")) { | |
| return "crash" | |
| } | |
| // Rate limit: HTTP 429 or quota errors | |
| if (normalizedMsg.includes("429") || normalizedMsg.includes("rate limit") || normalizedMsg.includes("quota")) { | |
| return "rate_limit" | |
| } | |
| // Permanent: auth failures, not found | |
| if ( | |
| normalizedMsg.includes("auth") || | |
| normalizedMsg.includes("401") || | |
| normalizedMsg.includes("403") || | |
| normalizedMsg.includes("not found") |
| export function redactSecrets(text: string): string { | ||
| return text | ||
| .replace(/(?:sk-|key-|AIza|ant-api)[a-zA-Z0-9_-]{20,}/g, "[REDACTED]") | ||
| .replace(/Bearer\s+[a-zA-Z0-9._-]+/g, "Bearer [REDACTED]") |
There was a problem hiding this comment.
The redaction regex can match inside longer words because it lacks word boundaries (e.g., it could redact a long string containing "sk-" in the middle). Add appropriate boundaries/preceding context to reduce false positives while still catching real keys.
| .replace(/Bearer\s+[a-zA-Z0-9._-]+/g, "Bearer [REDACTED]") | |
| .replace(/(^|\s)Bearer\s+[a-zA-Z0-9._-]+/g, "$1Bearer [REDACTED]") |
| ...response, | ||
| stdout: truncate(response.stdout, 50_000), | ||
| stderr: truncate(response.stderr, 5_000), | ||
| stderr: redactSecrets(truncate(response.stderr, 5_000)), |
There was a problem hiding this comment.
Redaction is applied after truncation (redactSecrets(truncate(...))). If a secret is truncated mid-token, it may no longer match the {20,} pattern and could leak partially. Redact before truncating (or ensure the redaction patterns also catch truncated fragments) to avoid exposing partial credentials.
| stderr: redactSecrets(truncate(response.stderr, 5_000)), | |
| stderr: truncate(redactSecrets(response.stderr), 5_000), |
| maxBuffer: 10 * 1024 * 1024, | ||
| reject: false, | ||
| windowsHide: true, | ||
| env: getSafeEnv(), |
There was a problem hiding this comment.
env: getSafeEnv() does not enforce an allowlist with execa’s default behavior: execa extends process.env unless extendEnv: false is set, so secrets from the parent process can still be inherited. Set extendEnv: false (and keep an explicit allowlist) to actually prevent leaking environment variables to spawned CLIs.
| env: getSafeEnv(), | |
| env: getSafeEnv(), | |
| extendEnv: false, |
| const env = getSafeEnv() | ||
| expect(env).not.toHaveProperty("MY_SECRET_TOKEN") | ||
| expect(env).not.toHaveProperty("DATABASE_URL") |
There was a problem hiding this comment.
This test can pass even if getSafeEnv is broken because it never sets process.env.MY_SECRET_TOKEN/DATABASE_URL before calling getSafeEnv(). Set those env vars within the test (and restore afterward) so the assertion verifies filtering rather than relying on the runner environment.
| const env = getSafeEnv() | |
| expect(env).not.toHaveProperty("MY_SECRET_TOKEN") | |
| expect(env).not.toHaveProperty("DATABASE_URL") | |
| const originalMySecret = process.env.MY_SECRET_TOKEN | |
| const originalDatabaseUrl = process.env.DATABASE_URL | |
| try { | |
| process.env.MY_SECRET_TOKEN = "test-secret-token" | |
| process.env.DATABASE_URL = "postgres://user:pass@localhost:5432/testdb" | |
| const env = getSafeEnv() | |
| expect(env).not.toHaveProperty("MY_SECRET_TOKEN") | |
| expect(env).not.toHaveProperty("DATABASE_URL") | |
| } finally { | |
| if (originalMySecret === undefined) { | |
| delete process.env.MY_SECRET_TOKEN | |
| } else { | |
| process.env.MY_SECRET_TOKEN = originalMySecret | |
| } | |
| if (originalDatabaseUrl === undefined) { | |
| delete process.env.DATABASE_URL | |
| } else { | |
| process.env.DATABASE_URL = originalDatabaseUrl | |
| } | |
| } |
| type IPolicy, | ||
| } from "cockatiel" | ||
|
|
||
| /** Per-CLI bulkhead: max 2 concurrent, queue up to 3 */ |
There was a problem hiding this comment.
The comment says “Per-CLI bulkhead”, but cliBulkhead is a single shared bulkhead instance. Either create a bulkhead per CLI (e.g., a factory or map keyed by CLI) or adjust the comment to reflect the actual shared behavior.
| /** Per-CLI bulkhead: max 2 concurrent, queue up to 3 */ | |
| /** Shared CLI bulkhead: max 2 concurrent, queue up to 3 */ |
| export const resilientPolicy: IPolicy = wrap( | ||
| timeoutPolicy, | ||
| retryPolicy, | ||
| circuitBreaker, | ||
| cliBulkhead, | ||
| ) |
There was a problem hiding this comment.
policies.ts exports cockatiel policies and adds a new dependency, but nothing in src/ imports or uses these policies right now. If the intent is to switch the resilience engine to cockatiel, wire resilientPolicy into the execution path; otherwise remove the unused module/dependency to avoid shipping dead code.
| { | ||
| "permissions": { | ||
| "allow": [ | ||
| "Bash(grep -E \"\\\\.ts$|\\\\.js$|\\\\.json$\")", | ||
| "Bash(xargs wc:*)", | ||
| "Bash(bun test:*)", | ||
| "Bash(bun run:*)" | ||
| ] | ||
| } | ||
| } |
There was a problem hiding this comment.
This file is named settings.local.json, but it’s committed to the repo. “.local” configs are typically developer-machine specific; consider renaming to a non-local config (if it’s meant to be shared) or add it to .gitignore (if it’s meant to stay local) to avoid unintended permission settings being versioned.
| TAG="v${PKG_VERSION}" | ||
| fi | ||
| echo "tag=$TAG" >> "$GITHUB_OUTPUT" | ||
| npm version "${TAG#v}" --no-git-tag-version --allow-same-version |
There was a problem hiding this comment.
npm version ... --no-git-tag-version updates package.json in the CI workspace, but the workflow then creates a git tag without committing that version bump. This can produce tags/releases whose repository package.json version doesn’t match the published artifact. Consider either committing the version change before tagging, or avoid mutating package.json and instead publish using the version already in the repo.
| npm version "${TAG#v}" --no-git-tag-version --allow-same-version |
| export function getSafeEnv(): Record<string, string> { | ||
| const env: Record<string, string> = {} | ||
| for (const key of SAFE_ENV_VARS) { | ||
| if (process.env[key]) env[key] = process.env[key]! |
There was a problem hiding this comment.
getSafeEnv drops variables whose value is an empty string because it uses a truthiness check. If an allowed var is intentionally set to "" it should still be forwarded; check against undefined/null instead of truthiness.
| if (process.env[key]) env[key] = process.env[key]! | |
| const value = process.env[key] | |
| if (value !== undefined && value !== null) { | |
| env[key] = value | |
| } |
Summary
Test plan