feat(cascade-tools): spec 014 — agent ergonomics (truthful prompts, structured envelope, --comment alias) by zbigniewsobiecki · Pull Request #1190 · mongrel-intelligence/cascade

zbigniewsobiecki · 2026-04-24T20:06:51Z

Summary

Root cause fix for prod run 5d993b04-6e05-4ae1-b7de-8c274cf3496b where a review agent wasted ~2½ min of a 7m42s run fighting cascade-tools scm create-pr-review and ultimately dropped an inline PR comment. The agent-facing system prompt literally told it to use --comment <string> (repeatable) when reality is --comments '<json array>'; error messages and --help gave nothing to self-correct from.
Shared-infra fix (plan 1): prompt renderer stops s-stripping array names and stops claiming "<string> (repeatable)" for every array; array-of-object params now render --<flag> '<json>' with aliases via | and a one-line runnable example inlined from the tool definition. Every CLI failure — flag parse, JSON parse, missing-required, enum-mismatch, unknown-flag, auth, runtime — emits a single structured envelope on stdout ({success:false, error:{type, flag?, message, got?, expected?, hint?, example?}}) plus a readable prose summary on stderr. Mistyped flags get a "did you mean" suggestion via fastest-levenshtein. --help renders def.examples as copy-pasteable invocations under an EXAMPLES section.
Per-gadget opt-in (plan 2): createPRReviewDef gains cliAliases: ['comment'] + fileInputAlternatives for --comments-file <path> (and - for stdin). Two declarative fields. Zero edits to shared infrastructure — proves the single-entrypoint invariant that a new gadget should never need to touch shared machinery.
Authoring guide lives at src/gadgets/README.md (new). Full spec at docs/specs/014-cascade-tools-agent-ergonomics.md.done.

Highlights

Before: agent system prompt declared [--comment <string> (repeatable)]. Agent sent --comment (singular). Got Nonexistent flag: comment. Tried --comments + single-quoted-keys JSON. Got --comments must be valid JSON. Reverse-engineered source. Gave up. Posted body-only review. 2½ min lost.

After: agent prompt declares [--comments|--comment '<json>'] with an example line showing --comments '[{"path":"src/x.ts","line":1,"body":"nit"}]'. If the agent still mistypes, the envelope hint says did you mean --comments?. If JSON is malformed, the envelope surfaces got: "[{'path'...", expected: "[{\"path\":...}]", hint: "Use double-quoted JSON keys... or pass --comments-file <path>". Self-correction is one retry away.

Test plan

npm run build — clean
npm run lint — clean (biome)
npm run typecheck — clean
npm test — 8440 passing, 0 failing, 23 skipped (458 files)
57 new unit tests across errorEnvelope.test.ts, shared-nativeToolPrompts.test.ts, factories.test.ts, definitions.test.ts
7 legacy pre-014 test assertions updated in-place (prose this.error() / flat envelope / s-stripped array names)
[manual] 6 binary-level scenarios run against ./bin/cascade-tools.js after npm run build:
1. canonical --comments '[...]' → exit 1, type:"runtime" (parse succeeded)
2. --comment '[...]' alias → exit 1, type:"runtime" (alias resolved)
3. --comment "[{'path':..}]" (single-quoted JSON) → exit 1, type:"json-parse", flag:"comments", hint mentions --comments-file
4. --comments-file /tmp/x.json → exit 1, type:"runtime" (file parsed)
5. --comments-file - (stdin) → exit 1, type:"runtime" (stdin parsed)
6. --help → exit 0, contains EXAMPLES + "path":"src/utils.ts"
[manual] stderr prose readable: runtime — No GitHub client in scope... (123 chars, single line, no ANSI, no Node stack dump)

Notes

vitest fork-pool workers deterministically fail to capture stdout/stderr from spawned binaries that do top-level await import() — reproduced with spawn/execa/sh -c/execSync, exit 2 with zero output. Unit tests cover every code path at the Command class level; the binary-level scenarios are documented as a manual-verification protocol in plan 2. Chasing the vitest/tinypool root cause is a separate concern and out of scope here.
This PR closes out spec 014. Both plans .done; spec .done. CLAUDE.md untouched (audit clean).

🤖 Generated with Claude Code

Adds docs/specs/014-cascade-tools-agent-ergonomics.md plus two plans covering shared-infra and create-pr-review adoption. Prompted by prod run 5d993b04-6e05-4ae1-b7de-8c274cf3496b. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…elope Ships the root-cause fix for prod run 5d993b04-6e05-4ae1-b7de-8c274cf3496b plus the shared infrastructure every future gadget inherits: - System-prompt renderer (src/backends/shared/nativeToolPrompts.ts) stops stripping trailing 's' from array param names and claiming '<string> (repeatable)' for every array. Array-of-object params now render as `--<flag> '<json>'` with aliases appended via `|` and a one-line runnable example from the tool definition. - Factory (src/gadgets/shared/cliCommandFactory.ts) gains oclif flag aliases, JSON parsing for array-of-object flags, file-input JSON parsing, `examples` wired into oclif `--help`, and Levenshtein-based 'did you mean' suggestions for mistyped flags (via fastest-levenshtein). - New shared error envelope (src/gadgets/shared/errorEnvelope.ts) — every CLI failure emits `{"success":false,"error":{type,flag?,message,got?, expected?,hint?,example?}}` on stdout plus a one-line prose summary on stderr. All prior `this.error()` / flat `{success:false,error:"<string>"}` call sites migrated. - Contracts widened: ParameterDefinition gains `cliAliases`, FileInput- Alternative gains `parseAs`, ToolManifest parameters carry `items`, `aliases`, `example`. - Manifest generator threads the new fields through. - bin/cascade-tools.js wraps `run()` to swallow oclif ExitError cleanly so the envelope isn't obscured by Node's default stack dump. Plan-1 ACs #1–#17 all delivered. 8438/8438 unit tests passing. Test surface delta: 57 new unit tests across errorEnvelope.test.ts, shared-nativeToolPrompts.test.ts, and factories.test.ts. Seven legacy assertions encoding the pre-014 error surface updated in cli/cli-command- factory, cli/file-input-flags, cli/scm/create-pr-sidecar, cli/scm/create- pr-review-sidecar, backends/claude-code. Plan 2 adopts the pattern on createPRReviewDef — zero shared-file edits — proving the declarative-metadata invariant. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Applies the spec-014 declarative-metadata pattern to createPRReviewDef: - --comment alias for --comments (the exact muscle-memory mistake from prod run 5d993b04-6e05-4ae1-b7de-8c274cf3496b). - --comments-file <path> (and - for stdin) JSON-parsed escape hatch for long payloads that don't survive shell quoting. - Two declarative fields on createPRReviewDef.parameters.comments.cliAliases + createPRReviewDef.cli.fileInputAlternatives. Zero edits to shared infrastructure (cliCommandFactory, manifestGenerator, nativeToolPrompts, errorEnvelope) — proves spec 014's single-entrypoint invariant. Per-plan ACs #1, #2, #3, #5, #6, #7, #8, #9, #11, #12 auto-verified (unit tests + build + lint + typecheck). AC #4 (binary-level smoke) tagged [manual] because vitest fork-pool workers fail to capture stdout/stderr from spawned binaries that do top-level await import(); the six scenarios were verified manually against the built binary and the trace is recorded in the plan. AC #10 n/a — integration test path abandoned for the same reason. All plans done. Spec 014 marked .done (docs/specs/014-*.md → .done). CHANGELOG Unreleased updated with a per-plan entry. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

codecov · 2026-04-24T20:11:39Z

Codecov Report

❌ Patch coverage is 87.46269% with 42 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/gadgets/shared/cliCommandFactory.ts	87.05%	29 Missing and 4 partials ⚠️
src/gadgets/shared/errorEnvelope.ts	81.48%	5 Missing ⚠️
src/backends/shared/nativeToolPrompts.ts	77.77%	4 Missing ⚠️

📢 Thoughts on this report? Let us know!

zbigniewsobiecki and others added 5 commits April 24, 2026 21:26

chore(plan-014): lock plan 1 (shared-infra)

a5c9586

chore(plan-014): lock plan 2 (createprreview-adopt)

f479d19

zbigniewsobiecki merged commit 3fe7398 into dev Apr 25, 2026
9 checks passed

zbigniewsobiecki deleted the feature/014-cascade-tools-agent-ergonomics branch April 25, 2026 07:00

zbigniewsobiecki mentioned this pull request Apr 25, 2026

chore: promote dev → main (worker exit diagnostics + agent docs dedupe + JIRA / Linear / PM-wizard fixes) #1194

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cascade-tools): spec 014 — agent ergonomics (truthful prompts, structured envelope, --comment alias)#1190

feat(cascade-tools): spec 014 — agent ergonomics (truthful prompts, structured envelope, --comment alias)#1190
zbigniewsobiecki merged 5 commits intodevfrom
feature/014-cascade-tools-agent-ergonomics

zbigniewsobiecki commented Apr 24, 2026

Uh oh!

codecov Bot commented Apr 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zbigniewsobiecki commented Apr 24, 2026

Summary

Highlights

Test plan

Notes

Uh oh!

codecov Bot commented Apr 24, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant