Skip to content

fix: improve PostgreSQL startup reliability and agent documentation#9

Merged
zbigniewsobiecki merged 1 commit intodevfrom
fix/postgresql-startup-reliability
Jan 1, 2026
Merged

fix: improve PostgreSQL startup reliability and agent documentation#9
zbigniewsobiecki merged 1 commit intodevfrom
fix/postgresql-startup-reliability

Conversation

@zbigniewsobiecki
Copy link
Copy Markdown
Member

Summary

This PR improves PostgreSQL handling in the agent execution environment:

  • Fail-fast startup: Agent sessions will not start if PostgreSQL fails to start, preventing wasted iterations troubleshooting database connectivity
  • Proper authentication: PostgreSQL configured with password authentication (postgres:postgres)
  • Agent documentation: Environment prompt now includes complete PostgreSQL management commands

Changes

src/agents/utils/setup.ts

  • Added robust error handling to startPostgres():
    • Checks if PostgreSQL is already running before starting
    • Uses -w flag for synchronous startup (waits until ready)
    • Throws descriptive error if startup fails
    • Verifies PostgreSQL is actually running after start

Dockerfile

  • Configured pg_hba.conf for proper authentication:
    • Local socket connections use trust (for admin tasks)
    • TCP connections require md5 password authentication
    • Set password postgres for user postgres
  • Connection string: postgresql://postgres:postgres@localhost:5432/postgres

src/agents/prompts/templates/partials/environment.eta

  • Added complete PostgreSQL documentation for agents:
    • Connection string
    • CLI access command
    • Start/stop/status commands using pg_ctl
    • Database creation example

Context

Analysis of a failed agent session revealed the agent wasted 11+ iterations trying to start PostgreSQL using incorrect commands (systemctl, brew services) that don't work in the container environment. This PR ensures:

  1. Agents know the correct commands to manage PostgreSQL
  2. Sessions fail immediately if PostgreSQL can't start

Test Plan

  • Local tests pass
  • CI tests pass
  • Docker build succeeds with new PostgreSQL configuration

🤖 Generated with Claude Code

- Add robust error handling to startPostgres() that fails fast if
  PostgreSQL cannot start, preventing agent sessions from proceeding
  with a broken database
- Configure PostgreSQL in Dockerfile with password authentication:
  - User: postgres, Password: postgres
  - Connection string: postgresql://postgres:postgres@localhost:5432/postgres
  - Local socket uses trust, TCP connections require md5 password
- Update agent environment prompt with complete PostgreSQL documentation:
  - Connection string and CLI access
  - Start/stop/status commands using pg_ctl
  - Database creation example

This addresses issues where agents would waste iterations trying to
troubleshoot PostgreSQL connectivity using incorrect commands (systemctl,
brew services) that don't work in the container environment.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@zbigniewsobiecki zbigniewsobiecki merged commit d3415ea into dev Jan 1, 2026
2 checks passed
@zbigniewsobiecki zbigniewsobiecki deleted the fix/postgresql-startup-reliability branch January 1, 2026 20:20
zbigniewsobiecki added a commit that referenced this pull request Apr 15, 2026
Plan 003/1 (status-parity).

Linear PM wizard Field Mapping step now exposes all 8 CASCADE stages
(backlog, splitting, planning, todo, inProgress, inReview, done,
merged) in lifecycle order instead of only 4. Linear operators can
now map workflow states to splitting/planning/todo/merged and have
the corresponding agents dispatch on issue transitions.

JIRA's silent-drop bug in resolveLifecycleConfig fixed: splitting/
planning/todo mappings the JIRA wizard already accepted now surface
through to PMLifecycleManager and GitHub PR triggers. No operator
action required.

Canonical ProjectPMConfig.statuses widens to declare the full 9-stage
vocabulary (including debug, reserved for future trigger), so
providers can no longer silently drift from the trigger layer.

Existing Linear integrations upgrade in place: new slots render as
'not set' on next wizard visit. No migration.

Tests: 9 new unit tests (type shape + Linear + JIRA integration + SSR
wizard). Integration coverage for spec ACs #8/#9 provided by existing
linear-status-changed and jira-status-changed trigger handler tests —
discovered during Phase 4 that handlers read provider-specific config
directly (not resolveLifecycleConfig), so dispatch was never blocked
for handlers; the drop was in downstream PMLifecycleManager callers.

Totals: 7597 unit + 522 integration all green. Lint + typecheck clean.

Closes spec 003.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
zbigniewsobiecki added a commit that referenced this pull request Apr 15, 2026
* docs(plans): add spec 003 + plan; lock plan 003/1

Spec 003 introduces PM status mapping parity across Linear and JIRA.
Plan 1 (status-parity) locked for execution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs(plans): plan 003/1 frontmatter status -> wip

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(pm): status mapping parity across Linear and JIRA

Plan 003/1 (status-parity).

Linear PM wizard Field Mapping step now exposes all 8 CASCADE stages
(backlog, splitting, planning, todo, inProgress, inReview, done,
merged) in lifecycle order instead of only 4. Linear operators can
now map workflow states to splitting/planning/todo/merged and have
the corresponding agents dispatch on issue transitions.

JIRA's silent-drop bug in resolveLifecycleConfig fixed: splitting/
planning/todo mappings the JIRA wizard already accepted now surface
through to PMLifecycleManager and GitHub PR triggers. No operator
action required.

Canonical ProjectPMConfig.statuses widens to declare the full 9-stage
vocabulary (including debug, reserved for future trigger), so
providers can no longer silently drift from the trigger layer.

Existing Linear integrations upgrade in place: new slots render as
'not set' on next wizard visit. No migration.

Tests: 9 new unit tests (type shape + Linear + JIRA integration + SSR
wizard). Integration coverage for spec ACs #8/#9 provided by existing
linear-status-changed and jira-status-changed trigger handler tests —
discovered during Phase 4 that handlers read provider-specific config
directly (not resolveLifecycleConfig), so dispatch was never blocked
for handlers; the drop was in downstream PMLifecycleManager callers.

Totals: 7597 unit + 522 integration all green. Lint + typecheck clean.

Closes spec 003.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs(specs): spec 003 (linear-status-mapping-parity) done — all plans complete

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
zbigniewsobiecki added a commit that referenced this pull request Apr 18, 2026
First real consumer of the shared wizard components. Trello's legacy
per-provider step file (pm-wizard-trello-steps.tsx) has no live importers
outside itself; deletion deferred to plan 011/5.

- TrelloOAuthStep: new custom step at pm-providers/trello/oauth-step.tsx.
  Lifts the window.open popup + manual-token fallback verbatim from the
  legacy TrelloCredentialsStep. Registered as kind:'custom' with
  component:'TrelloOAuthStep' in trelloManifest.wizardSpec.
- trelloManifest.wizardSpec.steps: now [custom(TrelloOAuthStep),
  container-pick, status-mapping, label-mapping, custom-field-mapping,
  webhook-url-display] — 6 steps, one of them custom.
- trelloProviderWizard.steps: rewritten to consume shared components via
  thin per-step adapters. useProviderHooks returns the flat shape each
  adapter slices (boardOptions, providerStates, providerLabels,
  providerCustomFields, onCreate* callbacks). Adapters call shared
  components directly with Trello-specific props.
- Adapters file (trello/adapters.tsx): deleted — orphaned after the
  wizard rewrite.
- useTrelloCustomFieldCreation: now accepts { name: string } argument
  (was hard-coded "Cost"). Enables the shared Create-form UX.

Forward-edit to plan 011/1 (additive, existing tests unchanged):
- label-mapping widened with optional labelDefaults?: Record<slot,
  {name, color?}> — pre-populates Create input, threads color to
  onCreateLabel. Trello uses it for cascade-ready/processing/etc.
- custom-field-mapping widened with optional fieldDefaults?: Record<slot,
  {name}> — pre-populates Create input. Trello uses it for cost field.

Normalize-upward UX changes (user-approved in behavior inventory):
- Dropped retry button on board-picker error (shared component shows
  error text; operator refreshes page).
- Dropped "Create All Missing Labels" batch button (per-slot Create
  covers the same ground, one click at a time).

AC #9 (no operator regression) marked **deferred** — browser smoke test
pending reviewer verification. Unit tests + conformance harness cover
wire-level invariants; no runtime behavior change in adapters
(discovery / label-creation / custom-field-creation hooks reused
unchanged).

19 Trello tests: 5 wizardSpec + 7 oauth-step + 7 wizard-generator. Plus
2 forward-edit tests on the widened shared steps. Full suite 8169/8169,
lint + typecheck + build all green.

Closes plan 011/2 of spec 011.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
zbigniewsobiecki added a commit that referenced this pull request Apr 18, 2026
…migration (specs 010 + 011/1-2) (#1148)

* docs(010): add spec + plans for PM integration hardening followups

* chore(010/1): lock plan 1 as .wip

* feat(010/1): manifest createCustomField hook + pm.discovery mutations

* chore(010/1): mutations complete, plan done

* chore(010/2): lock plan 2 as .wip

* feat(010/2): currentUser discovery capability + provider implementations

* chore(010/2): read cleanup done, currentUser UX restored

* chore(010/3): lock plan 3 with narrowed scope (option B)

* feat(010/3): wizard-components done — shared step components + generator dispatch

Upgrades the wizard generator from spec-010/1 placeholders to real shared
React components for every StandardStepKind. Six new components at
web/src/components/projects/pm-providers/steps/*.tsx: credentials,
container-pick, status-mapping, label-mapping, webhook-url-display,
project-scope. Generator exports STANDARD_STEP_COMPONENTS registry and
dispatches through it; unknown kinds still warn-once and render a
placeholder.

Trello/JIRA/Linear wizards keep their per-provider step adapters from
the spec-006 era — a future plan migrates them. The shared path is
live for new providers today.

new-provider-surface snapshot is tightened to pin the six new files;
wizard-generator + per-provider manifest-wizardSpec tests now assert
element.type identity against the registry instead of placeholder DOM
shapes. 55 new/updated tests, all green.

Docs updated: src/integrations/README.md (post-spec-010 additions),
root CLAUDE.md (PM-integration summary), spec 009 forward-references
spec 010, CHANGELOG entries for specs 009 + 010.

Closes plan 010/3 of spec 010.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* chore(010): spec done — all plans complete

All three plans of spec 010 (PM integration hardening follow-ups)
shipped. Mutations (010/1) added generic pm.discovery.createLabel /
createCustomField endpoints. Read cleanup (010/2) added currentUser
discovery capability. Wizard components (010/3) landed real shared
React components for every StandardStepKind. Spec marked .done.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* docs(011): spec + plan decomposition for wizard shared migration

Spec 011 (PM Wizard Shared Migration) and its 5-plan decomposition.
Migrates Trello/JIRA/Linear wizards onto the shared StandardStepKind
components landed by spec 010 — closes the "zero per-provider step
code" promise across all three production providers, not just new
providers.

Plans: 1-shared-components (widen container-pick/project-scope with
searchable mode + widen webhook-url-display with optional signing-
secret + add 7th StandardStepKind: custom-field-mapping), 2-trello
(first consumer; OAuth stays kind:'custom'), 3-jira (issue-type stays
kind:'custom'; free-text label mode), 4-linear (retire LinearWebhook-
InfoPanel in favor of widened shared component), 5-cleanup (delete
pm-wizard-{trello,jira,linear}-steps.tsx + final docs rewrite).

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* chore(011/1): lock plan 1 (shared-components)

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* feat(011/1): shared-components done — widen 3 steps + add custom-field-mapping kind

Foundation plan for the wizard migration. Three additive widenings +
one new StandardStepKind. All changes dormant until plan 2 activates
them; the 31 spec-010 step tests pass unchanged as the backward-compat
proof.

- container-pick gains optional searchable?: boolean → dispatches to
  the existing shared Combobox (cmdk + radix) when true.
- project-scope gains the same searchable? prop; empty value still
  means "no scope" in both render paths.
- webhook-url-display gains optional secretFieldRole / secretLabel /
  secretValue / onSecretChange → renders an inline <input type="password">
  below the URL when both role + callback are supplied. Defensive: omits
  the input if role is set but callback is not (avoids uncontrolled
  secret inputs silently dropping user input).
- 7th StandardStepKind: 'custom-field-mapping'. New shared component at
  web/src/components/projects/pm-providers/steps/custom-field-mapping.tsx
  renders one row per CASCADE slot with a dropdown of discovered provider
  custom fields + optional inline "Create…" affordance wired to
  manifest.createCustomField (spec 010/1). Visual idiom matches
  status-mapping.
- STANDARD_STEP_COMPONENTS registers the new kind; generator dispatch
  falls through the existing switch path.
- new-provider-surface snapshot pins the 7th file.

Tests use element-tree identity checks where SSR would hit the React
instance mismatch (radix lives in web/node_modules and pulls its own
React). 17 new/updated test assertions across 4 files. Full suite
8153/8153, lint 0/0, typecheck + build green.

Closes plan 011/1 of spec 011.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* chore(011/2): lock plan 2 (trello)

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* feat(011/2): trello done — wizard migrated to shared step components

First real consumer of the shared wizard components. Trello's legacy
per-provider step file (pm-wizard-trello-steps.tsx) has no live importers
outside itself; deletion deferred to plan 011/5.

- TrelloOAuthStep: new custom step at pm-providers/trello/oauth-step.tsx.
  Lifts the window.open popup + manual-token fallback verbatim from the
  legacy TrelloCredentialsStep. Registered as kind:'custom' with
  component:'TrelloOAuthStep' in trelloManifest.wizardSpec.
- trelloManifest.wizardSpec.steps: now [custom(TrelloOAuthStep),
  container-pick, status-mapping, label-mapping, custom-field-mapping,
  webhook-url-display] — 6 steps, one of them custom.
- trelloProviderWizard.steps: rewritten to consume shared components via
  thin per-step adapters. useProviderHooks returns the flat shape each
  adapter slices (boardOptions, providerStates, providerLabels,
  providerCustomFields, onCreate* callbacks). Adapters call shared
  components directly with Trello-specific props.
- Adapters file (trello/adapters.tsx): deleted — orphaned after the
  wizard rewrite.
- useTrelloCustomFieldCreation: now accepts { name: string } argument
  (was hard-coded "Cost"). Enables the shared Create-form UX.

Forward-edit to plan 011/1 (additive, existing tests unchanged):
- label-mapping widened with optional labelDefaults?: Record<slot,
  {name, color?}> — pre-populates Create input, threads color to
  onCreateLabel. Trello uses it for cascade-ready/processing/etc.
- custom-field-mapping widened with optional fieldDefaults?: Record<slot,
  {name}> — pre-populates Create input. Trello uses it for cost field.

Normalize-upward UX changes (user-approved in behavior inventory):
- Dropped retry button on board-picker error (shared component shows
  error text; operator refreshes page).
- Dropped "Create All Missing Labels" batch button (per-slot Create
  covers the same ground, one click at a time).

AC #9 (no operator regression) marked **deferred** — browser smoke test
pending reviewer verification. Unit tests + conformance harness cover
wire-level invariants; no runtime behavior change in adapters
(discovery / label-creation / custom-field-creation hooks reused
unchanged).

19 Trello tests: 5 wizardSpec + 7 oauth-step + 7 wizard-generator. Plus
2 forward-edit tests on the widened shared steps. Full suite 8169/8169,
lint + typecheck + build all green.

Closes plan 011/2 of spec 011.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* chore(011/3): lock plan 3 (jira)

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* feat(011/3): jira done — wizard migrated to shared step components

Second real consumer of the shared wizard components. JIRA's legacy
per-provider step file (pm-wizard-jira-steps.tsx) has no live importers
outside itself; deletion deferred to plan 011/5.

- IssueTypeMappingStep: new JIRA-specific custom step at
  pm-providers/jira/issue-type-step.tsx. Maps CASCADE task/subtask
  roles to JIRA issue types (filtered by the `subtask` flag). Stays
  kind:'custom' rather than becoming an 8th StandardStepKind because
  JIRA is the sole consumer today — speculative abstraction avoided.
- jiraManifest.wizardSpec.steps: now [credentials, container-pick,
  status-mapping, label-mapping, custom-field-mapping,
  custom(IssueTypeMappingStep), webhook-url-display] — 7 steps, one
  custom.
- jiraProviderWizard.steps: rewritten to consume shared components via
  thin per-step adapters. Credentials step uses the shared
  `CredentialsStep` with a synthetic `base_url` role alongside email +
  api_token — no OAuth popup needed for JIRA (unlike Trello). Label
  mapping passes providerLabels: [] so the shared step renders in
  free-text mode (JIRA labels are free-form).
- Adapters file (jira/adapters.tsx): deleted — orphaned after rewrite.
- useJiraCustomFieldCreation: now accepts { name: string } argument
  (was hard-coded "Cost") so the shared Create affordance works.

Task 1 behavior inventory found the same 4 gap classes Trello
surfaced; all four were already closed by plan 011/2's forward-edit to
plan 011/1 (labelDefaults + fieldDefaults additive widenings). No
additional shared-component changes were required for JIRA.

AC #10 (no operator regression) marked **deferred** — browser smoke
test pending reviewer verification on the deployed branch. Unit tests
+ conformance harness cover wire-level invariants; legacy discovery +
custom-field hooks reused unchanged (only the name-arg tweak on the
custom-field mutation).

17 JIRA tests: 6 manifest + 7 issue-type + 4 wizard-generator. JIRA
had zero dedicated wizard-step tests before this plan — this is the
first JIRA wizard coverage landing. Full suite 8185/8185, lint +
typecheck + build all green.

Closes plan 011/3 of spec 011.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* chore(011/4): lock plan 4 (linear)

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* feat(011/4): linear done — + parent-wizard fix for plans 2+3 regression

Third real consumer of the shared wizard components. Plan 011/4 also
ships a critical fix for a regression plans 011/2 and 011/3 introduced:
pm-wizard.tsx hardcoded 3 manifest step slots (stepIndex 0/1/2) from
the spec-006 era. Trello/JIRA wizardSpecs grew to 6+ steps; only the
first 3 rendered on the deploy — label-mapping, custom-field-mapping,
and issue-type-mapping steps were INVISIBLE in production.

Fix: pm-wizard.tsx now iterates over `manifestDef.steps`, rendering
one WizardStep slot per entry. Webhook steps (id ends with `-webhook`)
are filtered out — the legacy WebhookStep still owns programmatic
webhook registration (Trello/JIRA API calls) and Linear's signing-secret
UX. The shared `webhook-url-display` component (widened in plan 011/1)
remains dormant for the three existing providers until a follow-up plan
migrates webhook-creation UX into the manifest path.

Linear wizard migration:

- linearProviderWizard.steps: rewritten to consume shared components
  via 6 thin per-step adapters. No kind:'custom' steps — Linear has no
  OAuth popup (like Trello) and no issue-type mapping (like JIRA).
- LinearWebhookDisplayAdapter: Fragment composing shared
  WebhookUrlDisplayStep + ProjectSecretField (LINEAR_WEBHOOK_SECRET).
  Currently dormant; activates after legacy WebhookStep migration.
- project-scope step (spec 005): uses the shared ProjectScopeStep with
  `searchable: true`.
- label-mapping: uses shared component with LINEAR_LABEL_DEFAULTS
  (plan 011/1 forward-edit) pre-populating the Create input with
  cascade-ready/processing/etc. and threading hex colors.
- Adapters file (linear/adapters.tsx): deleted — orphaned after
  rewrite.
- Legacy step tests deleted: linear-field-mapping-step.test.ts,
  linear-team-step.test.ts, linear-webhook-info-panel.test.ts
  (-450 lines). Replaced by 8-test linear-wizard-generator.test.ts
  covering the wizard wiring + manifest↔definition parity.

AC #3 (inline webhook secret), #6 (LinearWebhookInfoPanel retired),
and #11 (no operator regression) marked **partial/deferred** — see
Progress section for details. All other ACs green.

Full suite 8167/8167, lint + typecheck + build all green.

Closes plan 011/4 of spec 011.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* chore(011/5): lock plan 5 (cleanup)

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* feat(011/5): cleanup done — deleted 3 legacy step files + docs rewrite

Closes spec 011 per user-approved option (a) — tight scope: deletions
+ docs. Scope-clipped items (full LinearWebhookInfoPanel retirement,
Linear inline-secret via shared component) carry over as follow-up
work; rationale is captured in the plan's Progress section.

Deletions:
- web/src/components/projects/pm-wizard-trello-steps.tsx (retired
  since plan 011/2; no live importers)
- web/src/components/projects/pm-wizard-jira-steps.tsx (since 011/3)
- web/src/components/projects/pm-wizard-linear-steps.tsx (since 011/4)
- pm-wizard.tsx dead comments about transitive imports of the above

Audits:
- pm-wizard-common-steps.tsx — all three remaining exports
  (LinearWebhookInfoPanel, WebhookStep, SaveStep) still have live
  consumers via pm-wizard.tsx. File retained.
- Dead-code grep: only doc-comment references to the deleted files
  remain; no live imports.

Docs:
- src/integrations/README.md — four-specs preamble (006/009/010/011);
  "seven kinds" in "Adding a new PM provider" step 3;
  Post-spec-011 additions table alongside the Post-spec-010 one.
- CLAUDE.md (project root) — PM-integration summary references
  spec 011.
- CHANGELOG.md — Internal entry for spec 011 alongside 009/010.
- docs/specs/010-pm-integration-hardening-followups.md.done —
  forward-reference blockquote to spec 011.

Verification:
- npm test: 8167 passed, 23 skipped
- npm run lint: clean
- npm run typecheck: green
- npm run build: green
- Conformance harness: all three providers pass
- new-provider-surface guard: 7 step files pinned

Closes plan 011/5 of spec 011.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* chore(011): spec done — all plans complete

Five plans of spec 011 (PM Wizard Shared Migration) shipped. Shared
components widened (plan 1), Trello migrated (plan 2), JIRA migrated
(plan 3), Linear migrated + pm-wizard.tsx parent refactor (plan 4),
legacy per-provider step files deleted + docs closed (plan 5). Spec
marked .done.

Deferred to follow-up spec:
- Full migration of webhook-creation UX (Trello/JIRA programmatic
  webhook registration + Linear signing-secret persistence) into the
  manifest path. Legacy WebhookStep + LinearWebhookInfoPanel still
  render for this.

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
zbigniewsobiecki added a commit that referenced this pull request Apr 25, 2026
…tructured envelope, --comment alias) (#1190)

* docs(014): spec + plans for cascade-tools agent ergonomics

Adds docs/specs/014-cascade-tools-agent-ergonomics.md plus two plans
covering shared-infra and create-pr-review adoption. Prompted by prod
run 5d993b04-6e05-4ae1-b7de-8c274cf3496b.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(plan-014): lock plan 1 (shared-infra)

* feat(cascade-tools): plan 014/1 shared-infra — truthful prompts + envelope

Ships the root-cause fix for prod run 5d993b04-6e05-4ae1-b7de-8c274cf3496b
plus the shared infrastructure every future gadget inherits:

- System-prompt renderer (src/backends/shared/nativeToolPrompts.ts) stops
  stripping trailing 's' from array param names and claiming '<string>
  (repeatable)' for every array. Array-of-object params now render as
  `--<flag> '<json>'` with aliases appended via `|` and a one-line runnable
  example from the tool definition.
- Factory (src/gadgets/shared/cliCommandFactory.ts) gains oclif flag aliases,
  JSON parsing for array-of-object flags, file-input JSON parsing, `examples`
  wired into oclif `--help`, and Levenshtein-based 'did you mean' suggestions
  for mistyped flags (via fastest-levenshtein).
- New shared error envelope (src/gadgets/shared/errorEnvelope.ts) — every
  CLI failure emits `{"success":false,"error":{type,flag?,message,got?,
  expected?,hint?,example?}}` on stdout plus a one-line prose summary on
  stderr. All prior `this.error()` / flat `{success:false,error:"<string>"}`
  call sites migrated.
- Contracts widened: ParameterDefinition gains `cliAliases`, FileInput-
  Alternative gains `parseAs`, ToolManifest parameters carry `items`,
  `aliases`, `example`.
- Manifest generator threads the new fields through.
- bin/cascade-tools.js wraps `run()` to swallow oclif ExitError cleanly so
  the envelope isn't obscured by Node's default stack dump.

Plan-1 ACs #1#17 all delivered. 8438/8438 unit tests passing.

Test surface delta: 57 new unit tests across errorEnvelope.test.ts,
shared-nativeToolPrompts.test.ts, and factories.test.ts. Seven legacy
assertions encoding the pre-014 error surface updated in cli/cli-command-
factory, cli/file-input-flags, cli/scm/create-pr-sidecar, cli/scm/create-
pr-review-sidecar, backends/claude-code.

Plan 2 adopts the pattern on createPRReviewDef — zero shared-file edits —
proving the declarative-metadata invariant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(plan-014): lock plan 2 (createprreview-adopt)

* feat(cascade-tools): plan 014/2 createprreview-adopt + spec done

Applies the spec-014 declarative-metadata pattern to createPRReviewDef:

- --comment alias for --comments (the exact muscle-memory mistake from
  prod run 5d993b04-6e05-4ae1-b7de-8c274cf3496b).
- --comments-file <path> (and - for stdin) JSON-parsed escape hatch for
  long payloads that don't survive shell quoting.
- Two declarative fields on createPRReviewDef.parameters.comments.cliAliases
  + createPRReviewDef.cli.fileInputAlternatives. Zero edits to shared
  infrastructure (cliCommandFactory, manifestGenerator, nativeToolPrompts,
  errorEnvelope) — proves spec 014's single-entrypoint invariant.

Per-plan ACs #1, #2, #3, #5, #6, #7, #8, #9, #11, #12 auto-verified
(unit tests + build + lint + typecheck). AC #4 (binary-level smoke)
tagged [manual] because vitest fork-pool workers fail to capture
stdout/stderr from spawned binaries that do top-level await import();
the six scenarios were verified manually against the built binary and
the trace is recorded in the plan. AC #10 n/a — integration test path
abandoned for the same reason.

All plans done. Spec 014 marked .done (docs/specs/014-*.md → .done).
CHANGELOG Unreleased updated with a per-plan entry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
zbigniewsobiecki added a commit that referenced this pull request Apr 26, 2026
* docs(spec/plans): add spec 015 + plans for router dispatch failure recovery

Spec captures the silent black-hole bug class verified live on 2026-04-26
(ucho/MNG-350): a transient capacity miss or Docker error during worker
spawn turns a webhook-driven job into a permanently failed BullMQ entry
while stranding the work-item / agent-type locks for up to 30 minutes,
silently rejecting subsequent webhooks for the same work item.

Decomposed into two plans with safety-net-first sequencing: plan 1 hooks
the BullMQ failed event to release locks on every dispatch failure path;
plan 2 replaces the throw-on-capacity with a wait-for-slot semaphore,
adds bounded retry with exponential backoff, and a transient/terminal
error classifier.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(plan): lock 015/1 failed-event-lock-compensation

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(router): plan 015/1 done — release locks on dispatch failure

Closes the stranded-lock half of spec 015's bug class verified live in
prod on 2026-04-26 (ucho/MNG-350). When a webhook-driven job's dispatch
fails — capacity throw, Docker spawn error, or any future throw site —
the work-item lock, agent-type concurrency counter, and recently-dispatched
dedup mark established by the webhook → enqueue path are now released by
a compensator hooked to BullMQ's `worker.on('failed')` event.

What landed:

- `src/router/dispatch-compensator.ts` (new) — `releaseLocksForFailedJob`
  wraps `extractProjectIdFromJob` / `extractWorkItemId` / `extractAgentType`
  and calls into `clearWorkItemEnqueued` / `clearAgentTypeEnqueued` /
  `clearRecentlyDispatched`. Never propagates errors; captures to Sentry
  with `tags: { source: 'dispatch_compensator' }`.
- `src/router/agent-type-lock.ts` — exports new `clearRecentlyDispatched`
  for the compensator. The existing `markRecentlyDispatched` semantics
  are unchanged (60s TTL, NOT cleared on completion); this helper exists
  solely so a permanently-failed dispatch doesn't keep deduping a fresh
  webhook for ~60s while the user retries.
- `src/router/bullmq-workers.ts` — extends the existing
  `worker.on('failed')` handler to invoke `releaseLocksForFailedJob`
  alongside the existing logger + Sentry calls. Wraps the call in a
  defensive `.catch` so a future regression in the compensator can't
  poison the worker.
- `src/router/lock-state-classifier.ts` (new) — `classifyLockState` returns
  `'awaiting-slot'` when an active worker or queued/waiting job matches
  the trio, `'wedged'` when neither correlation matches. Defaults to
  `'awaiting-slot'` on classifier error so a Redis blip doesn't mis-emit
  the wedged canary.
- `src/router/active-workers.ts` — `getActiveWorkers()` now exposes
  `(projectId, workItemId, agentType)` so the classifier can correlate.
  Backwards-compatible (existing callers work unchanged; new fields are
  additive optional).
- `src/router/webhook-processor.ts` — Step 8 (work-item lock check) now
  splits the decision-reason vocabulary into three states:
    * `Job queued: ...` (success path)
    * `Awaiting worker slot: ...` (lock held + dispatch in flight; healthy)
    * `Work item locked (no active dispatch): ...` (wedged-lock canary)
  The wedged branch additionally fires `captureException` with
  `tags: { source: 'wedged_lock_canary' }` so any regression in
  compensation is loud in production.

What this does NOT change (intentional, all in plan 015/2):
- `guardedSpawn` still throws on capacity (BullMQ marks the job failed,
  the compensator now releases the locks, but the job itself is still
  lost). Plan 2 replaces the throw with a wait-for-slot semaphore.
- Both queues still default to `attempts: 1`. Plan 2 raises this with
  exponential backoff and adds a transient/terminal error classifier.
- CLAUDE.md is intentionally not updated by this plan — the unified
  passage describing both halves of the new contract lands in plan 015/2.

Tests:
- 5 new unit tests in `dispatch-compensator.test.ts`
- 3 new unit tests in `agent-type-lock.test.ts` for `clearRecentlyDispatched`
- 4 new unit tests in `bullmq-workers.test.ts` for the failed-event seam
- 5 new unit tests in `lock-state-classifier.test.ts`
- 2 new unit tests in `active-workers.test.ts` for the extended shape
- 4 new unit tests in `webhook-processor.test.ts` for the three-way taxonomy
- 3 new module-integration tests in
  `tests/integration/router/dispatch-failure-compensation.test.ts` exercise
  the real lock modules + real bullmq-workers.ts failed-event handler +
  real compensator end-to-end (only BullMQ's Worker constructor + the
  worker-env extractors are mocked).

Full suite: 8515 passed / 23 skipped / 0 failed. Lint + typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(plan): lock 015/2 wait-for-slot-and-retry-classifier

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(plan): mark 015/2 status: wip

* feat(router): plan 015/2 done — wait-for-slot + retry budget + classifier

Closes the lost-job half of spec 015's bug class. Combined with plan 015/1,
the silent black-hole failure mode verified live in prod on 2026-04-26
(ucho/MNG-350) is now fully closed.

What landed:

- `src/router/slot-waiter.ts` (new) — semaphore-style primitive:
  `acquireSlot({ timeoutMs })` resolves immediately when capacity is
  below `routerConfig.maxWorkers`, otherwise queues a FIFO waiter with
  a bounded timeout that rejects with `code: 'SLOT_WAIT_TIMEOUT'`.
  `slotReleased()` pops the head waiter; `clearAllWaiters()` rejects
  every pending waiter with `code: 'SHUTDOWN'` on router stop.
- `src/router/dispatch-error-classifier.ts` (new) — classifies thrown
  errors into `'transient'` (Docker socket Node codes, HTTP 429/409,
  SLOT_WAIT_TIMEOUT, anything unknown — default-to-retry) vs
  `'terminal'` (TypeError, ZodError, image-not-found-after-fallback).
- `src/router/worker-manager.ts` — `guardedSpawn` rewritten:
  `await acquireSlot(...)` replaces the synchronous capacity throw;
  on spawn error, terminal errors are wrapped in BullMQ's
  `UnrecoverableError` so retries skip; transient errors propagate
  unchanged so BullMQ retries via attempts/backoff.
- `src/router/active-workers.ts` — `cleanupWorker` now calls
  `slotReleased()` exactly once per cleanup, including on the crash
  path. The existing `if (worker)` guard ensures idempotence.
- `src/router/config.ts` — new `slotWaitTimeoutMs` field (default
  5min, configurable via `SLOT_WAIT_TIMEOUT_MS`).
- `src/router/queue.ts` and `src/queue/client.ts` — both queues now
  default to `attempts: 4` with `backoff: { type: 'exponential', delay: 5000 }`
  (~75s total before exhaustion). Terminal errors bypass via
  `UnrecoverableError`.
- `src/router/container-manager.ts` — exports the existing
  `isImageNotFoundError` predicate so the classifier can reuse it.

Test contract change (spec AC #9):

The previous `tests/unit/router/worker-manager.test.ts:179` assertion
`'processFn throws when at capacity'` is REPLACED (not deleted) with
`'processFn awaits a slot when at capacity, then dispatches when one
frees'`. The throw-on-capacity contract is gone forever.

Tests:

- 7 new unit tests in `slot-waiter.test.ts` (FIFO, timeout, no-op,
  shutdown rejection)
- 11 new unit tests in `dispatch-error-classifier.test.ts` covering
  every transient/terminal class
- 4 new unit tests in `worker-manager.test.ts` (replaced original
  capacity-throw test + 3 for retry classification)
- 3 new unit tests in `active-workers.test.ts` for slotReleased
  integration
- 5 new module-integration tests in `dispatch-retry.test.ts` exercise
  REAL guardedSpawn + REAL slot-waiter + REAL dispatch-error-classifier
  against both queues, mocking only spawnWorker + BullMQ Worker
  constructor.

Plan 1's 3 module-integration tests continue to pass alongside plan 2's 5.
Full unit suite: 8539 passed / 23 skipped / 0 failed.

CLAUDE.md updated with a new "Dispatch failure semantics" section
documenting the unified contract (capacity wait, retry budget,
classifier, three-way decision-reason taxonomy from plan 1, wedged-lock
canary). File now 182 lines, under the 200-line cap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(spec): 015 done — router job dispatch failure recovery, all plans complete

Closes the silent black-hole bug class verified live on 2026-04-26
(ucho/MNG-350). Plan 1 added failed-event lock compensation +
three-way decision-reason taxonomy; plan 2 replaced the throw-on-capacity
with wait-for-slot, added bounded retry with exponential backoff, and
introduced a transient/terminal error classifier.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
zbigniewsobiecki added a commit that referenced this pull request Apr 26, 2026
* fix(triggers): audit & fix PM feedback inconsistencies across respond-to-* agents (#1201)

* fix(triggers): audit & fix PM feedback inconsistencies across respond-to-* agents

* fix(triggers): use case-insensitive JIRA status comparison in isInPlanningStatus

Match the established pattern from status-changed.ts and label-added.ts which
both use .toLowerCase() for JIRA status comparisons, since status names are
user-configurable and the API does not guarantee consistent casing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Cascade Bot <bot@cascade.dev>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(linear): populate inlineMedia from descriptions/comments and add downloadAttachment (#1202)

Co-authored-by: Cascade Bot <bot@cascade.dev>

* fix(claude-code): pin pathToClaudeCodeExecutable so SDK skips broken native-binary probe (#1206)

The agent-harness SDK bump in #1197 (claude-agent-sdk 0.2.91 → 0.2.119)
broke every review run on cascade-prod with:

  ReferenceError: Claude Code native binary not found at
  /app/node_modules/@anthropic-ai/claude-agent-sdk-linux-x64-musl/claude

The new SDK probes its own platform-specific optional-dependency
subpackages for a bundled `claude` binary. Two failure modes hit at once:

  1. Cascade installs `@anthropic-ai/claude-code@2.1.119` globally at
     /usr/local/bin/claude — the SDK never looks there.
  2. The SDK probes the `-musl` variant first regardless of host libc and
     errors on ENOENT instead of falling through to the glibc variant.

Pass an explicit `pathToClaudeCodeExecutable` to short-circuit the probe.
The resolver checks (in order):

  - $CLAUDE_CODE_EXECUTABLE_PATH env override (local-dev escape hatch)
  - `which claude` in $PATH
  - /usr/local/bin/claude (Docker default from Dockerfile.worker)

Two TDD tests pin the option onto query() and prove the env override
wins. No Dockerfile change needed; the existing global install at
/usr/local/bin/claude becomes the resolver's runtime target.

Confirmed broken on ucho PR #72 (cascade-prod review agent crash).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* spec 015: router job dispatch failure recovery (#1203)

* docs(spec/plans): add spec 015 + plans for router dispatch failure recovery

Spec captures the silent black-hole bug class verified live on 2026-04-26
(ucho/MNG-350): a transient capacity miss or Docker error during worker
spawn turns a webhook-driven job into a permanently failed BullMQ entry
while stranding the work-item / agent-type locks for up to 30 minutes,
silently rejecting subsequent webhooks for the same work item.

Decomposed into two plans with safety-net-first sequencing: plan 1 hooks
the BullMQ failed event to release locks on every dispatch failure path;
plan 2 replaces the throw-on-capacity with a wait-for-slot semaphore,
adds bounded retry with exponential backoff, and a transient/terminal
error classifier.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(plan): lock 015/1 failed-event-lock-compensation

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(router): plan 015/1 done — release locks on dispatch failure

Closes the stranded-lock half of spec 015's bug class verified live in
prod on 2026-04-26 (ucho/MNG-350). When a webhook-driven job's dispatch
fails — capacity throw, Docker spawn error, or any future throw site —
the work-item lock, agent-type concurrency counter, and recently-dispatched
dedup mark established by the webhook → enqueue path are now released by
a compensator hooked to BullMQ's `worker.on('failed')` event.

What landed:

- `src/router/dispatch-compensator.ts` (new) — `releaseLocksForFailedJob`
  wraps `extractProjectIdFromJob` / `extractWorkItemId` / `extractAgentType`
  and calls into `clearWorkItemEnqueued` / `clearAgentTypeEnqueued` /
  `clearRecentlyDispatched`. Never propagates errors; captures to Sentry
  with `tags: { source: 'dispatch_compensator' }`.
- `src/router/agent-type-lock.ts` — exports new `clearRecentlyDispatched`
  for the compensator. The existing `markRecentlyDispatched` semantics
  are unchanged (60s TTL, NOT cleared on completion); this helper exists
  solely so a permanently-failed dispatch doesn't keep deduping a fresh
  webhook for ~60s while the user retries.
- `src/router/bullmq-workers.ts` — extends the existing
  `worker.on('failed')` handler to invoke `releaseLocksForFailedJob`
  alongside the existing logger + Sentry calls. Wraps the call in a
  defensive `.catch` so a future regression in the compensator can't
  poison the worker.
- `src/router/lock-state-classifier.ts` (new) — `classifyLockState` returns
  `'awaiting-slot'` when an active worker or queued/waiting job matches
  the trio, `'wedged'` when neither correlation matches. Defaults to
  `'awaiting-slot'` on classifier error so a Redis blip doesn't mis-emit
  the wedged canary.
- `src/router/active-workers.ts` — `getActiveWorkers()` now exposes
  `(projectId, workItemId, agentType)` so the classifier can correlate.
  Backwards-compatible (existing callers work unchanged; new fields are
  additive optional).
- `src/router/webhook-processor.ts` — Step 8 (work-item lock check) now
  splits the decision-reason vocabulary into three states:
    * `Job queued: ...` (success path)
    * `Awaiting worker slot: ...` (lock held + dispatch in flight; healthy)
    * `Work item locked (no active dispatch): ...` (wedged-lock canary)
  The wedged branch additionally fires `captureException` with
  `tags: { source: 'wedged_lock_canary' }` so any regression in
  compensation is loud in production.

What this does NOT change (intentional, all in plan 015/2):
- `guardedSpawn` still throws on capacity (BullMQ marks the job failed,
  the compensator now releases the locks, but the job itself is still
  lost). Plan 2 replaces the throw with a wait-for-slot semaphore.
- Both queues still default to `attempts: 1`. Plan 2 raises this with
  exponential backoff and adds a transient/terminal error classifier.
- CLAUDE.md is intentionally not updated by this plan — the unified
  passage describing both halves of the new contract lands in plan 015/2.

Tests:
- 5 new unit tests in `dispatch-compensator.test.ts`
- 3 new unit tests in `agent-type-lock.test.ts` for `clearRecentlyDispatched`
- 4 new unit tests in `bullmq-workers.test.ts` for the failed-event seam
- 5 new unit tests in `lock-state-classifier.test.ts`
- 2 new unit tests in `active-workers.test.ts` for the extended shape
- 4 new unit tests in `webhook-processor.test.ts` for the three-way taxonomy
- 3 new module-integration tests in
  `tests/integration/router/dispatch-failure-compensation.test.ts` exercise
  the real lock modules + real bullmq-workers.ts failed-event handler +
  real compensator end-to-end (only BullMQ's Worker constructor + the
  worker-env extractors are mocked).

Full suite: 8515 passed / 23 skipped / 0 failed. Lint + typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(plan): lock 015/2 wait-for-slot-and-retry-classifier

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(plan): mark 015/2 status: wip

* feat(router): plan 015/2 done — wait-for-slot + retry budget + classifier

Closes the lost-job half of spec 015's bug class. Combined with plan 015/1,
the silent black-hole failure mode verified live in prod on 2026-04-26
(ucho/MNG-350) is now fully closed.

What landed:

- `src/router/slot-waiter.ts` (new) — semaphore-style primitive:
  `acquireSlot({ timeoutMs })` resolves immediately when capacity is
  below `routerConfig.maxWorkers`, otherwise queues a FIFO waiter with
  a bounded timeout that rejects with `code: 'SLOT_WAIT_TIMEOUT'`.
  `slotReleased()` pops the head waiter; `clearAllWaiters()` rejects
  every pending waiter with `code: 'SHUTDOWN'` on router stop.
- `src/router/dispatch-error-classifier.ts` (new) — classifies thrown
  errors into `'transient'` (Docker socket Node codes, HTTP 429/409,
  SLOT_WAIT_TIMEOUT, anything unknown — default-to-retry) vs
  `'terminal'` (TypeError, ZodError, image-not-found-after-fallback).
- `src/router/worker-manager.ts` — `guardedSpawn` rewritten:
  `await acquireSlot(...)` replaces the synchronous capacity throw;
  on spawn error, terminal errors are wrapped in BullMQ's
  `UnrecoverableError` so retries skip; transient errors propagate
  unchanged so BullMQ retries via attempts/backoff.
- `src/router/active-workers.ts` — `cleanupWorker` now calls
  `slotReleased()` exactly once per cleanup, including on the crash
  path. The existing `if (worker)` guard ensures idempotence.
- `src/router/config.ts` — new `slotWaitTimeoutMs` field (default
  5min, configurable via `SLOT_WAIT_TIMEOUT_MS`).
- `src/router/queue.ts` and `src/queue/client.ts` — both queues now
  default to `attempts: 4` with `backoff: { type: 'exponential', delay: 5000 }`
  (~75s total before exhaustion). Terminal errors bypass via
  `UnrecoverableError`.
- `src/router/container-manager.ts` — exports the existing
  `isImageNotFoundError` predicate so the classifier can reuse it.

Test contract change (spec AC #9):

The previous `tests/unit/router/worker-manager.test.ts:179` assertion
`'processFn throws when at capacity'` is REPLACED (not deleted) with
`'processFn awaits a slot when at capacity, then dispatches when one
frees'`. The throw-on-capacity contract is gone forever.

Tests:

- 7 new unit tests in `slot-waiter.test.ts` (FIFO, timeout, no-op,
  shutdown rejection)
- 11 new unit tests in `dispatch-error-classifier.test.ts` covering
  every transient/terminal class
- 4 new unit tests in `worker-manager.test.ts` (replaced original
  capacity-throw test + 3 for retry classification)
- 3 new unit tests in `active-workers.test.ts` for slotReleased
  integration
- 5 new module-integration tests in `dispatch-retry.test.ts` exercise
  REAL guardedSpawn + REAL slot-waiter + REAL dispatch-error-classifier
  against both queues, mocking only spawnWorker + BullMQ Worker
  constructor.

Plan 1's 3 module-integration tests continue to pass alongside plan 2's 5.
Full unit suite: 8539 passed / 23 skipped / 0 failed.

CLAUDE.md updated with a new "Dispatch failure semantics" section
documenting the unified contract (capacity wait, retry budget,
classifier, three-way decision-reason taxonomy from plan 1, wedged-lock
canary). File now 182 lines, under the 200-line cap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(spec): 015 done — router job dispatch failure recovery, all plans complete

Closes the silent black-hole bug class verified live on 2026-04-26
(ucho/MNG-350). Plan 1 added failed-event lock compensation +
three-way decision-reason taxonomy; plan 2 replaced the throw-on-capacity
with wait-for-slot, added bounded retry with exponential backoff, and
introduced a transient/terminal error classifier.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(deps): bump postcss from 8.5.8 to 8.5.12 (#1204)

Bumps [postcss](https://github.com/postcss/postcss) from 8.5.8 to 8.5.12.
- [Release notes](https://github.com/postcss/postcss/releases)
- [Changelog](https://github.com/postcss/postcss/blob/main/CHANGELOG.md)
- [Commits](postcss/postcss@8.5.8...8.5.12)

---
updated-dependencies:
- dependency-name: postcss
  dependency-version: 8.5.12
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump uuid, bullmq and dockerode (#1192)

Removes [uuid](https://github.com/uuidjs/uuid). It's no longer used after updating ancestor dependencies [uuid](https://github.com/uuidjs/uuid), [bullmq](https://github.com/taskforcesh/bullmq) and [dockerode](https://github.com/apocas/dockerode). These dependencies need to be updated together.


Removes `uuid`

Updates `bullmq` from 5.72.0 to 5.76.2
- [Release notes](https://github.com/taskforcesh/bullmq/releases)
- [Commits](taskforcesh/bullmq@v5.72.0...v5.76.2)

Updates `dockerode` from 4.0.10 to 5.0.0
- [Release notes](https://github.com/apocas/dockerode/releases)
- [Commits](apocas/dockerode@v4.0.10...v5.0.0)

---
updated-dependencies:
- dependency-name: uuid
  dependency-version: 
  dependency-type: indirect
- dependency-name: bullmq
  dependency-version: 5.76.2
  dependency-type: direct:production
- dependency-name: dockerode
  dependency-version: 5.0.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: aaight <aaight42@gmail.com>
Co-authored-by: Cascade Bot <bot@cascade.dev>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
zbigniewsobiecki added a commit that referenced this pull request Apr 26, 2026
* fix(triggers): audit & fix PM feedback inconsistencies across respond-to-* agents (#1201)

* fix(triggers): audit & fix PM feedback inconsistencies across respond-to-* agents

* fix(triggers): use case-insensitive JIRA status comparison in isInPlanningStatus

Match the established pattern from status-changed.ts and label-added.ts which
both use .toLowerCase() for JIRA status comparisons, since status names are
user-configurable and the API does not guarantee consistent casing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Cascade Bot <bot@cascade.dev>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(linear): populate inlineMedia from descriptions/comments and add downloadAttachment (#1202)

Co-authored-by: Cascade Bot <bot@cascade.dev>

* fix(claude-code): pin pathToClaudeCodeExecutable so SDK skips broken native-binary probe (#1206)

The agent-harness SDK bump in #1197 (claude-agent-sdk 0.2.91 → 0.2.119)
broke every review run on cascade-prod with:

  ReferenceError: Claude Code native binary not found at
  /app/node_modules/@anthropic-ai/claude-agent-sdk-linux-x64-musl/claude

The new SDK probes its own platform-specific optional-dependency
subpackages for a bundled `claude` binary. Two failure modes hit at once:

  1. Cascade installs `@anthropic-ai/claude-code@2.1.119` globally at
     /usr/local/bin/claude — the SDK never looks there.
  2. The SDK probes the `-musl` variant first regardless of host libc and
     errors on ENOENT instead of falling through to the glibc variant.

Pass an explicit `pathToClaudeCodeExecutable` to short-circuit the probe.
The resolver checks (in order):

  - $CLAUDE_CODE_EXECUTABLE_PATH env override (local-dev escape hatch)
  - `which claude` in $PATH
  - /usr/local/bin/claude (Docker default from Dockerfile.worker)

Two TDD tests pin the option onto query() and prove the env override
wins. No Dockerfile change needed; the existing global install at
/usr/local/bin/claude becomes the resolver's runtime target.

Confirmed broken on ucho PR #72 (cascade-prod review agent crash).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* spec 015: router job dispatch failure recovery (#1203)

* docs(spec/plans): add spec 015 + plans for router dispatch failure recovery

Spec captures the silent black-hole bug class verified live on 2026-04-26
(ucho/MNG-350): a transient capacity miss or Docker error during worker
spawn turns a webhook-driven job into a permanently failed BullMQ entry
while stranding the work-item / agent-type locks for up to 30 minutes,
silently rejecting subsequent webhooks for the same work item.

Decomposed into two plans with safety-net-first sequencing: plan 1 hooks
the BullMQ failed event to release locks on every dispatch failure path;
plan 2 replaces the throw-on-capacity with a wait-for-slot semaphore,
adds bounded retry with exponential backoff, and a transient/terminal
error classifier.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(plan): lock 015/1 failed-event-lock-compensation

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(router): plan 015/1 done — release locks on dispatch failure

Closes the stranded-lock half of spec 015's bug class verified live in
prod on 2026-04-26 (ucho/MNG-350). When a webhook-driven job's dispatch
fails — capacity throw, Docker spawn error, or any future throw site —
the work-item lock, agent-type concurrency counter, and recently-dispatched
dedup mark established by the webhook → enqueue path are now released by
a compensator hooked to BullMQ's `worker.on('failed')` event.

What landed:

- `src/router/dispatch-compensator.ts` (new) — `releaseLocksForFailedJob`
  wraps `extractProjectIdFromJob` / `extractWorkItemId` / `extractAgentType`
  and calls into `clearWorkItemEnqueued` / `clearAgentTypeEnqueued` /
  `clearRecentlyDispatched`. Never propagates errors; captures to Sentry
  with `tags: { source: 'dispatch_compensator' }`.
- `src/router/agent-type-lock.ts` — exports new `clearRecentlyDispatched`
  for the compensator. The existing `markRecentlyDispatched` semantics
  are unchanged (60s TTL, NOT cleared on completion); this helper exists
  solely so a permanently-failed dispatch doesn't keep deduping a fresh
  webhook for ~60s while the user retries.
- `src/router/bullmq-workers.ts` — extends the existing
  `worker.on('failed')` handler to invoke `releaseLocksForFailedJob`
  alongside the existing logger + Sentry calls. Wraps the call in a
  defensive `.catch` so a future regression in the compensator can't
  poison the worker.
- `src/router/lock-state-classifier.ts` (new) — `classifyLockState` returns
  `'awaiting-slot'` when an active worker or queued/waiting job matches
  the trio, `'wedged'` when neither correlation matches. Defaults to
  `'awaiting-slot'` on classifier error so a Redis blip doesn't mis-emit
  the wedged canary.
- `src/router/active-workers.ts` — `getActiveWorkers()` now exposes
  `(projectId, workItemId, agentType)` so the classifier can correlate.
  Backwards-compatible (existing callers work unchanged; new fields are
  additive optional).
- `src/router/webhook-processor.ts` — Step 8 (work-item lock check) now
  splits the decision-reason vocabulary into three states:
    * `Job queued: ...` (success path)
    * `Awaiting worker slot: ...` (lock held + dispatch in flight; healthy)
    * `Work item locked (no active dispatch): ...` (wedged-lock canary)
  The wedged branch additionally fires `captureException` with
  `tags: { source: 'wedged_lock_canary' }` so any regression in
  compensation is loud in production.

What this does NOT change (intentional, all in plan 015/2):
- `guardedSpawn` still throws on capacity (BullMQ marks the job failed,
  the compensator now releases the locks, but the job itself is still
  lost). Plan 2 replaces the throw with a wait-for-slot semaphore.
- Both queues still default to `attempts: 1`. Plan 2 raises this with
  exponential backoff and adds a transient/terminal error classifier.
- CLAUDE.md is intentionally not updated by this plan — the unified
  passage describing both halves of the new contract lands in plan 015/2.

Tests:
- 5 new unit tests in `dispatch-compensator.test.ts`
- 3 new unit tests in `agent-type-lock.test.ts` for `clearRecentlyDispatched`
- 4 new unit tests in `bullmq-workers.test.ts` for the failed-event seam
- 5 new unit tests in `lock-state-classifier.test.ts`
- 2 new unit tests in `active-workers.test.ts` for the extended shape
- 4 new unit tests in `webhook-processor.test.ts` for the three-way taxonomy
- 3 new module-integration tests in
  `tests/integration/router/dispatch-failure-compensation.test.ts` exercise
  the real lock modules + real bullmq-workers.ts failed-event handler +
  real compensator end-to-end (only BullMQ's Worker constructor + the
  worker-env extractors are mocked).

Full suite: 8515 passed / 23 skipped / 0 failed. Lint + typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(plan): lock 015/2 wait-for-slot-and-retry-classifier

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(plan): mark 015/2 status: wip

* feat(router): plan 015/2 done — wait-for-slot + retry budget + classifier

Closes the lost-job half of spec 015's bug class. Combined with plan 015/1,
the silent black-hole failure mode verified live in prod on 2026-04-26
(ucho/MNG-350) is now fully closed.

What landed:

- `src/router/slot-waiter.ts` (new) — semaphore-style primitive:
  `acquireSlot({ timeoutMs })` resolves immediately when capacity is
  below `routerConfig.maxWorkers`, otherwise queues a FIFO waiter with
  a bounded timeout that rejects with `code: 'SLOT_WAIT_TIMEOUT'`.
  `slotReleased()` pops the head waiter; `clearAllWaiters()` rejects
  every pending waiter with `code: 'SHUTDOWN'` on router stop.
- `src/router/dispatch-error-classifier.ts` (new) — classifies thrown
  errors into `'transient'` (Docker socket Node codes, HTTP 429/409,
  SLOT_WAIT_TIMEOUT, anything unknown — default-to-retry) vs
  `'terminal'` (TypeError, ZodError, image-not-found-after-fallback).
- `src/router/worker-manager.ts` — `guardedSpawn` rewritten:
  `await acquireSlot(...)` replaces the synchronous capacity throw;
  on spawn error, terminal errors are wrapped in BullMQ's
  `UnrecoverableError` so retries skip; transient errors propagate
  unchanged so BullMQ retries via attempts/backoff.
- `src/router/active-workers.ts` — `cleanupWorker` now calls
  `slotReleased()` exactly once per cleanup, including on the crash
  path. The existing `if (worker)` guard ensures idempotence.
- `src/router/config.ts` — new `slotWaitTimeoutMs` field (default
  5min, configurable via `SLOT_WAIT_TIMEOUT_MS`).
- `src/router/queue.ts` and `src/queue/client.ts` — both queues now
  default to `attempts: 4` with `backoff: { type: 'exponential', delay: 5000 }`
  (~75s total before exhaustion). Terminal errors bypass via
  `UnrecoverableError`.
- `src/router/container-manager.ts` — exports the existing
  `isImageNotFoundError` predicate so the classifier can reuse it.

Test contract change (spec AC #9):

The previous `tests/unit/router/worker-manager.test.ts:179` assertion
`'processFn throws when at capacity'` is REPLACED (not deleted) with
`'processFn awaits a slot when at capacity, then dispatches when one
frees'`. The throw-on-capacity contract is gone forever.

Tests:

- 7 new unit tests in `slot-waiter.test.ts` (FIFO, timeout, no-op,
  shutdown rejection)
- 11 new unit tests in `dispatch-error-classifier.test.ts` covering
  every transient/terminal class
- 4 new unit tests in `worker-manager.test.ts` (replaced original
  capacity-throw test + 3 for retry classification)
- 3 new unit tests in `active-workers.test.ts` for slotReleased
  integration
- 5 new module-integration tests in `dispatch-retry.test.ts` exercise
  REAL guardedSpawn + REAL slot-waiter + REAL dispatch-error-classifier
  against both queues, mocking only spawnWorker + BullMQ Worker
  constructor.

Plan 1's 3 module-integration tests continue to pass alongside plan 2's 5.
Full unit suite: 8539 passed / 23 skipped / 0 failed.

CLAUDE.md updated with a new "Dispatch failure semantics" section
documenting the unified contract (capacity wait, retry budget,
classifier, three-way decision-reason taxonomy from plan 1, wedged-lock
canary). File now 182 lines, under the 200-line cap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(spec): 015 done — router job dispatch failure recovery, all plans complete

Closes the silent black-hole bug class verified live on 2026-04-26
(ucho/MNG-350). Plan 1 added failed-event lock compensation +
three-way decision-reason taxonomy; plan 2 replaced the throw-on-capacity
with wait-for-slot, added bounded retry with exponential backoff, and
introduced a transient/terminal error classifier.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(deps): bump postcss from 8.5.8 to 8.5.12 (#1204)

Bumps [postcss](https://github.com/postcss/postcss) from 8.5.8 to 8.5.12.
- [Release notes](https://github.com/postcss/postcss/releases)
- [Changelog](https://github.com/postcss/postcss/blob/main/CHANGELOG.md)
- [Commits](postcss/postcss@8.5.8...8.5.12)

---
updated-dependencies:
- dependency-name: postcss
  dependency-version: 8.5.12
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump uuid, bullmq and dockerode (#1192)

Removes [uuid](https://github.com/uuidjs/uuid). It's no longer used after updating ancestor dependencies [uuid](https://github.com/uuidjs/uuid), [bullmq](https://github.com/taskforcesh/bullmq) and [dockerode](https://github.com/apocas/dockerode). These dependencies need to be updated together.


Removes `uuid`

Updates `bullmq` from 5.72.0 to 5.76.2
- [Release notes](https://github.com/taskforcesh/bullmq/releases)
- [Commits](taskforcesh/bullmq@v5.72.0...v5.76.2)

Updates `dockerode` from 4.0.10 to 5.0.0
- [Release notes](https://github.com/apocas/dockerode/releases)
- [Commits](apocas/dockerode@v4.0.10...v5.0.0)

---
updated-dependencies:
- dependency-name: uuid
  dependency-version: 
  dependency-type: indirect
- dependency-name: bullmq
  dependency-version: 5.76.2
  dependency-type: direct:production
- dependency-name: dockerode
  dependency-version: 5.0.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* spec 016: PM image delivery reliability (#1209)

* docs(spec/plans): add spec 016 + plans for PM image delivery reliability

Closes the silent screenshot-drop bug class verified live on 2026-04-26
(ucho/MNG-357): Linear's user-pasted-image URLs (uploads.linear.app/<uuid>
with no file extension) were dropped at the pre-download MIME filter
because mimeTypeFromUrl returned 'application/octet-stream' and
filterImageMedia excluded them. This affected all engines on the
disk-write path, regardless of PR #948's Claude-Code SDK delivery fix.

Three plans, safety-net-first sequencing matching spec 015:
- Plan 1 (boot-path-mime-fix-and-diagnostic-log): defers MIME authority
  to download response Content-Type via image/* wildcard sentinel; adds
  the grep-stable diagnostic log line at extract time. Independently
  fixes MNG-357.
- Plan 2 (runtime-gadget-image-delivery): makes the runtime
  cascade-tools pm read-work-item gadget actually download + write
  images to disk with file paths returned in text. Closes the mid-run
  pickup gap. Depends on Plan 1's shared download-and-prepare helper.
- Plan 3 (linear-fixture-and-extraction-coverage): captures a Linear
  GraphQL Issue payload fixture for an issue with a pasted screenshot;
  pins extraction with a regression test that fails loudly if Linear
  ever changes the payload shape. Mostly tests + docs.

9 ACs, 0 manual-only. CLAUDE.md not updated (already covered by spec 015's
silent-failure → diagnostic-line pattern).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(plan): lock 016/1 boot-path-mime-fix-and-diagnostic-log

* feat(pm): plan 016/1 done — boot-path image MIME fix + diagnostic log

Closes the silent screenshot-drop bug class verified live on 2026-04-26
(ucho/MNG-357). Linear's extension-less pasted-image URLs
(uploads.linear.app/<uuid>) now survive the pre-download MIME filter via
an image/* wildcard sentinel. The download response's Content-Type
header is the authoritative MIME — wildcard is resolved before bytes
are written.

What landed:

- src/pm/media.ts — new IMAGE_HOST_ALLOWLIST (currently 'uploads.linear.app');
  mimeTypeFromUrl returns 'image/*' for extension-less URLs from allowlisted
  hosts; isImageMimeType accepts the wildcard.
- src/pm/download-and-prepare.ts (new) — shared helper for the per-provider
  download dispatch loop (jira/linear/trello). Returns { images, failures }.
  Spec 016/2's runtime gadget will import this.
- src/agents/definitions/contextSteps.ts — fetchWorkItemStep refactored to
  use the shared helper; emits the new grep-stable diagnostic line
  '[image-pipeline] work-item-fetch summary' with stable fields:
  { provider, workItemId, urlsDetected, urlsAfterFilter, urlsDownloaded,
    urlsFailed, urlsByMimeType }.

Tests:
- 6 new unit tests in tests/unit/pm/media.test.ts (wildcard sentinel,
  Linear extension-less, regression for extensioned + non-PM URLs)
- 7 new unit tests in tests/unit/pm/download-and-prepare.test.ts
- 3 new diagnostic-log tests in contextSteps.test.ts; existing log
  message expectations updated to the new helper-prefix
- 3 module-integration tests in tests/integration/pm/image-pipeline.test.ts
  pinning the MNG-357 reproduction end-to-end with real
  mimeTypeFromUrl + filterImageMedia + extractMarkdownImages

PR #948's Claude-Code initial-input ImageBlockParam path is unchanged;
existing regression test (claude-code.test.ts:939
'logs image injection and strips images before buildTaskPrompt') confirms.

Docs:
- CHANGELOG.md entry under Unreleased.
- src/integrations/README.md gains a new 'Image delivery contract' section
  documenting the shared resolution path, allowlist semantics, diagnostic
  log line schema, and the rule that providers shouldn't write their own
  MIME-detection.

Full unit suite: 8521 passed / 23 skipped / 0 failed. Lint + typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(plan): lock 016/2 runtime-gadget-image-delivery

* feat(pm): plan 016/2 done — runtime gadget delivers images on disk

Closes the mid-run image pickup gap from spec 016. The runtime gadget
`cascade-tools pm read-work-item` now downloads any image media and
writes it to .cascade/context/images/work-item-<id>-img-<index>.<ext>,
returning text whose new "Local Image Files" section lists actual file
paths the agent's file-read tool can consume.

What landed:

- src/gadgets/pm/core/writeRuntimeImages.ts (new) — writes ContextImage
  arrays to .cascade/context/images/ with stable naming convention
  (work-item-<id>-img-<i>.<ext>); extension derived from resolved MIME;
  falls back to .bin + warn log for unresolved image/* sentinel.
- src/gadgets/pm/core/readWorkItem.ts — readWorkItem now calls
  downloadAndPrepareImages (Plan 1's helper) + writeRuntimeImages
  (this plan), then mutates the returned text to include the local
  file paths via formatRuntimeImagePaths. Same diagnostic log line
  '[image-pipeline] work-item-fetch summary' as the boot path.
  Failed downloads surface in a "Failed Image Downloads" subsection.

Tests:
- 8 new unit tests in tests/unit/gadgets/pm/core/writeRuntimeImages.test.ts
- 5 new unit tests in tests/unit/gadgets/pm/core/readWorkItem.test.ts
  (spec 016/2 sub-describe)
- 4 new module-integration tests in
  tests/integration/gadgets/runtime-image-delivery.test.ts pinning
  the mid-run pickup contract end-to-end.

CHANGELOG.md entry added.

Full unit suite (single-fork): 8534 passed / 23 skipped / 0 failed.
Lint + typecheck clean. Three PM manifest test suites occasionally
time out under parallel load on this machine — verified to pass in
isolation; not a code regression.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(plan): lock 016/3 linear-fixture-and-extraction-coverage

* test(pm): plan 016/3 done — Linear fixture + extraction-coverage regression

Closes spec 016 with the regression net for the contract Plans 1+2
established. If Linear ever changes its Issue payload shape in a way
that loses inline images, the extraction-coverage test fails loudly
with a specific URL-missing message.

What landed:

- tests/fixtures/linear-issue-with-screenshot.json (new) — reconstructed
  Linear GraphQL Issue payload covering: extension-less uploads.linear.app
  URL in description, extensioned Linear URL with alt text, external URL
  with image/svg+xml MIME, non-image markdown link (must NOT be picked
  up), one comment with a pasted screenshot, one comment without, and
  three formal Attachment records (Slack/GitHub/Sentry link previews).
- tests/unit/pm/linear/extraction-coverage.test.ts (new) — 9 tests:
  description coverage with explicit expected-URL list, image/* sentinel
  for extension-less, concrete MIME for extensioned, image/svg+xml for
  external SVG, non-image link exclusion, comment coverage, comment
  source field, attachment-NOT-leaked rule, meta-test of regression net.
- src/integrations/README.md — new "Linear: GraphQL surface for inline
  images" subsection documenting the conclusion: Issue.description
  markdown is canonical for inline-pasted screenshots; Issue.attachments
  is for formal Attachment records (link previews) and is the wrong
  surface for inline images. Links to the fixture and the test.

No production code change — Plan 1's mimeTypeFromUrl + extractMarkdownImages
already cover the cases. This plan ships the regression armor.

CHANGELOG.md entry added.

Lint + typecheck clean. 9/9 unit tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(spec): 016 done — pm-image-delivery-reliability, all plans complete

Closes the silent screenshot-drop bug class verified live on 2026-04-26
(ucho/MNG-357). Plan 1 added the Linear-extension-less MIME wildcard
sentinel + diagnostic log line; plan 2 made the runtime
cascade-tools pm read-work-item gadget actually deliver images on
disk; plan 3 captured a Linear GraphQL fixture and pinned extraction
coverage with a regression test.

CLAUDE.md untouched by this spec — already covered by spec 015's
broader silent-failure → diagnostic-line pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address code review concerns

* test(image-pipeline): supply urlsDetected in readWorkItemWithMedia mocks

The diagnostic-line assertion expected urlsDetected on the log payload, but
the mocked readWorkItemWithMedia return values omitted it, so the field
arrived as undefined and the toHaveBeenCalledWith match failed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Cascade Bot <bot@cascade.dev>

* fix(linear): drop comment-mention planning-state gate that prod payload never satisfies (#1210)

PR #1201 added a `currentStateId !== planningStateId` gate to the Linear
comment @mention trigger that read `data.issue.stateId` from the webhook
payload. Linear's Comment webhook does not ship `stateId` on the nested
issue (verified across four prod payloads on 2026-04-26 — 8cd0108a,
b93e4925, 6548cd14, 3d95b210). The gate therefore always evaluated to
true and silently dropped every legitimate bot @mention, including the
one on MNG-346 that motivated this fix.

The agent (respond-to-planning-comment) is now responsible for any
planning-only behavior; the trigger no longer gates on state and avoids
an extra Linear GraphQL round-trip per comment.

Also corrects `LinearWebhookCommentTriggerData.issue` to match what
Linear actually ships (six keys, no `stateId`, optional `team`) — the
old type lied and PR #1201 trusted it.

Tests pin a real prod-shape Comment payload as a regression. JIRA's
equivalent gate is unaffected (its `comment_created` payload does ship
`issue.fields.status.name`).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: aaight <aaight42@gmail.com>
Co-authored-by: Cascade Bot <bot@cascade.dev>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant