From 2638a378ac7b1dc05c9355824e4c4eee5780547a Mon Sep 17 00:00:00 2001 From: Zbigniew Sobiecki Date: Wed, 29 Apr 2026 10:49:00 +0000 Subject: [PATCH 1/4] chore(plan): 017/2 lock --- ...apacity-gate-pm-scope.md => 2-capacity-gate-pm-scope.md.wip} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename docs/plans/017-router-silent-failure-hardening/{2-capacity-gate-pm-scope.md => 2-capacity-gate-pm-scope.md.wip} (99%) diff --git a/docs/plans/017-router-silent-failure-hardening/2-capacity-gate-pm-scope.md b/docs/plans/017-router-silent-failure-hardening/2-capacity-gate-pm-scope.md.wip similarity index 99% rename from docs/plans/017-router-silent-failure-hardening/2-capacity-gate-pm-scope.md rename to docs/plans/017-router-silent-failure-hardening/2-capacity-gate-pm-scope.md.wip index d029119d..912403dd 100644 --- a/docs/plans/017-router-silent-failure-hardening/2-capacity-gate-pm-scope.md +++ b/docs/plans/017-router-silent-failure-hardening/2-capacity-gate-pm-scope.md.wip @@ -6,7 +6,7 @@ plan_slug: capacity-gate-pm-scope level: plan parent_spec: docs/specs/017-router-silent-failure-hardening.md depends_on: [] -status: pending +status: wip --- # 017/2: Pipeline-capacity-gate PM-provider scope From 532dda54a21de46c9a8a69cf8f2a1bd773877419 Mon Sep 17 00:00:00 2001 From: Zbigniew Sobiecki Date: Wed, 29 Apr 2026 10:59:25 +0000 Subject: [PATCH 2/4] fix(router): wrap PM-source dispatch in PM-provider scope so capacity gate enforces maxInFlightItems MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Plan 017/2 (capacity-gate-pm-scope). Closes failure mode B from spec 017's 24h log audit on 2026-04-29. Root cause: the pipeline-capacity gate at `src/triggers/shared/pipeline-capacity-gate.ts:33-45` calls `getPMProvider()` to count in-flight items. For every PM `status-changed` trigger this threw `No PMProvider in scope` because the three PM router adapters (`src/router/adapters/{linear,trello,jira}.ts`) wrapped trigger dispatch in their per-PM-type credential AsyncLocalStorage scope (`withLinearCredentials` / `withTrelloCredentials` / `withJiraCredentials`) but NOT in PM-provider scope. The GitHub adapter at `src/router/adapters/github.ts:280` already had both wrappings — the PM router adapters were the outliers. The gate caught the throw and fell through to its conservative branch: `WARN: pipeline-capacity-gate: PM provider unavailable, allowing run` + `return false` (allow). Net effect: `maxInFlightItems` was silently no-op for the entire PM-source path. 32 occurrences/day on prod cascade-router. The gate's whole purpose is preventing the multi-card-into-TODO incident class documented in the file header — and the protection had been disabled since the gate was first introduced. Three changes: 1. New shared helper `withPMScopeForDispatch(project, dispatch)` at `src/router/adapters/_shared.ts`. Resolves the PM provider via `createPMProvider(project)` and wraps `dispatch` in `withPMProvider`. Mirrors the GitHub adapter's correct shape. 2. The three PM router adapters consume the helper inside `dispatchWithCredentials`, layering PM-provider scope on top of their existing per-PM-type credential scope. 3. The capacity gate's "PM provider unavailable" branch flips from `WARN + return false` (allow) to ERROR + Sentry capture under stable tag `pipeline_capacity_gate_no_pm_provider` + `return true` (block). Once the routine path establishes scope (changes 1+2 above), hitting that branch represents a real AsyncLocalStorage scope leak that operators need to investigate. Failing closed is preferable to silently failing open and re-introducing the original incident class. Regression nets: - `tests/unit/integrations/pm-router-adapter-pm-scope.test.ts`: static guard reading each PM router adapter source and asserting it contains either `withPMScopeForDispatch` or `withPMProvider`. A future PM router adapter added without the wrapping fails this guard with a precise file path. Modeled on `trigger-event-consistency.test.ts`. - `tests/unit/router/adapters/with-pm-scope-for-dispatch.test.ts`: 4 tests covering happy path (PM provider resolves inside the wrapped callback), return-value passthrough, null dispatch result, and error propagation. - `pipeline-capacity-gate.test.ts`: existing "allows when no PM provider" test inverted to "FAILS CLOSED (blocks) when no PM provider scope" (the headline behavioral change), plus 2 new positive-path regression pins. Side effects on existing trigger tests: three trigger-handler test files (`linear-status-changed.test.ts`, `jira-status-changed.test.ts`, `status-changed.test.ts`) call `shouldBlockForPipelineCapacity` indirectly via the trigger handlers. After the fail-closed flip, those handlers were blocked because the unit tests don't establish PM-provider scope. Each test file gains a `vi.mock('.../pipeline-capacity-gate.js')` passthrough so trigger-logic assertions still run — the gate's behavior is tested in its own dedicated test file. Three PM router adapter test files (`linear.test.ts`, `trello.test.ts`, `jira.test.ts`) gain a similar passthrough mock for `_shared.js:withPMScopeForDispatch` so they don't pull the real PM manifest registry. Doc updates: - `CLAUDE.md` Architecture section gains a "Capacity-gate invariant" paragraph documenting the wrapping requirement, the fail-closed policy, the Sentry tag, and the static guard. - `CHANGELOG.md` entry under spec 017. Full suite: 474 files / 8677 tests passing (+9 from Plan 1's baseline). Co-Authored-By: Claude Opus 4.7 (1M context) --- CHANGELOG.md | 1 + CLAUDE.md | 2 + ...d.wip => 2-capacity-gate-pm-scope.md.done} | 24 ++--- src/router/adapters/_shared.ts | 42 ++++++++ src/router/adapters/jira.ts | 6 +- src/router/adapters/linear.ts | 7 +- src/router/adapters/trello.ts | 8 +- src/triggers/shared/pipeline-capacity-gate.ts | 29 ++++- .../pm-router-adapter-pm-scope.test.ts | 58 ++++++++++ tests/unit/router/adapters/jira.test.ts | 5 + tests/unit/router/adapters/linear.test.ts | 7 ++ tests/unit/router/adapters/trello.test.ts | 5 + .../with-pm-scope-for-dispatch.test.ts | 86 +++++++++++++++ .../unit/triggers/jira-status-changed.test.ts | 7 ++ .../triggers/linear-status-changed.test.ts | 8 ++ .../shared/pipeline-capacity-gate.test.ts | 100 +++++++++++++++--- tests/unit/triggers/status-changed.test.ts | 7 ++ 17 files changed, 372 insertions(+), 30 deletions(-) rename docs/plans/017-router-silent-failure-hardening/{2-capacity-gate-pm-scope.md.wip => 2-capacity-gate-pm-scope.md.done} (91%) create mode 100644 src/router/adapters/_shared.ts create mode 100644 tests/unit/integrations/pm-router-adapter-pm-scope.test.ts create mode 100644 tests/unit/router/adapters/with-pm-scope-for-dispatch.test.ts diff --git a/CHANGELOG.md b/CHANGELOG.md index c3723177..0ccb6d94 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,7 @@ All notable user-visible changes to CASCADE are documented here. The format is l ### Changed +- **Pipeline-capacity gate now enforces `maxInFlightItems` for PM `status-changed` triggers** (spec 017, plan 2 of 3). The gate at `src/triggers/shared/pipeline-capacity-gate.ts` is the hard cap on the active pipeline (TODO + IN_PROGRESS + IN_REVIEW work items) introduced after a prior incident where a human moved three cards into TODO simultaneously and three concurrent implementation runs fired against a project pinned to `maxInFlightItems: 1`. The gate calls `getPMProvider()` to count in-flight items, but for every PM `status-changed` trigger the call threw `No PMProvider in scope` because the three PM router adapters (`src/router/adapters/{linear,trello,jira}.ts`) wrapped trigger dispatch in their per-PM-type credential `AsyncLocalStorage` scope but NOT in PM-provider scope (the GitHub adapter at `src/router/adapters/github.ts:280` already had both wrappings). The gate fell through to its conservative branch (`WARN: pipeline-capacity-gate: PM provider unavailable, allowing run` and `return false`) — silently no-op for the only triggers that actually need it. 32 occurrences/day on cascade-router (verified 2026-04-29). The fix introduces a shared helper `withPMScopeForDispatch(project, dispatch)` at `src/router/adapters/_shared.ts` that the three PM router adapters consume, mirroring the GitHub adapter's correct shape. The gate's "PM provider unavailable" branch is converted from `WARN + return false` (allow) to ERROR-level + Sentry capture under stable tag `pipeline_capacity_gate_no_pm_provider` + `return true` (block) — once the routine path establishes scope, hitting that branch is a real `AsyncLocalStorage` scope leak operators need to investigate. A static-guard test at `tests/unit/integrations/pm-router-adapter-pm-scope.test.ts` enforces the wrapping invariant per adapter; CLAUDE.md gains a "Capacity-gate invariant" passage in the Architecture section. See [spec 017](docs/specs/017-router-silent-failure-hardening.md). - **PM-ack dispatch consolidation: Linear-based PM-focused agents now post their PM-side ack comment** (spec 017, plan 1 of 3). PM-focused agents (e.g. `backlog-manager`) triggered from a GitHub webhook used to silently skip their PM-side ack on Linear projects: the router-adapter's local `postPMAck` helper had `if (pmType === 'trello')` / `if (pmType === 'jira')` branches but no Linear branch, so Linear-based projects fell through to a `WARN: Unknown PM type for PM-focused agent ack, skipping` and never saw the "🔧 On it" comment that Trello/JIRA projects got (24 silent skips per day on cascade-router, all from `ucho`, verified 2026-04-29). A near-identical helper at `src/triggers/shared/pm-ack.ts` already had the Linear branch — pure parallel-path drift. The fix introduces a single consolidated helper `dispatchPMAck` at `src/router/pm-ack-dispatch.ts` that indexes the manifest registry directly and invokes `manifest.platformClientFactory(projectId).postComment(...)` — no per-PM-type literal branching anywhere on the dispatch surface. Both legacy call sites delegate. The PM manifest conformance harness gains a per-provider `dispatchPMAck reaches this provider without throwing` assertion, and a static-guard test pins "no `pmType === ''` branching" against all three call sites; adding a future PM provider to the registry lands the dispatch path for free. Genuinely-unknown PM types (configuration error: project pinned to a deleted provider) now log at ERROR + capture to Sentry under stable tag `pm_ack_unknown_pm_type` instead of a silent WARN. See [spec 017](docs/specs/017-router-silent-failure-hardening.md). - **Progress-comment lifecycle: post-agent cleanup hook now skips when an in-run gadget already deleted the comment** (spec 017, plan 3 of 3). The post-agent `deleteProgressCommentOnSuccess` hook used to read `sessionState.initialCommentId`, fall back to `result.agentInput.ackCommentId` when session state was empty, and issue a redundant DELETE — but "session state cleared by a gadget" was indistinguishable from "session state never populated", so the fallback fired and re-deleted comments that were already gone. GitHub returned 404 and `WARN: Failed to delete progress comment after agent success` was logged 72 times per day on cascade-router (live audit on 2026-04-29). Adds an explicit `initialCommentIdConsumed: boolean` flag on `SessionStateData`. Both `deleteInitialComment` (gadget-driven) and `clearInitialComment` (sidecar-driven) now set the flag to `true` after disposing of the comment. The post-agent hook checks the flag first and skips the entire deletion path — including the legacy `agentInput.ackCommentId` fallback — when consumed. As defense in depth, `githubClient.deletePRComment` now treats HTTP 404 as success (RFC-7231 idempotency) and logs at DEBUG instead of letting the error bubble as a WARN; other HTTP errors (5xx, 401, network) continue to throw. The legacy fallback to `agentInput.ackCommentId` continues to work for code paths that never populate session state. See [spec 017](docs/specs/017-router-silent-failure-hardening.md). - **PM image delivery: Linear GraphQL fixture + extraction-coverage regression test** (spec 016, plan 3 of 3). Captures a reconstructed Linear `Issue` GraphQL payload at `tests/fixtures/linear-issue-with-screenshot.json` containing extension-less and extensioned inline-pasted images (description + comment bodies) plus formal Attachment records (Slack/GitHub/Sentry link previews) that must NOT be mistaken for inline images. The unit test at `tests/unit/pm/linear/extraction-coverage.test.ts` pins the contract and fails loudly with a specific URL-missing message if Linear ever changes its payload shape in a way that loses inline images. Documents the conclusion in `src/integrations/README.md`: `Issue.description` markdown is canonical for Linear inline images; `Issue.attachments` is the wrong surface (formal Attachment records, not pastes). No production code change — this plan ships the regression net for the contract Plans 1+2 established. See [spec 016](docs/specs/016-pm-image-delivery-reliability.md). diff --git a/CLAUDE.md b/CLAUDE.md index 40ec0bd0..c821a20d 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -23,6 +23,8 @@ Three separate services, **no monolithic server mode**: Flow: `PM/SCM/alerting webhook → Router → Redis → Worker → TriggerRegistry → Agent → Code → PR`. +**Capacity-gate invariant.** Every PM router adapter (`src/router/adapters/{linear,trello,jira}.ts`) must wrap `triggerRegistry.dispatch(ctx)` in PM-provider `AsyncLocalStorage` scope via the shared `withPMScopeForDispatch(fullProject, dispatch)` helper at `src/router/adapters/_shared.ts` — in addition to the per-PM-type credential scope (`withLinearCredentials` / `withTrelloCredentials` / `withJiraCredentials`). Without the PM-provider wrapping, the pipeline-capacity gate at `src/triggers/shared/pipeline-capacity-gate.ts` cannot resolve `getPMProvider()`, **fails closed** under the spec-017 fail-closed policy (blocks the run + ERROR + Sentry capture under tag `pipeline_capacity_gate_no_pm_provider`), and `maxInFlightItems` is silently disabled for the PM-source path. Mirror the GitHub adapter's existing correct shape at `src/router/adapters/github.ts:dispatchWithCredentials`. The static guard at `tests/unit/integrations/pm-router-adapter-pm-scope.test.ts` enforces this at CI time — adding a new PM router adapter without the wrapping fails CI with a precise file path. + Integration abstraction lives in `src/integrations/`. For **adding a new PM provider**, see @src/integrations/README.md — PM providers (Trello, JIRA, Linear) use the `PMProviderManifest` registry with a **behavioral conformance harness** (spec 009 — config round-trip, discovery shape, full lifecycle scenario, auth-header provenance, single-entrypoint invariant). Each provider owns its Zod config schema (`src/integrations/pm//config-schema.ts`) as the single source of truth — the central `src/config/schema.ts` imports it. PM adapter method signatures use branded `StateId` / `LabelId` / `ContainerId` from `src/pm/ids.ts` to make state-name-vs-ID confusion a compile error at direct-adapter call sites. All runtime surfaces (router, worker, CLI, dashboard) register integrations through a single entrypoint at `src/integrations/entrypoint.ts`. **Spec 010 follow-ups** added generic `pm.discovery.createLabel` / `createCustomField` mutation endpoints + `currentUser` discovery capability + real shared React components for every `StandardStepKind` under `web/src/components/projects/pm-providers/steps/`. **Spec 011** migrated all three production providers (Trello, JIRA, Linear) onto those shared components, added a 7th `StandardStepKind: custom-field-mapping`, widened `container-pick` / `project-scope` / `webhook-url-display` with optional props, and deleted the three legacy `pm-wizard-{trello,jira,linear}-steps.tsx` files. **Spec 012** migrated each provider's webhook UX (programmatic create for Trello/JIRA, signing-secret + instructions for Linear) into per-provider manifest webhook adapters (Fragment compositions around the shared `WebhookUrlDisplayStep`); deleted the legacy `WebhookStep` + `LinearWebhookInfoPanel` + `useWebhookManagement` + `useLinearWebhookInfo`. Every PM wizard step now renders via the manifest path without exception. A new PM provider writes zero edits to shared orchestration (`pm-wizard.tsx`, `pm-wizard-common-steps.tsx`, `pm-wizard-hooks.ts`); provider-specific UI ships either as `kind: 'custom'` steps or as Fragment compositions inside the provider folder's wizard adapters. SCM (GitHub) and alerting (Sentry) still use the legacy `IntegrationModule` pattern via self-registration in `src/github/register.ts` + `src/sentry/register.ts`. Don't improvise; the README covers both patterns. ## PR checkout (worker) — gotcha diff --git a/docs/plans/017-router-silent-failure-hardening/2-capacity-gate-pm-scope.md.wip b/docs/plans/017-router-silent-failure-hardening/2-capacity-gate-pm-scope.md.done similarity index 91% rename from docs/plans/017-router-silent-failure-hardening/2-capacity-gate-pm-scope.md.wip rename to docs/plans/017-router-silent-failure-hardening/2-capacity-gate-pm-scope.md.done index 912403dd..781a743d 100644 --- a/docs/plans/017-router-silent-failure-hardening/2-capacity-gate-pm-scope.md.wip +++ b/docs/plans/017-router-silent-failure-hardening/2-capacity-gate-pm-scope.md.done @@ -6,7 +6,7 @@ plan_slug: capacity-gate-pm-scope level: plan parent_spec: docs/specs/017-router-silent-failure-hardening.md depends_on: [] -status: wip +status: done --- # 017/2: Pipeline-capacity-gate PM-provider scope @@ -210,14 +210,14 @@ n/a — all ACs auto-tested. ## Progress -- [ ] AC #1 -- [ ] AC #2 -- [ ] AC #3 -- [ ] AC #4 -- [ ] AC #5 -- [ ] AC #6 -- [ ] AC #7 -- [ ] AC #8 -- [ ] AC #9 -- [ ] AC #10 -- [ ] AC #11 +- [x] AC #1 — `withPMScopeForDispatch(project, dispatch)` exists at `src/router/adapters/_shared.ts`, resolves PM provider via `createPMProvider`, wraps in `withPMProvider` (verified by `tests/unit/router/adapters/with-pm-scope-for-dispatch.test.ts`'s 4 tests covering happy path, return passthrough, null result, error propagation). +- [x] AC #2 — Linear, Trello, JIRA router adapters each consume the shared helper inside `dispatchWithCredentials` (verified by per-adapter regression tests + the static guard). +- [x] AC #3 — `getPMProvider()` succeeds inside trigger handlers dispatched from any of the three PM router adapters (verified by the helper's positive-path test, which exercises `getPMProvider()` from inside the wrapped callback). +- [x] AC #4 — Capacity-gate fail-closed branch: ERROR log + Sentry tag `pipeline_capacity_gate_no_pm_provider` + `return true` (verified by `pipeline-capacity-gate.test.ts > FAILS CLOSED (blocks) when no PM provider scope is available`). +- [x] AC #5 — Positive path still returns `isActivePipelineOverCapacity`'s result; non-slot-consuming early-return preserved (verified by 2 new positive-path tests + the existing non-slot-consuming test). +- [x] AC #6 — Static guard `tests/unit/integrations/pm-router-adapter-pm-scope.test.ts` asserts each PM router adapter source contains either `withPMScopeForDispatch` or `withPMProvider`; CI fails when missing. +- [x] AC #7 — CLAUDE.md gains "Capacity-gate invariant" paragraph in the Architecture section. +- [x] AC #8 — All new/modified code has tests (+4 helper, +3 capacity-gate, +3 static guard, plus passthrough mocks added to 3 PM-source trigger tests + 3 PM router adapter tests). +- [x] AC #9 — `npm run typecheck` passes. +- [x] AC #10 — `npm test` (full unit suite) passes (474 files / 8677 tests, +9 from baseline after Plan 1). +- [x] AC #11 — `npm run lint` passes. diff --git a/src/router/adapters/_shared.ts b/src/router/adapters/_shared.ts new file mode 100644 index 00000000..72104d7d --- /dev/null +++ b/src/router/adapters/_shared.ts @@ -0,0 +1,42 @@ +/** + * Shared helpers for router platform adapters. + * + * Spec 017 / plan 2: PM router adapters (Linear/Trello/JIRA) wrap their + * `triggerRegistry.dispatch(ctx)` invocation in `withPMProvider` scope using + * the helper below. Without it, `getPMProvider()` calls inside trigger + * handlers — notably the pipeline-capacity gate at + * `src/triggers/shared/pipeline-capacity-gate.ts` — throw, the gate fails + * closed under spec 017's fail-closed policy, and Sentry captures under + * tag `pipeline_capacity_gate_no_pm_provider`. + * + * Mirrors the GitHub router adapter's existing correct shape at + * `src/router/adapters/github.ts:dispatchWithCredentials` which has wrapped + * dispatch in PM-provider scope since spec 006. + * + * The helper does NOT establish credential scope — that's each adapter's + * concern. PM-provider scope layers on TOP of the credential scope. + */ + +import { withPMProvider } from '../../pm/context.js'; +import { createPMProvider } from '../../pm/index.js'; +import type { ProjectConfig } from '../../types/index.js'; + +/** + * Wrap a dispatch callback in PM-provider AsyncLocalStorage scope so that + * `getPMProvider()` succeeds inside trigger handlers downstream. + * + * Resolves the PMProvider via `createPMProvider(project)` (the legacy + * compatibility adapter that delegates to the manifest registry's + * `pmIntegration.createProvider(project)`) and runs `dispatch` inside + * `withPMProvider(provider, dispatch)`. + * + * Returns whatever `dispatch` returns. Errors thrown by `dispatch` + * propagate unchanged. + */ +export function withPMScopeForDispatch( + project: ProjectConfig, + dispatch: () => Promise, +): Promise { + const provider = createPMProvider(project); + return withPMProvider(provider, dispatch); +} diff --git a/src/router/adapters/jira.ts b/src/router/adapters/jira.ts index 0e8c0544..feb4ff83 100644 --- a/src/router/adapters/jira.ts +++ b/src/router/adapters/jira.ts @@ -19,6 +19,7 @@ import type { AckResult, ParsedWebhookEvent, RouterPlatformAdapter } from '../pl import { resolveJiraCredentials } from '../platformClients/index.js'; import type { CascadeJob, JiraJob } from '../queue.js'; import { sendAcknowledgeReaction } from '../reactions.js'; +import { withPMScopeForDispatch } from './_shared.js'; const PROCESSABLE_EVENTS = [ 'jira:issue_updated', @@ -125,9 +126,12 @@ export class JiraRouterAdapter implements RouterPlatformAdapter { } const ctx: TriggerContext = { project: fullProject, source: 'jira', payload }; + // Wrap dispatch in BOTH credential scope AND PM-provider scope so that + // the pipeline-capacity gate at `src/triggers/shared/pipeline-capacity-gate.ts` + // can resolve `getPMProvider()`. See spec 017 plan 2. return withJiraCredentials( { email: jiraCreds.email, apiToken: jiraCreds.apiToken, baseUrl: jiraCreds.baseUrl }, - () => triggerRegistry.dispatch(ctx), + () => withPMScopeForDispatch(fullProject, () => triggerRegistry.dispatch(ctx)), ); } diff --git a/src/router/adapters/linear.ts b/src/router/adapters/linear.ts index dc0bb3ac..60f2e92f 100644 --- a/src/router/adapters/linear.ts +++ b/src/router/adapters/linear.ts @@ -19,6 +19,7 @@ import { loadProjectConfig, type RouterProjectConfig } from '../config.js'; import type { AckResult, ParsedWebhookEvent, RouterPlatformAdapter } from '../platform-adapter.js'; import { resolveLinearCredentials } from '../platformClients/index.js'; import type { CascadeJob, LinearJob } from '../queue.js'; +import { withPMScopeForDispatch } from './_shared.js'; // ============================================================================ // Processable event combinations (action/type) @@ -236,8 +237,12 @@ export class LinearRouterAdapter implements RouterPlatformAdapter { } const ctx: TriggerContext = { project: fullProject, source: 'linear', payload }; + // Wrap dispatch in BOTH credential scope AND PM-provider scope. + // The PM-provider scope is what the pipeline-capacity gate + // (src/triggers/shared/pipeline-capacity-gate.ts) needs to resolve + // `getPMProvider()`. Without it the gate fails closed (spec 017). return withLinearCredentials({ apiKey: linearCreds.apiKey }, () => - triggerRegistry.dispatch(ctx), + withPMScopeForDispatch(fullProject, () => triggerRegistry.dispatch(ctx)), ); } diff --git a/src/router/adapters/trello.ts b/src/router/adapters/trello.ts index 88a4b8f7..ad7678f2 100644 --- a/src/router/adapters/trello.ts +++ b/src/router/adapters/trello.ts @@ -25,6 +25,7 @@ import { isReadyToProcessLabelAdded, isSelfAuthoredTrelloComment, } from '../trello.js'; +import { withPMScopeForDispatch } from './_shared.js'; export class TrelloRouterAdapter implements RouterPlatformAdapter { readonly type = 'trello' as const; @@ -125,7 +126,12 @@ export class TrelloRouterAdapter implements RouterPlatformAdapter { } const ctx: TriggerContext = { project: fullProject, source: 'trello', payload }; - return withTrelloCredentials(trelloCreds, () => triggerRegistry.dispatch(ctx)); + // Wrap dispatch in BOTH credential scope AND PM-provider scope so that + // the pipeline-capacity gate at `src/triggers/shared/pipeline-capacity-gate.ts` + // can resolve `getPMProvider()`. See spec 017 plan 2. + return withTrelloCredentials(trelloCreds, () => + withPMScopeForDispatch(fullProject, () => triggerRegistry.dispatch(ctx)), + ); } async postAck( diff --git a/src/triggers/shared/pipeline-capacity-gate.ts b/src/triggers/shared/pipeline-capacity-gate.ts index a8138128..fef0295f 100644 --- a/src/triggers/shared/pipeline-capacity-gate.ts +++ b/src/triggers/shared/pipeline-capacity-gate.ts @@ -16,6 +16,7 @@ import { getPMProvider } from '../../pm/context.js'; import type { PMProvider } from '../../pm/types.js'; +import { captureException } from '../../sentry.js'; import type { ProjectConfig } from '../../types/index.js'; import { logger } from '../../utils/logging.js'; import { isActivePipelineOverCapacity } from './backlog-check.js'; @@ -34,14 +35,36 @@ export async function shouldBlockForPipelineCapacity(args: { try { provider = getPMProvider(); } catch (err) { - // No credential scope — conservative: allow. - logger.warn('pipeline-capacity-gate: PM provider unavailable, allowing run', { + // Spec 017 / plan 2: fail closed. + // + // Before plan 2, this branch logged WARN and returned `false` (allow) + // because the PM router adapters dispatched outside PM-provider scope + // — hitting this branch was the routine path for every PM + // `status-changed` trigger. After plan 2 wraps every PM router adapter + // in `withPMScopeForDispatch`, hitting this branch represents a real + // AsyncLocalStorage scope leak that operators need to investigate. + // Failing closed (block + error + Sentry) is preferable to silently + // failing open and re-introducing the original incident class + // (multiple concurrent implementation runs against a project pinned + // to `maxInFlightItems: 1`). + const error = err instanceof Error ? err : new Error(String(err)); + logger.error('pipeline-capacity-gate: PM provider unavailable, blocking run', { source: args.source, projectId: args.project.id, workItemId: args.workItemId, + agentType: args.agentType, error: String(err), }); - return false; + captureException(error, { + tags: { source: 'pipeline_capacity_gate_no_pm_provider' }, + extra: { + projectId: args.project.id, + workItemId: args.workItemId, + agentType: args.agentType, + triggerSource: args.source, + }, + }); + return true; } const result = await isActivePipelineOverCapacity(args.project, provider, { diff --git a/tests/unit/integrations/pm-router-adapter-pm-scope.test.ts b/tests/unit/integrations/pm-router-adapter-pm-scope.test.ts new file mode 100644 index 00000000..5c2ca667 --- /dev/null +++ b/tests/unit/integrations/pm-router-adapter-pm-scope.test.ts @@ -0,0 +1,58 @@ +/** + * Static guard: every PM router adapter must establish PM-provider AsyncLocalStorage + * scope around `triggerRegistry.dispatch(ctx)`. Without this, the pipeline-capacity + * gate at `src/triggers/shared/pipeline-capacity-gate.ts` cannot resolve the + * project's PM provider, fails closed (post-spec 017 plan 2), and Sentry captures + * under tag `pipeline_capacity_gate_no_pm_provider`. Live incident verified + * 2026-04-29: 32 occurrences/day on prod cascade-router. + * + * Compares each PM router adapter source file against the GitHub adapter's + * correct shape (which wraps in `withPMProvider` at + * `src/router/adapters/github.ts:dispatchWithCredentials`). A future PM router + * adapter that omits the wrapping fails this guard with a precise file path. + * + * Modeled on `tests/unit/triggers/trigger-event-consistency.test.ts` — + * static-grep style regression net. + */ +import { readFileSync } from 'node:fs'; +import { join } from 'node:path'; +import { describe, expect, it } from 'vitest'; + +const ROUTER_ADAPTERS_DIR = join(__dirname, '..', '..', '..', 'src', 'router', 'adapters'); + +// PM router adapters that drive PM `status-changed` triggers. Each must +// establish PM-provider scope before invoking `triggerRegistry.dispatch(ctx)`. +// Adding a new PM router adapter file here is part of the contract: the +// guard fails fast on unregistered adapters too if they end up dispatching +// without scope. +const PM_ROUTER_ADAPTER_FILES = ['linear.ts', 'trello.ts', 'jira.ts']; + +const ACCEPTABLE_WRAPPERS = ['withPMScopeForDispatch', 'withPMProvider']; + +describe('PM router adapter PM-provider scope (static guard)', () => { + for (const filename of PM_ROUTER_ADAPTER_FILES) { + it(`${filename} establishes PM-provider scope around trigger dispatch`, () => { + const path = join(ROUTER_ADAPTERS_DIR, filename); + const src = readFileSync(path, 'utf-8'); + + // Strip line comments and block comments to avoid false positives + // from doc references that mention the wrapper names. + const codeOnly = src + .split('\n') + .filter((line) => !line.trim().startsWith('//') && !line.trim().startsWith('*')) + .join('\n'); + + const found = ACCEPTABLE_WRAPPERS.find((wrapper) => codeOnly.includes(wrapper)); + + expect( + found, + `src/router/adapters/${filename} must wrap trigger dispatch in PM-provider AsyncLocalStorage scope. ` + + `Expected one of: ${ACCEPTABLE_WRAPPERS.join(', ')}. ` + + `Without this wrapping, the pipeline-capacity gate (` + + `src/triggers/shared/pipeline-capacity-gate.ts) cannot resolve the project's PM ` + + `provider, fails closed under the spec-017 fail-closed policy, and Sentry captures ` + + `under tag pipeline_capacity_gate_no_pm_provider.`, + ).toBeTruthy(); + }); + } +}); diff --git a/tests/unit/router/adapters/jira.test.ts b/tests/unit/router/adapters/jira.test.ts index 775c9459..1079678d 100644 --- a/tests/unit/router/adapters/jira.test.ts +++ b/tests/unit/router/adapters/jira.test.ts @@ -41,6 +41,11 @@ vi.mock('../../../../src/utils/runLink.js', () => ({ vi.mock('../../../../src/jira/client.js', () => ({ withJiraCredentials: vi.fn().mockImplementation((_creds: unknown, fn: () => unknown) => fn()), })); +// Spec 017 / plan 2: PM router adapters wrap dispatch in `withPMScopeForDispatch`. +// Mock as passthrough so the existing tests don't pull the real PM manifest registry. +vi.mock('../../../../src/router/adapters/_shared.js', () => ({ + withPMScopeForDispatch: vi.fn().mockImplementation((_p: unknown, fn: () => unknown) => fn()), +})); import { postJiraAck, resolveJiraBotAccountId } from '../../../../src/router/acknowledgments.js'; import { JiraRouterAdapter } from '../../../../src/router/adapters/jira.js'; diff --git a/tests/unit/router/adapters/linear.test.ts b/tests/unit/router/adapters/linear.test.ts index 1af3d623..14890141 100644 --- a/tests/unit/router/adapters/linear.test.ts +++ b/tests/unit/router/adapters/linear.test.ts @@ -32,6 +32,13 @@ vi.mock('../../../../src/utils/runLink.js', () => ({ buildWorkItemRunsLink: vi.fn().mockReturnValue(null), getDashboardUrl: vi.fn().mockReturnValue(null), })); +// Spec 017 / plan 2: PM router adapters wrap dispatch in `withPMScopeForDispatch` +// (PM-provider AsyncLocalStorage scope). Mock the helper as passthrough so the +// existing tests don't pull the real PM manifest registry into the assertion. +vi.mock('../../../../src/router/adapters/_shared.js', () => ({ + withPMScopeForDispatch: vi.fn().mockImplementation((_p: unknown, fn: () => unknown) => fn()), +})); + vi.mock('../../../../src/linear/client.js', () => ({ linearClient: { getIssueProjectId: vi.fn().mockResolvedValue(null), diff --git a/tests/unit/router/adapters/trello.test.ts b/tests/unit/router/adapters/trello.test.ts index 57e2945a..d3a93d82 100644 --- a/tests/unit/router/adapters/trello.test.ts +++ b/tests/unit/router/adapters/trello.test.ts @@ -36,6 +36,11 @@ vi.mock('../../../../src/utils/runLink.js', () => ({ vi.mock('../../../../src/trello/client.js', () => ({ withTrelloCredentials: vi.fn().mockImplementation((_creds: unknown, fn: () => unknown) => fn()), })); +// Spec 017 / plan 2: PM router adapters wrap dispatch in `withPMScopeForDispatch`. +// Mock as passthrough so the existing tests don't pull the real PM manifest registry. +vi.mock('../../../../src/router/adapters/_shared.js', () => ({ + withPMScopeForDispatch: vi.fn().mockImplementation((_p: unknown, fn: () => unknown) => fn()), +})); vi.mock('../../../../src/router/trello.js', () => ({ isAgentLogFilename: vi.fn().mockReturnValue(false), isAgentLogAttachmentUploaded: vi.fn().mockReturnValue(false), diff --git a/tests/unit/router/adapters/with-pm-scope-for-dispatch.test.ts b/tests/unit/router/adapters/with-pm-scope-for-dispatch.test.ts new file mode 100644 index 00000000..57822d7f --- /dev/null +++ b/tests/unit/router/adapters/with-pm-scope-for-dispatch.test.ts @@ -0,0 +1,86 @@ +/** + * Tests for the shared `withPMScopeForDispatch` helper at + * `src/router/adapters/_shared.ts`. PM router adapters (Linear/Trello/JIRA) + * call this helper to wrap `triggerRegistry.dispatch(ctx)` in PM-provider + * AsyncLocalStorage scope, mirroring the GitHub adapter's existing shape at + * `src/router/adapters/github.ts:withPMProvider(pmProvider, ...)`. + * + * Without this wrapping, `getPMProvider()` calls inside trigger handlers — + * notably the pipeline-capacity gate at + * `src/triggers/shared/pipeline-capacity-gate.ts` — throw, the gate falls + * through to its conservative branch, and the in-flight cap is silently + * disabled for every PM `status-changed` trigger. Verified live on + * 2026-04-29 (32 occurrences/day in prod cascade-router). + */ +import { beforeEach, describe, expect, it, vi } from 'vitest'; + +const { mockCreatePMProvider } = vi.hoisted(() => ({ + mockCreatePMProvider: vi.fn(), +})); + +// Mock the legacy createPMProvider compatibility adapter — it's the function +// used elsewhere in the router (see github.ts:262) to materialize a PMProvider +// for use inside withPMProvider(). +vi.mock('../../../../src/pm/index.js', async (importOriginal) => { + const actual = await importOriginal(); + return { + ...actual, + createPMProvider: mockCreatePMProvider, + }; +}); + +import { getPMProvider } from '../../../../src/pm/context.js'; +import { withPMScopeForDispatch } from '../../../../src/router/adapters/_shared.js'; +import type { ProjectConfig } from '../../../../src/types/index.js'; + +const fakeProject: ProjectConfig = { + id: 'proj-1', + repo: 'org/repo', + pm: { type: 'linear' }, +} as ProjectConfig; + +describe('withPMScopeForDispatch', () => { + beforeEach(() => { + vi.resetAllMocks(); + }); + + it('resolves the project PM provider via createPMProvider and runs dispatch inside withPMProvider scope', async () => { + const fakeProvider = { type: 'linear', __marker: 'fake' }; + mockCreatePMProvider.mockReturnValue(fakeProvider); + + const innerSawProvider = await withPMScopeForDispatch(fakeProject, async () => { + // getPMProvider() must succeed here; returns the same provider instance. + return getPMProvider(); + }); + + expect(mockCreatePMProvider).toHaveBeenCalledWith(fakeProject); + expect(innerSawProvider).toBe(fakeProvider); + }); + + it('returns whatever the dispatch callback returns (preserves TriggerResult passthrough)', async () => { + mockCreatePMProvider.mockReturnValue({ type: 'linear' }); + const expectedResult = { agentType: 'review', agentInput: {} }; + + const result = await withPMScopeForDispatch(fakeProject, async () => expectedResult); + + expect(result).toBe(expectedResult); + }); + + it('returns null when the dispatch callback returns null', async () => { + mockCreatePMProvider.mockReturnValue({ type: 'linear' }); + + const result = await withPMScopeForDispatch(fakeProject, async () => null); + + expect(result).toBeNull(); + }); + + it('propagates errors thrown by the dispatch callback (does not swallow)', async () => { + mockCreatePMProvider.mockReturnValue({ type: 'linear' }); + + await expect( + withPMScopeForDispatch(fakeProject, async () => { + throw new Error('dispatch boom'); + }), + ).rejects.toThrow('dispatch boom'); + }); +}); diff --git a/tests/unit/triggers/jira-status-changed.test.ts b/tests/unit/triggers/jira-status-changed.test.ts index 7fca6c4a..ebee2e9a 100644 --- a/tests/unit/triggers/jira-status-changed.test.ts +++ b/tests/unit/triggers/jira-status-changed.test.ts @@ -10,6 +10,13 @@ vi.mock('../../../src/utils/logging.js', () => ({ logger: mockLogger })); vi.mock('../../../src/triggers/config-resolver.js', () => mockConfigResolverModule); vi.mock('../../../src/triggers/shared/trigger-check.js', () => mockTriggerCheckModule); +// Spec 017 / plan 2: the capacity gate is now fail-closed when no +// PM-provider AsyncLocalStorage scope is in effect (the case in these +// unit tests). Mock as passthrough so trigger-logic assertions still run. +vi.mock('../../../src/triggers/shared/pipeline-capacity-gate.js', () => ({ + shouldBlockForPipelineCapacity: vi.fn().mockResolvedValue(false), +})); + import { JiraStatusChangedTrigger } from '../../../src/triggers/jira/status-changed.js'; import { checkTriggerEnabledWithParams } from '../../../src/triggers/shared/trigger-check.js'; import type { TriggerContext } from '../../../src/triggers/types.js'; diff --git a/tests/unit/triggers/linear-status-changed.test.ts b/tests/unit/triggers/linear-status-changed.test.ts index 6393b5ed..afeb3e83 100644 --- a/tests/unit/triggers/linear-status-changed.test.ts +++ b/tests/unit/triggers/linear-status-changed.test.ts @@ -4,6 +4,14 @@ import { mockLogger, mockTriggerCheckModule } from '../../helpers/sharedMocks.js vi.mock('../../../src/utils/logging.js', () => ({ logger: mockLogger })); vi.mock('../../../src/triggers/shared/trigger-check.js', () => mockTriggerCheckModule); +// Spec 017 / plan 2: the capacity gate is now fail-closed when no +// PM-provider AsyncLocalStorage scope is in effect (which is the case in +// these unit tests). Mock the gate to return false (don't block) so the +// trigger-logic assertions still exercise the paths under test. +vi.mock('../../../src/triggers/shared/pipeline-capacity-gate.js', () => ({ + shouldBlockForPipelineCapacity: vi.fn().mockResolvedValue(false), +})); + const mockGetLinearConfig = vi.fn(); vi.mock('../../../src/pm/config.js', () => ({ getLinearConfig: (...args: unknown[]) => mockGetLinearConfig(...args), diff --git a/tests/unit/triggers/shared/pipeline-capacity-gate.test.ts b/tests/unit/triggers/shared/pipeline-capacity-gate.test.ts index 59502328..e42c2f22 100644 --- a/tests/unit/triggers/shared/pipeline-capacity-gate.test.ts +++ b/tests/unit/triggers/shared/pipeline-capacity-gate.test.ts @@ -1,10 +1,12 @@ -import { describe, expect, it, vi } from 'vitest'; +import { beforeEach, describe, expect, it, vi } from 'vitest'; -const { mockGetPMProvider, mockIsActivePipelineOverCapacity, mockLogger } = vi.hoisted(() => ({ - mockGetPMProvider: vi.fn(), - mockIsActivePipelineOverCapacity: vi.fn(), - mockLogger: { info: vi.fn(), warn: vi.fn(), debug: vi.fn(), error: vi.fn() }, -})); +const { mockGetPMProvider, mockIsActivePipelineOverCapacity, mockLogger, mockCaptureException } = + vi.hoisted(() => ({ + mockGetPMProvider: vi.fn(), + mockIsActivePipelineOverCapacity: vi.fn(), + mockLogger: { info: vi.fn(), warn: vi.fn(), debug: vi.fn(), error: vi.fn() }, + mockCaptureException: vi.fn(), + })); vi.mock('../../../../src/pm/context.js', () => ({ getPMProvider: mockGetPMProvider, @@ -18,11 +20,19 @@ vi.mock('../../../../src/utils/logging.js', () => ({ logger: mockLogger, })); +vi.mock('../../../../src/sentry.js', () => ({ + captureException: mockCaptureException, +})); + import { shouldBlockForPipelineCapacity } from '../../../../src/triggers/shared/pipeline-capacity-gate.js'; import { createMockProject } from '../../../helpers/factories.js'; const project = createMockProject({ maxInFlightItems: 1 }); +beforeEach(() => { + vi.resetAllMocks(); +}); + describe('shouldBlockForPipelineCapacity', () => { it('does not gate non-slot-consuming agent types (review, planning, splitting, backlog-manager)', async () => { for (const agentType of ['review', 'planning', 'splitting', 'backlog-manager', 'debug']) { @@ -90,9 +100,17 @@ describe('shouldBlockForPipelineCapacity', () => { expect(blocked).toBe(false); }); - it('allows (conservatively) when no PM provider scope is available', async () => { + it('FAILS CLOSED (blocks) when no PM provider scope is available; logs ERROR and captures Sentry under tag pipeline_capacity_gate_no_pm_provider', async () => { + // Spec 017 / plan 2: this branch used to log WARN and return false + // (allow). After plan 2 wraps every PM router adapter in PM-provider + // scope, hitting this branch on the routine path is no longer + // expected — it represents a real AsyncLocalStorage scope leak that + // operators need to investigate. Failing closed (block + error + + // Sentry) is preferable to silently failing open and re-introducing + // the original incident class (3+ concurrent implementation runs + // against a `maxInFlightItems: 1` project). mockGetPMProvider.mockImplementation(() => { - throw new Error('no scope'); + throw new Error('No PMProvider in scope'); }); const blocked = await shouldBlockForPipelineCapacity({ @@ -102,10 +120,68 @@ describe('shouldBlockForPipelineCapacity', () => { source: 'jira', }); - expect(blocked).toBe(false); - expect(mockLogger.warn).toHaveBeenCalledWith( - 'pipeline-capacity-gate: PM provider unavailable, allowing run', - expect.objectContaining({ workItemId: 'UA-3' }), + expect(blocked).toBe(true); + expect(mockLogger.error).toHaveBeenCalledWith( + expect.stringMatching(/pipeline-capacity-gate: PM provider unavailable/), + expect.objectContaining({ workItemId: 'UA-3', source: 'jira' }), + ); + expect(mockLogger.warn).not.toHaveBeenCalled(); + expect(mockCaptureException).toHaveBeenCalledWith( + expect.any(Error), + expect.objectContaining({ + tags: expect.objectContaining({ source: 'pipeline_capacity_gate_no_pm_provider' }), + extra: expect.objectContaining({ + projectId: expect.any(String), + workItemId: 'UA-3', + triggerSource: 'jira', + agentType: 'implementation', + }), + }), ); }); + + it('positive path still works after fail-closed conversion: provider in scope and pipeline-over-capacity returns true', async () => { + // Regression pin against the over-capacity branch breaking during the + // fail-closed migration. This duplicates an earlier test's positive + // assertion explicitly to ensure plan 2 doesn't accidentally short- + // circuit the routine path. + mockGetPMProvider.mockReturnValue({ type: 'jira' }); + mockIsActivePipelineOverCapacity.mockResolvedValue({ + overCapacity: true, + reason: 'over-capacity', + inFlightCount: 2, + limit: 1, + }); + + const blocked = await shouldBlockForPipelineCapacity({ + project, + agentType: 'implementation', + workItemId: 'UA-4', + source: 'jira', + }); + + expect(blocked).toBe(true); + expect(mockCaptureException).not.toHaveBeenCalled(); + }); + + it('positive path: provider in scope, pipeline below capacity returns false', async () => { + // Companion regression pin to the over-capacity test above. + mockGetPMProvider.mockReturnValue({ type: 'jira' }); + mockIsActivePipelineOverCapacity.mockResolvedValue({ + overCapacity: false, + reason: undefined, + inFlightCount: 0, + limit: 1, + }); + + const blocked = await shouldBlockForPipelineCapacity({ + project, + agentType: 'implementation', + workItemId: 'UA-5', + source: 'jira', + }); + + expect(blocked).toBe(false); + expect(mockCaptureException).not.toHaveBeenCalled(); + }); }); diff --git a/tests/unit/triggers/status-changed.test.ts b/tests/unit/triggers/status-changed.test.ts index 406fbd18..f4104f53 100644 --- a/tests/unit/triggers/status-changed.test.ts +++ b/tests/unit/triggers/status-changed.test.ts @@ -15,6 +15,13 @@ vi.mock('../../../src/utils/logging.js', () => ({ logger: mockLogger })); vi.mock('../../../src/triggers/config-resolver.js', () => mockConfigResolverModule); vi.mock('../../../src/triggers/shared/trigger-check.js', () => mockTriggerCheckModule); +// Spec 017 / plan 2: the capacity gate is now fail-closed when no +// PM-provider AsyncLocalStorage scope is in effect (the case in these +// unit tests). Mock as passthrough so trigger-logic assertions still run. +vi.mock('../../../src/triggers/shared/pipeline-capacity-gate.js', () => ({ + shouldBlockForPipelineCapacity: vi.fn().mockResolvedValue(false), +})); + // Mocks required for PM integration registration (pm/index.js side-effect) vi.mock('../../../src/config/provider.js', () => mockConfigProvider); vi.mock('../../../src/trello/client.js', () => mockTrelloClientModule); From b03fb8e4be956bd0aeadc800bb4b8ce8f189b7e4 Mon Sep 17 00:00:00 2001 From: Zbigniew Sobiecki Date: Wed, 29 Apr 2026 10:59:53 +0000 Subject: [PATCH 3/4] =?UTF-8?q?chore(spec):=20017=20done=20=E2=80=94=20all?= =?UTF-8?q?=20three=20plans=20complete=20(router-side=20silent-failure=20h?= =?UTF-8?q?ardening)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- ...hardening.md => 017-router-silent-failure-hardening.md.done} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename docs/specs/{017-router-silent-failure-hardening.md => 017-router-silent-failure-hardening.md.done} (99%) diff --git a/docs/specs/017-router-silent-failure-hardening.md b/docs/specs/017-router-silent-failure-hardening.md.done similarity index 99% rename from docs/specs/017-router-silent-failure-hardening.md rename to docs/specs/017-router-silent-failure-hardening.md.done index a543a43a..9619ff83 100644 --- a/docs/specs/017-router-silent-failure-hardening.md +++ b/docs/specs/017-router-silent-failure-hardening.md.done @@ -4,7 +4,7 @@ slug: router-silent-failure-hardening level: spec title: Router-side silent-failure hardening created: 2026-04-29 -status: draft +status: done --- # 017: Router-side silent-failure hardening From a080cd11a6dc17b1189e433941af286eda71585e Mon Sep 17 00:00:00 2001 From: Zbigniew Sobiecki Date: Wed, 29 Apr 2026 11:24:56 +0000 Subject: [PATCH 4/4] fix(integration-tests): wrap implementation-trigger handle() in withPMProvider scope + plan 2 polish MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit CI integration-tests caught a regression I missed locally because lefthook's pre-push only runs `npm run test:fast` (unit-only), not the full integration suite. Four integration tests asserted `result?.agentType === 'implementation'` but received `undefined` after plan 2's fail-closed flip, because they call real trigger `.handle(ctx)` directly without establishing PM-provider scope — the same class of breakage I fixed in unit tests with vi.mock passthroughs. Failing tests: - tests/integration/trigger-registry.test.ts:323 (TrelloStatusChangedTodoTrigger.handle) - tests/integration/trigger-registry.test.ts:591 (registry.dispatch with implementation trigger) - tests/integration/pm-provider-switching.test.ts:258 (Trello implementation dispatch) - tests/integration/pm-provider-switching.test.ts:293 (JIRA implementation dispatch) Integration tests don't mock the gate (by design), so the right fix is to mirror production wrapping: each call site now wraps in `withPMProvider(createPMProvider(project), () => trigger.handle(ctx))` — which is exactly what `withPMScopeForDispatch` does in the PM router adapters. The synthetic in-test trigger at trigger-registry.test.ts:117 doesn't call the gate (it returns a hard-coded TriggerResult), so no scope wrap needed there. Same for any non-implementation agent (planning / splitting / etc.) since `SLOT_CONSUMING_AGENTS` only contains 'implementation'. Also applies polish from local-review on the spec 017 work: 1. Reworded stale comment in `src/gadgets/sessionState.ts:186` — the comment claimed clearing `initialCommentId` made the post-agent callback "see null and short-circuit", but post-plan-3 the callback's actual gate is the `initialCommentIdConsumed` flag set on the line below. Comment now describes both the eager-clear AND the consumed-flag gate together. 2. Trimmed redundant 3-line wrapping comments to zero lines across the three PM router adapters (`linear.ts`, `trello.ts`, `jira.ts`). The shared helper's name (`withPMScopeForDispatch`) is self-documenting and CLAUDE.md's "Capacity-gate invariant" paragraph carries the rationale. Per CLAUDE.md "Default to writing no comments" guidance. 3. Tightened conformance harness assertion at `pm-conformance.test.ts`'s `dispatchPMAck reaches this provider without throwing` test. Previously the only `expect(...)` was conditional (`if (result !== undefined)`), so when result was undefined the test passed with zero assertions executed — a "tests that would still pass if the code under test were deleted" anti-pattern. Now uses `expect.assertions(1)` and asserts the result is either undefined OR has the expected AckResult shape, with a clear failure message naming the provider. Co-Authored-By: Claude Opus 4.7 (1M context) --- src/gadgets/sessionState.ts | 12 +++--- src/router/adapters/jira.ts | 3 -- src/router/adapters/linear.ts | 4 -- src/router/adapters/trello.ts | 3 -- .../integration/pm-provider-switching.test.ts | 13 ++++++- tests/integration/trigger-registry.test.ts | 38 ++++++++++++++----- .../unit/integrations/pm-conformance.test.ts | 32 ++++++++-------- 7 files changed, 63 insertions(+), 42 deletions(-) diff --git a/src/gadgets/sessionState.ts b/src/gadgets/sessionState.ts index bb17390d..a43c31fb 100644 --- a/src/gadgets/sessionState.ts +++ b/src/gadgets/sessionState.ts @@ -183,16 +183,18 @@ export class SessionState { const commentId = this.state.initialCommentId; if (!commentId) return; - // Clear state first so the post-agent callback sees null and short-circuits + // Clear the id eagerly so concurrent reads can't observe a stale value. + // The post-agent callback's actual gate is the `initialCommentIdConsumed` + // flag set below — once that's true, the callback's legacy fallback to + // `agentInput.ackCommentId` is also short-circuited. this.state.initialCommentId = null; try { const { githubClient } = await import('../github/client.js'); await githubClient.deletePRComment(owner, repo, commentId); - // Mark consumed so the post-agent callback's legacy fallback to - // `agentInput.ackCommentId` does not re-issue a DELETE for the - // same id. `deletePRComment` swallows 404 internally, so reaching - // here without throwing covers both 200/204 and 404 outcomes. + // `deletePRComment` swallows 404 internally, so reaching here without + // throwing covers both 200/204 (we deleted) and 404 (someone else + // already did) outcomes — both mean the comment is gone. this.state.initialCommentIdConsumed = true; } catch { // Best-effort: restore the id so post-agent callback can retry. diff --git a/src/router/adapters/jira.ts b/src/router/adapters/jira.ts index feb4ff83..91c7eea4 100644 --- a/src/router/adapters/jira.ts +++ b/src/router/adapters/jira.ts @@ -126,9 +126,6 @@ export class JiraRouterAdapter implements RouterPlatformAdapter { } const ctx: TriggerContext = { project: fullProject, source: 'jira', payload }; - // Wrap dispatch in BOTH credential scope AND PM-provider scope so that - // the pipeline-capacity gate at `src/triggers/shared/pipeline-capacity-gate.ts` - // can resolve `getPMProvider()`. See spec 017 plan 2. return withJiraCredentials( { email: jiraCreds.email, apiToken: jiraCreds.apiToken, baseUrl: jiraCreds.baseUrl }, () => withPMScopeForDispatch(fullProject, () => triggerRegistry.dispatch(ctx)), diff --git a/src/router/adapters/linear.ts b/src/router/adapters/linear.ts index 60f2e92f..e80f68d3 100644 --- a/src/router/adapters/linear.ts +++ b/src/router/adapters/linear.ts @@ -237,10 +237,6 @@ export class LinearRouterAdapter implements RouterPlatformAdapter { } const ctx: TriggerContext = { project: fullProject, source: 'linear', payload }; - // Wrap dispatch in BOTH credential scope AND PM-provider scope. - // The PM-provider scope is what the pipeline-capacity gate - // (src/triggers/shared/pipeline-capacity-gate.ts) needs to resolve - // `getPMProvider()`. Without it the gate fails closed (spec 017). return withLinearCredentials({ apiKey: linearCreds.apiKey }, () => withPMScopeForDispatch(fullProject, () => triggerRegistry.dispatch(ctx)), ); diff --git a/src/router/adapters/trello.ts b/src/router/adapters/trello.ts index ad7678f2..35b92342 100644 --- a/src/router/adapters/trello.ts +++ b/src/router/adapters/trello.ts @@ -126,9 +126,6 @@ export class TrelloRouterAdapter implements RouterPlatformAdapter { } const ctx: TriggerContext = { project: fullProject, source: 'trello', payload }; - // Wrap dispatch in BOTH credential scope AND PM-provider scope so that - // the pipeline-capacity gate at `src/triggers/shared/pipeline-capacity-gate.ts` - // can resolve `getPMProvider()`. See spec 017 plan 2. return withTrelloCredentials(trelloCreds, () => withPMScopeForDispatch(fullProject, () => triggerRegistry.dispatch(ctx)), ); diff --git a/tests/integration/pm-provider-switching.test.ts b/tests/integration/pm-provider-switching.test.ts index 3c119c9a..ee6906cf 100644 --- a/tests/integration/pm-provider-switching.test.ts +++ b/tests/integration/pm-provider-switching.test.ts @@ -19,6 +19,7 @@ import { getIntegrationByProjectAndCategory, upsertProjectIntegration, } from '../../src/db/repositories/settingsRepository.js'; +import { withPMProvider } from '../../src/pm/context.js'; import { createPMProvider } from '../../src/pm/index.js'; import { pmRegistry } from '../../src/pm/registry.js'; import { JiraStatusChangedTrigger } from '../../src/triggers/jira/status-changed.js'; @@ -254,7 +255,12 @@ describe('PM Provider Switching (integration)', () => { }; expect(TrelloStatusChangedTodoTrigger.matches(ctx)).toBe(true); - const result = await TrelloStatusChangedTodoTrigger.handle(ctx); + // Spec 017 / plan 2: capacity gate is fail-closed without PM scope. + // Mirror production wrapping (`withPMScopeForDispatch`) so the + // implementation-gating branch finds a provider and proceeds. + const result = await withPMProvider(createPMProvider(assertFound(project)), () => + TrelloStatusChangedTodoTrigger.handle(ctx), + ); expect(result?.agentType).toBe('implementation'); }); }); @@ -289,7 +295,10 @@ describe('PM Provider Switching (integration)', () => { }; expect(trigger.matches(ctx)).toBe(true); - const result = await trigger.handle(ctx); + // Spec 017 / plan 2: PM scope wrap so the capacity gate finds a provider. + const result = await withPMProvider(createPMProvider(assertFound(project)), () => + trigger.handle(ctx), + ); expect(result?.agentType).toBe('implementation'); expect(result?.workItemId).toBe('IMPL-1'); }); diff --git a/tests/integration/trigger-registry.test.ts b/tests/integration/trigger-registry.test.ts index 09bc2a56..2f2aba96 100644 --- a/tests/integration/trigger-registry.test.ts +++ b/tests/integration/trigger-registry.test.ts @@ -6,10 +6,17 @@ */ import { beforeAll, beforeEach, describe, expect, it } from 'vitest'; + +// Bootstrap PM manifest registry so createPMProvider can resolve providers +// inside withPMProvider scope wraps below. See spec 017 / plan 2. +import '../../src/integrations/pm/index.js'; + import { findProjectByBoardIdFromDb, findProjectByRepoFromDb, } from '../../src/db/repositories/configRepository.js'; +import { withPMProvider } from '../../src/pm/context.js'; +import { createPMProvider } from '../../src/pm/index.js'; import { createTriggerRegistry } from '../../src/triggers/registry.js'; import { ReadyToProcessLabelTrigger } from '../../src/triggers/trello/label-added.js'; import { @@ -319,7 +326,14 @@ describe('Trigger Registry (integration)', () => { }), }; - const result = await TrelloStatusChangedTodoTrigger.handle(ctx); + // Spec 017 / plan 2: the pipeline-capacity gate is fail-closed when no + // PM-provider AsyncLocalStorage scope is in effect. In production the + // PM router adapters wrap dispatch in `withPMScopeForDispatch`; this + // integration test mirrors that wrapping so the gate's + // implementation-only branch finds a provider and proceeds. + const result = await withPMProvider(createPMProvider(assertFound(project)), () => + TrelloStatusChangedTodoTrigger.handle(ctx), + ); expect(result?.agentType).toBe('implementation'); expect(result?.agentInput.workItemId).toBe('card-xyz'); expect(result?.workItemId).toBe('card-xyz'); @@ -578,16 +592,20 @@ describe('Trigger Registry (integration)', () => { const project = await findProjectByBoardIdFromDb('board-123'); expect(project).toBeDefined(); - // Move to todo — should trigger implementation - const todoResult = await registry.dispatch({ - project: assertFound(project), - source: 'trello', - payload: makeTrelloCardMovedPayload({ - cardId: 'card-todo', - listAfterId: 'list-todo-123', - listBeforeId: 'list-plan-456', + // Move to todo — should trigger implementation. Spec 017 / plan 2: + // wrap in PM-provider scope so the capacity gate (fail-closed) finds + // a provider; mirrors production router-adapter wrapping. + const todoResult = await withPMProvider(createPMProvider(assertFound(project)), () => + registry.dispatch({ + project: assertFound(project), + source: 'trello', + payload: makeTrelloCardMovedPayload({ + cardId: 'card-todo', + listAfterId: 'list-todo-123', + listBeforeId: 'list-plan-456', + }), }), - }); + ); expect(todoResult?.agentType).toBe('implementation'); // Move to splitting — should trigger splitting diff --git a/tests/unit/integrations/pm-conformance.test.ts b/tests/unit/integrations/pm-conformance.test.ts index 5b5534b7..ef2db5a9 100644 --- a/tests/unit/integrations/pm-conformance.test.ts +++ b/tests/unit/integrations/pm-conformance.test.ts @@ -162,14 +162,15 @@ describe('PM provider conformance (every registered provider)', () => { // PM-side ack because the github router adapter's local `postPMAck` // helper had only Trello + JIRA branches (no `linear`). // - // Strategy: dispatch against the real registry. `postComment` may - // resolve credentials and return null when none are configured (the - // expected outcome in this test environment). What is NOT acceptable - // is the dispatch helper throwing or hitting its unknown-PM-type - // Sentry path — both indicate the provider is unreachable from the - // dispatch surface. We therefore assert the result is either undefined - // or has the AckResult shape, and that no exception propagates. + // Strategy: dispatch against the real registry and assert no exception + // propagates. `postComment` may legitimately return null in this test + // environment when credentials aren't seeded, so the result can be + // either `undefined` (null id) or an AckResult-shaped object — both + // outcomes prove the provider is reachable. What's NOT acceptable is + // dispatch throwing or hitting its unknown-PM-type Sentry path; both + // indicate the provider is unreachable from the dispatch surface. it('dispatchPMAck reaches this provider without throwing', async () => { + expect.assertions(1); const { dispatchPMAck } = await import('../../../src/router/pm-ack-dispatch.js'); const result = await dispatchPMAck({ @@ -180,14 +181,15 @@ describe('PM provider conformance (every registered provider)', () => { agentType: 'backlog-manager', }); - if (result !== undefined) { - expect(result).toEqual( - expect.objectContaining({ - commentId: expect.anything(), - message: 'conformance check', - }), - ); - } + expect( + result === undefined || + (typeof result === 'object' && + result !== null && + result.message === 'conformance check' && + result.commentId !== undefined), + `dispatchPMAck must reach provider '${id}' and return undefined or a well-shaped AckResult; ` + + `got ${JSON.stringify(result)}`, + ).toBe(true); }); it('pmIntegration is wired (type matches id)', () => {