Deploy: include Trello webhook error body#929
Merged
zbigniewsobiecki merged 117 commits intomainfrom Mar 17, 2026
Merged
Conversation
…oads Read raw text first via c.req.text() then parse URLSearchParams, so the HMAC signature can be computed over the exact bytes GitHub sent. Also update PR description to accurately reflect this PR adds the plumbing infrastructure (rawBody threading, verifySignature callback) rather than the actual signature verification wiring. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… selection guide Expand the LLM API keys section to cover both authentication paths for each supported engine (API key vs subscription auth), and add a new "Choose Agent Engine" section so users know how to switch between claude-code, codex, opencode, and llmist backends. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…dex-auth docs(getting-started): add Codex engine auth and agent engine selection
…ners (#828) Worker containers run setup.sh which already installs local PostgreSQL and creates cascade_test, but never wrote TEST_DATABASE_URL to .cascade/env. resolveTestDbUrl() already reads that file as its second fallback, so integration tests would fall through to the Docker Compose path (which fails with 'docker: not found' in containers). Also fixes sed -i portability: macOS requires 'sed -i ''' while Linux uses 'sed -i'. Replaces the file-existence guard with a 'touch' to ensure the file exists before the sed call. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
#825) * feat(tests): add shared mock factories and migrate heaviest test files * fix(tests): align shared github mock contract --------- Co-authored-by: Cascade Bot <bot@cascade.dev>
Co-authored-by: Cascade Bot <bot@cascade.dev>
Three changes to prevent agent confusion when running tests inside worker containers: 1. **CLAUDE.md**: Document correct test commands (npm test, test:unit, test:integration, test:all) and add a warning against `npm test -- --project integration`, which adds rather than replaces the unit project flags. 2. **beforeAll(truncateAll)**: Add file-level truncation to all 6 top-level integration test files so each file starts from a known-clean state regardless of what previous test files left in the DB. 3. **withTestTransaction helper**: Implement a correct transaction-rollback pattern for integration tests. Adds `_setTestDb` hook to getDb() so the active transaction can be injected, and exports `withTestTransaction` from tests/integration/helpers/db.ts for future use without re-inventing it. New unit tests cover _setTestDb/getDb override and withTestTransaction lifecycle (rollback-on-success, error propagation, _setTestDb cleanup in finally). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Keep the PR's more complete JSDoc comment that documents rawBody preservation for both JSON and form-urlencoded content types. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-env fix(tests): harden agent worker test environment
When GitHub delivers a webhook with application/x-www-form-urlencoded, rawBody is `payload=<url-encoded JSON>`. The previous implementation called JSON.parse(rawBody) directly, which threw, leaving repoFullName undefined and causing the verifier to return null — skipping signature verification entirely and accepting any missing/invalid signature with HTTP 200. Fix: after a JSON.parse failure, fall back to URLSearchParams to extract the payload field and parse its JSON to resolve repoFullName. The HMAC is still computed over the full raw form-encoded body (as GitHub does), so verification is correct for both delivery modes. Three new unit tests cover the form-urlencoded path (valid sig, wrong sig, missing header). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…re-verification feat(router): wire up HMAC signature verification for GitHub and Trello webhooks
…/envFilter.ts (#832) Co-authored-by: Cascade Bot <bot@cascade.dev>
Co-authored-by: Cascade Bot <bot@cascade.dev>
… module (#834) Co-authored-by: Cascade Bot <bot@cascade.dev>
* feat(backends): add resolveModel() to AgentEngine interface * docs(backends): comment double model resolution for backward compat Add explanatory comments to the resolve*Model() calls in each engine's execute() method clarifying that the redundancy is intentional. These calls remain for backward compatibility when execute() is invoked directly without going through the adapter's pre-resolution step. All three resolve functions are idempotent so the double call is safe. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…ntEngine interface (#836) Co-authored-by: Cascade Bot <bot@cascade.dev>
…837) * test(backends): add engine contract test for all registered engines * fix(tests): make "registers exactly the expected engines" test meaningful Add toHaveLength assertion so the test verifies no unexpected engines are registered, not just that the expected ones are present. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Cascade Bot <bot@cascade.dev>
…l/iterations controls (#839) Co-authored-by: Cascade Bot <bot@cascade.dev>
…ations > Source Control (#840) Co-authored-by: Cascade Bot <bot@cascade.dev>
Co-authored-by: Cascade Bot <bot@cascade.dev>
…pandable sidebar tree (#842) * feat(dashboard): refactor project tabs into URL-backed routes with expandable sidebar tree * fix(sidebar): address review feedback on navigation helpers and back-links - Fix prefix-matching bug in ProjectNavItem by importing and using isProjectActive/isSectionActive helpers from project-sections.ts instead of inline startsWith checks (prevents false matches like /projects/proj matching /projects/project-2/general) - Update back-links in prs/$projectId.$prNumber.tsx and work-items/$projectId.$workItemId.tsx to point directly to /projects/$projectId/general, avoiding an unnecessary client-side redirect on every click - Remove dead re-exports from $projectId.tsx (ProjectSection, DEFAULT_PROJECT_SECTION, PROJECT_SECTIONS) which had zero consumers - Restore overflow-y-auto max-h-48 on the projects container to prevent expandable tree items from pushing settings/global nav off-screen Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(build): use typed route paths for project section links in sidebar Add `route` property to PROJECT_SECTIONS with typed TanStack Router paths (`/projects/$projectId/<section>`) and update the sidebar Link component to use `to={section.route}` with `params={{ projectId: project.id }}` instead of a template literal that violated TanStack Router's type-safe route requirements, fixing the frontend build failure. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(sidebar): sync expansion state with URL navigation and fix back-link destinations - Use useEffect to keep sidebar project expansion in sync when activeProject changes due to URL navigation (not just initial mount) - Update back-navigation links in PR runs and work item runs pages to point to /work section instead of /general, since users reach these pages from Work Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…ssion fix(tests): mock runLink to prevent env var leakage in unit tests
* feat(db): add project_credentials table and migration 0040 * fix(db): align migration timestamp type with Drizzle schema Change TIMESTAMPTZ to TIMESTAMP in 0040 migration so the SQL column type matches the Drizzle schema's timestamp() (without timezone), consistent with the credentials and integrations table patterns. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…#846) Co-authored-by: Cascade Bot <bot@cascade.dev>
…to Agents (#847) Co-authored-by: Cascade Bot <bot@cascade.dev>
…#900) Same bug as router (#899) and dashboard (#896): worker-entry.ts calls loadConfig() at line 253 which runs EngineSettingsSchema validation. The dynamic ENGINE_SETTINGS_SCHEMAS registry was empty because registerBuiltInEngines() had not yet been called — it only runs later when createAgentRegistry() initializes. This caused every worker container to crash immediately with: ZodError: Unsupported engine settings for "claude-code" ...for any project with claude-code/codex/opencode engineSettings, making all Trello/JIRA/GitHub agent runs fail at job start. Fix: call registerBuiltInEngines() once before loadConfig(). Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…ct (#897) * feat(agents): require explicit agent config to enable agent per project * fix(tests): update integration tests for agent opt-in enforcement The new `isAgentEnabledForProject()` guard requires an explicit `agent_configs` row before any trigger can fire. Integration tests that called `handle()` and expected a non-null result were failing because no agent config was seeded. - Seed `agent_configs` + `agent_trigger_configs` rows in tests that expect triggers to fire (trigger-registry, pm-provider-switching, github-personas) - Export `clearAgentEnabledCache()` from agentConfigsRepository for test isolation (the 5s TTL cache was causing false negatives when tests ran within the TTL window of a prior test that had no config) - Call `clearAgentEnabledCache()` in `truncateAll()` so every `beforeEach` starts with a clean cache alongside a clean DB Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Zbigniew Sobiecki <zbigniew@sobiecki.name>
…cked chart, cost formatting (#901) Co-authored-by: Cascade Bot <bot@cascade.dev>
#902) Co-authored-by: Cascade Bot <bot@cascade.dev>
…s all 3 locations (#903) Co-authored-by: Cascade Bot <bot@cascade.dev>
Co-authored-by: Cascade Bot <bot@cascade.dev>
Co-authored-by: Cascade Bot <bot@cascade.dev>
Co-authored-by: Cascade Bot <bot@cascade.dev>
Co-authored-by: Cascade Bot <bot@cascade.dev>
…curl webhooks (#908) Co-authored-by: Cascade Bot <bot@cascade.dev>
* feat(agents): expose YAML default task prompt in getPrompts API - Add `getDefaultTaskPrompt(agentType)` to `src/agents/prompts/index.ts` Reads the factory-default task prompt directly from the YAML definition without requiring `initPrompts()`. Returns null for unknown agent types. - Wire it into the `agentConfigs.getPrompts` tRPC endpoint as a fourth prompt layer (`defaultTaskPrompt`), completing the inheritance chain: project override → global override → default system (disk template) → default task (YAML definition). - Update `agent-prompt-overrides.tsx` to use `defaultTaskPrompt` as the final fallback when initialising the task prompt editor and as the target of the "Load default" button. - Add `startWatchdog(project.watchdogTimeoutMs)` to `triggerManualRun` so manual runs respect the per-project timeout the same way webhook- triggered runs do. - Fix unit tests: add `getDefaultTaskPrompt` to the `prompts/index.js` mock in agentConfigs.test.ts; mock `lifecycle.js` in manual-runner.test.ts to prevent the watchdog timer from calling process.exit during tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(router): unified agent run timeout system Replaces two independent, uncoordinated timeout mechanisms with a single coherent flow where `watchdogTimeoutMs` is the source of truth. ## Problem Two timeouts existed with no knowledge of each other: 1. In-container watchdog (`startWatchdog(project.watchdogTimeoutMs)`) — per-project, updates DB to `timed_out` then exits. 2. Router-level kill (`setTimeout → killWorker`) — global env var, killed the Docker container with no DB update. This caused three bugs: - If `WORKER_TIMEOUT_MS` < `watchdogTimeoutMs` the router killed the container before the watchdog could set the correct DB status. - GitHub-triggered runs (no `workItemId`) were never marked in the DB after a router kill — they stayed `running` forever. - Orphaned containers (after router restart) were stopped but their DB runs were never updated. ## Solution **Per-project timeout in `spawnWorker`**: router now reads `watchdogTimeoutMs` from project config and uses it + 2-minute buffer (`ROUTER_KILL_BUFFER_MS`) for the container kill timer, so the watchdog always fires first and the router is purely a backstop. **DB update on router kill (`killWorker`)**: after stopping the container, marks the run `timed_out` via `failOrphanedRun` (workItemId path) or `failOrphanedRunFallback` (GitHub PR runs without workItemId). The call to `cleanupWorker` no longer passes an exit code so it skips its own DB write, eliminating the race that could set the wrong status (`failed` instead of `timed_out`). **Fallback for GitHub PR runs (`failOrphanedRunFallback`)**: new repository function that finds the most recent running run by `projectId + agentType + startedAt ≥ containerStart` and marks it, guarded by an optimistic `WHERE status='running'` check so it is always safe to call even if the watchdog already acted. **DB update in `cleanupWorker`**: extended to also handle the workItemId-absent case via `failOrphanedRunFallback`, covering crashes of GitHub PR runs that the watchdog didn't catch. **`cascade.agent.type` container label**: added at spawn time so orphan cleanup can pass `agentType` to `failOrphanedRunFallback`, avoiding matching the wrong run when multiple agent types run concurrently. **`durationMs` on orphaned runs**: all three fail paths now compute and persist the elapsed duration so dashboard users see actual run time instead of null. **Fixed BullMQ `lockDuration`**: replaced `workerTimeoutMs + 60s` with a fixed 8-hour constant (`BULLMQ_LOCK_DURATION_MS`) — `guardedSpawn` resolves immediately after container start so the lock is held for seconds, and tying it to `workerTimeoutMs` risked lock expiry for long-running project configs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…911) Co-authored-by: Cascade Bot <bot@cascade.dev>
…ent cache stampede (#913) Add an in-flight promise deduplicator (_pendingConfigFetch) to loadProjectConfig() so that concurrent webhook requests arriving while the cache is cold all share a single DB fetch instead of each firing their own loadConfig() call. Fixes N+1 query / cache stampede reported in Sentry issues 98589064, 98586219, 103501033, 98901392. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…ionsRepository (#912) * test(integrations): add dedicated integration test suite for integrationsRepository * fix(tests): remove unused deleteProjectCredential import Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…nding (#914) Co-authored-by: Cascade Bot <bot@cascade.dev>
Co-authored-by: Cascade Bot <bot@cascade.dev>
…916) * docs: audit and fix README, getting-started, and CLAUDE.md accuracy * fix(docs): correct runs/ command count from 8 to 9 in CLI directory tree The parenthetical already listed 9 commands (list, show, logs, llm-calls, llm-call, debug, trigger, retry, cancel) but the count said 8. Updated to match the actual count and the 9 .ts files in src/cli/dashboard/runs/. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…#917) * feat(dashboard): add OpenRouter model autocomplete to model selection * refactor(openrouter): address review feedback on model autocomplete - Key cache by API key (or '__public__') to prevent cross-project cache pollution if OpenRouter ever returns key-specific model lists - Extract prefix/formatting/grouping utilities to web/src/lib/openrouter-utils.ts so tests import production implementations directly instead of local copies - Remove unused stripPrefix from openrouter-model-combobox.tsx - Simplify dead handleChange conditional to a direct onChange pass-through Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…ommand (#918) Co-authored-by: Cascade Bot <bot@cascade.dev>
) * test(config): add integration tests for config provider round-trip * refactor(tests): fix config provider tests per review feedback - Rewrite config-provider.test.ts to focus on the actual provider layer (cached findProjectByBoardId, findProjectByRepo, findProjectByJiraProjectKey, loadConfig from src/config/provider.ts) rather than duplicating direct DB repository tests - Fix cache invalidation test to use cached provider functions, mutate the DB between calls, and verify invalidateConfigCache() causes a fresh read - Move genuinely new test coverage into configRepository.test.ts: WithConfig variants for repo/id/jira, CascadeConfigSchema.safeParse validation, cross-provider isolation, Trello+JIRA coexistence, and JIRA config loading Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Cascade Bot <bot@cascade.dev>
Co-authored-by: Cascade Bot <bot@cascade.dev>
…d, and re-encryption (#922) Co-authored-by: Cascade Bot <bot@cascade.dev>
…utils, and media (#923) Co-authored-by: Cascade Bot <bot@cascade.dev>
Co-authored-by: Cascade Bot <bot@cascade.dev>
…essages (#925) Co-authored-by: Cascade Bot <bot@cascade.dev>
* feat(cli): add actionable error messages with --verbose flag * fix(cli): address review feedback on actionable error messages - Use formatted actionable message in non-TRPC re-throw path so raw ECONNREFUSED errors show "Cannot reach server" instead of the raw Node.js error text - Remove unused _serverUrl parameter from mapTRPCError() since mapError() handles network errors before delegating to mapTRPCError() - Add message content assertions to NOT_FOUND, FORBIDDEN, and BAD_REQUEST test cases so they actually verify the actionable messages Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…d withSpinner() (#927) Co-authored-by: Cascade Bot <bot@cascade.dev>
…rrors (#928) The 400 error from Trello was opaque — now includes the response body for diagnosis. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Test plan
🤖 Generated with Claude Code