Conversation
…ent_runs - Add src/queue/cancel.ts module with publishCancelCommand() and subscribeToCancelCommands() for Dashboard→Router kill command communication via Redis pub/sub - Add job_id TEXT column to agent_runs table via migration 0035 - Update Drizzle schema to include jobId field - Add repository functions updateRunJobId() and getRunJobId() - Include jobId in enrichedRunSelect for API visibility - Add comprehensive unit tests for pub/sub module (9 tests) and repository functions (7 tests) - All tests passing, lint and typecheck clean Fixes: #302
nhopeatall
left a comment
There was a problem hiding this comment.
Summary
Clean foundational infrastructure for the cancel command feature. The code is correct, follows existing patterns, and is well-tested. Ready to merge.
Notes
-
No consumers yet —
publishCancelCommand,subscribeToCancelCommands,updateRunJobId, andgetRunJobIdare all defined but not wired into production code paths. This is expected per the PR description (foundational work for the cancellation epic). -
Payload validation on subscribe (minor, non-blocking):
subscribeToCancelCommandsusesJSON.parse(message) as CancelCommandPayload— a type assertion without runtime validation. If a malformed-but-valid-JSON message lands on the channel (e.g.,{"foo": 1}), the handler receives an object withundefinedforrunId/reason. In a controlled internal system this is low risk, but a lightweight Zod parse (or even arunId && reasonguard) in a follow-up PR would make the contract more robust. (Seesrc/queue/cancel.ts:84) -
enrichedRunSelectnow includesjobId: This meansgetRunById,getRunsByWorkItem, andgetRunsForPRreturnjobIdto API consumers. ThelistRunsdashboard query intentionally omits it. For an internal platform tool this is fine — just noting it's now part of the API surface. -
Redis connection pattern: Uses
new Redis(redisUrl)directly (correct for ioredis), separate from theparseRedisUrlhelper used by BullMQ consumers. Both patterns are appropriate for their respective use cases.
Everything else checks out: migration is correct for zero-downtime deploys, journal entry is properly sequenced, Drizzle schema matches, repository functions follow existing conventions, and test coverage is thorough.
Summary
This PR establishes the foundational cancel command infrastructure for the agent run cancellation feature epic:
Redis pub/sub cancel channel (
src/queue/cancel.ts): Provides two core functions:publishCancelCommand(runId, reason)- Dashboard publishes cancel requests tocascade:cancelchannelsubscribeToCancelCommands(handler)- Router subscribes and handles cancel commands with error resilienceDatabase column for job tracking (
0035_add_job_id_to_runs.sql):job_id TEXTcolumn toagent_runstable for zero-downtime deploy compatibilityRepository helpers (
runsRepository.ts):updateRunJobId(runId, jobId)- Store job ID after job enqueuegetRunJobId(runId)- Retrieve job ID for cancel operationsTechnical Details
Connection pattern: Uses lazy singleton initialization matching
src/queue/client.tsError handling: JSON parse failures and handler exceptions are logged but don't crash the subscription
Testing:
PR Contents
Acceptance Criteria
✅ New file src/queue/cancel.ts exports publishCancelCommand(runId, reason)
✅ New file src/queue/cancel.ts exports subscribeToCancelCommands(handler)
✅ Redis connections use lazy initialization matching src/queue/client.ts pattern
✅ New migration 0035_add_job_id_to_runs.sql adds nullable job_id TEXT column
✅ Migration journal entry added with idx 35 and unique when timestamp
✅ Drizzle schema includes jobId field on agentRuns table
✅ Repository functions updateRunJobId() and getRunJobId() implemented
✅ enrichedRunSelect includes jobId field for API visibility
✅ Unit tests cover pub/sub module with mocked Redis
✅ Unit tests cover repository functions with mocked DB
✅ All tests pass, typecheck and lint are clean
Testing
All 4533 tests pass. No linting or type errors.
Card: https://trello.com/c/MTfNVgY3/302-as-a-developer-i-want-a-redis-pub-sub-cancel-channel-and-jobid-column-on-agentruns-so-that-cancel-commands-can-reach-the-router