Problem
Refineries and polecats are getting stuck. Two related root causes were found:
1. MR bead failures bypass updateBeadStatus()
In processReviewQueue() (Town.do.ts), three early-exit failure paths call reviewQueue.completeReview(sql, entry.id, 'failed') directly (raw SQL UPDATE). This bypasses beadOps.updateBeadStatus(), which means:
- No
status_changed bead_event is logged
- Convoy progress is never updated — the convoy counter never increments, so the convoy never lands even when all source beads are closed
- The MR bead goes to
failed silently and the convoy stalls
The three affected paths are:
- No
rig_id on the MR bead
- No rig config found for the rig
- Refinery container fails to start (calls
completeReview then returns)
These should all call beadOps.updateBeadStatus() instead (same as the convoy-bead-failure-reasons convoy is doing for bead events, but the convoy-progress side effect is the critical fix here).
2. Rework re-dispatch path may also be involved
When a refinery or reviewer signals rework (agentCompleted in review-queue.ts and completeReviewWithResult in Town.do.ts), a new polecat is hooked and dispatched fire-and-forget. If anything in that path fails silently, the source bead can be left in_progress with no agent dispatched to it. Worth auditing this path for missing error handling or guard conditions.
Investigation notes
completeReview() in review-queue.ts is a raw SQL UPDATE — it does NOT call updateBeadStatus() or trigger updateConvoyProgress()
updateBeadStatus() → updateConvoyProgress() is the only path that increments convoy closed_bead counters and triggers auto-land
- The
processReviewQueue() refinery dispatch also passes kilocodeToken: rigConfig.kilocodeToken directly (line ~3629) rather than the resolved kilocodeToken used in all other dispatch sites — worth checking if this could be undefined and cause silent failures
Fix
Replace the reviewQueue.completeReview(sql, entry.id, 'failed') calls in processReviewQueue() with beadOps.updateBeadStatus() calls so convoy progress is properly updated on MR bead failure.
Problem
Refineries and polecats are getting stuck. Two related root causes were found:
1. MR bead failures bypass
updateBeadStatus()In
processReviewQueue()(Town.do.ts), three early-exit failure paths callreviewQueue.completeReview(sql, entry.id, 'failed')directly (raw SQL UPDATE). This bypassesbeadOps.updateBeadStatus(), which means:status_changedbead_event is loggedfailedsilently and the convoy stallsThe three affected paths are:
rig_idon the MR beadcompleteReviewthen returns)These should all call
beadOps.updateBeadStatus()instead (same as the convoy-bead-failure-reasons convoy is doing for bead events, but the convoy-progress side effect is the critical fix here).2. Rework re-dispatch path may also be involved
When a refinery or reviewer signals rework (
agentCompletedinreview-queue.tsandcompleteReviewWithResultinTown.do.ts), a new polecat is hooked and dispatched fire-and-forget. If anything in that path fails silently, the source bead can be leftin_progresswith no agent dispatched to it. Worth auditing this path for missing error handling or guard conditions.Investigation notes
completeReview()inreview-queue.tsis a raw SQL UPDATE — it does NOT callupdateBeadStatus()or triggerupdateConvoyProgress()updateBeadStatus()→updateConvoyProgress()is the only path that increments convoy closed_bead counters and triggers auto-landprocessReviewQueue()refinery dispatch also passeskilocodeToken: rigConfig.kilocodeTokendirectly (line ~3629) rather than the resolvedkilocodeTokenused in all other dispatch sites — worth checking if this could be undefined and cause silent failuresFix
Replace the
reviewQueue.completeReview(sql, entry.id, 'failed')calls inprocessReviewQueue()withbeadOps.updateBeadStatus()calls so convoy progress is properly updated on MR bead failure.