Skip to content

fix(gastown): address PR #544 review hardening items#689

Merged
jrf0110 merged 7 commits intomainfrom
686-pr544-review-hardening
Mar 1, 2026
Merged

fix(gastown): address PR #544 review hardening items#689
jrf0110 merged 7 commits intomainfrom
686-pr544-review-hardening

Conversation

@jrf0110
Copy link
Copy Markdown
Contributor

@jrf0110 jrf0110 commented Feb 28, 2026

Summary

Addresses all 8 hardening items identified in the PR #544 code review.

Closes #686

Changes

1. Fire-and-forget dispatchAgent error logging

  • Added .catch() with console.error on the void this.dispatchAgent() call in slingBead so failures are visible in logs.

2. Zod validation for SDK session creation response

  • Replaced fragile sessionResult.data ?? sessionResult + 'id' in session duck-typing with a Zod schema (z.object({ id: z.string().min(1) }).passthrough()).
  • Now throws a clear error if the SDK response shape changes unexpectedly.

3. registerAgent stores rig_id — verified + tested

  • Verified the code path: handler passes { ...parsed.data, rig_id: params.rigId }agents.registerAgent stores input.rig_id ?? null in the beads table rig_id column.
  • Added two integration tests: one verifying rig_id is stored when provided, one verifying it's null when omitted.

4. Thread-safe currentTownConfig via Hono context

  • Added Zod validation (z.record(z.string(), z.unknown())) for the X-Town-Config header JSON.
  • Typed the Hono app with ContainerEnv variables and store config per-request via c.set('townConfig', ...).
  • Renamed module-level variable to lastKnownTownConfig to clarify it's a cache, not the authoritative source.

5. Emit bead event on failed escalation notification

  • Added notification_failed to the BeadEventType enum (both bead-events.table.ts and rig-bead-events.table.ts).
  • Both routeEscalation and the re-escalation alarm now emit a notification_failed bead event with metadata (target, reason, severity) when mayor notification fails.
  • Fixed the re-escalation path that was silently swallowing errors with .catch(() => {}).

6. Resilient rig creation + git credential write

  • Wrapped the updateTownConfig git credential write in a try/catch so a failed credential write doesn't propagate as a rig creation failure.
  • Clear error logging with rig ID and integration ID for debugging.
  • Container-side resolveGitCredentialsIfMissing and refreshTownGitCredentials serve as recovery mechanisms.

7. /api/gastown/git-credentials route — verified

  • Confirmed the route exists at src/app/api/gastown/git-credentials/route.ts with proper auth, ownership verification, and Zod validation. No changes needed.

8. Return 403 for cross-town access attempts

  • Introduced resolveTownId() in auth middleware returning a discriminated union: { townId } | { error: 'missing', status: 400 } | { error: 'forbidden', status: 403 }.
  • Updated all 31 handler call sites across 8 handler files to use resolveTownId() and return the correct HTTP status code.
  • Kept getTownId() as a backward-compatible convenience wrapper.

Files changed (16)

File Change
cloudflare-gastown/src/dos/Town.do.ts Items 1, 5
cloudflare-gastown/container/src/process-manager.ts Item 2
cloudflare-gastown/container/src/control-server.ts Item 4
cloudflare-gastown/src/db/tables/bead-events.table.ts Item 5
cloudflare-gastown/src/db/tables/rig-bead-events.table.ts Item 5
cloudflare-gastown/src/middleware/auth.middleware.ts Item 8
cloudflare-gastown/src/handlers/rig-*.handler.ts (8 files) Item 8
cloudflare-gastown/test/integration/rig-do.test.ts Item 3
src/routers/gastown-router.ts Item 6

- Add .catch() error logging on fire-and-forget dispatchAgent in slingBead
- Validate SDK session.create response with Zod schema instead of duck-typing
- Add tests verifying registerAgent stores rig_id correctly
- Store town config per-request via Hono context with Zod validation
- Emit notification_failed bead events when escalation mayor notification fails
- Wrap git credential write in try/catch during rig creation for resilience
- Verify /api/gastown/git-credentials route exists (confirmed)
- Return 403 (not 400) for cross-town access attempts via resolveTownId
Comment thread cloudflare-gastown/container/src/control-server.ts Outdated
Comment thread cloudflare-gastown/container/src/control-server.ts Outdated
Comment thread cloudflare-gastown/container/src/process-manager.ts Outdated
Comment thread cloudflare-gastown/src/dos/Town.do.ts Outdated
@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented Feb 28, 2026

Code Review Summary

Status: No New Issues Found | Recommendation: Merge (after addressing prior review comments)

Overview

This PR is a solid hardening pass that addresses multiple issues from the prior review round. All 19 previously identified issues have been resolved or acknowledged. No new issues were found in this review pass.

Key Changes Reviewed

  1. Zod validation for X-Town-Config header (control-server.ts) — Properly validates the header as a JSON object using z.record(), with clear separation of JSON parse errors vs. schema validation failures. The lastKnownTownConfig rename improves clarity.

  2. Zod validation for SDK session response (process-manager.ts) — SessionResponse schema hoisted to module level, safeParse with clear error logging and a thrown error on failure. Good fail-fast pattern.

  3. townIdMiddleware extraction (auth.middleware.ts, gastown.worker.ts) — Clean separation of townId extraction (runs always, even in dev) from JWT auth (skipped in dev). All rig handlers now use c.get('townId') consistently, and the middleware guarantees it's set before handlers run.

  4. Cross-town guard in authMiddleware — JWT's townId is now validated against the route param, preventing cross-town access via URL manipulation.

  5. Fire-and-forget error handling (Town.do.ts) — void this.dispatchAgent(...) replaced with .catch(err => console.error(...)), and logBeadEvent calls in .catch() handlers are wrapped in try/catch to prevent double-throw.

  6. notification_failed event type — Added to both bead-events.table.ts and rig-bead-events.table.ts to support the new notification failure logging.

  7. Best-effort git credential write (gastown-router.ts) — Wrapped in try/catch so a credential failure doesn't prevent rig creation. Well-documented recovery mechanisms.

  8. New tests (rig-do.test.ts) — Tests for rig_id storage on agent registration.

Files Reviewed (17 files)
  • cloudflare-gastown/container/src/control-server.ts - 0 new issues
  • cloudflare-gastown/container/src/process-manager.ts - 0 new issues
  • cloudflare-gastown/src/db/tables/bead-events.table.ts - 0 new issues
  • cloudflare-gastown/src/db/tables/rig-bead-events.table.ts - 0 new issues
  • cloudflare-gastown/src/dos/Town.do.ts - 0 new issues
  • cloudflare-gastown/src/gastown.worker.ts - 0 new issues
  • cloudflare-gastown/src/handlers/rig-agent-events.handler.ts - 0 new issues
  • cloudflare-gastown/src/handlers/rig-agents.handler.ts - 0 new issues
  • cloudflare-gastown/src/handlers/rig-bead-events.handler.ts - 0 new issues
  • cloudflare-gastown/src/handlers/rig-beads.handler.ts - 0 new issues
  • cloudflare-gastown/src/handlers/rig-escalations.handler.ts - 0 new issues
  • cloudflare-gastown/src/handlers/rig-mail.handler.ts - 0 new issues
  • cloudflare-gastown/src/handlers/rig-molecules.handler.ts - 0 new issues
  • cloudflare-gastown/src/handlers/rig-review-queue.handler.ts - 0 new issues
  • cloudflare-gastown/src/middleware/auth.middleware.ts - 0 new issues
  • cloudflare-gastown/test/integration/rig-do.test.ts - 0 new issues
  • src/routers/gastown-router.ts - 0 new issues

- Fix misleading .passthrough() comment to accurately describe z.record()
- Make townConfig type optional in ContainerEnv since header may be absent
- Hoist SessionResponse Zod schema to module level to avoid repeated allocation
- Wrap logBeadEvent calls in try/catch inside .catch() handlers to prevent
  unhandled rejections if the event log write fails
Comment thread cloudflare-gastown/src/handlers/rig-agent-events.handler.ts Outdated
Comment thread cloudflare-gastown/src/middleware/auth.middleware.ts Outdated
Comment thread cloudflare-gastown/container/src/control-server.ts Outdated
Eliminates the repeated 3-line resolveTownId/error-check/extract pattern
from all 31 handler call sites across 8 files. The auth middleware now
resolves and validates townId (returning 400 or 403 as appropriate) and
sets it on the Hono context. Handlers simply call c.get('townId').
- Remove c.set('townConfig', ...) which was set but never read by any handler
- Remove ContainerEnv type that only existed for the unused variable
- getTownId and resolveTownId were already cleaned up in previous commit
Comment thread cloudflare-gastown/src/middleware/auth.middleware.ts Outdated
townId resolution was inside authMiddleware which is skipped in dev mode,
causing c.get('townId') to return undefined for all handlers. Split into
a separate townIdMiddleware that runs unconditionally before the auth gate.
Comment thread cloudflare-gastown/src/middleware/auth.middleware.ts Outdated
…able

townIdMiddleware now only extracts the route param (no JWT dependency).
The cross-town check (JWT townId vs route townId) is done in authMiddleware
after the JWT is parsed, alongside the existing rigId check.
Comment thread cloudflare-gastown/src/middleware/auth.middleware.ts Outdated
@jrf0110 jrf0110 merged commit 3ec917b into main Mar 1, 2026
12 checks passed
@jrf0110 jrf0110 deleted the 686-pr544-review-hardening branch March 1, 2026 14:09
jrf0110 added a commit that referenced this pull request Mar 1, 2026
## Summary

Fixes 4 compounding bugs that caused the refinery agent to be dispatched
into the wrong rig's repository in multi-rig towns, fixes 3 additional
issues with MR bead assignment and the review lifecycle, and adds a
Related Beads DAG section to the bead drawer UI.

Closes #657

> **Note:** This branch is based on `686-pr544-review-hardening` (PR
#689) which is pending review. Once #689 merges to main, this PR will be
clean against main.

Also adds UI to show relationship between related beads (like a task
bead to review bead)

<img width="1567" height="1237" alt="image"
src="https://github.com/user-attachments/assets/0801695b-a51a-4692-832f-07cb8d9cf2ab"
/>


## The bug chain (before this fix)

1. Polecat on **rig B** finishes work -> `submitToReviewQueue`
2. MR bead created with **`rig_id = null`** and **assignee = polecat**
(Bugs 3, 5)
3. Alarm fires -> `processReviewQueue` pops the entry
4. Rig resolved via **`rigList[0]`** -> gets **rig A** (Bug 2)
5. `getOrCreateAgent('refinery')` returns the **existing refinery for
rig A** (Bug 1)
6. Refinery hooked to the **source bead** (not the MR bead), overwriting
its assignee (Bug 6)
7. Source bead left `in_progress` instead of closed (Bug 7)

## Fixes

### Bug 1: Refinery was a town-wide singleton
- **`agents.ts`**: Removed `refinery` from `singletonRoles` (now only
`witness` and `mayor`)
- Refinery now uses the same per-rig idle agent reuse logic as polecats,
scoped by `rig_id`

### Bug 2: `processReviewQueue` hardcoded `rigList[0]`
- **`Town.do.ts`**: Now reads `entry.rig_id` from the merge_request bead

### Bug 3: merge_request beads created with `rig_id = null`
- **`review-queue.ts`**: `submitToReviewQueue` now writes `input.rig_id`
to the bead

### Bug 4: `ReviewQueueInput` type lacked `rig_id`
- **`types.ts`**: Added `rig_id: string` to both `ReviewQueueInput` and
`ReviewQueueEntry`

### Bug 5: MR bead assignee set to polecat instead of left for refinery
- **`review-queue.ts`**: MR bead `assignee_agent_bead_id` now null on
creation (refinery claims it via `hookBead`). Source agent stored in
`metadata.source_agent_id`.

### Bug 6: Refinery hooked to source bead, not MR bead
- **`Town.do.ts`**: `processReviewQueue` now hooks the refinery to
`entry.id` (MR bead), not `entry.bead_id` (source bead). This preserves
the source bead's polecat assignee.
- **`review-queue.ts`**: Refinery `agentDone` path updated —
`completeReviewFromMRBead` reads the source bead from the MR's metadata
instead of the old `completeReviewForSourceBead` lookup.

### Bug 7: Source bead not closed after polecat submits to review
- **`review-queue.ts`**: `agentDone` (polecat path) now calls
`closeBead` on the source bead after submitting to the review queue,
matching upstream `gt done` behavior.

### Feature: Related Beads DAG in bead drawer
- **`BeadPanel.tsx`**: New "Related Beads" section shows the bead's DAG
neighborhood:
- **Child beads** — beads whose `parent_bead_id` matches the current
bead
- **Source Work** — for MR beads, the original issue/work bead from
`metadata.source_bead_id`
- **Review** — for non-MR beads, any `merge_request` beads that track
this bead
- Each entry is clickable, pushing a new bead drawer onto the navigation
stack
- Computed client-side from existing `listBeads` data — no new API
endpoints
- Also adds a `bead_dependencies` 'tracks' row when creating MR beads to
formally link the DAG

## Files changed

| File | Change |
|------|--------|
| `cloudflare-gastown/src/types.ts` | Added `rig_id` to
`ReviewQueueInput` and `ReviewQueueEntry` |
| `cloudflare-gastown/src/dos/town/agents.ts` | Refinery per-rig,
generalized idle agent lookup |
| `cloudflare-gastown/src/dos/town/review-queue.ts` | MR bead lifecycle
fixes, bead_dependencies tracking |
| `cloudflare-gastown/src/dos/Town.do.ts` | Hook refinery to MR bead,
resolve rig from bead |
| `cloudflare-gastown/src/handlers/rig-review-queue.handler.ts` | Pass
`params.rigId` to `submitToReviewQueue` |
| `src/components/gastown/drawer-panels/BeadPanel.tsx` | Related Beads
DAG section |
| `cloudflare-gastown/test/integration/rig-do.test.ts` | Add `rig_id` to
test data |
| `cloudflare-gastown/test/integration/rig-alarm.test.ts` | Add `rig_id`
to test data |
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gastown: Address PR #544 review hardening items

2 participants