You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Disclaimer: This is completely experimental. We are currently exploring multiple angles for integration.
Overview
Cloud-first rewrite of Gastown's core tenets as a Kilo platform feature (Proposal D — Revised). All orchestration state lives in Durable Objects (SQLite). Agents interact with Gastown via tool calls backed by DO RPCs — no filesystem coordination, no gt/bd binaries. Each town gets a Cloudflare Container that runs all agent processes (Kilo CLI instances) — one container per town, not one per agent.
Product vision: An absurdly simple product. Create a town, connect repos via existing integrations, talk to the Mayor in a chat interface. The full Gastown machine operates behind the scenes — agents spawn, communicate, merge code — and the UI shows everything happening in real time. Every object on screen is clickable, every connection is traceable.
See plans/gastown-cloud-proposal-d.md for the full architecture proposal, UI vision, and architecture assessment.
Workflow
For each sub-issue:
Branch: Create a branch off main using the naming scheme {issue_number}-{short-description}
Implement: Work through the issue's acceptance criteria. Write tests alongside the implementation.
Validate: Run the test suite and type checks (pnpm typecheck) to confirm everything passes.
Commit: Stage all changes and create a commit with a descriptive message referencing the sub-issue number.
Push & PR: Push the branch and create a pull request targeting main. Link the sub-issue in the PR body with Closes #NNN.
Phase 1: Core Loop + Mayor (COMPLETED ✅)
Foundation: Rig DO state machine, HTTP API, tool plugin, container with kilo serve, alarm scheduler, tRPC routes, basic dashboard, MayorDO.
Phase 1.5: Close the Loop + Usable UI (NEW — Priority Sprint)
The core loop is built but doesn't close: the Mayor can't delegate, agents can't be watched, merges 404, and the dashboard is a skeleton. These issues unblock a product that actually works end-to-end.
Mayor agent completes without using tools — missing GASTOWN_SESSION_TOKEN in local dev #349 — Mayor Tool Execution Failures ⚡ Mayor agent starts and has tools registered but completes without using them. Need to diagnose container-side errors when mayor tools (gt_sling, gt_list_rigs, etc.) execute — likely auth or network issue with GASTOWN_API_URL from inside the container.
Simplify the DO topology, replace subprocess-based agent management with the SDK, and establish a clean WebSocket event pipeline. See plans/gastown-town-centric-refactor.md for the full plan.
Make Cloud Gastown beads-centric: unify all object types into the beads primitive #441 — Beads-Centric Data Model — Unify all object types (mail, molecules, convoys, escalations, merge requests, agents) into the universal beads primitive with satellite metadata tables. Drop rig_mail, rig_molecules, rig_review_queue, town_convoys, town_convoy_beads, town_escalations, rig_agents. Add parent_bead_id, bead_dependencies table, entity-prefixed column naming.
Fix: Polecat agents cannot push — git credentials never populated in town config #443 — Fix: Polecat git push credentials — Git credentials are never populated in town config. The tRPC createRig doesn't set git_auth.github_token, so GIT_TOKEN is never passed to the container. Agents clone public repos fine but can't push. Fix: populate git credentials from integration tokens on rig creation, add token refresh, fix merge dispatch for self-hosted GitLab.
Embed agent terminal UI via xterm.js + container PTY proxy #450 — Agent Terminal UI (xterm.js) — Embed live terminal view of agent sessions via xterm.js connected to the container's existing PTY infrastructure. Proxy PTY WebSocket through control-server → TownContainerDO → worker → browser. Enables direct visibility into agent work and interactive agent sessions from the dashboard.
Gate Gastown UI to Kilo admins only #537 — Gate Gastown UI to Kilo Admins — Restrict all Gastown pages, tRPC procedures, and nav items to is_admin users only. Temporary gate until Gastown is ready for beta/GA.
Bug: Refinery agent dispatched to wrong rig in multi-rig towns #657 — Bug: Refinery dispatched to wrong rig — Four compounding bugs cause the refinery to clone the wrong rig's repo in multi-rig towns: singleton reuse ignores rig_id, processReviewQueue hardcodes rigList[0], merge_request beads have null rig_id, ReviewQueueInput has no rig_id field.
Wire convoys end-to-end: batch sling, Mayor tools, and dead code cleanup #540 — Wire Convoys End-to-End — Add gt_sling_batch tool for atomic multi-bead + convoy creation, gt_list_convoys and gt_convoy_status visibility tools, Mayor prompt updates, and dead code cleanup. Backend convoy infrastructure (createConvoy, onBeadClosed, landing detection) already works — this connects it to the Mayor.
Gastown Feature Flags & Progressive Rollout #901 — Feature Flags & Progressive Rollout — Replace the binary is_admin gate (Gate Gastown UI to Kilo admins only #537) with progressive rollout: admin-only → allowlist → percentage → GA. Kill-switch for instant revert. Sub-feature flags for convoys, PR merge strategy, multi-rig. Technology-agnostic — defines integration points, not vendor choice.
Witness & Deacon: Alarm-driven orchestration with on-demand LLM triage #442 — Witness & Deacon: Alarm-Driven Orchestration — Run all mechanical witness/deacon patrol behaviors (protocol mail flow, zombie detection, GUPP violations, convoy completion, stale hooks, gate evaluation) as deterministic code in the TownDO alarm. Spawn short-lived LLM triage agents only for ambiguous situations (dirty polecat triage, stuck agent inspection, help requests). Replaces persistent AI patrol loops with alarm + on-demand reasoning.
Bug: Agent JWT expires after 8h with no refresh — Mayor tool calls 401 #923 — Bug: Agent JWT expires after 8h with no refresh — GASTOWN_SESSION_TOKEN is minted once per startAgentInContainer() with 8h expiry. No refresh mechanism exists. Persistent agents (Mayor) that run >8h get 401s on all tool calls, heartbeats, and event persistence. Affects production now.
Bug: Triage agent feedback loop — crash loop detection triggers on its own failures #965 — Bug: Triage agent feedback loop — detectCrashLoops() counts triage agent's own bead failures as crash loops, creating new triage requests that fail and feed the loop. Compounds with: no dispatch cooldown after failure, unnecessary git clone for triage role, no cap on triage request creation. Actively happening in production.
Bug: Convoy landing merge never triggers — updateConvoyProgress not called after MR closes #962 — Bug: Convoy landing merge never triggers — updateConvoyProgress() is never called after MR beads close because completeReview() uses raw SQL (bypasses updateBeadStatus) and closeBead() on the already-closed source bead is a no-op. The ready_to_land flag is never set, so processConvoyLandings() has nothing to process. Fix: explicitly re-trigger updateConvoyProgress in completeReviewWithResult().
Persist agent conversation across container restarts via AgentDO event reconstruction #1236 — Persist Agent Conversation Across Restarts — Reconstruct conversation history from AgentDO streaming events (message.created/completed/part.updated) and inject as context on re-dispatch. Fix sendMayorMessage passing checkpoint: null. Mayor maintains conversational continuity across container restarts. Transcript truncated/summarized to fit context window.
Fix terminal stability — WebSocket reconnection, resize debounce, control frame filtering #1195 — Fix Terminal Stability — Three compounding bugs: (1) PTY WebSockets have no reconnection logic — container sleep/restart = permanent freeze. (2) Undebounced ResizeObserver fires resize storms during CSS transitions — PTY/xterm dimension mismatch causes blacked-out cells. (3) Missing 0x00 control frame filter — SDK cursor metadata written as raw JSON to xterm. Reference fix exists in the native Kilo app.
Bug: Deleting a town doesn't clean up TownDO, TownContainerDO, or AgentDOs — resources leak indefinitely #1182 — Bug: Town deletion leaks DOs and containers — tRPC deleteTown only removes rows from GastownUserDO, never calls TownDO.destroy() or stops the container. TownDO alarm re-arms unconditionally (no exit condition), keeps pinging the container (preventing sleep), and can resurrect after destroy() via constructor re-initialization. 6 compounding root causes. Actively burning Cloudflare resources in production.
Phase 4: Hardening
Gastown User-Scoped Admin Panel — Inspect & Intervene via GastownUserDO → TownDO #897 — User-Scoped Admin Panel — User-scoped admin panel for debugging and intervening in Gastown towns. Entry point on existing admin user detail page: GastownUserDO lookup → town list → Town Inspector with bead/agent/container/config drill-down and admin interventions (force-reset, force-close, credential refresh, container restart). Immutable audit log for all actions.
In-Review Bead State — Beads Stay Open Until Refinery Actions #895 — In-Review Bead State — Add in_review state between in_progress and closed. Polecats mark beads in_review on completion; only the Refinery transitions to closed (on merge) or back to in_progress (on rework). Bead board gets an "In Review" column.
Staged convoys â plan, review, and iterate before executing #1006 — Staged Convoys — gt_sling_batch with staged=true creates convoy + beads without hooking agents or dispatching. Users review the plan, iterate on beads (add/remove/edit), then gt_convoy_start to kick off execution.
Mayor edit capabilities + manual bead/agent/convoy editing #996 — Mayor Edit Capabilities + Manual Editing — Give the Mayor surgical edit tools: gt_bead_update (status, title, body, priority), gt_bead_reassign, gt_agent_reset, gt_convoy_close, gt_bead_delete, gt_escalation_acknowledge. Add PATCH endpoints for bead fields and agent status with user auth. Add edit controls to the dashboard UI (status dropdown, editable fields, unhook/reset buttons).
Polecats: Run quality gates before gt_done to self-validate #958 — Polecat Pre-Submission Gates — Inject the town's quality gates (townConfig.refinery.gates) into the polecat system prompt so polecats self-validate before calling gt_done. Catches lint/test/build failures in-session instead of wasting a refinery round-trip.
Cloud-native nudge system — reliable real-time message delivery to agents #1032 — Cloud-Native Nudge System — Reliable real-time message delivery to agents. Non-mayor agents enter “idle-but-available” state on session.idle instead of exiting, check for pending nudges, and stay alive to process them. Nudge queue in TownDO with wait-idle/immediate/queue delivery modes. gt_nudge tool for agents. Patrol functions use nudge instead of mail for time-sensitive messages.
Add gastown feature attribution for LLM usage tracking #1154 — Gastown Feature Attribution — Add gastown to FEATURE_VALUES and have the container send X-KILOCODE-FEATURE: gastown on all LLM requests. Currently all Gastown agent usage is unattributed (NULL) in microdollar_usage_metadata.
Bug: Shift+Enter doesn't insert newline in xterm.js terminal — Kitty protocol not supported #963 — Bug: Shift+Enter doesn't insert newline in xterm.js terminal — xterm.js 6.0.0 doesn't support the Kitty keyboard protocol. Shift+Enter sends (same as Enter) instead of ^[[13;2u. Quick fix: attachCustomKeyEventHandler to inject the CSI u sequence. Proper fix: upgrade xterm.js to 6.1+ or switch to ghostty-web.
Sort agents by status — active agents at the top #1145 — Sort agents by status — Agent cards on overview and rig pages sorted with working agents at top, then starting, idle, and exited/failed at bottom. Secondary sort by last_activity_at descending.
Detect misdirected messages to polecats/refinery â redirect to Mayor #995 — Detect misdirected messages to polecats/refinery — Prompt-level heuristic that detects when a user message looks like it was meant for the Mayor (town status, work delegation, questions about other agents) and responds with a redirect. Prevents polecats from misinterpreting coordination messages as work instructions.
Track commits in bead history with SCM links #1003 — Track Commits in Bead History — Capture commits produced by agents per bead (on gt_done), store SHA/message/author/timestamp with SCM URLs. Bead detail panel shows commit history with clickable links to GitHub/GitLab.
Automatic merge conflict resolution on open PRs #1004 — Automatic Merge Conflict Resolution — Detect merge conflicts on open PRs via GitHub/GitLab API, dispatch agent to resolve conflicts, run gates, and push. Escalate complex conflicts.
Bead archival â clean up old beads without deleting #1005 — Bead Archival — Add archived_at timestamp to beads. Archived beads hidden from default views but preserved for history. Archive/unarchive via API, Mayor tool, and dashboard UI. Bulk archive for closed beads and completed convoys.
Parallel Refinery: Per-Convoy Refinery Agents #1073 — Parallel Refinery: Per-Convoy Refinery Agents — One refinery per active convoy for intermediate merges (polecat branch → feature branch), running in parallel across convoys. Rig singleton refinery stays for standalone beads and final landing merges to main. processReviewQueue() pops multiple items per tick and routes by convoy vs standalone. Configurable refinery.maxConvoyConcurrency.
Integrating with the Wasteland — Dolt sync + DoltHub onboarding #1040 — Integrating with the Wasteland — Dolt binary in the container, local MySQL server for fast queries, outbound sync (TownDO → Dolt → DoltHub) and inbound sync (Wasteland commons → Mayor). DoltHub OAuth onboarding, fork creation, PR submission to commons on bead completion. Full sync for existing towns via GastownDoltSyncDO. Incremental sync on cadence + push before eviction.
Parse agent status from H1 headers in output — replace gt_status tool #1307 — Parse Agent Status from H1 Headers — Replace the gt_status tool (which agents often forget to call) with automatic H1 parsing from the agent’s text output. Parse message.part.updated events in broadcastEvent(), extract first H1 line, post to existing updateAgentStatusMessage API. Status bubbles work automatically. Remove gt_status tool and prompt references.
Real-time throughput gauges and timeseries — tokens/sec, cost/sec, gas tank #1297 — Real-Time Throughput Gauges & Timeseries — Live gauges on town overview: tokens/sec, cost/sec, active agents, and a gas tank fuel gauge (balance → empty/full with time-remaining estimate). Timeseries charts on observability tab: token throughput, cost over time, agent utilization, bead velocity, review queue depth. Zoomable 1h/6h/24h/7d windows.
Lazy agent assignment — scheduler assigns agents when beads are ready, not at creation #1249 — Lazy Agent Assignment — Move agent assignment from bead creation time to schedule time. Currently slingBead/slingConvoy eagerly create and hook agents to ALL beads including blocked ones. The scheduler should assign agents only when beads are unblocked and ready. Reduces wasted agent slots, preserves the name pool, enables serial convoy reuse, and makes staged/non-staged convoys consistent.
Gastown bug reporting — issue template, UI link, and Mayor gt_report_bug tool #1248 — Gastown Bug Reporting — GitHub issue template for structured bug reports, "Report a Bug" link in the dashboard UI, and a gt_report_bug Mayor tool that searches existing issues for duplicates and files new ones with structured context (town ID, rig, recent errors). UI link is the fallback when the Mayor/container is unavailable.
Configurable terminal orientation (top/bottom/left/right) with drag-to-resize #1237 — Configurable Terminal Orientation + Resize — Terminal bar can be positioned at bottom/top/left/right with drag-to-resize. Page content adjusts dynamically (replacing static pb-[340px]). Collapsed state only reserves the tab bar strip. Position and size persisted to localStorage. Vertical orientation adapts tab bar, status pane, and DrawerStack.
Redesign Merge Queue page — split into Needs Attention + Refinery Activity Log #1173 — Redesign Merge Queue Page — Split into two sections: “Needs Your Attention” (open PRs, failed reviews, merge conflicts blocking on humans) and “Refinery Activity Log” (human-readable timeline of what bots did). Convoy reviews grouped under convoy headers. Every bead reference links to stackable drawer. New tRPC procedure returning full merge_request bead data.
Upstream branch cleanup — delete merged polecat and convoy branches #1174 — Upstream Branch Cleanup — Delete merged polecat and convoy branches from the remote. Refinery deletes source branch after successful merge (deterministic, via git manager). Convoy feature branches deleted after landing or close. Alarm-based sweep catches orphaned branches from crashed agents.
Bead failure reasons — explain why beads failed, not just that they failed #1172 — Bead Failure Reasons — 8 of 16 failure paths record no reason for why a bead failed. Add structured failure_reason (code, message, details, source) to every failure event. Fix completeReview() raw SQL paths to use updateBeadStatus(). Persist container’s reason param for polecat crashes (currently discarded). Render failure reasons in UI with red banner + expandable details. Stretch: LLM-written plain-language summaries for complex failures.
Overview
Cloud-first rewrite of Gastown's core tenets as a Kilo platform feature (Proposal D — Revised). All orchestration state lives in Durable Objects (SQLite). Agents interact with Gastown via tool calls backed by DO RPCs — no filesystem coordination, no gt/bd binaries. Each town gets a Cloudflare Container that runs all agent processes (Kilo CLI instances) — one container per town, not one per agent.
Product vision: An absurdly simple product. Create a town, connect repos via existing integrations, talk to the Mayor in a chat interface. The full Gastown machine operates behind the scenes — agents spawn, communicate, merge code — and the UI shows everything happening in real time. Every object on screen is clickable, every connection is traceable.
See
plans/gastown-cloud-proposal-d.mdfor the full architecture proposal, UI vision, and architecture assessment.Workflow
For each sub-issue:
Phase 1: Core Loop + Mayor (COMPLETED ✅)
Foundation: Rig DO state machine, HTTP API, tool plugin, container with kilo serve, alarm scheduler, tRPC routes, basic dashboard, MayorDO.
kilo servefor Agent Management #305 — PR 5.5: Container — Adopt kilo servePhase 1.5: Close the Loop + Usable UI (NEW — Priority Sprint)
P0 — Must complete before product is usable:
/mergeendpoint — the call 404s. The polecat→done→merge→closed loop is broken. Beads in the review queue sit forever.P1 — High impact, required for decent UI:
Phase 2: Multi-Agent Orchestration (COMPLETED ✅)
All Phase 2 issues implemented in PR #389.
Known issues from Phase 2 testing:
Phase 2.5: Town-Centric Refactor
createOpencode()replacesBun.spawn), WebSocket streaming (eliminate SSE + polling + tickets), per-agent AgentDO for event storage, config-on-request (eliminate stale container injection), proactive container/mayor startup. PR refactor(gastown): Town-centric refactor — Phase A: Consolidate control plane (#419) #421.rig_mail,rig_molecules,rig_review_queue,town_convoys,town_convoy_beads,town_escalations,rig_agents. Addparent_bead_id,bead_dependenciestable, entity-prefixed column naming.createRigdoesn't setgit_auth.github_token, soGIT_TOKENis never passed to the container. Agents clone public repos fine but can't push. Fix: populate git credentials from integration tokens on rig creation, add token refresh, fix merge dispatch for self-hosted GitLab.Phase 3: Multi-Rig + Scaling
is_adminusers only. Temporary gate until Gastown is ready for beta/GA.direct(Refinery pushes to main) andpr(Refinery creates a GitHub/GitLab PR for human review) merge strategies. Configurable at town level with per-rig overrides. PR strategy includes quality gate results, agent attribution, and webhook-based status tracking. PR feat(gastown): configurable merge strategy (direct/PR) #701.gt_sling_batchtool for atomic multi-bead + convoy creation,gt_list_convoysandgt_convoy_statusvisibility tools, Mayor prompt updates, and dead code cleanup. Backend convoy infrastructure (createConvoy, onBeadClosed, landing detection) already works — this connects it to the Mayor.is_admingate (Gate Gastown UI to Kilo admins only #537) with progressive rollout: admin-only → allowlist → percentage → GA. Kill-switch for instant revert. Sub-feature flags for convoys, PR merge strategy, multi-rig. Technology-agnostic — defines integration points, not vendor choice.gt_rig_browsetool — Let the Mayor inspect rig code on demand. New container endpoint clones-on-demand or fetches-if-stale, returns file tree or file contents. Intermediate solution for post-restart empty workspaces (full recovery in [Gastown] PR 18: Container Resilience — Checkpoint/Restore #269).GASTOWN_SESSION_TOKENis minted once perstartAgentInContainer()with 8h expiry. No refresh mechanism exists. Persistent agents (Mayor) that run >8h get 401s on all tool calls, heartbeats, and event persistence. Affects production now.detectCrashLoops()counts triage agent's own bead failures as crash loops, creating new triage requests that fail and feed the loop. Compounds with: no dispatch cooldown after failure, unnecessary git clone for triage role, no cap on triage request creation. Actively happening in production.updateConvoyProgress()is never called after MR beads close becausecompleteReview()uses raw SQL (bypassesupdateBeadStatus) andcloseBead()on the already-closed source bead is a no-op. Theready_to_landflag is never set, soprocessConvoyLandings()has nothing to process. Fix: explicitly re-triggerupdateConvoyProgressincompleteReviewWithResult().Phase 4: Hardening
in_reviewstate betweenin_progressandclosed. Polecats mark beadsin_reviewon completion; only the Refinery transitions toclosed(on merge) or back toin_progress(on rework). Bead board gets an "In Review" column.gt_sling_batchwithstaged=truecreates convoy + beads without hooking agents or dispatching. Users review the plan, iterate on beads (add/remove/edit), thengt_convoy_startto kick off execution.gt_bead_update(status, title, body, priority),gt_bead_reassign,gt_agent_reset,gt_convoy_close,gt_bead_delete,gt_escalation_acknowledge. Add PATCH endpoints for bead fields and agent status with user auth. Add edit controls to the dashboard UI (status dropdown, editable fields, unhook/reset buttons).townConfig.refinery.gates) into the polecat system prompt so polecats self-validate before callinggt_done. Catches lint/test/build failures in-session instead of wasting a refinery round-trip.session.idleinstead of exiting, check for pending nudges, and stay alive to process them. Nudge queue in TownDO with wait-idle/immediate/queue delivery modes.gt_nudgetool for agents. Patrol functions use nudge instead of mail for time-sensitive messages.gastowntoFEATURE_VALUESand have the container sendX-KILOCODE-FEATURE: gastownon all LLM requests. Currently all Gastown agent usage is unattributed (NULL) inmicrodollar_usage_metadata.(same as Enter) instead of^[[13;2u. Quick fix:attachCustomKeyEventHandlerto inject the CSI u sequence. Proper fix: upgrade xterm.js to 6.1+ or switch to ghostty-web.bead_dependenciesrows. Full vertical slice needed: core functions with cycle detection, TownDO RPC methods, HTTP endpoints,gt_bead_add_dependency/gt_bead_remove_dependencytools, anddepends_onparam ongt_sling. Missed in feat(gastown): add mayor edit tools and manual editing UI #1076.Git commit co-authorship â human as Author, agent as Committer #1001—Git Commit Co-Authorship— Superseded by feat(gastown): GitHub OAuth Device Flow for user-scoped git identity and attribution #1350. — Human asGIT_AUTHOR, agent asGIT_COMMITTER. User name/email resolved from Kilo user record and passed through dispatch.Co-authored-byandBead-IDtrailers in commit messages. GitHub contribution graphs credit the human.gt_done), store SHA/message/author/timestamp with SCM URLs. Bead detail panel shows commit history with clickable links to GitHub/GitLab.archived_attimestamp to beads. Archived beads hidden from default views but preserved for history. Archive/unarchive via API, Mayor tool, and dashboard UI. Bulk archive for closed beads and completed convoys.processReviewQueue()pops multiple items per tick and routes by convoy vs standalone. Configurablerefinery.maxConvoyConcurrency.gt_statustool (which agents often forget to call) with automatic H1 parsing from the agent’s text output. Parsemessage.part.updatedevents inbroadcastEvent(), extract first H1 line, post to existingupdateAgentStatusMessageAPI. Status bubbles work automatically. Removegt_statustool and prompt references.gt_report_bugMayor tool that searches existing issues for duplicates and files new ones with structured context (town ID, rig, recent errors). UI link is the fallback when the Mayor/container is unavailable.failure_reason(code, message, details, source) to every failure event. FixcompleteReview()raw SQL paths to useupdateBeadStatus(). Persist container’sreasonparam for polecat crashes (currently discarded). Render failure reasons in UI with red banner + expandable details. Stretch: LLM-written plain-language summaries for complex failures.Future Ideas
Closed/Superseded Issues
[Gastown] PR 10: Witness Alarm #217— Witness Alarm — Folded into [Gastown] PR 5: Rig DO Alarm — Work Scheduler #212 (Rig DO Alarm). The witness patrol is part of the alarm handler.[Gastown] PR 11: Mayor Agent #222— Mayor Agent (old design) — Superseded by [Gastown] PR 8: MayorDO — Town-Level Conversational Agent #338 (MayorDO).