Cloud Gastown: Future ideas — capabilities unique to or enhanced by the cloud model

## Overview

A collection of capabilities that are either impossible in local Gastown or dramatically better in the cloud. These are not prioritized for any specific phase — they're ideas to inform architecture decisions and inspire future work.

Parent: #204

---

## Organization-Level Towns

Local Gastown is single-player â^@^T one human, one machine, one Mayor. The cloud makes towns collaborative. A Kilo org or team owns a town, and any developer in that organization can interact with it â^@^T talking to the Mayor, slinging work, watching agents on the dashboard.

### Ownership model

Towns are owned by a Kilo **organization**, not an individual user. The org admin creates the town, connects repos via the org's GitHub/GitLab integrations, and configures settings (gates, merge strategy, agent concurrency). Any member of the org can:

- Talk to the Mayor through their own chat interface
- Sling beads and create convoys
- Watch agent streams and bead progress on the dashboard
- View the activity feed across all org members' work

The Mayor has context about what everyone is doing. "Don't touch the auth module, Sarah's convoy is refactoring it right now" becomes something the Mayor can say because it has the full picture. Each user's messages to the Mayor are attributed â^@^T the Mayor knows who is asking and can factor in their role/permissions.

### Cost attribution: always to the human, never to a bot

When a developer kicks off work (slings a bead, asks the Mayor to do something), all resulting LLM usage (polecat token consumption, refinery reviews, triage agent cycles) is attributed to **that user**, not to the org generically or to a bot account. The Kilo gateway already routes all LLM calls â^@^T each call carries the `userId` of the human who initiated the work chain.

This means:
- Org admins see per-user cost breakdowns: "John's beads cost $47 this week across 3 rigs"
- Individual developers see their own usage in their Kilo account
- Billing rolls up to the org, but attribution is per-human
- No "bot user" abstraction that obscures who is actually consuming resources

The `userId` should propagate from the Mayor message or `gt_sling` call through to every downstream agent dispatch and LLM call in that work chain.

### Git commit dual attribution

Git commits made by agents should carry **two authors**: the human who initiated the work and the agent that wrote the code. This applies to both org-scoped and user-scoped Gastown.

Git supports this natively via `Author` and `Committer` fields, or via `Co-authored-by` trailers:

```
Author: John Smith <john@company.com>
Committer: Toast (Polecat) <toast@gastown.kilo.ai>

feat: add token refresh to auth middleware

Co-authored-by: Toast (Polecat) <toast@gastown.kilo.ai>
Requested-by: John Smith <john@company.com>
Bead-ID: gt-abc-123
```

The exact format is a design decision (Author/Committer split vs. trailers), but the requirements are:
- **Human attribution**: GitHub/GitLab contribution graphs credit the human. PRs show the human as the author. The human's identity is the `Author` field.
- **Agent attribution**: The agent identity is visible in the commit metadata. Which polecat, which model, which bead â^@^T all traceable.
- **Bead linkage**: Each commit references the bead ID it was produced for, enabling dashboard â^F^R commit â^F^R code traceability.

This dual attribution is important for compliance (who authorized this code?), for contribution tracking (humans get credit for work they directed), and for debugging (which agent produced this commit?).

### Fleet management

Extends to **org-level fleet views** â^@^T a VP of Engineering sees aggregate metrics across all towns (cost, velocity, quality), and can spot patterns like "the payments team's polecats have a 40% rework rate on the billing repo, but the platform team's are at 8%."

See `plans/gastown-org-level-architecture.md` for the full org-level spec.

---

## Agent Personas and Smarter Convoy Dispatch

Two related ideas about rethinking how agents are assigned to work.

### Serial convoy agent reuse

Currently, each bead in a convoy gets a distinct polecat — even when beads run in series (B depends on A, C depends on B). This means 3 agents for sequential work: each one clones the repo, rebuilds context, and tries to understand what the previous agent did. A single agent carrying context from A → B → C would be faster, cheaper, and produce more coherent work.

The fix is a dispatch strategy change: when the alarm dispatches the next bead in a dependency chain, prefer re-hooking the agent that just completed the predecessor bead (if it's idle) rather than grabbing a random idle polecat. The agent keeps its worktree, its git state, and its LLM context. For parallel beads within a convoy, separate agents are still needed since the work is simultaneous.

This doesn't require rethinking agent identity — it's purely about dispatch preference. The Mayor could also influence this: "run this convoy with a single agent in series" vs. "fan out to separate agents."

### Agent personas

All polecats today are identical — same system prompt, same tools, same capabilities. The 20-name pool (Toast, Shadow, Maple, etc.) is cosmetic. There's no capability differentiation between agents.

**Personas** add a capability dimension. A persona is a prompt template + optional configuration (tool restrictions, quality gate priorities) applied at dispatch time:

- **Frontend Master** — knows React patterns, prioritizes component architecture, writes CSS-in-JS
- **Backend Architect** — database schema design, API contracts, performance optimization
- **Pentester** — looks for injection vulnerabilities, checks auth boundaries, writes exploit tests
- **Test Writer** — focuses on comprehensive test coverage, edge cases, mocking strategies
- **DevOps Engineer** — CI/CD pipelines, Docker, infrastructure-as-code

The model:

```
Agent = Polecat role + Persona (optional) + Identity (name, work history)
```

- Personas are defined in town config (custom per town) or from built-in defaults
- Each persona has: a name, a system prompt overlay, optional tool restrictions, optional quality gate overrides
- When the Mayor slings a bead, it can specify a persona: "sling this security audit to a Pentester"
- The dispatch system finds an idle polecat, applies the persona's prompt overlay, and dispatches
- The same polecat can wear different personas across different beads — "Toast" is a Frontend Master on bead A, then a Backend Architect on bead B
- Personas don't restrict the agent pool. You can have 5 agents all running with the "Frontend Master" persona simultaneously
- The Mayor becomes the casting director — deciding which persona fits each bead based on the work description

This also composes with serial convoy reuse: a convoy for "refactor the auth module" could be one agent, one persona ("Backend Architect"), carrying context through the whole chain.

---

## Agent Marketplace and Shared Formulas

Local Gastown's Mol Mall is a concept in the docs but barely implemented. The cloud can make this real — a hosted formula registry where users publish and install workflow templates. "Here's a formula for migrating a React codebase from class components to hooks" or "Here's a 12-step molecule for adding comprehensive test coverage to an untested module." Formulas become a product surface, not just an internal tool.

Beyond formulas, **agent configurations become shareable** — system prompts, quality gate configs, model selections that produce good results for specific kinds of work. "This polecat config with Claude Opus and this system prompt produces 95% first-pass merge rate on TypeScript refactors" becomes community knowledge.

---

## Cross-Session Intelligence at Scale

Local Gastown's seance system lets an agent query one predecessor. The cloud has every agent event stored in AgentDOs. Build **cross-agent, cross-session search**: "Which agent last touched the payment processing module, and what did they change?" The cloud can answer this by querying across all agent event histories in a town. This is institutional memory that scales — every agent's work is indexed and searchable.

When a polecat is assigned work on a file, the system can automatically surface relevant context from previous agents who touched that file. "Toast worked on this module 3 days ago and left a note about a tricky race condition in the checkout flow." The polecat gets this in its prime context without anyone asking for it. Local Gastown can't do this because session histories aren't centrally indexed.

---

## Real-Time Cost Attribution and Budget Controls

The cloud routes all LLM calls through the Kilo gateway, which means **per-bead cost tracking** in real time. "This convoy cost $47 across 12 beads and 3 rigs." Local Gastown tracks costs retroactively via `gt costs record` on session stop, but the cloud can do it live — show a cost ticker on the dashboard that updates as agents stream tokens.

This enables **budget guardrails**: "This rig has a $200/day budget. When polecats approach the limit, stop slinging new work and notify the Mayor." Or per-convoy budgets: "This refactoring convoy is capped at $500. If we're at $450 and 3 beads are still open, escalate to the human to decide if it's worth continuing." Local Gastown has no concept of cost limits — agents run until the work is done or the human intervenes.

---

## Speculative Execution and A/B Testing

With the container model, **sling the same bead to multiple polecats with different configurations** — different models, different system prompts, different temperatures — and let the Refinery pick the best result. "Give this task to three polecats: one with Opus, one with Sonnet, one with Gemini. Merge whichever passes quality gates first, discard the rest."

This turns agent orchestration into an experimentation platform. Over time, the system accumulates data: "For TypeScript refactoring tasks, Sonnet 4.6 produces first-pass merges 82% of the time at $0.40/bead. Opus produces 94% at $2.10/bead. For this task complexity, Sonnet is the better value." The cloud can make this recommendation automatically because it has the data. Local Gastown can compare models via the gt-model-eval framework, but it's manual and offline.

---

## Webhook-Driven Beads

The cloud sits behind a web server with existing GitHub/GitLab webhook infrastructure. GitHub issues can automatically become Gastown beads. A new PR from an external contributor triggers a review bead. A failing CI run creates an escalation. A Slack message creates a bead. The Mayor sees it all in its inbox and can decide to act.

Local Gastown is pull-only — agents check for work on their hooks. The cloud can be push-driven, reacting to external events in real time. A GitHub issue labeled `gastown` auto-creates a bead, the Mayor assesses it, slings it if appropriate, and the human who filed the issue sees a comment: "A polecat is working on this. Track progress at [dashboard link]."

---

## Persistent Agent Reputation

Local Gastown tracks agent CVs (work history per polecat identity), but it's per-machine and per-town. The cloud can build **cross-town agent reputation** at the platform level. Not just "Toast completed 47 beads" but "Across all Kilo Cloud Gastown towns, Claude Opus polecats with the standard polecat prompt have a 91% first-pass merge rate on Python repos, dropping to 73% on Rust repos." This becomes a data product — model performance benchmarks derived from real production work across the entire user base.

---

## Always-On with Zero Maintenance

The most obvious but most impactful: the cloud town is always running. No laptop that needs to stay open, no tmux sessions to babysit, no `gt mayor attach` to check on things. Tell the Mayor to refactor a module at 5 PM, close your laptop, and check the dashboard from your phone at dinner. The convoy landed, the PR is merged, the tests pass. The entire watchdog chain (alarm-driven health checks, triage agents for ambiguous situations) runs without any human presence.

---

## Local CLI Bridge (Stretch Goal)

Connect your local Kilo CLI to your cloud Gastown instance. `kilo gastown connect <town-url>` authenticates and registers your local instance as a crew-like agent in the cloud town. You get all the coordination benefits (beads, mail, identity, attribution, convoy tracking) while running locally with full filesystem access and your own dev environment. Your local session appears in the cloud dashboard as an active agent.

This bridges "fully cloud-hosted" and "I want to work locally but with cloud coordination." The tool plugin's HTTP API surface is the same whether the agent runs in a Cloudflare Container or on someone's laptop.

---

## SolidJS Dashboard: Native Integration with Kilo's Web UI

The Gastown dashboard currently lives in the Next.js (React) app. The Kilo desktop/web app (`packages/app/`) is built with **SolidJS** — including all the session UI, message rendering, tool execution cards, diff viewer, file tabs, terminal panel, and prompt dock. These are mature, production-quality components that represent significant engineering investment.

A future direction is to **migrate the Gastown dashboard to SolidJS** and serve it directly from the gastown Cloudflare Worker, making the UI a first-class part of the gastown service rather than a feature embedded in the main Next.js app. This enables:

1. **Native reuse of Kilo's UI components.** The session viewer, message timeline, tool execution cards, diff viewer, and terminal panel from `@opencode-ai/ui` and `packages/app/` can be used directly — no framework bridging, no React ports, no iframe isolation. The same `<SessionTurn>`, `<BasicTool>`, `<Code>`, and `<Diff>` components that render the desktop app render the cloud agent streams.

2. **Two interaction modes for agents.** The xterm.js terminal (Phase 2.5) provides raw PTY access to agent sessions. The SolidJS session viewer provides the rich, structured view — markdown message bubbles, expandable tool call cards with input/output, inline diffs, file tabs. Both show the same agent session. Users pick the view that suits their workflow. The terminal is power-user/debugging; the session viewer is the polished product experience.

3. **Independent deployment.** The gastown dashboard deploys with the gastown worker on its own release cycle, decoupled from the main Next.js app. UI changes to the gastown experience don't require rebuilding and deploying the entire Kilo web app.

4. **Server connection is already compatible.** The Kilo app connects to kilo serve via HTTP REST + SSE (via `@kilocode/sdk`). The cloud container runs kilo serve instances. The SDK client is framework-agnostic. The SolidJS dashboard would use the same `createOpencodeClient()` to connect to agent sessions through the gastown worker proxy — the same code path, just routed through a different network layer.

**What this looks like architecturally:**

```
Browser
  │
  ├── Main Kilo App (Next.js/React) — account, billing, integrations, cloud agent
  │     └── /gastown/* routes → redirect to gastown UI
  │
  └── Gastown UI (SolidJS, served from gastown worker)
        ├── Town dashboard, rig workbench, bead board, convoy tracker
        ├── Mayor chat (using Kilo app's session components)
        ├── Agent stream viewer (using Kilo app's message timeline)
        ├── Agent terminal (xterm.js — kept from Phase 2.5)
        └── All backed by gastown worker API + TownDO
```

The main Next.js app handles everything outside of Gastown (account management, billing, integrations, standalone cloud agent sessions). Gastown owns its own UI surface, served from its own worker, using the same component library as the rest of Kilo.

This is a significant architectural shift — not a near-term task. It depends on the Kilo app's component library being extractable as a standalone SolidJS package, and on the gastown worker being able to serve static assets (or using Cloudflare Pages alongside the worker). But it's the natural end state: the gastown UI is a SolidJS app that shares components with the Kilo desktop/web app, giving both the same quality of agent interaction rendering.

---

## UI-Aware Mayor: Dashboard Context Injection

The Mayor should know what the user is looking at. When the user is staring at a failed bead, the Mayor shouldn't need to be told "bead gt-abc failed" — it should already know, because the dashboard is feeding the user's navigation context into the Mayor's session.

### How it works

The dashboard client maintains a **user activity stream** — a rolling log of what the user has done and what they're currently viewing. On every Mayor message submission, this context is injected as a system block alongside the user's message:

```
<user-context>
  <current-view page="rig-detail" rig="frontend" />
  <viewing-bead bead-id="gt-abc" title="Fix token refresh" status="failed" 
    assignee="Toast" failure-reason="Tests failed: auth.test.ts line 42" />
  <recent-actions>
    - 2m ago: Viewed convoy "auth-refactor" (3/5 beads closed)
    - 5m ago: Viewed agent Toast (status: dead, last activity: 8m ago)
    - 8m ago: Opened bead "Fix token refresh" detail panel
    - 12m ago: Navigated to rig "frontend"
  </recent-actions>
</user-context>
```

The Mayor sees this context and can respond intelligently without the user having to explain: "I see you're looking at bead gt-abc which failed because of a test failure in auth.test.ts. The agent Toast that was working on it is dead. Would you like me to re-sling this to a new polecat, or should I look at the test failure first?"

### What gets tracked

| Signal | Source | Example |
|---|---|---|
| Current page/view | Router location | `rig-detail`, `town-home`, `convoy-detail` |
| Currently viewed object | Active slide-over / detail panel | Bead, agent, convoy, escalation |
| Recent navigation | Page view history (last 10-15 actions, 30 min window) | "Viewed rig X", "Opened bead Y", "Watched agent Z" |
| Recent user actions | Dashboard interactions | "Acknowledged escalation", "Created bead", "Clicked Watch on agent" |
| Selected objects | Multi-select or highlighted items | Beads selected on the kanban board |

### What does NOT get tracked

- Exact mouse movements or scroll position
- Time spent on each view (beyond ordering)
- Anything from outside the Gastown dashboard

### Implementation

The frontend maintains a bounded circular buffer of `UserActivityEvent` objects:

```typescript
type UserActivityEvent = {
  timestamp: string;
  action: 'page_view' | 'object_view' | 'object_action';
  page?: string;
  object?: { type: string; id: string; summary: string };
  detail?: string;
};
```

When sending a Mayor message, the frontend includes the current context and recent activity in a `context` field on the message payload. The TownDO resolves the referenced objects (fetches current bead status, agent state, etc.) and injects the resolved context into the Mayor's prompt as a system block.

### Progressive richness

The context injection can start simple and get richer over time:

1. **Phase 1**: Current page + currently viewed object (just the bead/agent/rig the user has open)
2. **Phase 2**: Recent navigation trail (last 10 actions)
3. **Phase 3**: Inferred intent — "the user has been looking at failed beads for the last 5 minutes, they're probably trying to diagnose a problem"
4. **Phase 4**: Proactive Mayor suggestions — the Mayor notices the user's pattern and offers help before they ask. "I see you've been looking at 3 failed beads in the frontend rig. They all failed on the same test. Want me to investigate the common root cause?"

### Why this matters

Without UI context, the Mayor is a blank-slate chatbot that requires the user to explain everything. With UI context, the Mayor becomes a copilot that shares the user's visual field. The conversation goes from:

> **User**: "The auth fix failed, can you look at bead gt-abc and re-sling it?"

to:

> **User**: "Can you fix this?"
> **Mayor**: "The bead you're viewing (gt-abc: Fix token refresh) failed because auth.test.ts:42 expected a 401 but got a 500. Toast, the agent that worked on it, is dead. I'll create a new bead with the test failure context and sling it to a fresh polecat. The new agent will know exactly what went wrong."

The user's dashboard becomes a shared workspace where the Mayor can see what the user sees.

---

## Agent Status Bubbles: Live Progress Updates

Agents periodically emit short, plain-language status updates — one or two sentences describing what they're currently doing. "Installing packages," "Setting up Vite," "Creating the main UI components," "Running tests — 3 failures to fix." These aren't debug logs or SDK events — they're human-readable summaries of intent, written by the agent for the user.

### Where they show up

- **Agent cards**: A chat-bubble-style tooltip above the agent card on the rig view. Shows the most recent status. Fades/rotates as new updates arrive.
- **Activity feed**: Appears as a timeline entry with the agent's name and avatar: "Toast: Connecting the frontend UI to the backend for the new todo app." Interleaved with bead events, convoy progress, and Mayor messages.
- **Bead detail panel**: Status history for the assigned agent, giving a narrative of how the work progressed.

### How it works

A new tool `gt_status` (or an implicit periodic prompt injection) lets agents emit a status string. The status is lightweight — just a string + timestamp, stored as a bead event or a dedicated `agent_status` field on the agent metadata. The container streams it to the dashboard via the existing WebSocket event sink.

The agent's system prompt includes guidance: "Periodically call gt_status with a brief, user-friendly description of what you're doing. Write it as if you're telling a teammate — not a log line, not a stack trace. One sentence. Do this when you start a new phase of work, not on every tool call."

### Why this matters

Right now, watching agents work means either opening the raw terminal stream (noisy, technical) or waiting for the bead to close (no intermediate visibility). Status bubbles give users the feeling of a team working — you glance at the dashboard and see "Maple is writing tests for the API endpoints" and "Shadow is fixing the lint errors from the previous commit." It makes the town feel alive without requiring users to dig into agent streams.

---

## Code Diffs Linked to Beads

Every bead that results in code changes should have its diff viewable directly from the dashboard. When a polecat works on a bead and pushes a branch, the diff between the branch and its base (main) is the artifact of that work. It should be a first-class, clickable object on the bead detail panel — not something you have to go to GitHub to find.

### How it works

When a polecat calls `gt_done` and pushes its branch, the TownDO records the branch name and base commit on the bead (or its `review_metadata` satellite). The dashboard can then fetch and render the diff on demand.

Two sources for the diff data:

1. **Container git**: The container's git manager can run `git diff main...<branch>` in the rig's repo and return the result via a new API endpoint (`GET /agents/:agentId/diff` or `GET /git/diff?rig=X&branch=Y`). This works while the container is running and the branch exists locally.

2. **GitHub/GitLab API**: For diffs that outlive the container session (or when the container has slept), fetch the diff via the platform API using the org's integration token. `GET /repos/{owner}/{repo}/compare/{base}...{head}` on GitHub returns the full diff. This is the durable path — it works as long as the branch exists on the remote.

### What the user sees

On the bead detail panel, a **"Diff" tab** appears when the bead has associated code changes:

- File-level diff summary (files changed, insertions, deletions)
- Expandable per-file unified diff with syntax highlighting
- Click a file to see the full diff for that file

For beads that went through the Refinery and merged, the diff shows what landed on main (the squash-merge commit). For beads still in progress, the diff shows the current state of the polecat's branch vs main.

### What the Mayor sees

When the Mayor checks on a bead or convoy, diff summaries can be included in the context: "Toast's bead changed 4 files (+120 -45): `src/auth.ts`, `src/middleware.ts`, `test/auth.test.ts`, `package.json`." This gives the Mayor a sense of the scope and nature of the work without reading the full diff.

### Convoy-level diffs

At the convoy level, aggregate the diffs across all tracked beads: total files changed, total lines added/removed, and a merged file list. "The auth refactor convoy touched 12 files across 2 rigs, +340 -180." This is the high-level summary for the user who wants to know the blast radius of a batch of work.

---

## Automated PR Review Response: Agents Address Human Feedback

When using the `pr` merge strategy, the TownDO already polls open PRs for merge/close status (#473). Extend this to **detect new review comments** and dispatch agents to address them — fix the code, push updates, reply to threads, and resolve conversations.

### The loop

1. **TownDO alarm polls open PRs** (already exists in `pollPendingPRs()`). Extend to also fetch review comments via the GitHub/GitLab API.
2. **New comments detected** — for each unaddressed review comment (not authored by Gastown, not already resolved), the TownDO creates a **sub-bead** of the original work bead with the comment context: file path, line range, comment body, reviewer name.
3. **Agent dispatched** — the sub-bead is slung to a polecat (or the Refinery itself, since it already has the review context and worktree). The agent's prompt includes the full comment, the file context, and instructions to either fix the issue or respond explaining why no change is needed.
4. **Agent works** — reads the comment, examines the code, makes the fix (or determines no fix is needed), commits and pushes to the same PR branch.
5. **Agent replies** — via the GitHub/GitLab API, posts a reply to the review thread explaining what was done: "Fixed: moved the null check before the destructuring as suggested" or "This is intentional — the fallback handles the null case on line 47. Marking as resolved."
6. **Agent resolves the thread** — calls the resolve conversation API (GitHub: `POST /graphql` mutation `resolveReviewThread`, GitLab: `PUT /discussions/:id/resolve`).
7. **Cycle repeats** — if the reviewer posts more comments, the next alarm tick picks them up and the loop continues.

### Comment classification

Not every comment needs a code fix. The agent should classify each comment:

| Type | Agent action |
|---|---|
| **Code fix request** ("this should use `?.` instead of `!`") | Fix the code, push, reply with what changed, resolve |
| **Question** ("why did you use X instead of Y?") | Reply with explanation, resolve |
| **Nitpick / style** ("prefer `const` over `let` here") | Fix it (low effort), push, resolve |
| **Architectural concern** ("this should be in a separate service") | Respond acknowledging the feedback. If the change is within scope, fix it. If it's a larger concern, escalate to the Mayor with context. |
| **Approval / positive comment** ("LGTM", "nice") | No action needed, skip |

### What the agent sees in its prompt

```
You are addressing review feedback on PR #42 for bead "Fix token refresh".

Review comment by @sarah (on src/auth.ts, lines 15-20):
> This function doesn't handle the case where the token is expired
> but the refresh endpoint is down. We should return a 503 here
> instead of silently failing.

The relevant code:
[file context with surrounding lines]

Your task:
1. Fix the issue described in the comment
2. Commit and push to the PR branch
3. Reply to the review thread explaining your fix
4. Resolve the conversation

If the comment is a question or doesn't require a code change, reply
with an explanation and resolve the thread. If the request requires
changes beyond the scope of this bead, escalate to the Mayor.
```

### Tracking

Each review comment creates a sub-bead linked to the MR bead via `parent_bead_id`. The sub-bead's `metadata` stores the comment ID, thread ID, file path, and line range. When the agent resolves the thread, the sub-bead closes. The MR bead's progress reflects how many review comments have been addressed.

The dashboard shows review comment sub-beads in the MR bead's detail drawer — each comment with its status (open/addressed/resolved) and the agent's reply.

### Platform API calls needed

**GitHub:**
- `GET /repos/{owner}/{repo}/pulls/{number}/reviews` — list reviews
- `GET /repos/{owner}/{repo}/pulls/{number}/comments` — list review comments
- `POST /repos/{owner}/{repo}/pulls/{number}/comments/{id}/replies` — reply to thread
- GraphQL `resolveReviewThread(input: { threadId })` — resolve conversation

**GitLab:**
- `GET /projects/{id}/merge_requests/{iid}/discussions` — list discussions
- `POST /projects/{id}/merge_requests/{iid}/discussions/{discussion_id}/notes` — reply
- `PUT /projects/{id}/merge_requests/{iid}/discussions/{discussion_id}?resolved=true` — resolve

### Building on existing infrastructure

The `platform-pr.util.ts` from #473 already has `parseGitUrl()`, GitHub/GitLab auth patterns, and Zod response parsing. The comment fetching and reply posting would be natural extensions of the same module. The `pollPendingPRs()` method in TownDO already iterates open PRs on every alarm tick — adding comment detection is an additional API call per open PR in the same loop.

---

## Wasteland Onboarding + Dolt Integration

Make Gastown by Kilo towns visible to the [Wasteland](https://docs.gastownhall.ai/) federation by syncing town state to a user-provided DoltHub database. The Wasteland expects Dolt databases — our internal storage is DO SQLite. Rather than replacing or mirroring the entire DO, we sync at the boundary: emit Wasteland-compatible records (beads, completions, agent identity) to DoltHub so the town is a queryable participant in the federation.

### Design: Event-sourced sync, not database replication

The TownDO remains the source of truth for all real-time orchestration. Dolt is the town's "public face" — the subset of state that the Wasteland cares about, published on a cadence.

**What gets synced to Dolt:**
- Bead created/updated/closed → wanted posters and completions
- Convoy created/landed → batch work summaries
- Agent identity created → agent registry entries
- MR beads completed → merge/contribution records

**What does NOT get synced:**
- High-frequency alarm state, dispatch attempts, SDK events
- Internal agent metadata (hook state, dispatch cooldowns, GUPP violations)
- Container lifecycle, environment variables, credentials
- Triage requests, mail delivery state, escalation re-escalation counters

### Sync mechanisms

**Incremental sync (steady-state):** The TownDO alarm loop (or a slower-cadence task, e.g., every 5 minutes) batches recent `bead_events` since the last sync cursor, translates them into Dolt-schema inserts, and pushes a commit to DoltHub via their SQL or HTTP API. This is append-mostly — new beads, status transitions, completions. Dolt's versioning gives history for free.

**Full sync (Wasteland onboarding for existing towns):** When a user connects an existing town to the Wasteland for the first time, they have historical beads, convoys, and agent records that predate the Dolt integration. A full sync publishes this backlog.

Full syncs are handled by a dedicated **GastownDoltSyncDO** (Durable Object, keyed by `{townId}:{syncId}`). The flow:

1. User completes Wasteland onboarding in the UI — provides DoltHub org/repo and API token
2. TownDO creates a `GastownDoltSyncDO` instance for the initial full sync
3. The sync DO queries the TownDO for all relevant historical data (closed beads, landed convoys, agent identities) in paginated batches
4. An alarm loop on the sync DO processes batches over time — pushing commits to DoltHub in chunks, tracking progress via a cursor, re-arming until complete
5. On completion, the sync DO marks itself done and the TownDO switches to incremental sync mode

This avoids blocking the TownDO alarm with a potentially large historical export. The sync DO is a one-shot worker that runs to completion and then goes dormant. Future full syncs (e.g., if the user resets their Dolt database) create a new sync DO instance.

### User setup flow

1. User navigates to town settings → Wasteland section
2. User creates a DoltHub database (or provides an existing one) — org, repo name, API token
3. We initialize the Dolt schema (beads table, agents table, completions table — matching the Wasteland's expected structure)
4. If the town has existing data, a full sync is triggered via GastownDoltSyncDO
5. Once the full sync completes, incremental sync begins automatically
6. The town is now visible to the Wasteland federation

### Inbound Wasteland work (stretch)

For the town to consume work from the Wasteland (not just publish), a poll-or-webhook mechanism reads wanted posters from the Wasteland's coordination database and creates beads in the TownDO. The Mayor could get a `gt_wasteland_check` tool that surfaces available federation work. This is a separate effort from the outbound sync described above.

### Key architectural points

- **DO SQLite is the source of truth.** Dolt is a downstream projection, not a primary store. If Dolt is unavailable, the town operates normally — sync catches up when connectivity returns.
- **GastownDoltSyncDO isolates sync work.** Batching, retries, and progress tracking don't pollute the TownDO's alarm loop. Each sync is a self-contained DO with its own alarm cadence.
- **Dolt schema matches Wasteland expectations.** We don't invent our own Dolt schema — we implement whatever table structure the Wasteland protocol expects, translating from our internal bead model on the way out.
- **Sync is idempotent.** Re-running a full sync or replaying incremental events produces the same Dolt state. Bead IDs are stable UUIDs.

---

## Kiloclaw Integration

### Sub-issues

- #1290 — **Kiloclaw as Meta-Mayor** — A Kiloclaw instance as a fleet coordinator across multiple Gastown towns. Connects to each town via a Gastown MCP server, can query bead boards, create convoys, send messages to individual town Mayors, track cross-town dependencies, and report on fleet-wide health and spend. The VP of Engineering AI.

### Other Kiloclaw integration opportunities (not yet scoped)

- **Kiloclaw as Mayor** — Replace a single town's built-in Mayor with a Kiloclaw instance. Gets persistent memory (solves the container restart context loss problem), custom MCP tools (Slack, Jira, Linear), and user-customizable behavior via custom instructions. Integration: route Mayor messages to Kiloclaw instead of the container; Kiloclaw uses the Gastown MCP server to call `gt_sling`, `gt_list_beads`, etc.
- **Kiloclaw as Polecat** — A Kiloclaw instance doing coding work. Benefits: persistent codebase knowledge across sessions, custom MCP servers for test infrastructure. Challenge: polecats need direct filesystem access (git worktrees, shell) — Kiloclaw would need MCP access to the container's filesystem or use the GitHub/GitLab API for code changes.
- **Kiloclaw as Refinery** — Persistent knowledge of codebase quality standards (learned across reviews), MCP tools for external quality checks (SonarQube, security scanners), org-specific review standards from custom instructions.
- **Kiloclaw as Triage Agent** — Persistent memory of past triage decisions. "Last time this agent OOM'd, I increased the memory limit and it worked." Institutional memory for operational decisions.

### Shared prerequisite: Gastown MCP Server

All Kiloclaw integrations depend on a **Gastown MCP server package** that exposes town operations as MCP tools. This is independently useful — any MCP-compatible agent (not just Kiloclaw) could use it to interact with Gastown towns.








Signal	Source	Example
Current page/view	Router location	`rig-detail`, `town-home`, `convoy-detail`
Currently viewed object	Active slide-over / detail panel	Bead, agent, convoy, escalation
Recent navigation	Page view history (last 10-15 actions, 30 min window)	"Viewed rig X", "Opened bead Y", "Watched agent Z"
Recent user actions	Dashboard interactions	"Acknowledged escalation", "Created bead", "Clicked Watch on agent"
Selected objects	Multi-select or highlighted items	Beads selected on the kanban board

Type	Agent action
Code fix request ("this should use `?.` instead of `!`")	Fix the code, push, reply with what changed, resolve
Question ("why did you use X instead of Y?")	Reply with explanation, resolve
Nitpick / style ("prefer `const` over `let` here")	Fix it (low effort), push, resolve
Architectural concern ("this should be in a separate service")	Respond acknowledging the feedback. If the change is within scope, fix it. If it's a larger concern, escalate to the Mayor with context.
Approval / positive comment ("LGTM", "nice")	No action needed, skip

Cloud Gastown: Future ideas — capabilities unique to or enhanced by the cloud model #447

Description

Overview

Organization-Level Towns

Ownership model

Cost attribution: always to the human, never to a bot

Git commit dual attribution

Fleet management

Agent Personas and Smarter Convoy Dispatch

Serial convoy agent reuse

Agent personas

Agent Marketplace and Shared Formulas

Cross-Session Intelligence at Scale

Real-Time Cost Attribution and Budget Controls

Speculative Execution and A/B Testing

Webhook-Driven Beads

Persistent Agent Reputation

Always-On with Zero Maintenance

Local CLI Bridge (Stretch Goal)

SolidJS Dashboard: Native Integration with Kilo's Web UI

UI-Aware Mayor: Dashboard Context Injection

How it works

What gets tracked

What does NOT get tracked

Implementation

Progressive richness

Why this matters

Agent Status Bubbles: Live Progress Updates

Where they show up

How it works

Why this matters

Code Diffs Linked to Beads

How it works

What the user sees

What the Mayor sees

Convoy-level diffs

Automated PR Review Response: Agents Address Human Feedback

The loop

Comment classification

What the agent sees in its prompt

Tracking

Platform API calls needed

Building on existing infrastructure

Wasteland Onboarding + Dolt Integration

Design: Event-sourced sync, not database replication

Sync mechanisms

User setup flow

Inbound Wasteland work (stretch)

Key architectural points

Kiloclaw Integration

Sub-issues

Other Kiloclaw integration opportunities (not yet scoped)

Shared prerequisite: Gastown MCP Server

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions