perf(gastown): Reduce alarm loop latency for active towns

## Parent

Part of #204 (Phase 4: Hardening)

## Problem

The Town DO alarm loop introduces noticeable latency in the UI. Mayor-slung beads take seconds to appear, agent hookings take a while to transition beads to in_progress, and PR merges take multiple alarm cycles to propagate. The root causes:

1. **`armAlarmIfNeeded()` doesn't bump a pending idle alarm.** When a town is idle (alarm set 60s in the future) and new work arrives, the method checks `if (!current || current < Date.now())` — if an alarm is pending, it does nothing. Work waits up to 60s for the existing idle alarm to fire. Affects `slingConvoy()`, `agentDone()`, `agentCompleted()`, `submitToReviewQueue()`, and all other callers.

2. **Active interval (5s) runs every phase on every tick.** Bead assignment, event drain, PR polling, patrol, GC, mail delivery, and status broadcast all run every 5s. Most of this work is unnecessary most ticks, but the bundling means the fast stuff (event drain + reconcile + dispatch) can't run faster without also running the slow stuff more often.

3. **PR polling has no rate limiting.** Every in-progress MR bead with a `pr_url` hits the GitHub API every 5s. A town with 5 open PRs makes ~60 API calls/minute against GitHub's 5000/hour (~83/min) authenticated rate limit. No headroom for other consumers sharing the token.

4. **Event-driven work routes through the poll loop.** `agentDone` inserts a `town_event` → waits for alarm → Phase 0 drains → Phase 1 reconciles → Phase 2 side effects. That's 5-10s of latency for an operation that already has full context at call time. Similarly, `slingConvoy` creates beads but relies on the reconciler to assign agents on the next tick.

## Proposed Changes

### A. Fix `armAlarmIfNeeded()` — bump idle alarms to active interval

**Effort: Low | Impact: High | Risk: Very low**

When new work arrives during an idle period, reschedule the alarm to fire within the active interval instead of waiting for the idle alarm:

```ts
private async armAlarmIfNeeded(): Promise<void> {
  const storedId = await this.ctx.storage.get<string>('town:id');
  if (!storedId) return;
  const current = await this.ctx.storage.getAlarm();
  const activeDeadline = Date.now() + ACTIVE_ALARM_INTERVAL_MS;
  if (!current || current > activeDeadline) {
    await this.ctx.storage.setAlarm(activeDeadline);
  }
}
```

Eliminates the 60s idle-to-active transition penalty.

### B. Split the alarm into fast and slow phases

**Effort: Medium | Impact: High | Risk: Medium**

Not all work needs to run every tick. Separate into:

- **Fast path (1-2s):** Event drain → reconcile beads/agents → dispatch → status broadcast
- **Slow path (30-60s):** PR polling, patrol/GUPP, GC, mail delivery, escalation checks, container health

Use a timestamp to throttle slow phases:

```ts
const FAST_ALARM_INTERVAL_MS = 2_000;
const SLOW_PHASE_INTERVAL_MS = 30_000;

// In alarm():
const now = Date.now();
const runSlowPhase = !this.lastSlowPhaseAt || (now - this.lastSlowPhaseAt) >= SLOW_PHASE_INTERVAL_MS;
if (runSlowPhase) this.lastSlowPhaseAt = now;
```

This lets bead assignment and agent hooking happen in ~2s while keeping expensive operations throttled.

### C. Add PR polling rate limiting

**Effort: Low | Impact: Medium | Risk: Low**

- Track `last_polled_at` on MR beads, skip if polled within the last 30s
- Consider GitHub webhooks for immediate PR status updates (eliminates polling for GitHub repos entirely)

### D. Process events inline where possible

**Effort: Medium | Impact: Medium | Risk: Medium**

For operations that already have full context (like `agentDone`), apply the state transition immediately in the RPC handler instead of inserting an event and waiting for the alarm to drain it. The event can still be inserted for audit purposes. This eliminates the 5-10s round-trip through the alarm loop for common operations.

### E. Immediate dispatch for `slingConvoy`

**Effort: Low | Impact: Medium | Risk: Low**

After creating convoy beads, run a targeted mini-reconcile that assigns agents to the initially-unblocked beads, the same way `slingBead()` already does fire-and-forget dispatch. Eliminates the 5s wait for the first batch of convoy beads.

## Priority Order

| Change | Effort | Impact | Risk |
|--------|--------|--------|------|
| A. Fix armAlarmIfNeeded | Low | High (eliminates 60s idle penalty) | Very low |
| B. Split fast/slow phases | Medium | High (2s bead assignment) | Medium |
| C. PR polling rate limit | Low | Medium (prevents rate limit issues) | Low |
| D. Inline event processing | Medium | Medium (eliminates 5-10s for agent done) | Medium |
| E. Immediate convoy dispatch | Low | Medium (faster convoy starts) | Low |

## Key Files

- `cloudflare-gastown/src/dos/Town.do.ts` — alarm handler (line 2830), `armAlarmIfNeeded()` (line 3442), interval constants (line 121)
- `cloudflare-gastown/src/dos/town/scheduling.ts` — dispatch logic, `hasActiveWork()`
- `cloudflare-gastown/src/dos/town/reconciler.ts` — all reconciliation rules, PR polling actions
- `cloudflare-gastown/src/dos/town/actions.ts` — action application and deferred side effects
- `cloudflare-gastown/src/dos/town/patrol.ts` — GUPP thresholds and patrol checks

## Notes

- Change A can be shipped independently and immediately — it's a one-line fix with no risk
- Change B is the big architectural improvement but needs careful testing to ensure slow-phase operations still run reliably
- Change C is important for correctness regardless of latency goals — without it, towns with many open PRs can hit GitHub rate limits
- Changes D and E are incremental optimizations that can be done independently

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(gastown): Reduce alarm loop latency for active towns #1428

Parent

Problem

Proposed Changes

A. Fix `armAlarmIfNeeded()` — bump idle alarms to active interval

B. Split the alarm into fast and slow phases

C. Add PR polling rate limiting

D. Process events inline where possible

E. Immediate dispatch for `slingConvoy`

Priority Order

Key Files

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Change	Effort	Impact	Risk
A. Fix armAlarmIfNeeded	Low	High (eliminates 60s idle penalty)	Very low
B. Split fast/slow phases	Medium	High (2s bead assignment)	Medium
C. PR polling rate limit	Low	Medium (prevents rate limit issues)	Low
D. Inline event processing	Medium	Medium (eliminates 5-10s for agent done)	Medium
E. Immediate convoy dispatch	Low	Medium (faster convoy starts)	Low

perf(gastown): Reduce alarm loop latency for active towns #1428

Description

Parent

Problem

Proposed Changes

A. Fix armAlarmIfNeeded() — bump idle alarms to active interval

B. Split the alarm into fast and slow phases

C. Add PR polling rate limiting

D. Process events inline where possible

E. Immediate dispatch for slingConvoy

Priority Order

Key Files

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

A. Fix `armAlarmIfNeeded()` — bump idle alarms to active interval

E. Immediate dispatch for `slingConvoy`