diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md
index 1e73e1f..ef1beee 100644
--- a/.claude/CLAUDE.md
+++ b/.claude/CLAUDE.md
@@ -66,32 +66,27 @@ Every session has three phases: start, work, end.
 
 <!-- BEGIN:REPO:current-state -->
 ## Current State
-Branch: `main` — clean. PR #182 open (`feature/architecture-refactor-plan`, docs only).
+Branch: `feature/agent-message-handler-stateful` — PR #193 open (step 4b), auto-merge set.
 
 Active development is in **`apps/claude-sdk-cli/`** — a TUI terminal app built on `@shellicar/claude-sdk`.
 
-**Architecture refactor planned** — see `.claude/plans/architecture-refactor.md` for the full
-step-by-step plan with estimates and risk ratings. Next step: **1a** (split `Conversation`
-from `ConversationStore`). Each substep ships independently; the CLI works at every commit.
-
-**Completed:**
-- Full cursor-aware multi-line editor (`AppLayout.ts`)
-- Clipboard text/file attachments via command mode
-- `ConversationHistory.push(msg, {id?})` + `remove(id)` for tagged message pruning
-- `IAnthropicAgent.injectContext/removeContext` public API
-- `RunAgentQuery.thinking` + `pauseAfterCompact` + `systemPrompts` options
-- `BetaMessageParam` used directly in public interface
-- Ref tool + RefStore for large output ref-swapping
-- Tool approval flow (auto-approve/deny/prompt)
-- Compaction display with context high-water mark
-- OAuth2 authentication (`TokenRefreshingAnthropic` subclass)
-- System prompts hardcoded in `apps/claude-sdk-cli/src/systemPrompts.ts`
-- `SdkQuerySummary` — query context block shown before each API call (ℹ️ query)
-
-**In-progress / next:**
-- Architecture refactor step 1a — split `Conversation` from `ConversationStore`
-- History replay in TUI (step 1b) — show prior turns on startup
-- Vitest setup (prerequisite for unit tests, do alongside 1a)
+**Architecture refactor in progress** — see `.claude/plans/architecture-refactor.md`.
+Follows a State / Renderer / ScreenCoordinator (MVVM) pattern. Each substep ships independently.
+
+**Completed refactor steps:**
+- **1a** `Conversation` (pure data) split from `ConversationStore` (I/O) — PR #183
+- **1b** History replay into TUI on startup — PR #186
+- **2** `RequestBuilder` pure function extracted from `AgentRun` — PR #187
+- **3a** `EditorState` extracted from `AppLayout` (fields + `reset`) — PR #189
+- **3b** `EditorState.handleKey` — all editor key transitions moved out of `AppLayout` — PR #190
+- **3c** `renderEditor(state, cols): string[]` pure renderer extracted — PR #191
+- **4a** `AgentMessageHandler` stateless cases extracted from `runAgent.ts` — PR #192
+- **4b** `AgentMessageHandler` stateful cases moved in (`message_usage`, `tool_approval_request`, `tool_error`) — PR #193 (pending merge)
+
+**Next: step 5a** — extract `StatusState` + `renderStatus` from `AppLayout`
+- Move the 5 token/cost accumulators to `StatusState`
+- Move status line render logic to `renderStatus(state, cols): string`
+- `AppLayout` holds `this.#statusState`, calls `renderStatus` in its render pass
 <!-- END:REPO:current-state -->
 
 <!-- BEGIN:REPO:vision -->
@@ -138,13 +133,19 @@ Full detail: `.claude/five-banana-pillars.md`
 | `AttachmentStore.ts` | `TextAttachment \| FileAttachment` union; SHA-256 dedup; 10 KB text cap; `addFile(path, kind, size?)` |
 | `clipboard.ts` | `readClipboardText()`; three-stage `readClipboardPath()` (pbpaste → VS Code code/file-list JXA → osascript furl); `looksLikePath`; `sanitiseFurlResult` |
 | `runAgent.ts` | Wires agent to layout: sets up tools, beta flags, event handlers |
+| File | Role |
+|------|------|
+| `entry/main.ts` | Entry point: creates agent, layout, starts readline loop |
+| `AppLayout.ts` | TUI: full cursor editor, streaming display, compaction blocks, tool approval, command mode, attachment chips |
+| `AttachmentStore.ts` | `TextAttachment \| FileAttachment` union; SHA-256 dedup; 10 KB text cap; `addFile(path, kind, size?)` |
+| `clipboard.ts` | `readClipboardText()`; three-stage `readClipboardPath()` (pbpaste → VS Code code/file-list JXA → osascript furl); `looksLikePath`; `sanitiseFurlResult` |
+| `EditorState.ts` | Pure editor state + `handleKey(key): boolean` transitions. No rendering, no I/O. |
+| `renderEditor.ts` | Pure `renderEditor(state: EditorState, cols: number): string[]` renderer. |
+| `AgentMessageHandler.ts` | Maps all `SdkMessage` events → layout calls / state mutations. Extracted from `runAgent.ts`. |
+| `runAgent.ts` | Wires agent to layout: sets up tools, beta flags, constructs handler, wires `port.on` |
 | `permissions.ts` | Tool auto-approve/deny rules |
 | `redact.ts` | Strips sensitive values from tool inputs before display |
 | `logger.ts` | Winston file logger (`claude-sdk-cli.log`) |
-
-### Key files in `packages/claude-sdk/src/`
-
-| File | Role |
 |------|------|
 | `public/interfaces.ts` | `IAnthropicAgent` abstract class (public contract) |
 | `public/types.ts` | `RunAgentQuery`, `SdkMessage` union, tool types |
@@ -230,22 +231,16 @@ Opt-in via `shellicarMcp: true` config. Registers an in-process MCP server (`she
 
 9. **No atomic session file writes** — `writeFileSync` is not atomic. Crash during write corrupts `.claude/cli-session`.
 <!-- END:REPO:known-debt -->
-
 <!-- BEGIN:REPO:recent-decisions -->
+- **MVVM architecture refactor** (2026-04-06): Three-layer model — State (pure data + transitions), Renderer (pure `(state, cols) → string[]`), ScreenCoordinator (owns screen, routes events, calls renderers). Pull-based: coordinator decides when to render. Plan in `.claude/plans/architecture-refactor.md`. Enables unit testing of state and render logic without terminal knowledge.
+- **`previousPatchId` chaining for multi-patch edits** (2026-04-06): When making sequential edits to the same file, chain `PreviewEdit` calls using `previousPatchId`, then apply once with `EditFile`. Previewing without applying then moving to a second patch is the failure mode — only the second patch gets applied, first is silently dropped. esbuild and vitest don't catch this; it only surfaces at runtime.
 - **f command clipboard system** (2026-04-05): Three-stage `readClipboardPath()` — (1) pbpaste filtered by `looksLikePath`, (2) VS Code `code/file-list` JXA probe (file:// URI → POSIX path), (3) osascript `furl` filtered by `sanitiseFurlResult`. Injectable `readClipboardPathCore` for tests. `looksLikePath` is permissive (accepts bare-relative like `apps/foo/bar.ts`); `isLikelyPath` in AppLayout is strict (explicit prefixes only) and only used for the missing-chip case. `sanitiseFurlResult` rejects paths containing `:` (HFS artifacts). `f` handler is stat-first: if the file exists attach it directly; only apply `isLikelyPath` if stat fails.
 - **Clipboard text attachments** (2026-04-06): `ctrl+/` enters command mode; `t` reads clipboard via `pbpaste` and adds a `<document>` block attachment; `d` removes selected chip; `← →` select chips. On `ctrl+enter` submit, attachments are folded into the prompt as `<document>` XML blocks and cleared.
 - **ConversationHistory ID tagging** (2026-04-06): `push(msg, { id? })` tags messages for later removal. `remove(id)` splices the last item with matching ID. IDs are session-scoped (not persisted). Used by `IAnthropicAgent.injectContext/removeContext` for skills context management.
 - **IAnthropicAgent uses BetaMessageParam** (2026-04-06): `getHistory/loadHistory/injectContext` now use `BetaMessageParam` directly instead of `JsonObject` casts. `JsonObject`, `JsonValue`, `ContextMessage` types removed. `BetaMessageParam` re-exported from package index.
 - **thinking/pauseAfterCompact as RunAgentQuery options** (2026-04-06): Both default off. `thinking: true` adds `{ type: 'adaptive' }` to the API body. `pauseAfterCompact: true` wires into `compact_20260112.pause_after_compaction`. When `pauseAfterCompact: true` and compaction fires, the agent sends `done` with `stopReason: 'pause_turn'` — user sees the summary and resumes manually (intentional UX).
 - **Skills timing design issue** (2026-04-06): Documented in `docs/skills-design.md`. Calling `agent.injectContext()` from inside a tool handler merges the injected user message with the pending tool-results user message (consecutive merge policy). Resolution options documented; implementation deferred.
-## Recent Decisions
-
-- **Structured command execution via in-process MCP** (#99) — replaced freeform Bash with a structured Exec tool served by an in-process MCP server. Glob-based auto-approve (`execAutoApprove`) with custom zero-dep glob matcher (no minimatch dependency).
-- **Exec tool extracted to `@shellicar/mcp-exec`** — schema, executor, pipeline, validation rules, and ANSI stripping moved to a published package. CLI retains only `autoApprove.ts` (CLI-specific config concern).
-- **ZWJ sanitisation in layout pipeline**: `sanitiseZwj` strips U+200D before `wrapLine` measures width. Terminals render ZWJ sequences as individual emojis; `string-width` assumes composed form. Stripping at the layout boundary removes the mismatch.
-- **Monorepo workspace conversion**: CLI source moved to `packages/claude-cli/`. Root package is private workspace with turbo, syncpack, biome, lefthook. Turbo orchestrates build/test/type-check. syncpack enforces version consistency. `.packagename` file at root holds the active package name for scripts and pre-push hooks.
-- **SDK bidirectional channel** (`packages/claude-sdk/`): New package wrapping the Anthropic API. Uses `MessagePort` for bidirectional consumer/SDK communication. Tool validation (existence + input schema) happens before approval requests are sent. Approval requests are sent in bulk; tools execute in approval-arrival order.
-- **Screen utilities extracted to `claude-core`**: `sanitise`, `reflow` (wrapLine/rewrapFromSegments/computeLineSegments), `screen` (Screen interface + StdoutScreen), `status-line` (StatusLineBuilder), `viewport` (Viewport), `renderer` (Renderer) all moved from `claude-cli` to `claude-core`. `claude-cli` now imports from `@shellicar/claude-core/*`. `tsconfig.json` in claude-core requires `"types": ["node"]` for process globals with moduleResolution bundler.
+<!-- END:REPO:recent-decisions -->
 <!-- END:REPO:recent-decisions -->
 
 <!-- BEGIN:REPO:extra -->
diff --git a/.claude/sessions/2026-04-06.md b/.claude/sessions/2026-04-06.md
index ea54212..04b8466 100644
--- a/.claude/sessions/2026-04-06.md
+++ b/.claude/sessions/2026-04-06.md
@@ -314,3 +314,105 @@ if (key.type !== 'ctrl+enter') return;
   - The editor rendering block in `AppLayout.render()` pushes: a divider, a blank line, then each editor line with prefix and cursor highlighting via `INVERSE_ON/OFF`
   - Dependencies to resolve: `buildDivider`, `BLOCK_PLAIN.prompt`, `EDITOR_PROMPT`, `CONTENT_INDENT` — check whether these are AppLayout-private or shareable
   - Result is a pure `string[]`, testable without ANSI stripping (just check the non-cursor lines contain the text, cursor line contains `INVERSE_ON`)
+
+
+---
+
+## Session continuation 5 (same day, later still)
+
+### Step 3c — `renderEditor` (PR #191, merged)
+
+New file `apps/claude-sdk-cli/src/renderEditor.ts`:
+- Pure function `renderEditor(state: EditorState, cols: number): string[]`
+- Returns only the editor text lines — no divider, no trailing blank line. `AppLayout` owns that chrome for all block types, so the renderer shouldn't add it.
+- Handles prefix (`💬 ` for line 0, `   ` for subsequent lines), cursor highlighting via `INVERSE_ON`/`INVERSE_OFF`, end-of-line cursor space, and `wrapLine` for long lines.
+- `EDITOR_PROMPT` constant removed from `AppLayout` — only consumer was the rendering loop, which moved here.
+
+`apps/claude-sdk-cli/test/renderEditor.spec.ts` — 11 tests covering: empty state cursor, prefix on line 0 only, cursor highlights mid-line character, cursor at EOL appends space inside inverse, multi-line indent prefix, line wrapping.
+
+Bananabot review: "Clean extraction, textbook renderer. YAGNI on shared constants." 🍌
+
+---
+
+### Step 4a — `AgentMessageHandler` stateless cases (PR #192, merged)
+
+New file `apps/claude-sdk-cli/src/AgentMessageHandler.ts`:
+- `constructor(layout: AppLayout, log: typeof logger)`
+- `public handle(msg: SdkMessage): void`
+- Handles: `query_summary`, `message_thinking`, `message_text`, `message_compaction_start`, `message_compaction`, `done`, `error`
+- These are all the cases that require no accumulated state — just route to a layout call.
+
+`runAgent.ts` changes:
+- `import { AgentMessageHandler }` added
+- `const handler = new AgentMessageHandler(layout, logger)` instantiated at function top
+- The switch retains explicit cases for `tool_approval_request`, `tool_error`, `message_usage` (all stateful — need `usageBeforeTools`, `lastUsage`, async approval flow)
+- `default: handler.handle(msg)` catches everything else
+
+**Known temporary regression:** `message_compaction` no longer shows the `[compacted at X/Y (Z%)]` annotation. That annotation reads `lastUsage`, which is only set by `message_usage` — a 4b case. The omission is documented in a comment in `AgentMessageHandler.ts`. Compaction only fires at 150k tokens so the regression isn't visible in normal use.
+
+`apps/claude-sdk-cli/test/AgentMessageHandler.spec.ts` — 16 tests using `vi.fn()` mocks cast to `AppLayout`.
+
+#### Bug encountered and lesson learned
+
+During the 4a edits I previewed two separate patches to `runAgent.ts` — one adding the import+declaration, one removing the old cases and adding `default: handler.handle(msg)` — but applied only the second without applying the first. The result compiled fine (esbuild doesn't type-check) and tests didn't catch it (vitest doesn't import `runAgent.ts` directly), so the `ReferenceError: handler is not defined` only surfaced at runtime.
+
+Fix: a second commit adding the missing import + `const handler` declaration.
+
+**Lesson:** When making sequential edits to the same file, either chain them with `previousPatchId` so they compose into a single `EditFile`, or apply each patch immediately before previewing the next. Preview-without-apply then move on is the failure mode.
+
+---
+
+### State at end of session
+
+- Branch: `main`, clean, up to date (PR #192 squash-merged)
+- **Next: step 4b** — move stateful message handling into `AgentMessageHandler`
+  - Add fields: `usageBeforeTools: SdkMessageUsage | null`, `lastUsage: SdkMessageUsage | null`
+  - Move `message_usage` handling in: delta token calculation, marginal cost via `calculateCost`, `appendToLastSealed('tools', ...)` annotation
+  - Move `tool_approval_request` in: snapshot `usageBeforeTools`, call async `toolApprovalRequest`
+  - Move `tool_error` in
+  - Constructor will need additional dependencies: `model`, `cacheTtl`, `cwd`, `store`, and a `respond` callback for posting `tool_approval_response`
+  - The reset timing for `usageBeforeTools = null` after a tool batch closes is the risky part — test that sequence explicitly
+
+
+---
+
+## Session continuation 6 (same day, even later)
+
+### Step 4b — `AgentMessageHandler` stateful cases (PR #193)
+
+Completes the `AgentMessageHandler` extraction. Everything that was still in `runAgent.ts`'s message switch moved in.
+
+**What moved into `AgentMessageHandler.ts`:**
+- Helper functions `fmtBytes`, `primaryArg`, `formatRefSummary`, `formatToolSummary` (were module-level in `runAgent.ts`)
+- Private fields `#lastUsage: SdkMessageUsage | null` and `#usageBeforeTools: SdkMessageUsage | null`
+- `message_usage` case: computes delta tokens and marginal cost, annotates the sealed tools block via `appendToLastSealed`, resets `usageBeforeTools`
+- `tool_approval_request` case: snapshots `usageBeforeTools` on first tool of batch, fires `void this.#toolApprovalRequest(msg)`
+- `tool_error` case: formats the error block
+- `#toolApprovalRequest` private async method: permission check, approval prompt, `respond` callback, `addPendingTool`/`removePendingTool`
+
+**Constructor change:** adds `opts: AgentMessageHandlerOptions` — `model`, `cacheTtl`, `cwd`, `store`, `tools`, `respond`. The `respond` callback (`(requestId, approved) => void`) abstracts `port.postMessage` so the handler doesn't hold a reference to `port`.
+
+**Restores** the `message_compaction` `[compacted at X/Y (Z%)]` annotation that was temporarily dropped in 4a.
+
+**`runAgent.ts` after this:** 89 lines. Sets up tools, calls `agent.runAgent()`, constructs `respond` + handler, wires `port.on('message', (msg) => handler.handle(msg))`. No message routing logic lives there anymore.
+
+**Tests:** 27 total (16 unchanged + 11 new). New tests:
+- `tool_error`: transitions to tools block, streams tool name and error message
+- `message_usage` without tool batch: calls `updateUsage`, no annotation
+- `message_usage` delta annotation: asserts `+500` in annotation string after a tool batch
+- Reset timing sequence: two independent batches annotate separately with correct deltas
+- No double-snapshot: second tool in same batch doesn't reset the snapshot
+- `message_compaction` annotation: present when `lastUsage` known, absent when not
+
+All 167 tests pass. Manual testing confirmed: tool approval flow, delta annotation, two-tool batch annotates once, `tool_error` format all working.
+
+---
+
+### State at end of session
+
+- Branch: `feature/agent-message-handler-stateful`, PR #193 open, auto-merge set
+- **Next: step 5a** — extract `StatusState` + `renderStatus` from `AppLayout`
+  - Move the 5 token/cost accumulators to `StatusState`
+  - Move status line render logic to `renderStatus(state, cols): string`
+  - `AppLayout` holds `this.#statusState`, calls `renderStatus` in its render pass
+  - Tests: given a usage sequence, assert state totals and render output
diff --git a/apps/claude-sdk-cli/src/AgentMessageHandler.ts b/apps/claude-sdk-cli/src/AgentMessageHandler.ts
index afddfd1..29fff60 100644
--- a/apps/claude-sdk-cli/src/AgentMessageHandler.ts
+++ b/apps/claude-sdk-cli/src/AgentMessageHandler.ts
@@ -1,26 +1,115 @@
-import type { SdkMessage } from '@shellicar/claude-sdk';
-import type { AppLayout } from './AppLayout.js';
+import { relative } from 'node:path';
+import { type AnyToolDefinition, type CacheTtl, calculateCost, type SdkMessage, type SdkMessageUsage, type SdkToolApprovalRequest } from '@shellicar/claude-sdk';
+import type { RefStore } from '@shellicar/claude-sdk-tools/RefStore';
+import type { AppLayout, PendingTool } from './AppLayout.js';
 import type { logger } from './logger.js';
+import { getPermission, PermissionAction } from './permissions.js';
+
+// ---- helpers (moved from runAgent.ts) ------------------------------------
+
+function fmtBytes(n: number): string {
+  if (n >= 1024 * 1024) {
+    return `${(n / 1024 / 1024).toFixed(1)}mb`;
+  }
+  if (n >= 1024) {
+    return `${(n / 1024).toFixed(1)}kb`;
+  }
+  return `${n}b`;
+}
+
+function primaryArg(input: Record<string, unknown>, cwd: string): string | null {
+  for (const key of ['path', 'file']) {
+    if (typeof input[key] === 'string') {
+      return relative(cwd, input[key] as string) || (input[key] as string);
+    }
+  }
+  if (typeof input.pattern === 'string') {
+    return input.pattern;
+  }
+  if (typeof input.description === 'string') {
+    return input.description;
+  }
+  return null;
+}
+
+function formatRefSummary(input: Record<string, unknown>, store: RefStore): string {
+  const id = typeof input.id === 'string' ? input.id : '';
+  if (!id) {
+    return 'Ref(?)';
+  }
+  const hint = store.getHint(id) ?? id.slice(0, 8);
+  const content = store.get(id);
+  if (content === undefined) {
+    return `Ref(${id.slice(0, 8)}\u2026)`;
+  }
+  const sizeStr = fmtBytes(content.length);
+  const start = typeof input.start === 'number' ? input.start : 0;
+  const limit = typeof input.limit === 'number' ? input.limit : 1000;
+  const end = Math.min(start + limit, content.length);
+  return `Ref \u2190 ${hint} [${start}\u2013${end} / ${sizeStr}]`;
+}
+
+function formatToolSummary(name: string, input: Record<string, unknown>, cwd: string, store: RefStore): string {
+  if (name === 'Ref') {
+    return formatRefSummary(input, store);
+  }
+  if (name === 'Pipe' && Array.isArray(input.steps)) {
+    const steps = (input.steps as Array<{ tool?: unknown; input?: unknown }>)
+      .map((s) => {
+        const tool = typeof s.tool === 'string' ? s.tool : '?';
+        const stepInput = s.input != null && typeof s.input === 'object' ? (s.input as Record<string, unknown>) : {};
+        const arg = primaryArg(stepInput, cwd);
+        return arg ? `${tool}(${arg})` : tool;
+      })
+      .join(' | ');
+    return steps;
+  }
+  const arg = primaryArg(input, cwd);
+  return arg ? `${name}(${arg})` : name;
+}
+
+// ---- types ---------------------------------------------------------------
+
+export interface AgentMessageHandlerOptions {
+  model: string;
+  cacheTtl: CacheTtl;
+  cwd: string;
+  store: RefStore;
+  tools: AnyToolDefinition[];
+  respond: (requestId: string, approved: boolean) => void;
+}
+
+// ---- class ---------------------------------------------------------------
 
 /**
- * Handles the stateless SdkMessage cases: routes each message to the
- * appropriate layout call. No accumulated state here.
+ * Handles all SdkMessage cases: routes each message to the appropriate
+ * layout call or state mutation.
  *
- * Stateful cases (tool_approval_request, tool_error, message_usage) stay
- * in runAgent.ts until step 4b, when usageBeforeTools tracking and the
- * async tool approval flow move here too.
- *
- * NOTE: message_compaction currently omits the "compacted at X/Y (Z%)"
- * context-usage annotation. That annotation reads lastUsage, which is
- * set by message_usage — a 4b case. The annotation is restored in 4b.
+ * Stateless cases (query_summary, message_thinking, etc.) just delegate to
+ * layout. Stateful cases maintain usageBeforeTools / lastUsage to produce
+ * the per-tool-batch token-delta annotation on the sealed tools block.
  */
 export class AgentMessageHandler {
   #layout: AppLayout;
   #logger: typeof logger;
+  #model: string;
+  #cacheTtl: CacheTtl;
+  #cwd: string;
+  #store: RefStore;
+  #tools: AnyToolDefinition[];
+  #respond: (requestId: string, approved: boolean) => void;
+  #lastUsage: SdkMessageUsage | null = null;
+  #usageBeforeTools: SdkMessageUsage | null = null;
 
-  public constructor(layout: AppLayout, log: typeof logger) {
+  public constructor(layout: AppLayout, log: typeof logger, opts: AgentMessageHandlerOptions) {
     this.#layout = layout;
     this.#logger = log;
+    this.#model = opts.model;
+    this.#cacheTtl = opts.cacheTtl;
+    this.#cwd = opts.cwd;
+    this.#store = opts.store;
+    this.#tools = opts.tools;
+    this.#respond = opts.respond;
   }
 
   public handle(msg: SdkMessage): void {
@@ -28,7 +117,7 @@ export class AgentMessageHandler {
       case 'query_summary': {
         const parts = [`${msg.systemPrompts} system`, `${msg.userMessages} user`, `${msg.assistantMessages} assistant`, ...(msg.thinkingBlocks > 0 ? [`${msg.thinkingBlocks} thinking`] : [])];
         this.#layout.transitionBlock('meta');
-        this.#layout.appendStreaming(parts.join(' · '));
+        this.#layout.appendStreaming(parts.join(' \u00b7 '));
         break;
       }
       case 'message_thinking':
@@ -42,10 +131,56 @@ export class AgentMessageHandler {
       case 'message_compaction_start':
         this.#layout.transitionBlock('compaction');
         break;
-      case 'message_compaction':
+      case 'message_compaction': {
         this.#layout.transitionBlock('compaction');
         this.#layout.appendStreaming(msg.summary);
+        if (this.#lastUsage) {
+          const used = this.#lastUsage.inputTokens + this.#lastUsage.cacheCreationTokens + this.#lastUsage.cacheReadTokens;
+          const pct = ((used / this.#lastUsage.contextWindow) * 100).toFixed(1);
+          const fmt = (n: number) => (n >= 1000 ? `${(n / 1000).toFixed(1)}k` : String(n));
+          this.#layout.appendStreaming(`\n\n[compacted at ${fmt(used)} / ${fmt(this.#lastUsage.contextWindow)} (${pct}%)]`);
+        }
+        break;
+      }
+      case 'tool_approval_request':
+        this.#layout.transitionBlock('tools');
+        this.#layout.appendStreaming(`${formatToolSummary(msg.name, msg.input, this.#cwd, this.#store)}\n`);
+        if (!this.#usageBeforeTools) {
+          this.#usageBeforeTools = this.#lastUsage;
+        }
+        void this.#toolApprovalRequest(msg);
+        break;
+      case 'tool_error':
+        this.#layout.transitionBlock('tools');
+        this.#layout.appendStreaming(`${msg.name} error\n\`\`\`json\n${JSON.stringify(msg.input, null, 2)}\n\`\`\`\n\n${msg.error}\n`);
         break;
+      case 'message_usage': {
+        this.#logger.debug('message_usage', { hasUsageBeforeTools: this.#usageBeforeTools !== null });
+        if (this.#usageBeforeTools !== null) {
+          const prev = this.#usageBeforeTools;
+          const prevCtx = prev.inputTokens + prev.cacheCreationTokens + prev.cacheReadTokens;
+          const currCtx = msg.inputTokens + msg.cacheCreationTokens + msg.cacheReadTokens;
+          const delta = currCtx - prevCtx;
+          const sign = delta >= 0 ? '+' : '';
+          const marginalCost = calculateCost(
+            {
+              inputTokens: Math.max(0, msg.inputTokens - prev.inputTokens),
+              cacheCreationTokens: Math.max(0, msg.cacheCreationTokens - prev.cacheCreationTokens),
+              cacheReadTokens: Math.max(0, msg.cacheReadTokens - prev.cacheReadTokens),
+              outputTokens: msg.outputTokens,
+            },
+            this.#model,
+            this.#cacheTtl,
+          );
+          const costStr = `$${marginalCost.toFixed(4)}`;
+          this.#logger.debug('tool_batch_tokens', { prevCtx, currCtx, delta, marginalCost });
+          this.#layout.appendToLastSealed('tools', `[\u2191 ${sign}${delta.toLocaleString()} tokens \u00b7 ${costStr}]\n`);
+          this.#usageBeforeTools = null;
+        }
+        this.#lastUsage = msg;
+        this.#layout.updateUsage(msg);
+        break;
+      }
       case 'done':
         this.#logger.info('done', { stopReason: msg.stopReason });
         if (msg.stopReason !== 'end_turn') {
@@ -59,4 +194,29 @@ export class AgentMessageHandler {
         break;
     }
   }
+
+  async #toolApprovalRequest(msg: SdkToolApprovalRequest): Promise<void> {
+    try {
+      this.#logger.info('tool_approval_request', { name: msg.name, input: msg.input });
+      const pendingTool: PendingTool = { requestId: msg.requestId, name: msg.name, input: msg.input };
+      this.#layout.addPendingTool(pendingTool);
+      const perm = getPermission({ name: msg.name, input: msg.input }, this.#tools, this.#cwd);
+      let approved: boolean;
+      if (perm === PermissionAction.Approve) {
+        this.#logger.info('Auto approving', { name: msg.name });
+        approved = true;
+      } else if (perm === PermissionAction.Deny) {
+        this.#logger.info('Auto denying', { name: msg.name });
+        approved = false;
+      } else {
+        approved = await this.#layout.requestApproval();
+      }
+      this.#respond(msg.requestId, approved);
+      this.#layout.removePendingTool(msg.requestId);
+    } catch (err) {
+      this.#logger.error('Error', err);
+      this.#respond(msg.requestId, false);
+      this.#layout.removePendingTool(msg.requestId);
+    }
+  }
 }
diff --git a/apps/claude-sdk-cli/src/runAgent.ts b/apps/claude-sdk-cli/src/runAgent.ts
index 78723a2..f123a76 100644
--- a/apps/claude-sdk-cli/src/runAgent.ts
+++ b/apps/claude-sdk-cli/src/runAgent.ts
@@ -1,5 +1,4 @@
-import { relative } from 'node:path';
-import { AnthropicBeta, type AnyToolDefinition, type CacheTtl, calculateCost, type IAnthropicAgent, type SdkMessage, type SdkMessageUsage, type SdkToolApprovalRequest } from '@shellicar/claude-sdk';
+import { AnthropicBeta, type AnyToolDefinition, type IAnthropicAgent, type SdkMessage } from '@shellicar/claude-sdk';
 import { CreateFile } from '@shellicar/claude-sdk-tools/CreateFile';
 import { DeleteDirectory } from '@shellicar/claude-sdk-tools/DeleteDirectory';
 import { DeleteFile } from '@shellicar/claude-sdk-tools/DeleteFile';
@@ -17,73 +16,10 @@ import type { RefStore } from '@shellicar/claude-sdk-tools/RefStore';
 import { SearchFiles } from '@shellicar/claude-sdk-tools/SearchFiles';
 import { Tail } from '@shellicar/claude-sdk-tools/Tail';
 import { AgentMessageHandler } from './AgentMessageHandler.js';
-import type { AppLayout, PendingTool } from './AppLayout.js';
+import type { AppLayout } from './AppLayout.js';
 import { logger } from './logger.js';
-import { getPermission, PermissionAction } from './permissions.js';
 import { systemPrompts } from './systemPrompts.js';
 
-function fmtBytes(n: number): string {
-  if (n >= 1024 * 1024) {
-    return `${(n / 1024 / 1024).toFixed(1)}mb`;
-  }
-  if (n >= 1024) {
-    return `${(n / 1024).toFixed(1)}kb`;
-  }
-  return `${n}b`;
-}
-
-function primaryArg(input: Record<string, unknown>, cwd: string): string | null {
-  for (const key of ['path', 'file']) {
-    if (typeof input[key] === 'string') {
-      return relative(cwd, input[key] as string) || (input[key] as string);
-    }
-  }
-  if (typeof input.pattern === 'string') {
-    return input.pattern;
-  }
-  if (typeof input.description === 'string') {
-    return input.description;
-  }
-  return null;
-}
-
-function formatRefSummary(input: Record<string, unknown>, store: RefStore): string {
-  const id = typeof input.id === 'string' ? input.id : '';
-  if (!id) {
-    return 'Ref(?)';
-  }
-  const hint = store.getHint(id) ?? id.slice(0, 8);
-  const content = store.get(id);
-  if (content === undefined) {
-    return `Ref(${id.slice(0, 8)}…)`;
-  }
-  const sizeStr = fmtBytes(content.length);
-  // start and limit always have defaults now (0 and 1000) so always show the range
-  const start = typeof input.start === 'number' ? input.start : 0;
-  const limit = typeof input.limit === 'number' ? input.limit : 1000;
-  const end = Math.min(start + limit, content.length);
-  return `Ref ← ${hint} [${start}–${end} / ${sizeStr}]`;
-}
-
-function formatToolSummary(name: string, input: Record<string, unknown>, cwd: string, store: RefStore): string {
-  if (name === 'Ref') {
-    return formatRefSummary(input, store);
-  }
-  if (name === 'Pipe' && Array.isArray(input.steps)) {
-    const steps = (input.steps as Array<{ tool?: unknown; input?: unknown }>)
-      .map((s) => {
-        const tool = typeof s.tool === 'string' ? s.tool : '?';
-        const stepInput = s.input != null && typeof s.input === 'object' ? (s.input as Record<string, unknown>) : {};
-        const arg = primaryArg(stepInput, cwd);
-        return arg ? `${tool}(${arg})` : tool;
-      })
-      .join(' | ');
-    return steps;
-  }
-  const arg = primaryArg(input, cwd);
-  return arg ? `${name}(${arg})` : name;
-}
-
 export async function runAgent(agent: IAnthropicAgent, prompt: string, layout: AppLayout, store: RefStore): Promise<void> {
   const pipeSource = [Find, ReadFile, Grep, Head, Tail, Range, SearchFiles];
   const { tool: Ref, transformToolResult: refTransform } = createRef(store, 2_000);
@@ -92,10 +28,8 @@ export async function runAgent(agent: IAnthropicAgent, prompt: string, layout: A
   const tools: AnyToolDefinition[] = [pipe, ...pipeSource, ...otherTools];
 
   const cwd = process.cwd();
-  let lastUsage: SdkMessageUsage | null = null;
-  /** Snapshot of usage at the start of the current tool batch; used to compute the token delta
-   * when the next message_usage arrives. Non-null while a batch is in-flight. */
-  let usageBeforeTools: SdkMessageUsage | null = null;
+  const model = 'claude-sonnet-4-6';
+  const cacheTtl = '5m' as const;
 
   const transformToolResult = (toolName: string, output: unknown): unknown => {
     const result = refTransform(toolName, output);
@@ -106,13 +40,8 @@ export async function runAgent(agent: IAnthropicAgent, prompt: string, layout: A
     return result;
   };
 
-  const handler = new AgentMessageHandler(layout, logger);
-
   layout.startStreaming(prompt);
 
-  const model = 'claude-sonnet-4-6';
-  const cacheTtl: CacheTtl = '5m';
-
   const { port, done } = agent.runAgent({
     model,
     maxTokens: 32768,
@@ -136,101 +65,13 @@ export async function runAgent(agent: IAnthropicAgent, prompt: string, layout: A
     },
   });
 
-  const toolApprovalRequest = async (msg: SdkToolApprovalRequest) => {
-    try {
-      logger.info('tool_approval_request', { name: msg.name, input: msg.input });
-
-      const pendingTool: PendingTool = { requestId: msg.requestId, name: msg.name, input: msg.input };
-      layout.addPendingTool(pendingTool);
-
-      const perm = getPermission({ name: msg.name, input: msg.input }, tools, cwd);
-      let approved: boolean;
-      if (perm === PermissionAction.Approve) {
-        logger.info('Auto approving', { name: msg.name });
-        approved = true;
-      } else if (perm === PermissionAction.Deny) {
-        logger.info('Auto denying', { name: msg.name });
-        approved = false;
-      } else {
-        approved = await layout.requestApproval();
-      }
-
-      port.postMessage({ type: 'tool_approval_response', requestId: msg.requestId, approved });
-      layout.removePendingTool(msg.requestId);
-    } catch (err) {
-      logger.error('Error', err);
-      port.postMessage({ type: 'tool_approval_response', requestId: msg.requestId, approved: false });
-      layout.removePendingTool(msg.requestId);
-    }
+  const respond = (requestId: string, approved: boolean) => {
+    port.postMessage({ type: 'tool_approval_response', requestId, approved });
   };
 
-  port.on('message', (msg: SdkMessage) => {
-    switch (msg.type) {
-      case 'query_summary': {
-        const parts = [`${msg.systemPrompts} system`, `${msg.userMessages} user`, `${msg.assistantMessages} assistant`, ...(msg.thinkingBlocks > 0 ? [`${msg.thinkingBlocks} thinking`] : [])];
-        layout.transitionBlock('meta');
-        layout.appendStreaming(parts.join(' · '));
-        break;
-      }
+  const handler = new AgentMessageHandler(layout, logger, { model, cacheTtl, cwd, store, tools, respond });
 
-      case 'message_thinking':
-        layout.transitionBlock('thinking');
-        layout.appendStreaming(msg.text);
-        break;
-      case 'message_text':
-        layout.transitionBlock('response');
-        layout.appendStreaming(msg.text);
-        break;
-      case 'tool_approval_request':
-        layout.transitionBlock('tools');
-        layout.appendStreaming(`${formatToolSummary(msg.name, msg.input, cwd, store)}\n`);
-        // Snapshot usage at the start of the first tool in this batch so we can
-        // compute the per-batch turn cost when the next message_usage arrives.
-        if (!usageBeforeTools) {
-          usageBeforeTools = lastUsage;
-        }
-        toolApprovalRequest(msg);
-        break;
-      case 'tool_error':
-        layout.transitionBlock('tools');
-        layout.appendStreaming(`${msg.name} error\n\`\`\`json\n${JSON.stringify(msg.input, null, 2)}\n\`\`\`\n\n${msg.error}\n`);
-        break;
-      case 'message_usage': {
-        // Annotate the (now-sealed) tools block with how many tokens this batch added to the
-        // context window: delta = (input+cacheCreate+cacheRead at N+1) - (same at N).
-        // This captures tool-result tokens + the assistant tool-call tokens that moved into
-        // the cache between turns. The running cost total is in the status bar.
-        logger.debug('message_usage', { hasUsageBeforeTools: usageBeforeTools !== null });
-        if (usageBeforeTools !== null) {
-          const prevCtx = usageBeforeTools.inputTokens + usageBeforeTools.cacheCreationTokens + usageBeforeTools.cacheReadTokens;
-          const currCtx = msg.inputTokens + msg.cacheCreationTokens + msg.cacheReadTokens;
-          const delta = currCtx - prevCtx;
-          const sign = delta >= 0 ? '+' : '';
-          // Marginal cost: price only the net-new tokens this batch added (delta per category)
-          // plus the output tokens Claude generated in response to those results.
-          const marginalCost = calculateCost(
-            {
-              inputTokens: Math.max(0, msg.inputTokens - usageBeforeTools.inputTokens),
-              cacheCreationTokens: Math.max(0, msg.cacheCreationTokens - usageBeforeTools.cacheCreationTokens),
-              cacheReadTokens: Math.max(0, msg.cacheReadTokens - usageBeforeTools.cacheReadTokens),
-              outputTokens: msg.outputTokens,
-            },
-            model,
-            cacheTtl,
-          );
-          const costStr = `$${marginalCost.toFixed(4)}`;
-          logger.debug('tool_batch_tokens', { prevCtx, currCtx, delta, marginalCost });
-          layout.appendToLastSealed('tools', `[\u2191 ${sign}${delta.toLocaleString()} tokens \u00b7 ${costStr}]\n`);
-          usageBeforeTools = null;
-        }
-        lastUsage = msg;
-        layout.updateUsage(msg);
-        break;
-      }
-      default:
-        handler.handle(msg);
-    }
-  });
+  port.on('message', (msg: SdkMessage) => handler.handle(msg));
 
   layout.setCancelFn(() => port.postMessage({ type: 'cancel' }));
 
diff --git a/apps/claude-sdk-cli/test/AgentMessageHandler.spec.ts b/apps/claude-sdk-cli/test/AgentMessageHandler.spec.ts
index 13860e1..5018ac7 100644
--- a/apps/claude-sdk-cli/test/AgentMessageHandler.spec.ts
+++ b/apps/claude-sdk-cli/test/AgentMessageHandler.spec.ts
@@ -1,5 +1,5 @@
 import { describe, expect, it, vi } from 'vitest';
-import { AgentMessageHandler } from '../src/AgentMessageHandler.js';
+import { AgentMessageHandler, type AgentMessageHandlerOptions } from '../src/AgentMessageHandler.js';
 import type { AppLayout } from '../src/AppLayout.js';
 import { logger } from '../src/logger.js';
 
@@ -11,11 +11,28 @@ function makeLayout() {
   return {
     transitionBlock: vi.fn(),
     appendStreaming: vi.fn(),
+    appendToLastSealed: vi.fn(),
+    updateUsage: vi.fn(),
+    addPendingTool: vi.fn(),
+    removePendingTool: vi.fn(),
+    requestApproval: vi.fn().mockResolvedValue(true),
   } as unknown as AppLayout;
 }
 
-function makeHandler(layout: AppLayout) {
-  return new AgentMessageHandler(layout, logger);
+function makeOpts(overrides: Partial<AgentMessageHandlerOptions> = {}): AgentMessageHandlerOptions {
+  return {
+    model: 'claude-test',
+    cacheTtl: '5m',
+    cwd: '/test',
+    store: { get: vi.fn(), getHint: vi.fn() } as unknown as AgentMessageHandlerOptions['store'],
+    tools: [],
+    respond: vi.fn(),
+    ...overrides,
+  };
+}
+
+function makeHandler(layout: AppLayout, opts?: Partial<AgentMessageHandlerOptions>) {
+  return new AgentMessageHandler(layout, logger, makeOpts(opts));
 }
 
 // ---------------------------------------------------------------------------
@@ -187,3 +204,145 @@ describe('AgentMessageHandler — error', () => {
     expect(actual).toBe(expected);
   });
 });
+
+// ---------------------------------------------------------------------------
+// tool_error
+// ---------------------------------------------------------------------------
+
+describe('AgentMessageHandler — tool_error', () => {
+  it('transitions to tools block', () => {
+    const layout = makeLayout();
+    makeHandler(layout).handle({ type: 'tool_error', name: 'EditFile', input: { file: 'x.ts' }, error: 'oops' });
+    const expected = 'tools';
+    const actual = vi.mocked(layout.transitionBlock).mock.calls[0]?.[0];
+    expect(actual).toBe(expected);
+  });
+
+  it('streams tool name in the error line', () => {
+    const layout = makeLayout();
+    makeHandler(layout).handle({ type: 'tool_error', name: 'EditFile', input: { file: 'x.ts' }, error: 'boom' });
+    const expected = true;
+    const actual = (vi.mocked(layout.appendStreaming).mock.calls[0]?.[0] ?? '').includes('EditFile error');
+    expect(actual).toBe(expected);
+  });
+
+  it('includes the error message in the output', () => {
+    const layout = makeLayout();
+    makeHandler(layout).handle({ type: 'tool_error', name: 'EditFile', input: { file: 'x.ts' }, error: 'bad things' });
+    const expected = true;
+    const actual = (vi.mocked(layout.appendStreaming).mock.calls[0]?.[0] ?? '').includes('bad things');
+    expect(actual).toBe(expected);
+  });
+});
+
+// ---------------------------------------------------------------------------
+// message_usage
+// ---------------------------------------------------------------------------
+
+function makeUsage(inputTokens: number): { type: 'message_usage'; inputTokens: number; cacheCreationTokens: number; cacheReadTokens: number; outputTokens: number; costUsd: number; contextWindow: number } {
+  return { type: 'message_usage', inputTokens, cacheCreationTokens: 0, cacheReadTokens: 0, outputTokens: 100, costUsd: 0.001, contextWindow: 200_000 };
+}
+
+describe('AgentMessageHandler — message_usage without prior tools', () => {
+  it('calls updateUsage', () => {
+    const layout = makeLayout();
+    makeHandler(layout).handle(makeUsage(1000));
+    const expected = 1;
+    const actual = vi.mocked(layout.updateUsage).mock.calls.length;
+    expect(actual).toBe(expected);
+  });
+
+  it('does not annotate when no tool batch is open', () => {
+    const layout = makeLayout();
+    makeHandler(layout).handle(makeUsage(1000));
+    const expected = 0;
+    const actual = vi.mocked(layout.appendToLastSealed).mock.calls.length;
+    expect(actual).toBe(expected);
+  });
+});
+
+describe('AgentMessageHandler — message_usage delta annotation', () => {
+  it('annotates the tools block with token delta after a tool batch', () => {
+    const layout = makeLayout();
+    const handler = makeHandler(layout);
+    // Establish a baseline usage, then fire a tool that snapshots it
+    handler.handle(makeUsage(1000));
+    handler.handle({ type: 'tool_approval_request', requestId: 'r1', name: 'Find', input: { path: '.' } });
+    // Next usage shows 500 more input tokens
+    handler.handle(makeUsage(1500));
+    const expected = true;
+    const annotation = vi.mocked(layout.appendToLastSealed).mock.calls[0]?.[1] ?? '';
+    const actual = annotation.includes('+500');
+    expect(actual).toBe(expected);
+  });
+
+  it('resets usageBeforeTools after the annotation so second batch computes independently', () => {
+    const layout = makeLayout();
+    const handler = makeHandler(layout);
+    handler.handle(makeUsage(1000));
+    handler.handle({ type: 'tool_approval_request', requestId: 'r1', name: 'Find', input: { path: '.' } });
+    handler.handle(makeUsage(1500));
+    // Second tool batch
+    handler.handle({ type: 'tool_approval_request', requestId: 'r2', name: 'Find', input: { path: '.' } });
+    handler.handle(makeUsage(1700));
+    // appendToLastSealed should have been called twice, second annotation for +200
+    const expected = 2;
+    const actual = vi.mocked(layout.appendToLastSealed).mock.calls.length;
+    expect(actual).toBe(expected);
+  });
+
+  it('second batch delta is computed from the post-first-batch usage, not the original', () => {
+    const layout = makeLayout();
+    const handler = makeHandler(layout);
+    handler.handle(makeUsage(1000));
+    handler.handle({ type: 'tool_approval_request', requestId: 'r1', name: 'Find', input: { path: '.' } });
+    handler.handle(makeUsage(1500));
+    handler.handle({ type: 'tool_approval_request', requestId: 'r2', name: 'Find', input: { path: '.' } });
+    handler.handle(makeUsage(1700));
+    const expected = true;
+    const secondAnnotation = vi.mocked(layout.appendToLastSealed).mock.calls[1]?.[1] ?? '';
+    const actual = secondAnnotation.includes('+200');
+    expect(actual).toBe(expected);
+  });
+
+  it('does not snapshot usageBeforeTools again if batch already open', () => {
+    const layout = makeLayout();
+    const handler = makeHandler(layout);
+    handler.handle(makeUsage(1000));
+    // Two tools in the same batch before usage arrives
+    handler.handle({ type: 'tool_approval_request', requestId: 'r1', name: 'Find', input: { path: '.' } });
+    handler.handle({ type: 'tool_approval_request', requestId: 'r2', name: 'Find', input: { path: '.' } });
+    handler.handle(makeUsage(1800));
+    // Delta should be relative to the 1000 baseline, not the second tool
+    const expected = true;
+    const annotation = vi.mocked(layout.appendToLastSealed).mock.calls[0]?.[1] ?? '';
+    const actual = annotation.includes('+800');
+    expect(actual).toBe(expected);
+  });
+});
+
+// ---------------------------------------------------------------------------
+// message_compaction with lastUsage
+// ---------------------------------------------------------------------------
+
+describe('AgentMessageHandler — message_compaction annotation', () => {
+  it('appends context-usage annotation when lastUsage is known', () => {
+    const layout = makeLayout();
+    const handler = makeHandler(layout);
+    handler.handle(makeUsage(150_000));
+    handler.handle({ type: 'message_compaction', summary: 'trimmed' });
+    const calls = vi.mocked(layout.appendStreaming).mock.calls;
+    const expected = true;
+    const actual = calls.some((c) => (c[0] ?? '').includes('compacted at'));
+    expect(actual).toBe(expected);
+  });
+
+  it('omits annotation when lastUsage is not yet known', () => {
+    const layout = makeLayout();
+    makeHandler(layout).handle({ type: 'message_compaction', summary: 'trimmed' });
+    const calls = vi.mocked(layout.appendStreaming).mock.calls;
+    const expected = false;
+    const actual = calls.some((c) => (c[0] ?? '').includes('compacted at'));
+    expect(actual).toBe(expected);
+  });
+});