Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 31 additions & 36 deletions .claude/CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,32 +66,27 @@ Every session has three phases: start, work, end.

<!-- BEGIN:REPO:current-state -->
## Current State
Branch: `main` — clean. PR #182 open (`feature/architecture-refactor-plan`, docs only).
Branch: `feature/agent-message-handler-stateful` — PR #193 open (step 4b), auto-merge set.

Active development is in **`apps/claude-sdk-cli/`** — a TUI terminal app built on `@shellicar/claude-sdk`.

**Architecture refactor planned** — see `.claude/plans/architecture-refactor.md` for the full
step-by-step plan with estimates and risk ratings. Next step: **1a** (split `Conversation`
from `ConversationStore`). Each substep ships independently; the CLI works at every commit.

**Completed:**
- Full cursor-aware multi-line editor (`AppLayout.ts`)
- Clipboard text/file attachments via command mode
- `ConversationHistory.push(msg, {id?})` + `remove(id)` for tagged message pruning
- `IAnthropicAgent.injectContext/removeContext` public API
- `RunAgentQuery.thinking` + `pauseAfterCompact` + `systemPrompts` options
- `BetaMessageParam` used directly in public interface
- Ref tool + RefStore for large output ref-swapping
- Tool approval flow (auto-approve/deny/prompt)
- Compaction display with context high-water mark
- OAuth2 authentication (`TokenRefreshingAnthropic` subclass)
- System prompts hardcoded in `apps/claude-sdk-cli/src/systemPrompts.ts`
- `SdkQuerySummary` — query context block shown before each API call (ℹ️ query)

**In-progress / next:**
- Architecture refactor step 1a — split `Conversation` from `ConversationStore`
- History replay in TUI (step 1b) — show prior turns on startup
- Vitest setup (prerequisite for unit tests, do alongside 1a)
**Architecture refactor in progress** — see `.claude/plans/architecture-refactor.md`.
Follows a State / Renderer / ScreenCoordinator (MVVM) pattern. Each substep ships independently.

**Completed refactor steps:**
- **1a** `Conversation` (pure data) split from `ConversationStore` (I/O) — PR #183
- **1b** History replay into TUI on startup — PR #186
- **2** `RequestBuilder` pure function extracted from `AgentRun` — PR #187
- **3a** `EditorState` extracted from `AppLayout` (fields + `reset`) — PR #189
- **3b** `EditorState.handleKey` — all editor key transitions moved out of `AppLayout` — PR #190
- **3c** `renderEditor(state, cols): string[]` pure renderer extracted — PR #191
- **4a** `AgentMessageHandler` stateless cases extracted from `runAgent.ts` — PR #192
- **4b** `AgentMessageHandler` stateful cases moved in (`message_usage`, `tool_approval_request`, `tool_error`) — PR #193 (pending merge)

**Next: step 5a** — extract `StatusState` + `renderStatus` from `AppLayout`
- Move the 5 token/cost accumulators to `StatusState`
- Move status line render logic to `renderStatus(state, cols): string`
- `AppLayout` holds `this.#statusState`, calls `renderStatus` in its render pass
<!-- END:REPO:current-state -->

<!-- BEGIN:REPO:vision -->
Expand Down Expand Up @@ -138,13 +133,19 @@ Full detail: `.claude/five-banana-pillars.md`
| `AttachmentStore.ts` | `TextAttachment \| FileAttachment` union; SHA-256 dedup; 10 KB text cap; `addFile(path, kind, size?)` |
| `clipboard.ts` | `readClipboardText()`; three-stage `readClipboardPath()` (pbpaste → VS Code code/file-list JXA → osascript furl); `looksLikePath`; `sanitiseFurlResult` |
| `runAgent.ts` | Wires agent to layout: sets up tools, beta flags, event handlers |
| File | Role |
|------|------|
| `entry/main.ts` | Entry point: creates agent, layout, starts readline loop |
| `AppLayout.ts` | TUI: full cursor editor, streaming display, compaction blocks, tool approval, command mode, attachment chips |
| `AttachmentStore.ts` | `TextAttachment \| FileAttachment` union; SHA-256 dedup; 10 KB text cap; `addFile(path, kind, size?)` |
| `clipboard.ts` | `readClipboardText()`; three-stage `readClipboardPath()` (pbpaste → VS Code code/file-list JXA → osascript furl); `looksLikePath`; `sanitiseFurlResult` |
| `EditorState.ts` | Pure editor state + `handleKey(key): boolean` transitions. No rendering, no I/O. |
| `renderEditor.ts` | Pure `renderEditor(state: EditorState, cols: number): string[]` renderer. |
| `AgentMessageHandler.ts` | Maps all `SdkMessage` events → layout calls / state mutations. Extracted from `runAgent.ts`. |
| `runAgent.ts` | Wires agent to layout: sets up tools, beta flags, constructs handler, wires `port.on` |
| `permissions.ts` | Tool auto-approve/deny rules |
| `redact.ts` | Strips sensitive values from tool inputs before display |
| `logger.ts` | Winston file logger (`claude-sdk-cli.log`) |

### Key files in `packages/claude-sdk/src/`

| File | Role |
|------|------|
| `public/interfaces.ts` | `IAnthropicAgent` abstract class (public contract) |
| `public/types.ts` | `RunAgentQuery`, `SdkMessage` union, tool types |
Expand Down Expand Up @@ -230,22 +231,16 @@ Opt-in via `shellicarMcp: true` config. Registers an in-process MCP server (`she

9. **No atomic session file writes** — `writeFileSync` is not atomic. Crash during write corrupts `.claude/cli-session`.
<!-- END:REPO:known-debt -->

<!-- BEGIN:REPO:recent-decisions -->
- **MVVM architecture refactor** (2026-04-06): Three-layer model — State (pure data + transitions), Renderer (pure `(state, cols) → string[]`), ScreenCoordinator (owns screen, routes events, calls renderers). Pull-based: coordinator decides when to render. Plan in `.claude/plans/architecture-refactor.md`. Enables unit testing of state and render logic without terminal knowledge.
- **`previousPatchId` chaining for multi-patch edits** (2026-04-06): When making sequential edits to the same file, chain `PreviewEdit` calls using `previousPatchId`, then apply once with `EditFile`. Previewing without applying then moving to a second patch is the failure mode — only the second patch gets applied, first is silently dropped. esbuild and vitest don't catch this; it only surfaces at runtime.
- **f command clipboard system** (2026-04-05): Three-stage `readClipboardPath()` — (1) pbpaste filtered by `looksLikePath`, (2) VS Code `code/file-list` JXA probe (file:// URI → POSIX path), (3) osascript `furl` filtered by `sanitiseFurlResult`. Injectable `readClipboardPathCore` for tests. `looksLikePath` is permissive (accepts bare-relative like `apps/foo/bar.ts`); `isLikelyPath` in AppLayout is strict (explicit prefixes only) and only used for the missing-chip case. `sanitiseFurlResult` rejects paths containing `:` (HFS artifacts). `f` handler is stat-first: if the file exists attach it directly; only apply `isLikelyPath` if stat fails.
- **Clipboard text attachments** (2026-04-06): `ctrl+/` enters command mode; `t` reads clipboard via `pbpaste` and adds a `<document>` block attachment; `d` removes selected chip; `← →` select chips. On `ctrl+enter` submit, attachments are folded into the prompt as `<document>` XML blocks and cleared.
- **ConversationHistory ID tagging** (2026-04-06): `push(msg, { id? })` tags messages for later removal. `remove(id)` splices the last item with matching ID. IDs are session-scoped (not persisted). Used by `IAnthropicAgent.injectContext/removeContext` for skills context management.
- **IAnthropicAgent uses BetaMessageParam** (2026-04-06): `getHistory/loadHistory/injectContext` now use `BetaMessageParam` directly instead of `JsonObject` casts. `JsonObject`, `JsonValue`, `ContextMessage` types removed. `BetaMessageParam` re-exported from package index.
- **thinking/pauseAfterCompact as RunAgentQuery options** (2026-04-06): Both default off. `thinking: true` adds `{ type: 'adaptive' }` to the API body. `pauseAfterCompact: true` wires into `compact_20260112.pause_after_compaction`. When `pauseAfterCompact: true` and compaction fires, the agent sends `done` with `stopReason: 'pause_turn'` — user sees the summary and resumes manually (intentional UX).
- **Skills timing design issue** (2026-04-06): Documented in `docs/skills-design.md`. Calling `agent.injectContext()` from inside a tool handler merges the injected user message with the pending tool-results user message (consecutive merge policy). Resolution options documented; implementation deferred.
## Recent Decisions

- **Structured command execution via in-process MCP** (#99) — replaced freeform Bash with a structured Exec tool served by an in-process MCP server. Glob-based auto-approve (`execAutoApprove`) with custom zero-dep glob matcher (no minimatch dependency).
- **Exec tool extracted to `@shellicar/mcp-exec`** — schema, executor, pipeline, validation rules, and ANSI stripping moved to a published package. CLI retains only `autoApprove.ts` (CLI-specific config concern).
- **ZWJ sanitisation in layout pipeline**: `sanitiseZwj` strips U+200D before `wrapLine` measures width. Terminals render ZWJ sequences as individual emojis; `string-width` assumes composed form. Stripping at the layout boundary removes the mismatch.
- **Monorepo workspace conversion**: CLI source moved to `packages/claude-cli/`. Root package is private workspace with turbo, syncpack, biome, lefthook. Turbo orchestrates build/test/type-check. syncpack enforces version consistency. `.packagename` file at root holds the active package name for scripts and pre-push hooks.
- **SDK bidirectional channel** (`packages/claude-sdk/`): New package wrapping the Anthropic API. Uses `MessagePort` for bidirectional consumer/SDK communication. Tool validation (existence + input schema) happens before approval requests are sent. Approval requests are sent in bulk; tools execute in approval-arrival order.
- **Screen utilities extracted to `claude-core`**: `sanitise`, `reflow` (wrapLine/rewrapFromSegments/computeLineSegments), `screen` (Screen interface + StdoutScreen), `status-line` (StatusLineBuilder), `viewport` (Viewport), `renderer` (Renderer) all moved from `claude-cli` to `claude-core`. `claude-cli` now imports from `@shellicar/claude-core/*`. `tsconfig.json` in claude-core requires `"types": ["node"]` for process globals with moduleResolution bundler.
<!-- END:REPO:recent-decisions -->
<!-- END:REPO:recent-decisions -->

<!-- BEGIN:REPO:extra -->
Expand Down
102 changes: 102 additions & 0 deletions .claude/sessions/2026-04-06.md
Original file line number Diff line number Diff line change
Expand Up @@ -314,3 +314,105 @@ if (key.type !== 'ctrl+enter') return;
- The editor rendering block in `AppLayout.render()` pushes: a divider, a blank line, then each editor line with prefix and cursor highlighting via `INVERSE_ON/OFF`
- Dependencies to resolve: `buildDivider`, `BLOCK_PLAIN.prompt`, `EDITOR_PROMPT`, `CONTENT_INDENT` — check whether these are AppLayout-private or shareable
- Result is a pure `string[]`, testable without ANSI stripping (just check the non-cursor lines contain the text, cursor line contains `INVERSE_ON`)


---

## Session continuation 5 (same day, later still)

### Step 3c — `renderEditor` (PR #191, merged)

New file `apps/claude-sdk-cli/src/renderEditor.ts`:
- Pure function `renderEditor(state: EditorState, cols: number): string[]`
- Returns only the editor text lines — no divider, no trailing blank line. `AppLayout` owns that chrome for all block types, so the renderer shouldn't add it.
- Handles prefix (`💬 ` for line 0, ` ` for subsequent lines), cursor highlighting via `INVERSE_ON`/`INVERSE_OFF`, end-of-line cursor space, and `wrapLine` for long lines.
- `EDITOR_PROMPT` constant removed from `AppLayout` — only consumer was the rendering loop, which moved here.

`apps/claude-sdk-cli/test/renderEditor.spec.ts` — 11 tests covering: empty state cursor, prefix on line 0 only, cursor highlights mid-line character, cursor at EOL appends space inside inverse, multi-line indent prefix, line wrapping.

Bananabot review: "Clean extraction, textbook renderer. YAGNI on shared constants." 🍌

---

### Step 4a — `AgentMessageHandler` stateless cases (PR #192, merged)

New file `apps/claude-sdk-cli/src/AgentMessageHandler.ts`:
- `constructor(layout: AppLayout, log: typeof logger)`
- `public handle(msg: SdkMessage): void`
- Handles: `query_summary`, `message_thinking`, `message_text`, `message_compaction_start`, `message_compaction`, `done`, `error`
- These are all the cases that require no accumulated state — just route to a layout call.

`runAgent.ts` changes:
- `import { AgentMessageHandler }` added
- `const handler = new AgentMessageHandler(layout, logger)` instantiated at function top
- The switch retains explicit cases for `tool_approval_request`, `tool_error`, `message_usage` (all stateful — need `usageBeforeTools`, `lastUsage`, async approval flow)
- `default: handler.handle(msg)` catches everything else

**Known temporary regression:** `message_compaction` no longer shows the `[compacted at X/Y (Z%)]` annotation. That annotation reads `lastUsage`, which is only set by `message_usage` — a 4b case. The omission is documented in a comment in `AgentMessageHandler.ts`. Compaction only fires at 150k tokens so the regression isn't visible in normal use.

`apps/claude-sdk-cli/test/AgentMessageHandler.spec.ts` — 16 tests using `vi.fn()` mocks cast to `AppLayout`.

#### Bug encountered and lesson learned

During the 4a edits I previewed two separate patches to `runAgent.ts` — one adding the import+declaration, one removing the old cases and adding `default: handler.handle(msg)` — but applied only the second without applying the first. The result compiled fine (esbuild doesn't type-check) and tests didn't catch it (vitest doesn't import `runAgent.ts` directly), so the `ReferenceError: handler is not defined` only surfaced at runtime.

Fix: a second commit adding the missing import + `const handler` declaration.

**Lesson:** When making sequential edits to the same file, either chain them with `previousPatchId` so they compose into a single `EditFile`, or apply each patch immediately before previewing the next. Preview-without-apply then move on is the failure mode.

---

### State at end of session

- Branch: `main`, clean, up to date (PR #192 squash-merged)
- **Next: step 4b** — move stateful message handling into `AgentMessageHandler`
- Add fields: `usageBeforeTools: SdkMessageUsage | null`, `lastUsage: SdkMessageUsage | null`
- Move `message_usage` handling in: delta token calculation, marginal cost via `calculateCost`, `appendToLastSealed('tools', ...)` annotation
- Move `tool_approval_request` in: snapshot `usageBeforeTools`, call async `toolApprovalRequest`
- Move `tool_error` in
- Constructor will need additional dependencies: `model`, `cacheTtl`, `cwd`, `store`, and a `respond` callback for posting `tool_approval_response`
- The reset timing for `usageBeforeTools = null` after a tool batch closes is the risky part — test that sequence explicitly


---

## Session continuation 6 (same day, even later)

### Step 4b — `AgentMessageHandler` stateful cases (PR #193)

Completes the `AgentMessageHandler` extraction. Everything that was still in `runAgent.ts`'s message switch moved in.

**What moved into `AgentMessageHandler.ts`:**
- Helper functions `fmtBytes`, `primaryArg`, `formatRefSummary`, `formatToolSummary` (were module-level in `runAgent.ts`)
- Private fields `#lastUsage: SdkMessageUsage | null` and `#usageBeforeTools: SdkMessageUsage | null`
- `message_usage` case: computes delta tokens and marginal cost, annotates the sealed tools block via `appendToLastSealed`, resets `usageBeforeTools`
- `tool_approval_request` case: snapshots `usageBeforeTools` on first tool of batch, fires `void this.#toolApprovalRequest(msg)`
- `tool_error` case: formats the error block
- `#toolApprovalRequest` private async method: permission check, approval prompt, `respond` callback, `addPendingTool`/`removePendingTool`

**Constructor change:** adds `opts: AgentMessageHandlerOptions` — `model`, `cacheTtl`, `cwd`, `store`, `tools`, `respond`. The `respond` callback (`(requestId, approved) => void`) abstracts `port.postMessage` so the handler doesn't hold a reference to `port`.

**Restores** the `message_compaction` `[compacted at X/Y (Z%)]` annotation that was temporarily dropped in 4a.

**`runAgent.ts` after this:** 89 lines. Sets up tools, calls `agent.runAgent()`, constructs `respond` + handler, wires `port.on('message', (msg) => handler.handle(msg))`. No message routing logic lives there anymore.

**Tests:** 27 total (16 unchanged + 11 new). New tests:
- `tool_error`: transitions to tools block, streams tool name and error message
- `message_usage` without tool batch: calls `updateUsage`, no annotation
- `message_usage` delta annotation: asserts `+500` in annotation string after a tool batch
- Reset timing sequence: two independent batches annotate separately with correct deltas
- No double-snapshot: second tool in same batch doesn't reset the snapshot
- `message_compaction` annotation: present when `lastUsage` known, absent when not

All 167 tests pass. Manual testing confirmed: tool approval flow, delta annotation, two-tool batch annotates once, `tool_error` format all working.

---

### State at end of session

- Branch: `feature/agent-message-handler-stateful`, PR #193 open, auto-merge set
- **Next: step 5a** — extract `StatusState` + `renderStatus` from `AppLayout`
- Move the 5 token/cost accumulators to `StatusState`
- Move status line render logic to `renderStatus(state, cols): string`
- `AppLayout` holds `this.#statusState`, calls `renderStatus` in its render pass
- Tests: given a usage sequence, assert state totals and render output
Loading
Loading