You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Target: macOS 26+ native client. First supported agent: Claude Code. Architecture opens path to ACP / Codex / other agents post-MVP.
1. Problem & motivation
Today every agent session in Relay is a raw PTY rendered via SwiftTerm. For shell sessions this is correct. For AI-coding agents — it is a mismatch: the agent produces structured output (messages, tool calls, thinking, plugin invocations, diffs), and rendering it as undifferentiated ANSI text discards meaning, hurts scannability, and prevents UX polish (markdown, syntax-highlighted code, collapsible tool output, approval sheets, live progress).
Dialogue introduces a second presentation surface for agent sessions — a structured chat UI that parses the agent's stream and renders messages, tool calls, and thinking as first-class UI primitives. The raw terminal stays available for sessions that need it (interactive TUI prompts, /login, shell-like flows).
Selection is at session creation, not a live toggle — Claude Code's interactive TTY and -p --output-format stream-json are different CLI entrypoints with incompatible stdout contracts. The UX compensation for "switch mode" expectation is an Open as Terminal / Open as Dialogue action that creates a sibling session via claude --resume <session_id> (context preserved).
2. User stories
As a user running a Claude Code session on a task, I want the agent's answer rendered as formatted markdown with tool invocations shown as cards, so I can scan the work done without re-reading raw terminal output.
As a user, I want to see live progress while the agent is working — which tool is currently running, what thinking is happening, when a message is being streamed — so I feel the agent is actively working for me.
As a user, I want to switch a session from Dialogue to raw Terminal without losing the conversation context, so that when the agent drops into a TUI flow I can still finish it.
As a user with accessibility needs, I want every interactive element to be keyboard-reachable and VoiceOver-labelled, every animation to respect Reduce Motion, and every status signal to be pair-coded with an icon (not color alone).
3. Naming
Feature name: Dialogue. In code: SessionSurface.agentDialogue(.claudeCode), DialogueFeature (TCA reducer), DialogueView (SwiftUI), DialogueMarkdownView (markdown renderer wrapper). User-facing strings: "Dialogue", "New Dialogue Session", "Open as Dialogue".
Considered alternatives (rejected): Parley, Atelier, Converse, Scribe, Studio. Dialogue wins on symmetry with Terminal, cross-lingual clarity, no conflict with Claude / Anthropic brand terms.
4. Architecture
4.1 Packages
Package
Layer
Content
AgentChat (new)
Domain + TCA
AgentMessage, ToolCall, AgentStreamEvent, AgentChatSession protocol, ClaudeStreamJSONParser, AgentChatFeature reducer. No SwiftUI, no gRPC, no SwiftTerm. Pure Swift + Foundation.
ACP (post-MVP): unifies N agents, first-class permissions + diff blocks; no Swift SDK yet (~3000–4500 LOC to write); Claude works via Node-adapter @zed-industries/claude-agent-acp which reduces CLI feature parity. Trigger for adoption: second chat-agent in roadmap OR ACP 1.0 OR Anthropic officially accepts ACP.
The AgentStreamEvent internal model is designed close to ACP's ToolCall/ContentBlock shape so that future migration is a parser swap, not a reducer/UI rewrite.
Scoped via ifCaseLet. TerminalFeature.State stays pure-terminal; Dialogue state lives in a sibling reducer.
Mode selection UX:
New Session creation sheet shows explicit picker — Terminal / Dialogue — when creating an agent session.
preferredClaudeSurface in SettingsStore sets the default (initial value: .agentTerminal(.claudeCode); flipped to .agentDialogue in a later release after positive feedback).
Keyboard shortcuts: ⌘⇧N — New Dialogue Session; ⌘⌥N — New Terminal Session.
Cross-surface resume: menu item on a running session — Open as Terminal / Open as Dialogue (⌘⇧T / ⌘⇧D) — creates a sibling tab with the same session_id via claude --resume <session_id>. Context preserved, physically kill + respawn. User's expectation of "switch mode" is met without breaking the process-level separation.
6. UI concept — live feel
Multi-layer live signals (all behind accessibilityReduceMotion guards):
Token level — ▊ mini-caret at end of streaming text; text_delta throttled to 30fps (~33ms batching of deltas per runloop); optional character-by-character smoothing when tokens arrive in bursts.
Message level — soft fade-in (DS.Motion.fast, 120ms); auto-scroll anchored to bottom only when user is already at bottom, else floating ↓ New message chip; subtle glow pulse (1-2% opacity) on active streaming bubble.
Tool-call level — card appears instantly with pending status (shimmer skeleton); pending → in_progress pill opacity pulse; completed checkmark spring (scale 0.8 → 1.0); failed shake (±4pt, 2 cycles); live duration tick every 1s.
Session level — top status strip above input: idle / Thinking… / Responding… / Running: Bash (npm test) / error banner with retry info. Tab-icon ambient pulse during streaming.
Specialized tool widgets (full catalog in events spec §3):
Tool
Widget pattern
Read / Grep / Glob / LS
Path + count preview; expanded = content / match list
Edit / Write / MultiEdit
Inline DiffView (green +, red −) with syntax accents
Dedup:stream_event by (session_id, event.index, event.type, uuid); assistant by message.id; user by tool_use_id; system by uuid. assistant snapshot is authoritative over partial state.
Edge cases covered: CLI crash mid-turn, reconnect + VT-snapshot duplicates, unknown event types (never crash), partial JSON parse, long thinking content, large tool_result (>1 MiB truncation), cost thresholds, user interrupt (⌘. via streaming-input ControlRequest → SIGINT fallback).
8. External dependencies
New SPM dependencies (all require explicit approval before adoption):
apple/swift-markdown (Apache-2.0) — AST parser for markdown rendering. Used as AST-provider; inline rendering delegated to Apple's AttributedString(markdown:, options: .inlineOnlyPreservingWhitespace); block rendering is a custom MarkupVisitor. Bus factor 0 (Apple), streaming control is ours, full render of chat-relevant markdown.
(Optional, v1.1+)JohnSundell/Splash or equivalent for syntax-highlighted code fences. Not in MVP — MVP renders code fences as monochrome monospaced blocks.
LiYanan2004/MarkdownView — reference for code patterns (swift-markdown + Highlightr), but full adoption is rejected: bus factor 1 + Highlightr tugs WebKit.
ACP Swift SDK — doesn't exist; deferred to post-MVP.
9. MVP acceptance criteria
Before Dialogue MVP is considered done:
A .agentDialogue(.claudeCode) session can be created from the UI; the CLI launches with the expected stream-json flags and JSONL is parsed into AgentChatFeature.State.transcript.
Progressive markdown renders assistant messages with 30fps throttling — no visible flicker on a typical 2000-token answer.
Tool calls appear as cards with lifecycle state (pending → in_progress → completed / failed) and specialized widgets for Read / Edit / Bash / Grep / WebFetch / WebSearch / mcp__*. Output over 2KB truncates with "Show full".
Parser errors do not crash the UI — unknown events log + surface in debug panel, session continues.
Graceful terminate — process exit closes AsyncStreams; no zombie processes.
Pre-flight check — missing auth / /login interactive flow shows explicit error screen "Needs interactive setup. Open as Terminal".
Open as Terminal action (⌘⇧D) creates a sibling tab via claude --resume <session_id> with preserved context. Reverse Open as Dialogue works the same way.
A11y — all 9 DS checklist points satisfied for every Dialogue view; Stream Inspector is VoiceOver-navigable; all animations guarded by accessibilityReduceMotion.
Stream Inspector (⌘⌥R) reveals raw JSONL stream for debugging parser issues.
Reconnect via VT-snapshot replay breaks JSONL mid-line
Critical
Parser robust to partial lines; dedup by Claude message_id. Post-MVP — server-side topology solves it first-class.
Agent drops to interactive TUI (/login, OAuth) that Dialogue can't show
Major
Pre-flight claude doctor / auth check; missing → explicit "Open as Terminal" screen.
Claude stream-json schema drift
Major
Tolerant Codable (decodeIfPresent), unknown event → log + debug panel + continue. CI test against live CLI.
User confusion from missing toggle ("I said switch mode")
Major
Explicit picker at creation + "Open as X" actions + release notes. First release — .agentTerminal is default; .agentDialogue opt-in.
Large tool_output hangs card UI
Major
Truncate 2KB / 40 lines + "Show full" modal. Very large → binary summary.
apple/swift-markdown types not Sendable under strict concurrency
Major
Parse on @MainActor, pass only RenderedBlock / AttributedString across actor boundaries.
SwiftUI rerender performance on long transcript
Major
LRU [hash(blockSource): RenderedBlock] cache; LazyVStack; block-boundary stable/tail split. Profile post-MVP; escalate to performance-expert if red.
AttributedString(markdown:) inline parser differs from cmark-gfm on edge cases
Minor
Fallback = write own inline visitor (~1.5 days).
12. Wave breakdown
Tasks decomposed across six waves (0 = prerequisites, 5 = polish). Sub-issue list at the bottom of this epic is populated when child tasks are created. See each child for full acceptance criteria.
Wave
Focus
Unblocks
0
DesignSystem primitives (ds-api)
3, 4
1
rawStdout hook, SessionSurface, docs commit
2
2
AgentChat domain + parser + reducer (no UI)
3, 4
3
AgentChatUI views, markdown renderer, widgets
4
4
Integration — routing, command builder, settings, "Open as X"
5
5
Polish — a11y sweep, tests, profiling, docs
Parallelism within a wave is encouraged; cross-wave is mostly strict (see Blocked by / Blocks fields on each child).
13. References
Research report (frozen): swarm-report/chat-mode-ui-research.md (in research worktree; summary committed to docs/architecture/dialogue.md as part of wave 1).
Events spec: swarm-report/chat-mode-spec-events.md → committed to docs/architecture/dialogue-events.md in wave 1.
Full breakdown across 6 waves. Sub-issue links via GitHub parent/child relationship; blockers documented per-issue via Blocked by references. Status on the Relay project board.
Dialogue — structured chat UI for agent sessions
Target: macOS 26+ native client. First supported agent: Claude Code. Architecture opens path to ACP / Codex / other agents post-MVP.
1. Problem & motivation
Today every agent session in Relay is a raw PTY rendered via SwiftTerm. For shell sessions this is correct. For AI-coding agents — it is a mismatch: the agent produces structured output (messages, tool calls, thinking, plugin invocations, diffs), and rendering it as undifferentiated ANSI text discards meaning, hurts scannability, and prevents UX polish (markdown, syntax-highlighted code, collapsible tool output, approval sheets, live progress).
Dialogue introduces a second presentation surface for agent sessions — a structured chat UI that parses the agent's stream and renders messages, tool calls, and thinking as first-class UI primitives. The raw terminal stays available for sessions that need it (interactive TUI prompts,
/login, shell-like flows).Selection is at session creation, not a live toggle — Claude Code's interactive TTY and
-p --output-format stream-jsonare different CLI entrypoints with incompatible stdout contracts. The UX compensation for "switch mode" expectation is an Open as Terminal / Open as Dialogue action that creates a sibling session viaclaude --resume <session_id>(context preserved).2. User stories
3. Naming
Feature name: Dialogue. In code:
SessionSurface.agentDialogue(.claudeCode),DialogueFeature(TCA reducer),DialogueView(SwiftUI),DialogueMarkdownView(markdown renderer wrapper). User-facing strings: "Dialogue", "New Dialogue Session", "Open as Dialogue".Considered alternatives (rejected): Parley, Atelier, Converse, Scribe, Studio. Dialogue wins on symmetry with Terminal, cross-lingual clarity, no conflict with Claude / Anthropic brand terms.
4. Architecture
4.1 Packages
AgentChat(new)AgentMessage,ToolCall,AgentStreamEvent,AgentChatSessionprotocol,ClaudeStreamJSONParser,AgentChatFeaturereducer. No SwiftUI, no gRPC, no SwiftTerm. Pure Swift + Foundation.AgentChatUI(new)DialogueView,UserMessageView,AssistantMessageView,ToolCallCardView, specialized tool widgets,ThinkingIndicatorView,ApprovalPromptView,DialogueMarkdownView,StreamInspectorView.TerminalAbstractionTerminalSessionprotocol gainsrawStdout: AsyncStream<Data>— a side-channel of raw bytes before VT parsing. Default impl provided.TerminalSwiftTerm/RemoteTerminalrawStdout. Fork PTY-master bytes (local) orServerMessage.stdout_data(gRPC) into the stream.AgentOrchestrator/ClaudeCommandBuildersurface: SessionSurfacein build; for.agentDialogue(.claudeCode)append-p --output-format stream-json --verbose --include-partial-messages --input-format stream-json --bare --permission-mode bypassPermissions.SharedModelsSessionSurfaceenum:.shell / .agentTerminal(AgentKind) / .agentDialogue(AgentKind). Immutable after session creation.PaneManagerTab.surface: SessionSurface. Routing. New session actions with surface choice.DesignSystemCard,StatusIndicator,BlockCodeContainer. No markdown (ISP).Relay(app target)MainFeatureview routes bytab.surface. Settings UI forpreferredClaudeSurface. Menu actions.4.2 Dependency graph (no cycles)
4.3 Data flow (a Claude Dialogue turn)
DialogueViewinput.AgentChatFeaturesends bytes toTerminalSession.write(_:)as stream-json ClientMessage.ServerMessage.stdout_data→TerminalSession.rawStdout.ClaudeCodeJSONLSessionsubscriber readsrawStdout, feedsClaudeStreamJSONParser.AgentStreamEventvalues —messageStarted,textDelta,toolCallStarted,toolCallCompleted,thinkingDelta,sessionEnd, etc.AgentChatFeaturereducer updatestranscript: IdentifiedArrayOf<AgentMessage>with dedup bymessage_id.DialogueViewre-renders — bubble grows with live caret, tool cards flip through lifecycle states, status strip reflects current activity.4.4 Why client-side parsing in MVP (vs server-side vs ACP)
Evaluated three topologies:
@zed-industries/claude-agent-acpwhich reduces CLI feature parity. Trigger for adoption: second chat-agent in roadmap OR ACP 1.0 OR Anthropic officially accepts ACP.The
AgentStreamEventinternal model is designed close to ACP's ToolCall/ContentBlock shape so that future migration is a parser swap, not a reducer/UI rewrite.5. Session surface model
New enum in
SharedModels:Immutable after session creation.
AgentSessionFeature.State.presentation: Presentationenum:Scoped via
ifCaseLet.TerminalFeature.Statestays pure-terminal; Dialogue state lives in a sibling reducer.Mode selection UX:
preferredClaudeSurfaceinSettingsStoresets the default (initial value:.agentTerminal(.claudeCode); flipped to.agentDialoguein a later release after positive feedback).session_idviaclaude --resume <session_id>. Context preserved, physically kill + respawn. User's expectation of "switch mode" is met without breaking the process-level separation.6. UI concept — live feel
Multi-layer live signals (all behind
accessibilityReduceMotionguards):Token level —
▊mini-caret at end of streaming text;text_deltathrottled to 30fps (~33ms batching of deltas per runloop); optional character-by-character smoothing when tokens arrive in bursts.Message level — soft fade-in (DS.Motion.fast, 120ms); auto-scroll anchored to bottom only when user is already at bottom, else floating
↓ New messagechip; subtle glow pulse (1-2% opacity) on active streaming bubble.Tool-call level — card appears instantly with
pendingstatus (shimmer skeleton);pending → in_progresspill opacity pulse;completedcheckmark spring (scale 0.8 → 1.0);failedshake (±4pt, 2 cycles); live duration tick every 1s.Session level — top status strip above input: idle /
Thinking…/Responding…/Running: Bash (npm test)/ error banner with retry info. Tab-icon ambient pulse during streaming.Specialized tool widgets (full catalog in events spec §3):
DiffView(green +, red −) with syntax accentsDialogueViewinside the card (routed viaparent_tool_use_id)mcp__<server>__<tool>7. Events — source of truth
Full event catalog with wire-level detail —
docs/architecture/dialogue-events.md(created in this epic). Summary:Top-level types:
system(subtypes:init,api_retry,plugin_install,compact_boundary,rate_limit, unknown),assistant(complete message),user(tool_result OR echo),result(exactly one, final),stream_event(raw Anthropic SSE wrapped when--include-partial-messages).stream_event.event.type:message_start,content_block_start,content_block_delta(delta.type:text_delta | input_json_delta | thinking_delta | signature_delta | citations_delta),content_block_stop,message_delta,message_stop,ping.Dedup:
stream_eventby(session_id, event.index, event.type, uuid);assistantbymessage.id;userbytool_use_id;systembyuuid.assistantsnapshot is authoritative over partial state.Edge cases covered: CLI crash mid-turn, reconnect + VT-snapshot duplicates, unknown event types (never crash), partial JSON parse, long thinking content, large tool_result (>1 MiB truncation), cost thresholds, user interrupt (⌘. via streaming-input ControlRequest → SIGINT fallback).
8. External dependencies
New SPM dependencies (all require explicit approval before adoption):
AttributedString(markdown:, options: .inlineOnlyPreservingWhitespace); block rendering is a customMarkupVisitor. Bus factor 0 (Apple), streaming control is ours, full render of chat-relevant markdown.JohnSundell/Splashor equivalent for syntax-highlighted code fences. Not in MVP — MVP renders code fences as monochrome monospaced blocks.Explicitly rejected for this feature:
9. MVP acceptance criteria
Before Dialogue MVP is considered done:
.agentDialogue(.claudeCode)session can be created from the UI; the CLI launches with the expected stream-json flags and JSONL is parsed intoAgentChatFeature.State.transcript.mcp__*. Output over 2KB truncates with "Show full"./logininteractive flow shows explicit error screen "Needs interactive setup. Open as Terminal".claude --resume <session_id>with preserved context. Reverse Open as Dialogue works the same way.accessibilityReduceMotion.10. Out of MVP (deliberately deferred)
--include-hook-events— undocumented schema, silently skipped.AskUserQuestiontool — requires streaming-input + ControlResponse; MVP usesbypassPermissions.--permission-mode.--json-schemais final-only inresult.structured_output).max_thinking_tokens(disables stream_events).system/task_*,CronCreate).11. Risks (top)
message_id. Post-MVP — server-side topology solves it first-class./login, OAuth) that Dialogue can't showclaude doctor/ auth check; missing → explicit "Open as Terminal" screen.Codable(decodeIfPresent), unknown event → log + debug panel + continue. CI test against live CLI..agentTerminalis default;.agentDialogueopt-in.apple/swift-markdowntypes not Sendable under strict concurrency@MainActor, pass onlyRenderedBlock/AttributedStringacross actor boundaries.[hash(blockSource): RenderedBlock]cache;LazyVStack; block-boundary stable/tail split. Profile post-MVP; escalate to performance-expert if red.AttributedString(markdown:)inline parser differs from cmark-gfm on edge cases12. Wave breakdown
Tasks decomposed across six waves (0 = prerequisites, 5 = polish). Sub-issue list at the bottom of this epic is populated when child tasks are created. See each child for full acceptance criteria.
rawStdouthook,SessionSurface, docs commitAgentChatdomain + parser + reducer (no UI)AgentChatUIviews, markdown renderer, widgetsParallelism within a wave is encouraged; cross-wave is mostly strict (see Blocked by / Blocks fields on each child).
13. References
swarm-report/chat-mode-ui-research.md(in research worktree; summary committed todocs/architecture/dialogue.mdas part of wave 1).swarm-report/chat-mode-spec-events.md→ committed todocs/architecture/dialogue-events.mdin wave 1.Sub-issues
Full breakdown across 6 waves. Sub-issue links via GitHub parent/child relationship; blockers documented per-issue via
Blocked byreferences. Status on the Relay project board.Wave 0 — DesignSystem primitives (ds-api)
Wave 1 — Foundation
rawStdout: AsyncStream<Data>hookWave 2 — Domain & Parser
Wave 3 — AgentChatUI
Wave 4 — Integration
Wave 5 — Polish
Dependency graph (critical path)
Parallelism hints
Within each wave:
Typical single-developer calendar: 4–6 weeks end-to-end.