Add agentic peer API design spec#8
Merged
Merged
Conversation
Proposes a JSON-RPC server (`willow-agent` binary) that exposes the full ClientHandle API to external agents over local Unix socket or TCP. Agents are real peers with their own Ed25519 identity, subject to the same permission model. Includes event streaming via WebSocket, bearer token auth with scoped permissions, and a phased implementation plan. https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
…pi-access-Z8Xie # Conflicts: # docs/specs/2026-03-29-agentic-peer-api-design.md
MCP (Model Context Protocol) is JSON-RPC 2.0 with conventions built for AI agent integration. Key changes: - Tools replace raw JSON-RPC methods for mutations (send_message, etc.) - Resources expose read-only state (channels, members, messages) with subscription support for change notifications - Three transports: stdio (AI clients spawn directly), SSE (bots/scripts), Streamable HTTP (stateless calls) - AI clients (Claude Code, Claude Desktop, Cursor) get zero-config discovery via tools/list and resources/list - Non-AI consumers still work identically via plain JSON-RPC - Token scopes now filter tool/resource visibility in MCP listings - Updated examples for Python MCP SDK, Rust SDK, and Claude Code config https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
Reflects the recent client refactor: - ClientHandle<N> is now generic over Network trait, actor-based - EndpointId (Ed25519 public key, 64-char hex) replaces string peer IDs - All ClientHandle methods are async (actor message passing) - Startup flow updated: actor system init, connect(network), listener tasks - Added missing tools: share_file_inline, voice (join/leave/mute/deafen), authorize_workers - Added missing resources: voice status, voice participants - Added missing notifications: ServerDescriptionChanged, VoiceJoined, VoiceLeft, JoinLinkResponse, JoinLinkDenied - Message resource now includes edited, reply_to, reactions fields - Added willow-network and willow-actor to dependency list - Documented permission enum values for tool parameter reference https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
The agent API enables UI-free end-to-end testing of multi-peer behavior. Tests spawn willow-agent processes connected over real iroh networking and drive them via typed MCP tool calls and resource reads. Adds: - AgentTestHarness design for managing N agent peers + relay - Three concrete test examples: message delivery, permission enforcement, state convergence - Comparison table showing which scenarios are hard via UI but easy via MCP - Integration with existing test tiers (state < client < MCP E2E < Playwright) - justfile commands: test-agent, test-agent-e2e - Updated Phase 2/3 to include harness and E2E test porting https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
Reflects the latest refactor replacing the monolithic ClientStateActor with a layered reactive system: Architecture section: - 6 domain-specific StateActor<S> instances (EventState, ServerRegistry, ChatMeta, ProfileState, NetworkMeta, VoiceState) - DerivedActor layer computing reactive views (MessagesView, ChannelsView, MembersView, UnreadView, RolesView, ConnectionView) - ClientViewHandle with StateRef<T> at every granularity for reads - ClientMutations<N> typed interface for writes - Broker<ClientEvent> for pub/sub event distribution - PersistenceActor for fire-and-forget I/O Resource subscription mapping: - Each MCP resource backed by a specific StateRef<T> from the view system - StateRef::subscribe() drives notifications/resources/updated - PartialEq at every layer prevents spurious updates - No polling needed — changes push from state actors through derived views to MCP transport E2E testing: - In-process harness using ClientHandle<MemNetwork> (~5-50ms/test) exercises full actor stack without processes or real network - Process-spawning harness for MCP protocol + real iroh validation - Updated examples to use views/mutations APIs directly - New test tier table with in-process E2E layer Also added missing notification types: FileAnnounced, Listening, VoiceSignal. https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
Audited every public method on ClientHandle, ClientMutations, accessors,
views, servers, voice, and joining modules against the spec. Fixes:
Architecture:
- State actor table now shows exact field names and types from code
- Derived view table shows actual source dependencies from compute fns
- ClientViewHandle fields listed at all 3 layers (terminal, L2, L1)
- Distinguish ClientHandle methods (multi-actor coordination) from
ClientMutations (event-sourced operations)
Tools:
- Corrected description: tools wrap ClientHandle methods, which
delegate to mutations or directly coordinate domain actors
Resources:
- Every resource now has explicit Backed By column showing exact
StateRef<T> or accessor
- Added willow://channel/{name}/typing (from typing_in() accessor)
- Added typing_peers to willow://connection (from ConnectionView)
- Added display_name to willow://server/current
- Corrected willow://server/roles field name from role_id to id
(matches RoleEntry struct)
- Noted which resources support reactive subscriptions vs polling
https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
4-phase plan covering crate skeleton, tool/resource implementations, E2E test harness with 24 multi-peer test scenarios, token scoping, and SSE transport. Heavy emphasis on in-process E2E testing via AgentTestHarness + MemNetwork as the primary way to test complex multi-peer behavior without a browser. https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
Fixes found during audit: - ClientEvent variant count: 25 → 27 (was missing JoinLinkResponse, JoinLinkDenied in count) - token_scope comment: "Phase 3" → "Phase 4" (scopes are Phase 4) - Remove "add crates/agent to workspace" from modified files list (workspace uses crates/* glob, no edit needed) - Fix Phase 1d contradiction: tool list was described as "empty" but 1e defines schema stubs in same phase - Fix Phase 2h test sequencing: test_client() is pub(crate) in willow-client, so agent crate creates its own local helper for Phase 2 (Phase 3d introduces proper test-utils feature) - Fix check-wasm: no update needed (already lists crates explicitly) - Add EndpointId parsing for authorize_workers and generate_invite tool call descriptions https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
"10+ more E2E tests" was inaccurate — Phase 4 has 6 E2E tests (16-21) plus 3 scope unit tests (22-24). Updated to match actual inventory. https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
Add the willow-agent crate: an MCP server binary that exposes Willow's ClientHandle to AI agents via tools, resources, and notifications. - 37 MCP tools covering messaging, channels, permissions, server mgmt, identity, invites, voice, and state verification - 15 MCP resources for reading identity, connection, server state, channels, members, roles, messages, voice status, etc. - 27 ClientEvent → JSON notification serializers - CLI with clap (relay, identity, transport, token management) - Bearer token auth with 256-bit random hex tokens - lib.rs + main.rs split for integration test access - 18 unit tests + 15 E2E integration tests (all passing) - Workspace clippy clean, all workspace tests passing https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
- Add TokenScope enum (Full, ReadOnly, Messaging, Admin, Custom) with tool/resource filtering in scopes.rs - Wire scope enforcement into WillowMcpServer: list_tools and call_tool filter by scope, list_resources filters by scope - Add WillowMcpServer::with_scope() constructor - Add Streamable HTTP transport via rmcp's transport-streamable-http-server feature (serve_http function with axum router) - Support --transport http in CLI with bearer token auth - Add 9 new E2E tests: kick_member, server_rename, display_name_updates, voice_join_leave, send_reply, create_and_delete_channel, plus 3 scope enforcement tests (readonly, messaging, custom allowlist) - Total: 24 E2E tests + 24 unit tests, all passing - Workspace clippy clean, all workspace tests passing https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
Resource fixes to match spec:
- willow://connection: add typing_peers field with peer_id + channel
- willow://server/current: add description and display_name fields
- willow://channel/{name}/messages: add reactions field (HashMap)
- willow://server/join-links: rename link_id→id, used→uses per spec
Client accessor additions:
- Add server_description() accessor to ClientHandle
- Add typing_peers() accessor returning Vec<(peer_id, channel)>
Other fixes:
- Listening notification: field renamed topic→address to match spec
- CLI transport help text now mentions http option
- justfile test-all now includes test-agent-e2e
Doc alignment:
- Spec: update transports (SSE+HTTP → Streamable HTTP), crate structure,
dependencies, notification field names
- Plan: update file list, transport section, E2E test inventory to
match actual 24 implemented tests
https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
…r tests - Notification bridge: subscribe to Broker<ClientEvent> and forward events as custom MCP notifications (willow/event) via rmcp Peer handle. Wired for both stdio (from RunningService) and HTTP (from RequestContext). - Bearer token auth: axum middleware validates Authorization header on all HTTP transport requests. Token passed from CLI to serve_http(). - VoiceSignal notification: include signal payload (Offer/Answer/IceCandidate) instead of dropping it with `..`. - Multi-peer test infrastructure: add test_client_on_hub() to willow-client for creating connected clients on a shared MemHub. - 5 new E2E tests (29 total): notification serialization, JSON roundtrip, hub-connected client, two-client different IDs, separate server state. - 1 new unit test: voice_signal_includes_payload, token_file_written_and_readable. - Document resource subscriptions as deferred (rmcp 1.3 limitation). https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
willow-agent is a native-only binary (tokio, axum, rmcp) that cannot compile for wasm32-unknown-unknown. Add --exclude willow-agent to the WASM check step, matching the existing exclusions for relay/worker/etc. https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
intendednull
added a commit
that referenced
this pull request
Apr 26, 2026
lifecycle, fix IrohBlobStore spec drift, track 4 new follow-ups Round 2 review (two fresh agents) verified all 15 round-1 fixes land cleanly with no regressions, then surfaced 8 new findings (0 critical, 3 medium, 5 low) by widening scope to cross-component interactions, perf, and API surface. Fixed inline (trivial doc / spec): - Add an "Actor coordination signal" row to the spec decision tree + CLAUDE.md table covering tokio::sync::watch / oneshot / broadcast / Notify, with the explicit rule that tokio::sync::Mutex is forbidden for business state on the same terms as std/parking_lot Mutex. Closes the spec gap that left contributors without guidance on async channels. (round-2 #3) - Reconcile spec § 184 with the corrected IrohBlobStore comment (round-1 fixed the code, missed the spec). The blob store is not an iroh-callback boundary — it's an interim stub. The relay- status timestamp Mutex stays in the iroh boundary list. (round-2 #4) - Document the web `_event_loop` drop pattern in `crates/web/src/app.rs` so future readers see explicitly that the actor System is process- scoped on web (page reload tears everything down) and that any actor needing pre-close cleanup must route via `beforeunload`, not Drop. (round-2 #8) Tracked as new follow-ups in spec § Follow-up work: - F5. SearchActor head-of-line + rebuild-storm fix. Rebuild blocks Query in FIFO order; the rebuild Effect has no debounce. Fix is chunked-Rebuild + Debounce<Rebuild> wrap. (round-2 #1, #2 — Med) - F6. Browser-tier coverage for SearchIndexHandle consumers. The spawn_local + Effect path has no wasm-pack test. (round-2 #6) - F7. Sealed ClientSpawner to narrow the system() API surface, rather than exposing the full SystemHandle. (round-2 #7) - F8. Search-query debouncing-flicker fix via generation tag or Leptos Resource migration. (round-2 #5) Each follow-up has a "Trigger:" line naming the dedicated PR title. `just check` green: clippy zero warnings, 1003+ tests pass, WASM compile clean. Loop terminates here per the user's two-round cap; no Critical issues remain. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Proposes a JSON-RPC server (
willow-agentbinary) that exposes the fullClientHandle API to external agents over local Unix socket or TCP. Agents
are real peers with their own Ed25519 identity, subject to the same
permission model. Includes event streaming via WebSocket, bearer token
auth with scoped permissions, and a phased implementation plan.
https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr