Skip to content

Add agentic peer API design spec#8

Merged
intendednull merged 16 commits into
mainfrom
claude/agentic-peer-api-access-Z8Xie
Apr 1, 2026
Merged

Add agentic peer API design spec#8
intendednull merged 16 commits into
mainfrom
claude/agentic-peer-api-access-Z8Xie

Conversation

@intendednull
Copy link
Copy Markdown
Owner

Proposes a JSON-RPC server (willow-agent binary) that exposes the full
ClientHandle API to external agents over local Unix socket or TCP. Agents
are real peers with their own Ed25519 identity, subject to the same
permission model. Includes event streaming via WebSocket, bearer token
auth with scoped permissions, and a phased implementation plan.

https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr

claude added 16 commits March 29, 2026 11:18
Proposes a JSON-RPC server (`willow-agent` binary) that exposes the full
ClientHandle API to external agents over local Unix socket or TCP. Agents
are real peers with their own Ed25519 identity, subject to the same
permission model. Includes event streaming via WebSocket, bearer token
auth with scoped permissions, and a phased implementation plan.

https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
…pi-access-Z8Xie

# Conflicts:
#	docs/specs/2026-03-29-agentic-peer-api-design.md
MCP (Model Context Protocol) is JSON-RPC 2.0 with conventions built for
AI agent integration. Key changes:

- Tools replace raw JSON-RPC methods for mutations (send_message, etc.)
- Resources expose read-only state (channels, members, messages) with
  subscription support for change notifications
- Three transports: stdio (AI clients spawn directly), SSE (bots/scripts),
  Streamable HTTP (stateless calls)
- AI clients (Claude Code, Claude Desktop, Cursor) get zero-config
  discovery via tools/list and resources/list
- Non-AI consumers still work identically via plain JSON-RPC
- Token scopes now filter tool/resource visibility in MCP listings
- Updated examples for Python MCP SDK, Rust SDK, and Claude Code config

https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
Reflects the recent client refactor:
- ClientHandle<N> is now generic over Network trait, actor-based
- EndpointId (Ed25519 public key, 64-char hex) replaces string peer IDs
- All ClientHandle methods are async (actor message passing)
- Startup flow updated: actor system init, connect(network), listener tasks
- Added missing tools: share_file_inline, voice (join/leave/mute/deafen),
  authorize_workers
- Added missing resources: voice status, voice participants
- Added missing notifications: ServerDescriptionChanged, VoiceJoined,
  VoiceLeft, JoinLinkResponse, JoinLinkDenied
- Message resource now includes edited, reply_to, reactions fields
- Added willow-network and willow-actor to dependency list
- Documented permission enum values for tool parameter reference

https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
The agent API enables UI-free end-to-end testing of multi-peer behavior.
Tests spawn willow-agent processes connected over real iroh networking
and drive them via typed MCP tool calls and resource reads.

Adds:
- AgentTestHarness design for managing N agent peers + relay
- Three concrete test examples: message delivery, permission enforcement,
  state convergence
- Comparison table showing which scenarios are hard via UI but easy via MCP
- Integration with existing test tiers (state < client < MCP E2E < Playwright)
- justfile commands: test-agent, test-agent-e2e
- Updated Phase 2/3 to include harness and E2E test porting

https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
Reflects the latest refactor replacing the monolithic ClientStateActor
with a layered reactive system:

Architecture section:
- 6 domain-specific StateActor<S> instances (EventState, ServerRegistry,
  ChatMeta, ProfileState, NetworkMeta, VoiceState)
- DerivedActor layer computing reactive views (MessagesView, ChannelsView,
  MembersView, UnreadView, RolesView, ConnectionView)
- ClientViewHandle with StateRef<T> at every granularity for reads
- ClientMutations<N> typed interface for writes
- Broker<ClientEvent> for pub/sub event distribution
- PersistenceActor for fire-and-forget I/O

Resource subscription mapping:
- Each MCP resource backed by a specific StateRef<T> from the view system
- StateRef::subscribe() drives notifications/resources/updated
- PartialEq at every layer prevents spurious updates
- No polling needed — changes push from state actors through derived
  views to MCP transport

E2E testing:
- In-process harness using ClientHandle<MemNetwork> (~5-50ms/test)
  exercises full actor stack without processes or real network
- Process-spawning harness for MCP protocol + real iroh validation
- Updated examples to use views/mutations APIs directly
- New test tier table with in-process E2E layer

Also added missing notification types: FileAnnounced, Listening,
VoiceSignal.

https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
Audited every public method on ClientHandle, ClientMutations, accessors,
views, servers, voice, and joining modules against the spec. Fixes:

Architecture:
- State actor table now shows exact field names and types from code
- Derived view table shows actual source dependencies from compute fns
- ClientViewHandle fields listed at all 3 layers (terminal, L2, L1)
- Distinguish ClientHandle methods (multi-actor coordination) from
  ClientMutations (event-sourced operations)

Tools:
- Corrected description: tools wrap ClientHandle methods, which
  delegate to mutations or directly coordinate domain actors

Resources:
- Every resource now has explicit Backed By column showing exact
  StateRef<T> or accessor
- Added willow://channel/{name}/typing (from typing_in() accessor)
- Added typing_peers to willow://connection (from ConnectionView)
- Added display_name to willow://server/current
- Corrected willow://server/roles field name from role_id to id
  (matches RoleEntry struct)
- Noted which resources support reactive subscriptions vs polling

https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
4-phase plan covering crate skeleton, tool/resource implementations,
E2E test harness with 24 multi-peer test scenarios, token scoping,
and SSE transport. Heavy emphasis on in-process E2E testing via
AgentTestHarness + MemNetwork as the primary way to test complex
multi-peer behavior without a browser.

https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
Fixes found during audit:
- ClientEvent variant count: 25 → 27 (was missing JoinLinkResponse,
  JoinLinkDenied in count)
- token_scope comment: "Phase 3" → "Phase 4" (scopes are Phase 4)
- Remove "add crates/agent to workspace" from modified files list
  (workspace uses crates/* glob, no edit needed)
- Fix Phase 1d contradiction: tool list was described as "empty" but
  1e defines schema stubs in same phase
- Fix Phase 2h test sequencing: test_client() is pub(crate) in
  willow-client, so agent crate creates its own local helper for
  Phase 2 (Phase 3d introduces proper test-utils feature)
- Fix check-wasm: no update needed (already lists crates explicitly)
- Add EndpointId parsing for authorize_workers and generate_invite
  tool call descriptions

https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
"10+ more E2E tests" was inaccurate — Phase 4 has 6 E2E tests
(16-21) plus 3 scope unit tests (22-24). Updated to match actual
inventory.

https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
Add the willow-agent crate: an MCP server binary that exposes Willow's
ClientHandle to AI agents via tools, resources, and notifications.

- 37 MCP tools covering messaging, channels, permissions, server mgmt,
  identity, invites, voice, and state verification
- 15 MCP resources for reading identity, connection, server state,
  channels, members, roles, messages, voice status, etc.
- 27 ClientEvent → JSON notification serializers
- CLI with clap (relay, identity, transport, token management)
- Bearer token auth with 256-bit random hex tokens
- lib.rs + main.rs split for integration test access
- 18 unit tests + 15 E2E integration tests (all passing)
- Workspace clippy clean, all workspace tests passing

https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
- Add TokenScope enum (Full, ReadOnly, Messaging, Admin, Custom) with
  tool/resource filtering in scopes.rs
- Wire scope enforcement into WillowMcpServer: list_tools and call_tool
  filter by scope, list_resources filters by scope
- Add WillowMcpServer::with_scope() constructor
- Add Streamable HTTP transport via rmcp's transport-streamable-http-server
  feature (serve_http function with axum router)
- Support --transport http in CLI with bearer token auth
- Add 9 new E2E tests: kick_member, server_rename, display_name_updates,
  voice_join_leave, send_reply, create_and_delete_channel, plus 3 scope
  enforcement tests (readonly, messaging, custom allowlist)
- Total: 24 E2E tests + 24 unit tests, all passing
- Workspace clippy clean, all workspace tests passing

https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
Resource fixes to match spec:
- willow://connection: add typing_peers field with peer_id + channel
- willow://server/current: add description and display_name fields
- willow://channel/{name}/messages: add reactions field (HashMap)
- willow://server/join-links: rename link_id→id, used→uses per spec

Client accessor additions:
- Add server_description() accessor to ClientHandle
- Add typing_peers() accessor returning Vec<(peer_id, channel)>

Other fixes:
- Listening notification: field renamed topic→address to match spec
- CLI transport help text now mentions http option
- justfile test-all now includes test-agent-e2e

Doc alignment:
- Spec: update transports (SSE+HTTP → Streamable HTTP), crate structure,
  dependencies, notification field names
- Plan: update file list, transport section, E2E test inventory to
  match actual 24 implemented tests

https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
…r tests

- Notification bridge: subscribe to Broker<ClientEvent> and forward events
  as custom MCP notifications (willow/event) via rmcp Peer handle. Wired
  for both stdio (from RunningService) and HTTP (from RequestContext).
- Bearer token auth: axum middleware validates Authorization header on all
  HTTP transport requests. Token passed from CLI to serve_http().
- VoiceSignal notification: include signal payload (Offer/Answer/IceCandidate)
  instead of dropping it with `..`.
- Multi-peer test infrastructure: add test_client_on_hub() to willow-client
  for creating connected clients on a shared MemHub.
- 5 new E2E tests (29 total): notification serialization, JSON roundtrip,
  hub-connected client, two-client different IDs, separate server state.
- 1 new unit test: voice_signal_includes_payload, token_file_written_and_readable.
- Document resource subscriptions as deferred (rmcp 1.3 limitation).

https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
willow-agent is a native-only binary (tokio, axum, rmcp) that cannot
compile for wasm32-unknown-unknown. Add --exclude willow-agent to the
WASM check step, matching the existing exclusions for relay/worker/etc.

https://claude.ai/code/session_01DkyZbVvWdew23LjUMdScyr
@intendednull intendednull merged commit cec0e54 into main Apr 1, 2026
4 checks passed
@intendednull intendednull deleted the claude/agentic-peer-api-access-Z8Xie branch April 1, 2026 15:49
intendednull added a commit that referenced this pull request Apr 26, 2026
lifecycle, fix IrohBlobStore spec drift, track 4 new follow-ups

Round 2 review (two fresh agents) verified all 15 round-1 fixes
land cleanly with no regressions, then surfaced 8 new findings
(0 critical, 3 medium, 5 low) by widening scope to cross-component
interactions, perf, and API surface.

Fixed inline (trivial doc / spec):

- Add an "Actor coordination signal" row to the spec decision tree
  + CLAUDE.md table covering tokio::sync::watch / oneshot /
  broadcast / Notify, with the explicit rule that
  tokio::sync::Mutex is forbidden for business state on the same
  terms as std/parking_lot Mutex. Closes the spec gap that left
  contributors without guidance on async channels. (round-2 #3)
- Reconcile spec § 184 with the corrected IrohBlobStore comment
  (round-1 fixed the code, missed the spec). The blob store is
  not an iroh-callback boundary — it's an interim stub. The relay-
  status timestamp Mutex stays in the iroh boundary list. (round-2 #4)
- Document the web `_event_loop` drop pattern in `crates/web/src/app.rs`
  so future readers see explicitly that the actor System is process-
  scoped on web (page reload tears everything down) and that any
  actor needing pre-close cleanup must route via `beforeunload`,
  not Drop. (round-2 #8)

Tracked as new follow-ups in spec § Follow-up work:

- F5. SearchActor head-of-line + rebuild-storm fix. Rebuild blocks
  Query in FIFO order; the rebuild Effect has no debounce. Fix is
  chunked-Rebuild + Debounce<Rebuild> wrap. (round-2 #1, #2 — Med)
- F6. Browser-tier coverage for SearchIndexHandle consumers. The
  spawn_local + Effect path has no wasm-pack test. (round-2 #6)
- F7. Sealed ClientSpawner to narrow the system() API surface,
  rather than exposing the full SystemHandle. (round-2 #7)
- F8. Search-query debouncing-flicker fix via generation tag or
  Leptos Resource migration. (round-2 #5)

Each follow-up has a "Trigger:" line naming the dedicated PR title.

`just check` green: clippy zero warnings, 1003+ tests pass, WASM
compile clean. Loop terminates here per the user's two-round cap;
no Critical issues remain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants