Spec
Summary
Make LLM-powered agents first-class participants in Willow servers. Agents join channels, read messages, and reply like any other member, but their identity (PeerKind), capabilities, presence, and UI treatment are distinct (bot badge, "Agent" tag, streaming responses, thinking indicators, data-policy disclosure). Implementation introduces a new crates/bot/ worker (separate from the existing willow-agent MCP crate), additive Profile/ProfileDelta fields, ephemeral WireMessage variants for streaming/thinking, an admin-only SetAgentConfig event, and — as a prerequisite — a versioned-envelope forward-compat layer so future EventKind/WireMessage variants don't force flag-day upgrades.
Build phases
Phases follow §12 of the spec.
Acceptance criteria
- Old peers running pre-Phase-0 builds continue to function (with reduced capability) when a new peer broadcasts unknown
EventKind / WireMessage variants — no panics, no deserialization failures.
Profile / ProfileDelta round-trip tests cover the new peer_kind and data_policy fields and confirm absent fields decode as None (#[serde(default)] semantics).
- Member list renders three ordered sections: Infrastructure, Agents, Members. Agents show bot badge, provider tag, and Online/Thinking/Offline indicator.
- Agent messages render with bot badge and "Agent" tag; messages over 20 lines (any author) auto-collapse with a "Show more" toggle.
willow-bot binary in crates/bot/ connects via the standard worker runtime, subscribes to SERVER_OPS_TOPIC, WORKERS_TOPIC, and per-channel channel_topic(server_id, channel_id) topics, and announces WorkerRoleInfo::Bot { .. } in heartbeats.
- @mention triggers an inference call; the agent responds via
WireMessage::StreamStart/Chunk/End plus a final EventKind::Message carrying the full body, with reply_to set to the invoking event's EventHash.
WireMessage::StreamCancel from a user causes the agent to emit StreamEnd plus a final EventKind::Message containing the partial body generated so far.
- Stream chunks do not appear in the event store — replay materializes only the final
EventKind::Message.
Thinking / StoppedThinking indicators render in the message-input bar and member list, and auto-clear after 30s without a StoppedThinking.
EventKind::SetAgentConfig is rejected (ApplyResult::Rejected) when author is not state.is_admin(author); accepted when admin.
- Granting
ManageRoles to an agent and proposing ProposedAction::GrantAdmin { peer_id: <agent> } both surface high-friction confirmation dialogs explaining the multi-vote nature on multi-admin servers.
- One-time
CloudProvider confirmation dialog fires on first @mention of a cloud-backed agent and is remembered per agent thereafter.
- Rate-limit hits surface a "Agent rate limited" system message; excess messages are dropped silently otherwise.
- All new code passes
just check (fmt + clippy + test + WASM) with zero warnings; new behaviour is covered at the lowest viable test tier per CLAUDE.md (state > client > browser > Playwright).
crates/agent/ (the MCP/JSON-RPC bridge) is unchanged in scope; the new in-protocol bot lives in crates/bot/ to avoid scope collision.
Out of scope
- File / attachment sharing by agents. The spec notes file attachment is not yet a state event; would require a new
EventKind::AttachFile and an associated permission. Not part of this work.
- Granting agents the ability to delete other users' messages. The spec notes this would require a new
ModerateMessages permission and gating DeleteMessage by author-or-moderator; explicitly out of scope for shipping Regenerate.
- Adding
ReadMessages, ManageMessages, or BanMembers permissions. None exist today; read access remains implicit at the gossip layer.
- Hard-deleting messages (the
DeleteMessage tombstone is soft — replay still sees the original event; this stays as-is).
- Persisting stream chunks in the event store. Stream chunks remain ephemeral / wire-only by design.
- Reusing
crates/agent/ for the in-protocol LLM bot. The MCP bridge and the in-protocol bot are deliberately separated into two crates.
Open questions
- E2E encryption path for agent messages (§3 / §8).
EventKind::Message.body is plain String today and travels in the state DAG; encrypted human messages travel through the parallel Content/SealedContent path in crates/messaging + crates/crypto. Pick before shipping Phase 4:
- (a) Treat
EventKind::Message.body as opaque ciphertext (e.g. base64 SealedContent) and decrypt in the rendering layer — uniform event-sourced path but breaking schema change.
- (b) Route agent output through the
Content-based message path and reserve EventKind::Message for plaintext-only channels — preserves existing encryption pipeline but bifurcates how agents and humans publish.
- Atomic Kick+Rotate shape (§4 / Phase 6). If we ship Phase 6, pick exactly one of
ProposedAction::KickAndRotate (preserves governance, preferred) vs. top-level EventKind::KickAndRotate (admin-only, faster, loses multi-admin checkpoint). Mixing both adds two ways to express the same thing.
- Forward-compat strategy for
EventKind / WireMessage (§11). Spec recommends Option B (versioned envelope { kind_tag: u32, payload: Vec<u8> }) over Option A (custom Deserialize with hand-managed tag table). Confirm Option B before starting Phase 0 — Option A is documented for completeness but is brittle under variant reordering.
- Pin-management permission tightness (§4).
PinMessage / UnpinMessage are unrestricted in required_permission() today. Decide whether agents pinning messages requires adding a new ManagePins permission, or remains implicit.
Spec
docs/design/llm-agent-ux-spec.mdSummary
Make LLM-powered agents first-class participants in Willow servers. Agents join channels, read messages, and reply like any other member, but their identity (
PeerKind), capabilities, presence, and UI treatment are distinct (bot badge, "Agent" tag, streaming responses, thinking indicators, data-policy disclosure). Implementation introduces a newcrates/bot/worker (separate from the existingwillow-agentMCP crate), additiveProfile/ProfileDeltafields, ephemeralWireMessagevariants for streaming/thinking, an admin-onlySetAgentConfigevent, and — as a prerequisite — a versioned-envelope forward-compat layer so futureEventKind/WireMessagevariants don't force flag-day upgrades.Build phases
Phases follow §12 of the spec.
kind_tag: u32,payload: Vec<u8>) for bothEventKindandWireMessage, withUnknownno-op fallback inapply_event. Without this, every later phase is a breaking wire change.ProfileandProfileDeltawithpeer_kind: Option<PeerKind>anddata_policy: Option<DataPolicy>(additive#[serde(default)]onProfile;Option<Option<_>>onProfileDelta).apply_event(UpdateProfile)to overlay the new fields.PeerKindandDataPolicythrough profile UI (member list, profile card).crates/web/src/components/message.rs(applies to all messages, not just agents).crates/bot/crate shippingwillow-botbinary (separate fromcrates/agent/MCP crate).BotWorkerimplementingWorkerRole+InferenceBackendtrait.WorkerRoleInfo::Bot { inference_in_flight, inference_capacity }variant; updateWorkerRoleInfo::role_name()arm and round-trip tests.reply_to: Option<EventHash>chain).WireMessage::StreamStart/StreamChunk/StreamEnd/StreamCancelephemeral variants.StreamEnd+ finalEventKind::Messagewith partial body).WireMessage::Thinking/StoppedThinkingephemeral indicators with 30s UI auto-clear timeout.EventKind::SetAgentConfig { peer_id: EndpointId, system_prompt, auto_respond_channels, rate_limit }(gated viastate.is_admin(author)in the dedicated admin block ofcheck_permission, NOT viarequired_permission()).CloudProvideragent./trigger).DeleteMessageof agent's prior response + newEventKind::Messagepreservingreply_to).RotateChannelKeysequence proves error-prone, add eitherProposedAction::KickAndRotate(preferred — preserves governance) or top-levelEventKind::KickAndRotate(admin-only, faster). Pick exactly one shape.Acceptance criteria
EventKind/WireMessagevariants — no panics, no deserialization failures.Profile/ProfileDeltaround-trip tests cover the newpeer_kindanddata_policyfields and confirm absent fields decode asNone(#[serde(default)]semantics).willow-botbinary incrates/bot/connects via the standard worker runtime, subscribes toSERVER_OPS_TOPIC,WORKERS_TOPIC, and per-channelchannel_topic(server_id, channel_id)topics, and announcesWorkerRoleInfo::Bot { .. }in heartbeats.WireMessage::StreamStart/Chunk/Endplus a finalEventKind::Messagecarrying the full body, withreply_toset to the invoking event'sEventHash.WireMessage::StreamCancelfrom a user causes the agent to emitStreamEndplus a finalEventKind::Messagecontaining the partial body generated so far.EventKind::Message.Thinking/StoppedThinkingindicators render in the message-input bar and member list, and auto-clear after 30s without aStoppedThinking.EventKind::SetAgentConfigis rejected (ApplyResult::Rejected) whenauthoris notstate.is_admin(author); accepted when admin.ManageRolesto an agent and proposingProposedAction::GrantAdmin { peer_id: <agent> }both surface high-friction confirmation dialogs explaining the multi-vote nature on multi-admin servers.CloudProviderconfirmation dialog fires on first @mention of a cloud-backed agent and is remembered per agent thereafter.just check(fmt + clippy + test + WASM) with zero warnings; new behaviour is covered at the lowest viable test tier perCLAUDE.md(state > client > browser > Playwright).crates/agent/(the MCP/JSON-RPC bridge) is unchanged in scope; the new in-protocol bot lives incrates/bot/to avoid scope collision.Out of scope
EventKind::AttachFileand an associated permission. Not part of this work.ModerateMessagespermission and gatingDeleteMessageby author-or-moderator; explicitly out of scope for shipping Regenerate.ReadMessages,ManageMessages, orBanMemberspermissions. None exist today; read access remains implicit at the gossip layer.DeleteMessagetombstone is soft — replay still sees the original event; this stays as-is).crates/agent/for the in-protocol LLM bot. The MCP bridge and the in-protocol bot are deliberately separated into two crates.Open questions
EventKind::Message.bodyis plainStringtoday and travels in the state DAG; encrypted human messages travel through the parallelContent/SealedContentpath incrates/messaging+crates/crypto. Pick before shipping Phase 4:EventKind::Message.bodyas opaque ciphertext (e.g. base64SealedContent) and decrypt in the rendering layer — uniform event-sourced path but breaking schema change.Content-based message path and reserveEventKind::Messagefor plaintext-only channels — preserves existing encryption pipeline but bifurcates how agents and humans publish.ProposedAction::KickAndRotate(preserves governance, preferred) vs. top-levelEventKind::KickAndRotate(admin-only, faster, loses multi-admin checkpoint). Mixing both adds two ways to express the same thing.EventKind/WireMessage(§11). Spec recommends Option B (versioned envelope{ kind_tag: u32, payload: Vec<u8> }) over Option A (customDeserializewith hand-managed tag table). Confirm Option B before starting Phase 0 — Option A is documented for completeness but is brittle under variant reordering.PinMessage/UnpinMessageare unrestricted inrequired_permission()today. Decide whether agents pinning messages requires adding a newManagePinspermission, or remains implicit.