Add Iroh migration design specification#13
Merged
Conversation
Comprehensive design for replacing libp2p with iroh as the networking layer. Covers identity mapping, transport changes, gossip protocol migration, blob-based file transfer, relay replacement, WASM support, and a 6-phase migration plan. https://claude.ai/code/session_014rKQjnqPmhpDxY3jyhTR7o
- Use EndpointId natively instead of shimming into old PeerId API - Scope to Leptos web UI only (Bevy app out of scope) - Drop data migration phase — clean break, no backward compat - Self-hosted relay by default with TLS via reverse proxy - Resolve gossip max message size (64 KiB, with implications analysis) - Resolve bootstrap cold start (infra concern, relay + workers) - Elaborate blob GC strategy (MemStore for clients, size cap for workers) https://claude.ai/code/session_014rKQjnqPmhpDxY3jyhTR7o
Restructure the entire networking stack around iroh's native model: - Network crate exposes iroh handles directly (no wrapper abstractions) - Client holds GossipSender/GossipReceiver directly (no command enums) - Workers stream from GossipReceiver (no NetworkEvent polling) - Drop NetworkCommand/NetworkEvent/bridge indirection entirely - Consolidate migration into 4 phases instead of 6 https://claude.ai/code/session_014rKQjnqPmhpDxY3jyhTR7o
- Network trait uses iroh types (TopicId, EndpointId, Hash, Bytes) but is swappable: IrohNetwork for production, MemNetwork for tests - TopicHandle/TopicEvents traits mirror iroh gossip API surface - BlobStore trait mirrors iroh-blobs operations - Client and workers are generic over Network — testable without real QUIC connections or tokio runtime - MemHub provides in-process gossip mesh for test assertions - Concrete test example showing two clients exchanging messages via MemNetwork without any networking https://claude.ai/code/session_014rKQjnqPmhpDxY3jyhTR7o
Covers all 7 test tiers: state machine (unchanged), client API (ported to MemNetwork), browser/Leptos (minimal changes), network integration (rewritten against IrohNetwork), scaling (ported), workers (MemNetwork), and E2E state convergence (unchanged). Details MemHub design for deterministic in-process gossip testing, test migration checklist with counts, and per-phase validation gates. https://claude.ai/code/session_014rKQjnqPmhpDxY3jyhTR7o
- willow-crypto: X25519 key derivation from iroh SecretKey - willow-channel/messaging: String → EndpointId for peer fields - willow-common: wire signature format stays our own envelope - EndpointId serialization: 32 bytes binary, hex string display - Voice/WebRTC signaling: maps to iroh-gossip topics directly - Reconnection: iroh handles relay reconnect, client re-subscribes topics via ConnectionEvent stream on Network trait - just dev flow: relay binary changes - Playwright E2E tests: added to migration checklist (tier 8) https://claude.ai/code/session_014rKQjnqPmhpDxY3jyhTR7o
- Remove duplicate test migration checklist - Fix Rc<RefCell> → Arc<RwLock> for Send+Sync client - Add unsubscribe() to Network trait - Clarify willow-files is deleted (replaced by iroh-blobs) - Note Phase 1 parallelism (state + network are independent) https://claude.ai/code/session_014rKQjnqPmhpDxY3jyhTR7o
- Relay/bootstrap contradiction: relay is pure packet forwarding, bootstrap node is a separate lightweight gossip participant deployed alongside it. Relay wrapper binary runs both. - Wire format non-goal: clarify inner WireMessage enum unchanged, outer signed envelope naturally changes due to EndpointId - Phase 3: fix to reference TopicHandle/TopicEvents traits, not raw iroh types (matches the trait abstraction decision) - Add connection_events() to canonical Network trait definition, remove duplicate definition from Reconnection section - Fix "no tokio runtime" claim: MemNetwork needs #[tokio::test] for async trait methods, but all I/O is in-process channels https://claude.ai/code/session_014rKQjnqPmhpDxY3jyhTR7o
- Remove orphaned "can either:" fragment from relay section - Fix MemNetwork doc comment: needs tokio, not "no async runtime" - Fix "relay's EndpointId" → "bootstrap node's EndpointId" - Phase 1: add willow-channel, willow-messaging, willow-crypto - Phase 4: add willow-files deletion https://claude.ai/code/session_014rKQjnqPmhpDxY3jyhTR7o
- WASM blob store: MemBlobStore stub with step-by-step TODO for IndexedDB-backed IdbBlobStore implementation - BlobStore trait: add remove() and store_size() methods from day one - Blob GC: detailed implementation plan for BlobGc struct, GC loop, FsStore integration, CLI flags, and test cases https://claude.ai/code/session_014rKQjnqPmhpDxY3jyhTR7o
Detailed step-by-step plan covering: - 1.1: willow-identity rewrite (iroh SecretKey/EndpointId) - 1.2: Network traits (TopicHandle, TopicEvents, BlobStore, Network) - 1.3: MemNetwork test double with MemHub - 1.4: IrohNetwork implementation with integration tests - 1.5: Delete old libp2p network code - 1.6: willow-state String → EndpointId (63 tests) - 1.7: Supporting crates (channel, messaging, crypto, transport, common) - 1.8: Validation gate Phases 2-4 scoped but deferred until Phase 1 complete. https://claude.ai/code/session_014rKQjnqPmhpDxY3jyhTR7o
Phase 2 (Client + Web UI): 8 steps - Restructure ClientHandle as generic over Network - Topic listener system (replaces NetworkCommand/Event enums) - File sharing via BlobStore trait - Delete old network module - Port 93 client tests to MemNetwork - Wire Leptos web UI with IrohNetwork - Update 39 browser tests Phase 3 (Relay + Workers): 8 steps - Relay rewrite (iroh-relay + bootstrap gossip node) - Worker runtime generic over Network - Worker actor rewrites (TopicHandle/TopicEvents) - Replay and storage binary updates - Port worker tests to MemNetwork - Port scaling tests to IrohNetwork - Update just dev flow Phase 4 (Cleanup): 8 steps - Remove libp2p deps, delete willow-files - Remove WASM transport branching - Update E2E and Playwright tests - Update Docker deployment and CLAUDE.md https://claude.ai/code/session_014rKQjnqPmhpDxY3jyhTR7o
Client switches to IrohNetwork in Phase 2, but relay is still libp2p until Phase 3 — incompatible transports. Phase 2 validates via MemNetwork tests and WASM compile checks only. First real end-to-end smoke test waits for Phase 3 when relay is also on iroh. https://claude.ai/code/session_014rKQjnqPmhpDxY3jyhTR7o
- Move e2e_flow.rs update from Phase 4 to Phase 1.6 — these tests use ServerState directly and break as soon as String → EndpointId changes land - Add warning to Phase 1 gate: do NOT run just check or cargo check --workspace, downstream crates won't compile until Phases 2-3 - Renumber Phase 4 steps after removing duplicate https://claude.ai/code/session_014rKQjnqPmhpDxY3jyhTR7o
- ops.rs: call out JoinToken/JoinLink peer ID field changes - invite.rs: invite creation/parsing needs EndpointId - storage.rs: serialized event format changes, add version check to wipe old data on format mismatch (clean break) https://claude.ai/code/session_014rKQjnqPmhpDxY3jyhTR7o
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds a comprehensive design specification for migrating Willow's networking layer from libp2p to iroh. The document outlines the rationale, architecture mapping, implementation strategy, and phased migration plan.
Key Changes
docs/superpowers/specs/2026-03-29-iroh-migration-design.md) covering:willow-identity,willow-network,willow-client,willow-app,willow-worker,willow-relay)TopicIdvaluesNotable Details
Endpointreplaces separate TCP/WebSocket stacks, eliminating native/WASM code branchingRouterreplaces complexNetworkBehaviourcomposition with 6 sub-behavioursThis is a design document only—no code changes are included in this PR.
https://claude.ai/code/session_014rKQjnqPmhpDxY3jyhTR7o