diff --git a/.claude/skills/resolving-issues/SKILL.md b/.claude/skills/resolving-issues/SKILL.md index c12c1537..17e47aec 100644 --- a/.claude/skills/resolving-issues/SKILL.md +++ b/.claude/skills/resolving-issues/SKILL.md @@ -62,6 +62,8 @@ Why this shape: **When same-day audit-to-fix gap, expect ~zero already-fixed hits.** The sweep yields most when the audit ran days/weeks before the fix run (intervening PRs land between audit and fix). When the audit was filed earlier the same day against the current `main`, no fixes can have landed between audit and fix run by definition — the sweep correctly returns empty. Don't over-invest sweep effort in this case; one quick `git log ..origin/main` check on each pick's path is enough to confirm. + **Coordinator pre-flight reads (cheap, high-value).** When an issue cites multiple files or hints at sibling rot ("Pairs with FX/FY", "or the migration's port surface drifted", "compounds Z"), spend 1–2 `mcp__github__get_file_contents` calls before dispatching to read the cited files at HEAD. This costs nothing (read-only, doesn't touch the implementer's working tree) but lets the coordinator: (a) discover wider rot the audit narrowed past — e.g. issue #626 audit prescribed only `--tcp-port`/`--ws-port` swap on the relay entrypoint, but pre-flight reading the worker entrypoints surfaced `--relay` vs `--relay-url`, `--max-events-per-server` vs `--max-events-per-author`, and a libp2p multiaddr the workers built that iroh doesn't accept; (b) pre-script an A vs B brainstorm prompt for the implementer with concrete options grounded in real file content, not speculation. **Skip pre-flight when the issue is genuinely one-token (e.g. a TODO comment swap).** Heuristic: pre-flight when the audit body mentions ≥ 2 file paths or "pairs with"; skip otherwise. The implementer can still investigate independently — pre-flight just sharpens the scope brief and prevents under-scoped fixes that "fix" a symptom while leaving the system broken. + **Ambiguous-fix-path (audit premise real, fix is a design call) — coordinator-skip without close.** Some audit findings flag a real concern but the prescribed fix is ambiguous-by-design with ≥ 2 legitimate approaches that need a design decision the coordinator can't make unilaterally. Common shape: "audit-glob doesn't catch in-file `mod tests` — move to external test file OR update glob." Both are valid; picking either silently is wrong. **Action:** during step 6 picks, if the audit's literal fix conflicts with HEAD reality (e.g. code already satisfies the surface premise) AND the underlying concern requires a design call, **skip the issue from this run's dispatch queue without closing it**. Note it in the run-end `## Lessons Learned` so the next session has full context. Don't comment on the issue — the audit description already captures the concern; a "skipped, design call needed" comment adds noise without progress. The next run with fresh eyes / accumulated context can decide. 7. Per remaining issue, sequential, max 10 per run: - **Pre-dispatch sync:** before spawning each implementer, in the coordinator's checkout: @@ -210,6 +212,7 @@ Coordinator owns this check (metadata work, allowed under "coordinator never cod - "Did the implementer's work land on master?" → check git ref / wait for SHA-advance. - "Did the implementer agent terminate?" → wait for agent-completion notification. Don't treat them as redundant — and don't preemptively conclude the implementer crashed just because one arrived first. +- **Monitor is overhead for short dispatches — skip it when the agent return is the only signal you need.** For TINY/small fixes (single-file docs / config / one-token edits, <2 min implementer runtimes), the foreground Agent call already returns the agent's report synchronously. Don't bother arming Monitor — the return message itself tells you the SHA, and a quick `git fetch + git reset --hard origin/` confirms the push. Monitor's value is when you need the SHA-advance signal *independent* of agent completion: long dispatches where you want to do coordinator metadata work in parallel, or background-launched agents (`run_in_background: true`) where the agent return doesn't synchronously surface to you. **Rule of thumb:** foreground Agent + immediate `git fetch + reset --hard` for fast issues; background Agent + Monitor for long ones (≥ 5 min expected runtime, e.g. cross-crate gates, docker overhauls). Mid-run signals from the harness (stop-hook chatter, file writes) tell you the implementer's *alive*; the Agent return tells you it *finished*. ### Implementer cuts off mid-flight (uncommitted edits, agent terminated) - A surprisingly common failure mode: the implementer agent gets cut off (token budget, time slice, harness limit) AFTER making the substantive edits but BEFORE running the local merge gate or committing. Symptoms: diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 38220fa3..76a8ab32 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -177,7 +177,7 @@ jobs: - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - uses: dtolnay/rust-toolchain@29eef336d9b2848a0b548edc03f92a220660cdb8 # stable - name: Install cargo-audit - run: cargo install --locked cargo-audit + run: cargo install --locked --version 0.22.1 cargo-audit # Known advisories are tracked in dedicated issues; fail only on new ones. # When each issue is resolved, remove the matching --ignore entry. - name: Run cargo audit diff --git a/CLAUDE.md b/CLAUDE.md index 1c90bd3f..1ee2a129 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -121,7 +121,7 @@ just clean # cargo clean + remove crates/web/dist | Service | Address | Description | |---------|---------|-------------| -| Relay | `localhost:9090` (TCP), `localhost:9091` (WS) | Bridges peers | +| Relay | `localhost:3340` | iroh relay HTTP + bootstrap | | Replay node | connects via relay | In-memory state sync (max 1000 events/server) | | Storage node | connects via relay | Archival SQLite storage | | Web UI | `http://localhost:8080` | Leptos app via `trunk serve` | diff --git a/README.md b/README.md index f550820c..ee27cbea 100644 --- a/README.md +++ b/README.md @@ -127,7 +127,7 @@ This launches all services: | Service | Address | Description | |---------|---------|-------------| -| Relay | `localhost:9090` (TCP), `localhost:9091` (WS) | Bridges peers | +| Relay | `localhost:3340` | iroh relay HTTP + bootstrap | | Replay node | connects via relay | In-memory state sync (max 1000 events/server) | | Storage node | connects via relay | Archival SQLite storage | | Web UI | `http://localhost:8080` | Leptos app via trunk serve | diff --git a/crates/messaging/src/lib.rs b/crates/messaging/src/lib.rs index d56aefc4..357ab71b 100644 --- a/crates/messaging/src/lib.rs +++ b/crates/messaging/src/lib.rs @@ -106,6 +106,22 @@ pub const MAX_FILENAME_BYTES: usize = 255; /// MIME types longer than this are rejected by [`Content::validate`]. pub const MAX_MIME_BYTES: usize = 255; +/// Maximum allowed length, in bytes, for [`SealedContent::ciphertext`]. +/// +/// `SealedContent::ciphertext` is opaque before AEAD verification, so a +/// peer can broadcast an arbitrarily large blob and force receivers to +/// allocate it during deserialise and during decryption before any +/// authentication failure surfaces. Capping the wire length closes that +/// pre-decrypt DoS surface. +/// +/// 64 KiB comfortably exceeds any realistic chat payload (text bodies, +/// reactions, replies, edits, system messages) plus the 16-byte +/// ChaCha20-Poly1305 auth tag, while bounding worst-case allocations +/// from a single message at a level that's cheap to absorb. Peer-supplied +/// `Content::Encrypted` values whose `ciphertext` exceeds this bound are +/// rejected by [`Content::validate`]. +pub const MAX_SEALED_CIPHERTEXT_BYTES: usize = 64 * 1024; + /// Errors returned by [`Content::validate`] / [`Message::validate`] when a /// peer-supplied payload exceeds the structural bounds enforced by this /// crate. @@ -135,6 +151,16 @@ pub enum MessageValidationError { /// Configured maximum. max: usize, }, + + /// `Content::Encrypted` carried a `SealedContent::ciphertext` that + /// exceeded [`MAX_SEALED_CIPHERTEXT_BYTES`]. + #[error("sealed ciphertext too long: {actual} bytes (max {max})")] + SealedCiphertextTooLong { + /// Observed length in bytes. + actual: usize, + /// Configured maximum. + max: usize, + }, } /// The payload inside a [`Message`]. @@ -216,34 +242,52 @@ impl Content { /// `Content` is a wire enum decoded with `bincode` before any /// structural check runs, so callers handling peer-supplied values /// MUST invoke this method after decoding and before trusting the - /// fields. Currently the only bounded fields are - /// [`Content::File::filename`] (≤ [`MAX_FILENAME_BYTES`]) and - /// [`Content::File::mime_type`] (≤ [`MAX_MIME_BYTES`]); other - /// variants always validate successfully. + /// fields. Bounded fields: /// - /// `Content::Encrypted` is intentionally not recursed into here — - /// the inner `Content` is opaque ciphertext until the channel key - /// is applied; callers should re-invoke `validate()` on the - /// decrypted `Content` before using its fields. + /// - [`Content::File::filename`] (≤ [`MAX_FILENAME_BYTES`]) + /// - [`Content::File::mime_type`] (≤ [`MAX_MIME_BYTES`]) + /// - [`SealedContent::ciphertext`] inside [`Content::Encrypted`] + /// (≤ [`MAX_SEALED_CIPHERTEXT_BYTES`]) + /// + /// Other variants always validate successfully. + /// + /// The plaintext inside `Content::Encrypted` is opaque until the + /// channel key is applied; callers should re-invoke `validate()` + /// on the decrypted `Content` before trusting its inner fields. pub fn validate(&self) -> Result<(), MessageValidationError> { - if let Content::File { - filename, - mime_type, - .. - } = self - { - if filename.len() > MAX_FILENAME_BYTES { - return Err(MessageValidationError::FilenameTooLong { - actual: filename.len(), - max: MAX_FILENAME_BYTES, - }); + match self { + Content::File { + filename, + mime_type, + .. + } => { + if filename.len() > MAX_FILENAME_BYTES { + return Err(MessageValidationError::FilenameTooLong { + actual: filename.len(), + max: MAX_FILENAME_BYTES, + }); + } + if mime_type.len() > MAX_MIME_BYTES { + return Err(MessageValidationError::MimeTypeTooLong { + actual: mime_type.len(), + max: MAX_MIME_BYTES, + }); + } } - if mime_type.len() > MAX_MIME_BYTES { - return Err(MessageValidationError::MimeTypeTooLong { - actual: mime_type.len(), - max: MAX_MIME_BYTES, - }); + Content::Encrypted(sealed) => { + if sealed.ciphertext.len() > MAX_SEALED_CIPHERTEXT_BYTES { + return Err(MessageValidationError::SealedCiphertextTooLong { + actual: sealed.ciphertext.len(), + max: MAX_SEALED_CIPHERTEXT_BYTES, + }); + } } + Content::Text { .. } + | Content::Reaction { .. } + | Content::Reply { .. } + | Content::Edit { .. } + | Content::Delete { .. } + | Content::System { .. } => {} } Ok(()) } @@ -727,6 +771,34 @@ mod tests { } } + #[test] + fn validate_rejects_oversized_sealed_ciphertext() { + let content = Content::Encrypted(SealedContent { + ciphertext: vec![0u8; MAX_SEALED_CIPHERTEXT_BYTES + 1], + nonce: [0u8; 12], + key_epoch: 0, + ratchet_counter: 0, + }); + assert_eq!( + content.validate(), + Err(MessageValidationError::SealedCiphertextTooLong { + actual: MAX_SEALED_CIPHERTEXT_BYTES + 1, + max: MAX_SEALED_CIPHERTEXT_BYTES, + }) + ); + } + + #[test] + fn validate_accepts_sealed_ciphertext_at_boundary() { + let content = Content::Encrypted(SealedContent { + ciphertext: vec![0u8; MAX_SEALED_CIPHERTEXT_BYTES], + nonce: [0u8; 12], + key_epoch: 0, + ratchet_counter: 0, + }); + assert!(content.validate().is_ok()); + } + #[test] fn message_validate_delegates_to_content() { let mut hlc = hlc::HLC::new(); diff --git a/crates/network/src/traits.rs b/crates/network/src/traits.rs index 5147d9d8..ffa35f28 100644 --- a/crates/network/src/traits.rs +++ b/crates/network/src/traits.rs @@ -158,7 +158,7 @@ pub trait Network: Send + Sync + 'static { /// Access the blob store. fn blobs(&self) -> &dyn BlobStore; - // TODO(#119): add connection_events() — stream relay up/down and direct + // TODO(#561): add connection_events() — stream relay up/down and direct // peer connect/disconnect events so the client can surface connectivity // status in the UI. diff --git a/crates/state/src/materialize.rs b/crates/state/src/materialize.rs index c0552f60..b6880dbd 100644 --- a/crates/state/src/materialize.rs +++ b/crates/state/src/materialize.rs @@ -29,6 +29,23 @@ fn truncate_chars(s: &str, cap: usize) -> String { s.chars().take(cap).collect() } +/// Maximum UTF-8 byte length of a `Reaction.emoji` string. +/// +/// Emoji codepoints are short — even multi-codepoint ZWJ sequences fit +/// in ~28 bytes. Capping at 32 bytes prevents a peer with +/// `SendMessages` permission from broadcasting multi-MB strings as +/// reaction keys, which would replicate to every receiver. Measured in +/// bytes (not chars) because byte length is what drives wire-format +/// storage cost. See issue #615. +pub const MAX_REACTION_EMOJI_BYTES: usize = 32; + +/// Maximum number of distinct reaction keys per message. +/// +/// Without this cap, a single peer can issue N events with N distinct +/// emoji to grow `ChatMessage.reactions` cardinality unboundedly, +/// since each unique key clones at every replay. See issue #615. +pub const MAX_REACTIONS_PER_MESSAGE: usize = 32; + /// Result of applying an event to state. #[derive(Debug, Clone, PartialEq, Eq)] pub enum ApplyResult { @@ -535,8 +552,30 @@ fn apply_mutation(state: &mut ServerState, event: &Event) -> ApplyResult { } EventKind::Reaction { message_id, emoji } => { + // Reject oversized emoji strings — bounds DAG-time storage + // cost, since each unique emoji is cloned as a BTreeMap key + // on every receiver. See issue #615. + if emoji.len() > MAX_REACTION_EMOJI_BYTES { + return ApplyResult::Rejected(format!( + "reaction emoji exceeds {} bytes ({} bytes)", + MAX_REACTION_EMOJI_BYTES, + emoji.len() + )); + } if let Some(&idx) = state.message_index.get(message_id) { if let Some(msg) = state.messages.get_mut(idx) { + // Reject if adding a *new* reaction key would push + // cardinality past the cap. Adding an author to an + // existing emoji key is fine — that doesn't grow + // cardinality. See issue #615. + if !msg.reactions.contains_key(emoji) + && msg.reactions.len() >= MAX_REACTIONS_PER_MESSAGE + { + return ApplyResult::Rejected(format!( + "message already has {} distinct reactions (cap)", + MAX_REACTIONS_PER_MESSAGE + )); + } msg.reactions .entry(emoji.clone()) .or_default() @@ -594,6 +633,21 @@ fn apply_mutation(state: &mut ServerState, event: &Event) -> ApplyResult { elsewhere, since, } = delta.as_ref(); + // Mirror SetProfile's display_name cap: reject the entire + // event if display_name exceeds 64 chars. Rejection (not + // silent truncation like the sibling profile fields) is + // chosen for parity with SetProfile and because display_name + // is identity-tier — silent truncation could produce + // confusing duplicate names ("alice" vs a truncated + // "alice…"). See issue #614. + if let Some(name) = display_name { + if name.chars().count() > 64 { + return ApplyResult::Rejected(format!( + "display name exceeds 64 chars ({} chars)", + name.chars().count() + )); + } + } let entry = state .profiles .entry(event.author) @@ -1601,6 +1655,81 @@ mod tests { ); } + #[test] + fn update_profile_display_name_over_64_chars_rejected() { + // Mirrors the SetProfile cap (issue #614): an UpdateProfile event + // carrying a >64-char display_name must be rejected so a + // misbehaving peer cannot DoS receivers by broadcasting a + // multi-megabyte name (`Some("a".repeat(10_000_000))`). At the + // same time, a 64-char display name must still apply, and the + // sibling fields (which use silent truncation) must not regress. + let admin = Identity::generate(); + let mut dag = test_dag(&admin); + + // First, set a known-good display name via SetProfile so we have + // a baseline value to assert against after the rejected event. + let baseline = "alice".to_string(); + emit( + &mut dag, + &admin, + EventKind::SetProfile { + display_name: baseline.clone(), + }, + ); + + // 64 crabs (256 bytes) must apply — char-count, not byte-length. + let ok_display: String = "🦀".repeat(64); + emit( + &mut dag, + &admin, + EventKind::UpdateProfile(Box::new(crate::types::ProfileDelta { + display_name: Some(ok_display.clone()), + ..Default::default() + })), + ); + let state_after_ok = materialize(&dag); + assert_eq!( + state_after_ok.profiles[&admin.endpoint_id()].display_name, + ok_display, + "64-char UpdateProfile display_name must apply", + ); + assert_eq!( + state_after_ok.members[&admin.endpoint_id()].display_name, + Some(ok_display.clone()), + "members entry must mirror profiles entry", + ); + + // 65 crabs must be rejected — and the prior value must remain. + let bad_display: String = "🦀".repeat(65); + emit( + &mut dag, + &admin, + EventKind::UpdateProfile(Box::new(crate::types::ProfileDelta { + display_name: Some(bad_display), + // Pair with a benign sibling field; if the event were + // (incorrectly) applied with truncation instead of + // rejection, the pronouns side-effect would still land. + // Rejection must drop the *whole* event. + pronouns: Some(Some("they/them".into())), + ..Default::default() + })), + ); + let state = materialize(&dag); + // Display name pinned at the prior 64-char value. + assert_eq!( + state.profiles[&admin.endpoint_id()].display_name, + ok_display, + "rejected UpdateProfile must leave display_name unchanged", + ); + // Sibling field must NOT have been applied — rejection drops the + // whole event, not just the offending field. + assert_eq!( + state.profiles[&admin.endpoint_id()].pronouns, + None, + "rejected UpdateProfile must not apply sibling fields", + ); + } + #[test] fn kick_cleans_up_pending_votes() { let admin = Identity::generate(); @@ -1667,4 +1796,177 @@ mod tests { let state = materialize(&dag); assert_eq!(state.vote_threshold, VoteThreshold::Unanimous); } + + #[test] + fn reaction_emoji_over_32_bytes_rejected() { + // Issue #615: an oversized Reaction.emoji must be rejected, not + // applied as a BTreeMap key, otherwise a peer with SendMessages + // permission can broadcast a multi-MB string and force every + // receiver to clone it on every replay. + let admin = Identity::generate(); + let mut dag = test_dag(&admin); + let msg = emit( + &mut dag, + &admin, + EventKind::Message { + channel_id: "general".into(), + body: "react to me".into(), + reply_to: None, + }, + ); + + // 33 bytes of ASCII — one byte over the 32-byte cap. + let bad_emoji: String = "a".repeat(MAX_REACTION_EMOJI_BYTES + 1); + assert_eq!(bad_emoji.len(), MAX_REACTION_EMOJI_BYTES + 1); + emit( + &mut dag, + &admin, + EventKind::Reaction { + message_id: msg.hash, + emoji: bad_emoji.clone(), + }, + ); + + let state = materialize(&dag); + assert!( + !state.messages[0].reactions.contains_key(&bad_emoji), + "oversized emoji must not appear as a reaction key", + ); + assert!( + state.messages[0].reactions.is_empty(), + "no reactions should land when the only Reaction is rejected", + ); + } + + #[test] + fn reaction_cardinality_over_32_per_message_rejected() { + // Issue #615: distinct reaction keys per message are capped at + // MAX_REACTIONS_PER_MESSAGE. A 33rd distinct emoji must be + // rejected; the first 32 must remain. + let admin = Identity::generate(); + let mut dag = test_dag(&admin); + let msg = emit( + &mut dag, + &admin, + EventKind::Message { + channel_id: "general".into(), + body: "react to me".into(), + reply_to: None, + }, + ); + + // 32 distinct ASCII emojis (use 2-char strings so they're + // unambiguously distinct keys, well under 32 bytes each). + for i in 0..MAX_REACTIONS_PER_MESSAGE { + let emoji = format!("e{i:02}"); + emit( + &mut dag, + &admin, + EventKind::Reaction { + message_id: msg.hash, + emoji, + }, + ); + } + + // 33rd distinct key — must be rejected. + let overflow_emoji = "EXTRA".to_string(); + emit( + &mut dag, + &admin, + EventKind::Reaction { + message_id: msg.hash, + emoji: overflow_emoji.clone(), + }, + ); + + let state = materialize(&dag); + assert_eq!( + state.messages[0].reactions.len(), + MAX_REACTIONS_PER_MESSAGE, + "exactly MAX_REACTIONS_PER_MESSAGE distinct keys must remain", + ); + assert!( + !state.messages[0].reactions.contains_key(&overflow_emoji), + "33rd distinct reaction must not be added", + ); + } + + #[test] + fn reaction_existing_key_not_blocked_by_cardinality_cap() { + // Issue #615: at the cap, applying a Reaction with an emoji + // *already* present must still succeed — it only adds an author + // to the existing set; cardinality does not grow. + let admin = Identity::generate(); + let bob = Identity::generate(); + let mut dag = test_dag(&admin); + + // Trust bob so he can react. + let grant = emit( + &mut dag, + &admin, + EventKind::GrantPermission { + peer_id: bob.endpoint_id(), + permission: crate::event::Permission::SendMessages, + }, + ); + let msg = emit( + &mut dag, + &admin, + EventKind::Message { + channel_id: "general".into(), + body: "react to me".into(), + reply_to: None, + }, + ); + + // Fill to the cap with admin's reactions. + for i in 0..MAX_REACTIONS_PER_MESSAGE { + let emoji = format!("e{i:02}"); + emit( + &mut dag, + &admin, + EventKind::Reaction { + message_id: msg.hash, + emoji, + }, + ); + } + + // Bob reacts with an *existing* emoji key — must apply, even + // though the message is at MAX_REACTIONS_PER_MESSAGE distinct + // reactions, because cardinality does not grow. Bob's event + // must causally depend on the grant + message so the topo sort + // places it after both, otherwise his reaction may be processed + // before he has SendMessages. + let existing_emoji = "e00".to_string(); + emit_with_deps( + &mut dag, + &bob, + EventKind::Reaction { + message_id: msg.hash, + emoji: existing_emoji.clone(), + }, + vec![grant.hash, msg.hash], + ); + + let state = materialize(&dag); + assert_eq!( + state.messages[0].reactions.len(), + MAX_REACTIONS_PER_MESSAGE, + "cardinality unchanged", + ); + let authors = state.messages[0] + .reactions + .get(&existing_emoji) + .expect("existing emoji key must still be present"); + assert!( + authors.contains(&admin.endpoint_id()), + "admin's original reaction must remain", + ); + assert!( + authors.contains(&bob.endpoint_id()), + "bob's reaction with existing emoji must be added", + ); + } } diff --git a/crates/state/src/tests/permissions.rs b/crates/state/src/tests/permissions.rs index 76f1bf76..70dc1e2e 100644 --- a/crates/state/src/tests/permissions.rs +++ b/crates/state/src/tests/permissions.rs @@ -669,6 +669,19 @@ fn check_permission_rejects_non_admin_rename_server() { assert!(crate::materialize::check_permission(&state, &peer.endpoint_id(), &kind).is_err()); } +#[test] +fn check_permission_rejects_non_admin_set_server_description() { + let owner = Identity::generate(); + let peer = Identity::generate(); + let dag = test_dag(&owner); + let state = materialize(&dag); + + let kind = EventKind::SetServerDescription { + description: "hacked".into(), + }; + assert!(crate::materialize::check_permission(&state, &peer.endpoint_id(), &kind).is_err()); +} + #[test] fn create_and_insert_rejects_without_permission() { use crate::dag::InsertError; diff --git a/crates/web/tests/static_assets.rs b/crates/web/tests/static_assets.rs index cd44c083..025e0218 100644 --- a/crates/web/tests/static_assets.rs +++ b/crates/web/tests/static_assets.rs @@ -13,6 +13,7 @@ //! //! See: GitHub issue #312 (SEC-W-07). +use std::collections::{BTreeMap, BTreeSet}; use std::path::PathBuf; /// The complete set of SVG icon paths permitted to appear in the web @@ -112,33 +113,81 @@ fn index_html_only_references_bundled_svg_icons() { assert_all_in_allow_list("index.html", &refs); } -/// Required directives for the CSP meta tag baked into `index.html`. +/// Expected directives for the CSP meta tag baked into `index.html`. /// -/// Each entry is a substring search against the meta tag's `content` value; -/// they collectively encode the policy decisions described in the inline -/// HTML comment beside the tag (see GitHub issue #175). If you intentionally -/// loosen or tighten one of these directives, update both this list and the -/// inline comment so the rationale stays in sync. -const REQUIRED_CSP_DIRECTIVES: &[&str] = &[ - "default-src 'self'", +/// Each entry is `(directive-name, &[expected source tokens])`. The test +/// parses the meta tag's `content` attribute and asserts that the parsed +/// directive set equals this expected set *exactly* — no missing +/// directives, no extra directives, and no extra source tokens within a +/// directive. Token order is irrelevant (compared as sets). +/// +/// The exact-match shape is deliberate: a substring-only check let +/// additions like widening `script-src` with `'unsafe-inline'` slip +/// through unnoticed (see GitHub issue #619). If you intentionally loosen +/// or tighten one of these directives, update both this list and the +/// rationale comment beside the meta tag in `index.html` so the +/// reasoning stays in sync (issue #175). +const EXPECTED_CSP_DIRECTIVES: &[(&str, &[&str])] = &[ + ("default-src", &["'self'"]), // WASM module + the still-extant js_sys::eval() sites (tracked by // issues #171 / #425). Drop 'unsafe-eval' once those are gone. - "script-src 'self' 'wasm-unsafe-eval' 'unsafe-eval'", + ( + "script-src", + &["'self'", "'wasm-unsafe-eval'", "'unsafe-eval'"], + ), // Inline style="…" attrs from Leptos views + Google Fonts CSS @import. - "style-src 'self' 'unsafe-inline' https://fonts.googleapis.com", - "font-src 'self' https://fonts.gstatic.com", + ( + "style-src", + &["'self'", "'unsafe-inline'", "https://fonts.googleapis.com"], + ), + ("font-src", &["'self'", "https://fonts.gstatic.com"]), // ws/wss for relay transport, https for the relay HTTP bootstrap probe. - "connect-src 'self' ws: wss: https:", + ("connect-src", &["'self'", "ws:", "wss:", "https:"]), // data: for avatar URIs, blob: for runtime createObjectURL attachments. - "img-src 'self' https: data: blob:", - "media-src 'self' blob:", - "worker-src 'self'", - "object-src 'none'", - "base-uri 'self'", - "form-action 'self'", - "frame-ancestors 'none'", + ("img-src", &["'self'", "https:", "data:", "blob:"]), + ("media-src", &["'self'", "blob:"]), + ("worker-src", &["'self'"]), + ("object-src", &["'none'"]), + ("base-uri", &["'self'"]), + ("form-action", &["'self'"]), + ("frame-ancestors", &["'none'"]), ]; +/// Extract the `content="…"` attribute value of the +/// `` tag from `index.html`. +fn extract_csp_content(html: &str) -> Option<&str> { + let needle = "http-equiv=\"Content-Security-Policy\""; + let pos = html.find(needle)?; + // Find `content="…"` after the http-equiv attribute (within the same tag). + let after = &html[pos..]; + let tag_end = after.find('>')?; + let tag = &after[..tag_end]; + let content_marker = "content=\""; + let content_start = tag.find(content_marker)? + content_marker.len(); + let content_rel = tag[content_start..].find('"')?; + Some(&tag[content_start..content_start + content_rel]) +} + +/// Parse a CSP `content` attribute value into a directive → token-set map. +/// +/// Splits on `;`, trims, then within each directive splits on whitespace, +/// taking the first token as the directive name and the rest as its +/// source tokens. +fn parse_csp(content: &str) -> BTreeMap<&str, BTreeSet<&str>> { + content + .split(';') + .filter_map(|d| { + let d = d.trim(); + if d.is_empty() { + return None; + } + let mut parts = d.split_whitespace(); + let name = parts.next()?; + Some((name, parts.collect())) + }) + .collect() +} + #[test] fn index_html_declares_content_security_policy() { let contents = read_asset("index.html"); @@ -149,12 +198,75 @@ fn index_html_declares_content_security_policy() { See GitHub issue #175 — the CSP guards against script injection \ and clickjacking and must stay in the document head.", ); - for directive in REQUIRED_CSP_DIRECTIVES { + + let csp = extract_csp_content(&contents).expect( + "could not extract content=\"…\" from the Content-Security-Policy meta tag in index.html", + ); + let actual = parse_csp(csp); + let expected: BTreeMap<&str, BTreeSet<&str>> = EXPECTED_CSP_DIRECTIVES + .iter() + .map(|(name, tokens)| (*name, tokens.iter().copied().collect())) + .collect(); + + // Exact-set comparison: catches missing directives, extra directives, + // missing tokens, and — most importantly for issue #619 — additions + // of source tokens (e.g. widening script-src with 'unsafe-inline') + // that a substring check would silently allow through. + assert_eq!( + actual, expected, + "index.html CSP does not match the expected directive set.\n\ + Update both the meta tag in index.html AND \ + EXPECTED_CSP_DIRECTIVES in this test (and the rationale comment \ + beside the meta tag) if you are intentionally changing the CSP.\n\ + Expected: {expected:#?}\nActual: {actual:#?}", + ); +} + +/// Defense-in-depth: even if a future maintainer updates both the meta +/// tag and `EXPECTED_CSP_DIRECTIVES` together, certain source tokens are +/// dangerous enough that they should never appear outside specific +/// directives. This test bypasses the expected-set comparison and asserts +/// those invariants directly. +#[test] +fn csp_rejects_unsafe_inline_outside_style_src() { + let contents = read_asset("index.html"); + let csp = extract_csp_content(&contents).expect("CSP meta tag missing"); + let parsed = parse_csp(csp); + for (name, tokens) in &parsed { + if *name == "style-src" { + // 'unsafe-inline' is intentionally allowed here for Leptos + // inline style="…" attributes. See the comment beside the + // meta tag in index.html. + continue; + } + assert!( + !tokens.contains("'unsafe-inline'"), + "CSP directive `{name}` contains 'unsafe-inline'. \ + Only `style-src` is permitted to use 'unsafe-inline'; any \ + other directive (especially script-src) opens an XSS vector. \ + See GitHub issue #619.", + ); + assert!( + !tokens.contains("'unsafe-hashes'"), + "CSP directive `{name}` contains 'unsafe-hashes'. \ + 'unsafe-hashes' relaxes hash matching to apply to inline \ + event handlers; it should not be added to any directive \ + without an explicit security review. See issue #619.", + ); + } +} + +#[test] +fn csp_rejects_data_in_script_src() { + let contents = read_asset("index.html"); + let csp = extract_csp_content(&contents).expect("CSP meta tag missing"); + let parsed = parse_csp(csp); + if let Some(tokens) = parsed.get("script-src") { assert!( - contents.contains(directive), - "index.html CSP is missing required directive `{directive}`. \ - Update both the meta tag in index.html and the rationale \ - comment beside it if you are intentionally changing this.", + !tokens.contains("data:"), + "CSP `script-src` contains `data:`. This permits inline \ + scripts to be sourced from data: URIs and is a known XSS \ + vector — never add it. See GitHub issue #619.", ); } } diff --git a/docker-compose.yml b/docker-compose.yml index e5157040..6d98fe30 100644 --- a/docker-compose.yml +++ b/docker-compose.yml @@ -4,11 +4,9 @@ services: context: . dockerfile: docker/relay.Dockerfile ports: - - "9090:9090" - - "9091:9091" + - "3340:3340" volumes: - relay-identity:/etc/willow - - shared:/shared restart: unless-stopped replay-1: @@ -17,9 +15,9 @@ services: dockerfile: docker/replay.Dockerfile volumes: - replay-1-identity:/etc/willow - - shared:/shared:ro environment: - - MAX_EVENTS_PER_SERVER=1000 + - RELAY_URL=http://relay:3340 + - MAX_EVENTS_PER_AUTHOR=1000 - SYNC_INTERVAL=30 depends_on: - relay @@ -31,9 +29,9 @@ services: dockerfile: docker/replay.Dockerfile volumes: - replay-2-identity:/etc/willow - - shared:/shared:ro environment: - - MAX_EVENTS_PER_SERVER=1000 + - RELAY_URL=http://relay:3340 + - MAX_EVENTS_PER_AUTHOR=1000 - SYNC_INTERVAL=30 depends_on: - relay @@ -46,8 +44,8 @@ services: volumes: - storage-1-identity:/etc/willow - storage-1-data:/var/lib/willow - - shared:/shared:ro environment: + - RELAY_URL=http://relay:3340 - SYNC_INTERVAL=60 depends_on: - relay @@ -60,8 +58,8 @@ services: volumes: - storage-2-identity:/etc/willow - storage-2-data:/var/lib/willow - - shared:/shared:ro environment: + - RELAY_URL=http://relay:3340 - SYNC_INTERVAL=60 depends_on: - relay @@ -79,7 +77,6 @@ services: volumes: relay-identity: - shared: replay-1-identity: replay-2-identity: storage-1-identity: diff --git a/docker/relay-entrypoint.sh b/docker/relay-entrypoint.sh index 228b80ac..66aca937 100755 --- a/docker/relay-entrypoint.sh +++ b/docker/relay-entrypoint.sh @@ -1,10 +1,5 @@ #!/bin/sh set -e -mkdir -p /etc/willow /shared +mkdir -p /etc/willow -# Generate key and publish peer ID for workers. -PEER_ID=$(willow-relay --identity /etc/willow/relay.key --print-peer-id) -echo "$PEER_ID" > /shared/relay-peer-id -echo "Relay peer ID: $PEER_ID" - -exec willow-relay --tcp-port 9090 --ws-port 9091 --identity /etc/willow/relay.key +exec willow-relay --relay-port 3340 --identity /etc/willow/relay.key diff --git a/docker/replay-entrypoint.sh b/docker/replay-entrypoint.sh index 01d9a5ef..a5489c3a 100755 --- a/docker/replay-entrypoint.sh +++ b/docker/replay-entrypoint.sh @@ -7,16 +7,10 @@ if [ ! -f /etc/willow/replay.key ]; then willow-replay --generate-identity --identity-path /etc/willow/replay.key fi -# Wait for relay peer ID to be published. -echo "Waiting for relay peer ID..." -while [ ! -f /shared/relay-peer-id ]; do sleep 1; done -RELAY_PEER_ID=$(cat /shared/relay-peer-id) -echo "Relay peer ID: $RELAY_PEER_ID" - -RELAY_ADDR="/dns4/relay/tcp/9091/ws/p2p/$RELAY_PEER_ID" +RELAY_URL="${RELAY_URL:-http://relay:3340}" exec willow-replay \ --identity-path /etc/willow/replay.key \ - --relay "$RELAY_ADDR" \ - --max-events-per-server "${MAX_EVENTS_PER_SERVER:-1000}" \ + --relay-url "$RELAY_URL" \ + --max-events-per-author "${MAX_EVENTS_PER_AUTHOR:-1000}" \ --sync-interval "${SYNC_INTERVAL:-30}" diff --git a/docker/storage-entrypoint.sh b/docker/storage-entrypoint.sh index 114741c2..8e4b1d90 100755 --- a/docker/storage-entrypoint.sh +++ b/docker/storage-entrypoint.sh @@ -7,16 +7,10 @@ if [ ! -f /etc/willow/storage.key ]; then willow-storage --generate-identity --identity-path /etc/willow/storage.key fi -# Wait for relay peer ID to be published. -echo "Waiting for relay peer ID..." -while [ ! -f /shared/relay-peer-id ]; do sleep 1; done -RELAY_PEER_ID=$(cat /shared/relay-peer-id) -echo "Relay peer ID: $RELAY_PEER_ID" - -RELAY_ADDR="/dns4/relay/tcp/9091/ws/p2p/$RELAY_PEER_ID" +RELAY_URL="${RELAY_URL:-http://relay:3340}" exec willow-storage \ --identity-path /etc/willow/storage.key \ - --relay "$RELAY_ADDR" \ + --relay-url "$RELAY_URL" \ --db-path "${DB_PATH:-/var/lib/willow/storage.db}" \ --sync-interval "${SYNC_INTERVAL:-60}" diff --git a/justfile b/justfile index 9f6626cf..7f149e99 100644 --- a/justfile +++ b/justfile @@ -1,6 +1,6 @@ # Willow — P2P encrypted chat -# Run ALL checks (fmt, clippy, test, wasm, browser). Use before committing. +# Run ALL checks (fmt, clippy, test, wasm). Use before committing. check: fmt clippy test check-wasm # Format all code @@ -186,7 +186,7 @@ serve-web: build-relay: cargo build --release -p willow-relay -# Run the relay server (TCP 9090, WebSocket 9091) +# Run the relay server (HTTP 3340) relay *args: cargo run -p willow-relay -- {{args}}