Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .claude/skills/resolving-issues/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,8 @@ Why this shape:

**When same-day audit-to-fix gap, expect ~zero already-fixed hits.** The sweep yields most when the audit ran days/weeks before the fix run (intervening PRs land between audit and fix). When the audit was filed earlier the same day against the current `main`, no fixes can have landed between audit and fix run by definition — the sweep correctly returns empty. Don't over-invest sweep effort in this case; one quick `git log <audit-ref>..origin/main` check on each pick's path is enough to confirm.

**Coordinator pre-flight reads (cheap, high-value).** When an issue cites multiple files or hints at sibling rot ("Pairs with FX/FY", "or the migration's port surface drifted", "compounds Z"), spend 1–2 `mcp__github__get_file_contents` calls before dispatching to read the cited files at HEAD. This costs nothing (read-only, doesn't touch the implementer's working tree) but lets the coordinator: (a) discover wider rot the audit narrowed past — e.g. issue #626 audit prescribed only `--tcp-port`/`--ws-port` swap on the relay entrypoint, but pre-flight reading the worker entrypoints surfaced `--relay` vs `--relay-url`, `--max-events-per-server` vs `--max-events-per-author`, and a libp2p multiaddr the workers built that iroh doesn't accept; (b) pre-script an A vs B brainstorm prompt for the implementer with concrete options grounded in real file content, not speculation. **Skip pre-flight when the issue is genuinely one-token (e.g. a TODO comment swap).** Heuristic: pre-flight when the audit body mentions ≥ 2 file paths or "pairs with"; skip otherwise. The implementer can still investigate independently — pre-flight just sharpens the scope brief and prevents under-scoped fixes that "fix" a symptom while leaving the system broken.

**Ambiguous-fix-path (audit premise real, fix is a design call) — coordinator-skip without close.** Some audit findings flag a real concern but the prescribed fix is ambiguous-by-design with ≥ 2 legitimate approaches that need a design decision the coordinator can't make unilaterally. Common shape: "audit-glob doesn't catch in-file `mod tests` — move to external test file OR update glob." Both are valid; picking either silently is wrong. **Action:** during step 6 picks, if the audit's literal fix conflicts with HEAD reality (e.g. code already satisfies the surface premise) AND the underlying concern requires a design call, **skip the issue from this run's dispatch queue without closing it**. Note it in the run-end `## Lessons Learned` so the next session has full context. Don't comment on the issue — the audit description already captures the concern; a "skipped, design call needed" comment adds noise without progress. The next run with fresh eyes / accumulated context can decide.
7. Per remaining issue, sequential, max 10 per run:
- **Pre-dispatch sync:** before spawning each implementer, in the coordinator's checkout:
Expand Down Expand Up @@ -210,6 +212,7 @@ Coordinator owns this check (metadata work, allowed under "coordinator never cod
- "Did the implementer's work land on master?" → check git ref / wait for SHA-advance.
- "Did the implementer agent terminate?" → wait for agent-completion notification.
Don't treat them as redundant — and don't preemptively conclude the implementer crashed just because one arrived first.
- **Monitor is overhead for short dispatches — skip it when the agent return is the only signal you need.** For TINY/small fixes (single-file docs / config / one-token edits, <2 min implementer runtimes), the foreground Agent call already returns the agent's report synchronously. Don't bother arming Monitor — the return message itself tells you the SHA, and a quick `git fetch + git reset --hard origin/<branch>` confirms the push. Monitor's value is when you need the SHA-advance signal *independent* of agent completion: long dispatches where you want to do coordinator metadata work in parallel, or background-launched agents (`run_in_background: true`) where the agent return doesn't synchronously surface to you. **Rule of thumb:** foreground Agent + immediate `git fetch + reset --hard` for fast issues; background Agent + Monitor for long ones (≥ 5 min expected runtime, e.g. cross-crate gates, docker overhauls). Mid-run signals from the harness (stop-hook chatter, file writes) tell you the implementer's *alive*; the Agent return tells you it *finished*.

### Implementer cuts off mid-flight (uncommitted edits, agent terminated)
- A surprisingly common failure mode: the implementer agent gets cut off (token budget, time slice, harness limit) AFTER making the substantive edits but BEFORE running the local merge gate or committing. Symptoms:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@ jobs:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: dtolnay/rust-toolchain@29eef336d9b2848a0b548edc03f92a220660cdb8 # stable
- name: Install cargo-audit
run: cargo install --locked cargo-audit
run: cargo install --locked --version 0.22.1 cargo-audit
# Known advisories are tracked in dedicated issues; fail only on new ones.
# When each issue is resolved, remove the matching --ignore entry.
- name: Run cargo audit
Expand Down
2 changes: 1 addition & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ just clean # cargo clean + remove crates/web/dist

| Service | Address | Description |
|---------|---------|-------------|
| Relay | `localhost:9090` (TCP), `localhost:9091` (WS) | Bridges peers |
| Relay | `localhost:3340` | iroh relay HTTP + bootstrap |
| Replay node | connects via relay | In-memory state sync (max 1000 events/server) |
| Storage node | connects via relay | Archival SQLite storage |
| Web UI | `http://localhost:8080` | Leptos app via `trunk serve` |
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,7 @@ This launches all services:

| Service | Address | Description |
|---------|---------|-------------|
| Relay | `localhost:9090` (TCP), `localhost:9091` (WS) | Bridges peers |
| Relay | `localhost:3340` | iroh relay HTTP + bootstrap |
| Replay node | connects via relay | In-memory state sync (max 1000 events/server) |
| Storage node | connects via relay | Archival SQLite storage |
| Web UI | `http://localhost:8080` | Leptos app via trunk serve |
Expand Down
120 changes: 96 additions & 24 deletions crates/messaging/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,22 @@ pub const MAX_FILENAME_BYTES: usize = 255;
/// MIME types longer than this are rejected by [`Content::validate`].
pub const MAX_MIME_BYTES: usize = 255;

/// Maximum allowed length, in bytes, for [`SealedContent::ciphertext`].
///
/// `SealedContent::ciphertext` is opaque before AEAD verification, so a
/// peer can broadcast an arbitrarily large blob and force receivers to
/// allocate it during deserialise and during decryption before any
/// authentication failure surfaces. Capping the wire length closes that
/// pre-decrypt DoS surface.
///
/// 64 KiB comfortably exceeds any realistic chat payload (text bodies,
/// reactions, replies, edits, system messages) plus the 16-byte
/// ChaCha20-Poly1305 auth tag, while bounding worst-case allocations
/// from a single message at a level that's cheap to absorb. Peer-supplied
/// `Content::Encrypted` values whose `ciphertext` exceeds this bound are
/// rejected by [`Content::validate`].
pub const MAX_SEALED_CIPHERTEXT_BYTES: usize = 64 * 1024;

/// Errors returned by [`Content::validate`] / [`Message::validate`] when a
/// peer-supplied payload exceeds the structural bounds enforced by this
/// crate.
Expand Down Expand Up @@ -135,6 +151,16 @@ pub enum MessageValidationError {
/// Configured maximum.
max: usize,
},

/// `Content::Encrypted` carried a `SealedContent::ciphertext` that
/// exceeded [`MAX_SEALED_CIPHERTEXT_BYTES`].
#[error("sealed ciphertext too long: {actual} bytes (max {max})")]
SealedCiphertextTooLong {
/// Observed length in bytes.
actual: usize,
/// Configured maximum.
max: usize,
},
}

/// The payload inside a [`Message`].
Expand Down Expand Up @@ -216,34 +242,52 @@ impl Content {
/// `Content` is a wire enum decoded with `bincode` before any
/// structural check runs, so callers handling peer-supplied values
/// MUST invoke this method after decoding and before trusting the
/// fields. Currently the only bounded fields are
/// [`Content::File::filename`] (≤ [`MAX_FILENAME_BYTES`]) and
/// [`Content::File::mime_type`] (≤ [`MAX_MIME_BYTES`]); other
/// variants always validate successfully.
/// fields. Bounded fields:
///
/// `Content::Encrypted` is intentionally not recursed into here —
/// the inner `Content` is opaque ciphertext until the channel key
/// is applied; callers should re-invoke `validate()` on the
/// decrypted `Content` before using its fields.
/// - [`Content::File::filename`] (≤ [`MAX_FILENAME_BYTES`])
/// - [`Content::File::mime_type`] (≤ [`MAX_MIME_BYTES`])
/// - [`SealedContent::ciphertext`] inside [`Content::Encrypted`]
/// (≤ [`MAX_SEALED_CIPHERTEXT_BYTES`])
///
/// Other variants always validate successfully.
///
/// The plaintext inside `Content::Encrypted` is opaque until the
/// channel key is applied; callers should re-invoke `validate()`
/// on the decrypted `Content` before trusting its inner fields.
pub fn validate(&self) -> Result<(), MessageValidationError> {
if let Content::File {
filename,
mime_type,
..
} = self
{
if filename.len() > MAX_FILENAME_BYTES {
return Err(MessageValidationError::FilenameTooLong {
actual: filename.len(),
max: MAX_FILENAME_BYTES,
});
match self {
Content::File {
filename,
mime_type,
..
} => {
if filename.len() > MAX_FILENAME_BYTES {
return Err(MessageValidationError::FilenameTooLong {
actual: filename.len(),
max: MAX_FILENAME_BYTES,
});
}
if mime_type.len() > MAX_MIME_BYTES {
return Err(MessageValidationError::MimeTypeTooLong {
actual: mime_type.len(),
max: MAX_MIME_BYTES,
});
}
}
if mime_type.len() > MAX_MIME_BYTES {
return Err(MessageValidationError::MimeTypeTooLong {
actual: mime_type.len(),
max: MAX_MIME_BYTES,
});
Content::Encrypted(sealed) => {
if sealed.ciphertext.len() > MAX_SEALED_CIPHERTEXT_BYTES {
return Err(MessageValidationError::SealedCiphertextTooLong {
actual: sealed.ciphertext.len(),
max: MAX_SEALED_CIPHERTEXT_BYTES,
});
}
}
Content::Text { .. }
| Content::Reaction { .. }
| Content::Reply { .. }
| Content::Edit { .. }
| Content::Delete { .. }
| Content::System { .. } => {}
}
Ok(())
}
Expand Down Expand Up @@ -727,6 +771,34 @@ mod tests {
}
}

#[test]
fn validate_rejects_oversized_sealed_ciphertext() {
let content = Content::Encrypted(SealedContent {
ciphertext: vec![0u8; MAX_SEALED_CIPHERTEXT_BYTES + 1],
nonce: [0u8; 12],
key_epoch: 0,
ratchet_counter: 0,
});
assert_eq!(
content.validate(),
Err(MessageValidationError::SealedCiphertextTooLong {
actual: MAX_SEALED_CIPHERTEXT_BYTES + 1,
max: MAX_SEALED_CIPHERTEXT_BYTES,
})
);
}

#[test]
fn validate_accepts_sealed_ciphertext_at_boundary() {
let content = Content::Encrypted(SealedContent {
ciphertext: vec![0u8; MAX_SEALED_CIPHERTEXT_BYTES],
nonce: [0u8; 12],
key_epoch: 0,
ratchet_counter: 0,
});
assert!(content.validate().is_ok());
}

#[test]
fn message_validate_delegates_to_content() {
let mut hlc = hlc::HLC::new();
Expand Down
2 changes: 1 addition & 1 deletion crates/network/src/traits.rs
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@ pub trait Network: Send + Sync + 'static {
/// Access the blob store.
fn blobs(&self) -> &dyn BlobStore;

// TODO(#119): add connection_events() — stream relay up/down and direct
// TODO(#561): add connection_events() — stream relay up/down and direct
// peer connect/disconnect events so the client can surface connectivity
// status in the UI.

Expand Down
Loading