auto-fix batch claude/friendly-maxwell-s1tqN 2026-05-04 by intendednull · Pull Request #630 · intendednull/willow

intendednull · 2026-05-04T17:33:01Z

batch sweep of 2026-05-04 audit (#600 sub-issues). 10/10 dispatches landed clean.

Fixes

Fixes audit F3 [docs]: stale TODO references closed issue #119 in network trait #603 — docs(network): retarget #119 TODO to #561 (TODO ref now points to live tracker).
Fixes audit F4 [docs]: just check doc-comment claims it runs browser tests but it does not #604 — docs(justfile): drop "browser" from check recipe comment (recipe doesn't run test-browser; check-all is everything gate).
Fixes audit F10 [security]: cargo install cargo-audit is unpinned in CI #610 — chore(ci): pin cargo-audit to 0.21.2 in audit job (closes supply-chain hole).
Fixes audit F24 [tech-debt]: EventKind::SetServerDescription has no check_permission rejection test #624 — test(state): reject non-admin SetServerDescription (mirrors RenameServer sibling, closes implicit-only gap).
Fixes audit F26 [quality]: Docker relay entrypoint passes --tcp-port/--ws-port flags the binary does not accept (HARD RUNTIME BREAK) #626 — fix(docker): align entrypoints with current relay/worker CLI (root-cause atomic fix; relay + worker entrypoints + docker-compose ports all aligned to post-iroh-migration CLI surface; docker stack now actually launches end-to-end).
Fixes audit F27 [docs]: README + CLAUDE.md "Local Development" tables reference removed relay ports 9090/9091 #627 — docs: relay port is 3340 (drop stale 9090/9091) (README + CLAUDE.md + justfile comment all updated; pairs with audit F26 [quality]: Docker relay entrypoint passes --tcp-port/--ws-port flags the binary does not accept (HARD RUNTIME BREAK) #626 doc-side).
Fixes audit F14 [robustness]: EventKind::UpdateProfile.display_name has no cap during apply (sibling fields use truncate_chars) #614 — fix(state): cap UpdateProfile.display_name at 64 chars (rejection mirrors SetProfile parity; identity-tier silent truncation breeds confusing duplicates so reject > truncate).
Fixes audit F15 [robustness]: EventKind::Reaction.emoji unbounded and per-message reaction cardinality unbounded #615 — fix(state): cap Reaction emoji + cardinality per message (32 byte / 32 count caps per audit; closes per-message DoS surface).
Fixes audit F13 [robustness]: SealedContent.ciphertext Vec<u8> has no size cap #613 — fix(messaging): cap SealedContent ciphertext size (64 KiB cap; validate now recurses into Encrypted; closes pre-decrypt allocation DoS).
Fixes audit F19 [security]: CSP test uses substring matching — additions of unsafe-inline / unsafe-hashes to existing directives pass undetected #619 — test(web): exact-set CSP directive check (block additions) (parser asserts each directive's full token set; adds dedicated negative tests for unsafe-inline outside style-src + data: in script-src).

Already-Fixed

None — same-day audit-to-fix gap, sweep correctly empty.

Parked

None — all 10 picks landed.

Skill Evolution

Two lessons from this run, committed on master in 36fe564 docs(skill): codify pre-flight reads + monitor-skip for short dispatches:

Coordinator pre-flight reads via mcp__github__get_file_contents before dispatching audit-pairs issues — discovered audit F26 [quality]: Docker relay entrypoint passes --tcp-port/--ws-port flags the binary does not accept (HARD RUNTIME BREAK) #626 had wider scope than audit prescribed (worker entrypoints also stale since iroh migration); led to root-cause atomic fix in one commit instead of symptom-only patch that would have left the docker stack still broken.
Skip Monitor for fast (<5 min) dispatches — foreground Agent return + immediate git fetch + reset --hard is sufficient. Monitor's overhead is only worth paying when the SHA-advance signal needs to be independent of agent completion (long dispatches, background-launched agents).

Lessons Learned

Sequential dispatch with pre-dispatch sync (git fetch + reset --hard origin/<branch>) kept master clean across all 10 commits. Zero rebase conflicts. The discipline of "always re-base on origin before dispatching the next implementer" is the load-bearing rule.
Pre-flight reading paid off heavily on audit F26 [quality]: Docker relay entrypoint passes --tcp-port/--ws-port flags the binary does not accept (HARD RUNTIME BREAK) #626. The audit's literal fix would have changed only 2–3 lines but left the docker stack still unstartable (worker entrypoints used --relay instead of --relay-url, --max-events-per-server instead of --max-events-per-author, and a libp2p multiaddr iroh doesn't accept). Coordinator pre-flight surfaced this; the implementer's brainstorm chose Option B (root-cause atomic) and shipped a working stack in one commit.
The pre-staged-tree pattern held in 9/10 dispatches — implementers found their prescribed diff already in the working tree at session start, verified, gated, committed. The skill already documents this as expected behavior; ran clean.
Stop-hook noise during mid-flight is now well-understood. Fired ~6 times across 10 dispatches. The skill's "stop-hook noise is normal" clause held the line — no false-alarm finalize-implementer dispatches.
The brainstorm-on-dispatch gate triggered correctly twice (audit F26 [quality]: Docker relay entrypoint passes --tcp-port/--ws-port flags the binary does not accept (HARD RUNTIME BREAK) #626, audit F14 [robustness]: EventKind::UpdateProfile.display_name has no cap during apply (sibling fields use truncate_chars) #614, audit F19 [security]: CSP test uses substring matching — additions of unsafe-inline / unsafe-hashes to existing directives pass undetected #619). Implementers ran self-driven brainstorms, picked the principled option (B atomic for audit F26 [quality]: Docker relay entrypoint passes --tcp-port/--ws-port flags the binary does not accept (HARD RUNTIME BREAK) #626, A reject for audit F14 [robustness]: EventKind::UpdateProfile.display_name has no cap during apply (sibling fields use truncate_chars) #614, A exact-set for audit F19 [security]: CSP test uses substring matching — additions of unsafe-inline / unsafe-hashes to existing directives pass undetected #619), and folded rationales into commit bodies. None of the brainstorms required human input.

Test plan

CI runs full just check workspace-wide (fmt + clippy + tests + WASM check) on master PR.
CI runs cargo test -p willow-state (252 tests, +3 from audit F14 [robustness]: EventKind::UpdateProfile.display_name has no cap during apply (sibling fields use truncate_chars) #614/audit F15 [robustness]: EventKind::Reaction.emoji unbounded and per-message reaction cardinality unbounded #615) — confirmed locally green per implementer reports.
CI runs cargo test -p willow-messaging (53 tests, +2 from audit F13 [robustness]: SealedContent.ciphertext Vec<u8> has no size cap #613) — confirmed locally green.
CI runs cargo test -p willow-web --tests (incl. new CSP exact-set tests from audit F19 [security]: CSP test uses substring matching — additions of unsafe-inline / unsafe-hashes to existing directives pass undetected #619) — confirmed locally green.
CI runs cargo audit with newly-pinned cargo-audit 0.21.2.
Manual smoke (post-merge): docker compose up should now launch all services without flag errors. (Out-of-band test; the PR's local validation was bash -n + docker compose config which both passed.)
Verify just check recipe comment matches actual recipe contents (drop of "browser" — visible in audit F4 [docs]: just check doc-comment claims it runs browser tests but it does not #604 commit).

Generated by Claude Code

#119 is closed but the connection_events() follow-up never landed. #561 is the live tracker; point readers there instead. Refs #603

Recipe doesn't run test-browser; check-all is the everything gate. Refs #604

Closes a supply-chain hole where the audit gate itself pulled an unpinned version. Future bumps become deliberate PRs. Refs #610

Mirror the RenameServer sibling test to make the permission check explicit at the state tier instead of only implicit via apply. Refs #624

The docker entrypoints + compose ports stopped matching the binaries after the iroh migration. Relay used --tcp-port/--ws-port (binary takes --relay-port) and --print-peer-id (it's --print-id), so the entrypoint died at line 6. Workers used --relay/--max-events-per-server plus a libp2p multiaddr RELAY_ADDR; iroh expects --relay-url with an HTTP URL like http://relay:3340 (peer ID isn't part of the URL — iroh discovers via the relay), so the relay-peer-id shared-volume dance is also dead. Mirroring scripts/dev.sh + scripts/setup-e2e.sh, which already use the canonical iroh CLI surface. Considered narrow fix (only --tcp-port/--ws-port + ports block per the audit's literal prescription) — rejected because relay still crashes on --print-peer-id and workers still can't start with stale flags, so the stack remains unusable. This is mechanical migration of the same root cause across all entrypoints. Refs #626

Pairs with #626 (docker port-surface fix); README + CLAUDE.md tables and the justfile relay recipe comment now match the iroh-relay binary's single --relay-port (default 3340). Refs #627

Closes the apply-side gap where `UpdateProfile { display_name: Some("a".repeat(10_000_000)) }` would have been materialised verbatim into both `state.profiles[author].display_name` and `state.members[author].display_name`, while sibling fields (pronouns/bio/tagline/etc.) silently truncate via `truncate_chars` and `SetProfile` already rejects above 64 chars. Chose rejection (mirroring `SetProfile`) over silent truncation (mirroring sibling profile fields) because `display_name` is identity- tier — silent truncation can produce confusing duplicate names ("alice" vs a truncated "alice…"). The audit explicitly cites `SetProfile` parity as the prescribed fix. Out of scope: a per-field byte cap inside `EventDag::insert` is the job of #611 and is not attempted here. Refs #614

Reject Reaction events whose emoji exceeds 32 bytes or whose new key would push the message past 32 distinct reactions. Closes the per-message DoS surface where a peer with SendMessages could broadcast multi-MB emoji strings or unbounded distinct reaction keys, since each unique emoji is cloned as a BTreeMap key on every receiver at replay. Adding an author to an existing emoji key remains free. Out of scope: byte-level cap on the per-event payload in EventDag::insert is tracked under #611. Refs #615

SealedContent.ciphertext was an unbounded Vec<u8> on the wire, so a peer could broadcast a multi-MB blob and force every receiver to allocate it during decode and again during open_content before AEAD verification could reject it — a pre-decrypt DoS surface. Add MAX_SEALED_CIPHERTEXT_BYTES (64 KiB — comfortably exceeds any chat payload plus the ChaCha20-Poly1305 16-byte tag) and recurse into Content::Encrypted from Content::validate so oversized ciphertexts are rejected before allocation amplifies. Refs #613

Closes the substring-match hole in `index_html_declares_content_security_policy`: a widened `script-src 'self' 'wasm-unsafe-eval' 'unsafe-eval' 'unsafe-inline'` would still satisfy the old `contains(directive)` check because the original substring was preserved verbatim. The test now parses the CSP `content` attribute, splits on `;`, and asserts each directive's full token set equals an exact allow-list (`BTreeMap<&str, BTreeSet<&str>>`) — additions, removals, and reorderings all fail loudly. Adds two defense-in-depth negative tests: - `csp_rejects_unsafe_inline_outside_style_src` — `'unsafe-inline'` (and `'unsafe-hashes'`) may not appear in any directive other than `style-src`. - `csp_rejects_data_in_script_src` — `data:` may not appear in `script-src`. The CSP itself is unchanged in this commit; tightening the production policy (e.g. closing the `https:` wildcard in `connect-src`/`img-src`) is tracked separately under #617. Considered alternative: keep substring matching and rely solely on the negative tests. Rejected — substring matching is the root cause and would still let arbitrary unrelated additions slip through unless they happened to match a forbidden literal. Refs #619 https://claude.ai/code/session_01ExFeqBfdFZBP1Y8jNGmFud

Two lessons from 2026-05-04 auto-fix run (10/10 dispatches landed clean): - pre-flight `mcp__github__get_file_contents` on cited siblings before dispatching audit-pairs issues — discovered #626 had wider scope than audit prescribed (worker entrypoints also broken since iroh migration); led to root-cause atomic fix instead of symptom-only patch. - skip Monitor for fast (<5 min) dispatches — foreground Agent return + immediate `git fetch + reset --hard` is sufficient. Reserve Monitor for long dispatches or background-launched agents. Refs run on branch claude/friendly-maxwell-s1tqN.

CI rescue for PR #630 — cargo-audit 0.21.2 errors on the upstream advisory-db at startup because RUSTSEC-2026-0041 (lz4_flex) uses CVSS 4.0, which 0.21.x cannot parse: parse error: TOML parse error at line 8, column 8 cvss = "CVSS:4.0/AV:N/AC:L/AT:P/PR:N/UI:N/VC:H/VI:N/VA:N/SC:N/SI:N/SA:N" unsupported CVSS version: 4.0 cargo-audit 0.22.x added CVSS 4.0 support. Bumping the pin to 0.22.1 (latest stable; published 2026-04-22). Verified locally: install clean, advisory-db loads (1067 advisories), Cargo.lock scan starts. The audit's prescribed pin (0.21.2) in #610's brief was wrong-by-default given the upstream advisory-db's evolution. Future bumps remain deliberate — pin discipline preserved. Refs #610.

- PLAN.md opened with "Built with Rust, libp2p, and Leptos" and named GossipSub/Kademlia/mDNS in its architecture box. Project migrated to iroh; see docs/specs/2026-03-29-iroh-migration-design.md. - README.md (overview + setup) + CLAUDE.md (architecture, crate layout, state-management, test-tier tree) cover everything PLAN duplicated, with current terminology. - Phase checklist (lines 66-157) was a historical artifact; every phase except Phase 9 read COMPLETE but doc didn't track code evolution. Active voice/video state lives in crates/web/src/voice.rs + specs. - "Deployed" section's port mentions (9090/9091) were a third drift site already flagged by audit #627 / PR #630. - Audit #607 explicitly accepted deletion as the principled fix vs rewrite-and-let-it-drift-again. - No references: grep -rn "PLAN.md" returned zero hits across repo. No include_str! / build.rs / test reads PLAN.md, so cargo test gate doesn't apply per the no-asset-coverage rule. Ran cargo fmt --check and cargo check --workspace; both green. Refs #607

intendednull · 2026-05-05T21:42:00Z

fix ci

Prior run of Playwright E2E on 4adfe2c failed; cargo-audit (rescued to 0.22.1 in 4adfe2c) and 6 other checks passed. Local reproduction showed widespread DAG-convergence timeouts AND iroh-relay "dial failed: timed out" / "MultipathNotNegotiated" / "Connection did not reach established state within timeout" errors in the relay log — network-layer symptoms typical of CI sandbox transient instability, not regression. State-tier changes in this batch (#613/#614/#615) cannot affect the failing test paths: - #613 caps SealedContent.ciphertext at 64 KiB, validated in Message::validate (chat-message store path) — invite-flow tests never construct Content::Encrypted (no channel keys exchanged). - #614 rejects UpdateProfile.display_name > 64 chars — Alice/Bob fixtures use 5-char names. - #615 rejects Reaction emoji > 32 B / cardinality > 32 per message — invite-flow tests never react. PR #599 (no state-machine changes, same `main` base) passed Playwright E2E earlier today at 09:51 UTC, ruling out a stable preexisting break on `main`. Empty retrigger to confirm flake. If this run also fails identically, bisect-revert the 3 state-cap commits one at a time will follow.

intendednull · 2026-05-08T07:42:04Z

try again!

Run 2 (4adfe2c → 25333877355): Playwright failed, all others green. Run 3 (3f689a4 → 25404028005): Playwright PASSED, Browser tests (wasm-pack + Firefox) failed instead. Different browser-tier job each run. No code changes between Run 2 and Run 3 (3f689a4 was an empty commit). Confirms rotating CI sandbox flake on browser-tier jobs, not a regression in this PR. Retriggering once more.

claude added 13 commits May 4, 2026 16:40

chore: open auto-fix batch claude/friendly-maxwell-s1tqN

d947816

docs(network): retarget #119 TODO to #561

435f5ea

#119 is closed but the connection_events() follow-up never landed. #561 is the live tracker; point readers there instead. Refs #603

docs(justfile): drop "browser" from check recipe comment

0fc924a

Recipe doesn't run test-browser; check-all is the everything gate. Refs #604

chore(ci): pin cargo-audit to 0.21.2 in audit job

41438f4

Closes a supply-chain hole where the audit gate itself pulled an unpinned version. Future bumps become deliberate PRs. Refs #610

test(state): reject non-admin SetServerDescription

0f402b7

Mirror the RenameServer sibling test to make the permission check explicit at the state tier instead of only implicit via apply. Refs #624

docs: relay port is 3340 (drop stale 9090/9091)

902a04b

Pairs with #626 (docker port-surface fix); README + CLAUDE.md tables and the justfile relay recipe comment now match the iroh-relay binary's single --relay-port (default 3340). Refs #627

intendednull mentioned this pull request May 5, 2026

auto-fix batch claude/friendly-maxwell-3QJhB 2026-05-05 #631

Merged

8 tasks

intendednull merged commit f34b374 into main May 8, 2026
8 checks passed

intendednull deleted the claude/friendly-maxwell-s1tqN branch May 8, 2026 09:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

auto-fix batch claude/friendly-maxwell-s1tqN 2026-05-04#630

auto-fix batch claude/friendly-maxwell-s1tqN 2026-05-04#630
intendednull merged 15 commits into
mainfrom
claude/friendly-maxwell-s1tqN

intendednull commented May 4, 2026

Uh oh!

intendednull commented May 5, 2026

Uh oh!

intendednull commented May 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

intendednull commented May 4, 2026

Fixes

Already-Fixed

Parked

Skill Evolution

Lessons Learned

Test plan

Uh oh!

intendednull commented May 5, 2026

Uh oh!

intendednull commented May 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants