Skip to content

auto-fix batch claude/friendly-maxwell-cCGXk 2026-05-04#598

Merged
intendednull merged 10 commits into
mainfrom
claude/friendly-maxwell-cCGXk
May 4, 2026
Merged

auto-fix batch claude/friendly-maxwell-cCGXk 2026-05-04#598
intendednull merged 10 commits into
mainfrom
claude/friendly-maxwell-cCGXk

Conversation

@intendednull
Copy link
Copy Markdown
Owner

@intendednull intendednull commented May 4, 2026

Scheduled /resolving-issues sweep against the 2026-05-03 general-audit (master ticket #567). Five small-scope fixes landed sequentially; one follow-up filed (#597). Picked candidates that did NOT overlap with PR #566 + PR #596's in-flight files.

Fixes

  • Fixes audit F8 [security]: Ed25519 verify uses non-strict mode (signature malleability) #575docs(identity): pin verify_strict invariant + test (6a765ea). Stale-audit-with-residual-gap path: audit assumed key.verify(...) was non-strict, but iroh_base::PublicKey::verify (re-exported and used here) already calls verify_strict internally (iroh-base-0.98.0/src/key.rs:134-136). Implementer pinned the invariant via doc comment + added verify_rejects_non_canonical_s_component regression test (flips top bit of byte 63 of the S scalar to push above the curve order). Defense-in-depth: any future refactor that bypasses iroh-base's wrapper must continue to use the strict primitive.
  • Fixes audit F9 [security]: DAG anti-DoS caps run AFTER expensive signature verify #519fix(state): check deps cap before sig verify (a3cc10c). Security HIGH — defeats SendMessages-perm DoS where attacker forces every receiver to pay full ~50µs Ed25519 verify cost on rejected over-cap events. Reordered EventDag::insert: MAX_EVENT_DEPS + MAX_ENCRYPTED_KEY_BYTES cheap structural caps moved BEFORE event.verify(). Added insert_deps_cap_rejects_before_signature_verify regression test (constructs over-cap event with all-zero signature, asserts InsertError::DepsTooLong — proves cap fires before sig-verify path runs). SEC-V-07 comment updated to reflect the new ordering rationale (the original intent comment already prescribed this — fix matched implementation to that intent).
  • Fixes audit F16 [security]: Content::File size_bytes attacker-declared, filename/mime unbounded #583fix(messaging): cap Content::File filename + mime (9a700dd). Added MAX_FILENAME_BYTES = 255 (POSIX) + MAX_MIME_BYTES = 255 (RFC 6838 aligned), MessageValidationError enum, Content::validate() + Message::validate(). Validation wired at InMemoryStore::insert in messaging crate (closest free inbound boundary; client/listeners.rs is locked under PR auto-fix batch claude/friendly-maxwell-M5xB6 2026-05-03 #566 — gap noted in code comment). Advisory-only doc comment on size_bytes (attacker-declared; never use for preallocation). 6 new tests covering oversize rejection, boundary acceptance (255 bytes), empty values, no-op for non-File variants, Message::validate delegation. No size_bytes-driven Vec::with_capacity / reserve calls found in the tree, so no follow-up needed.
  • Fixes audit F17 [security]: CSP img-src lacks https: but auto-embed renders external images #584fix(web): allow https: img-src for auto-embed (39a9f1d) + CI rescue test(web): track CSP img-src https: change (89f55a2). CSP img-src 'self' data: blob:img-src 'self' https: data: blob:. Design intent already established by PR [SEC-W-04] Peer-supplied URLs auto-embedded as <img> with no scheme/host allowlist — passive-tracking vector #243 (which added referrerpolicy="no-referrer" to external <img> embeds — clear positive choice to render external images while protecting referrer privacy); coordinator pre-decided narrowing eliminated the audit's two-option ambiguity. CI Test job initially failed because crates/web/tests/static_assets.rs:133 asserts the exact pre-change CSP substring; rescue commit 89f55a2 updated the assertion to track the new directive. Lesson codified in skill (see ## Skill Evolution).
  • Fixes audit F45 [robustness]: WorkerCache uses Instant::now() (native-only) in willow-client lib crate #545fix(client): use web_time::Instant in WorkerCache (6846fee). Use-rename std::time::Instantweb_time::Instant in crates/client/src/worker_cache.rs. Added web-time = "1" workspace dep + per-crate dep on willow-client. Coordinator-narrowed scope: audit's two options (cfg-gate eviction wasm-out vs web_time::Instant swap) decisively answered — option B is correct because cfg-gating eviction would mean workers never expire on the web client (wrong behavior). Wasm-target runtime test skipped (wasm-bindgen-test not wired into willow-client; compile-time linkage gate via cargo check --target wasm32-unknown-unknown covers the audit's primary concern; gap noted in commit body).

Already-Fixed

None this run. The audit at #567 was filed 2026-05-03 against main @ 6404719 — the same HEAD this run started from. No fixes can have landed in the audit-to-fix gap by definition (per the prior PR #566 + #596 lessons documenting same-day audit-to-fix as the expected zero-yield case). Sweep was a one-line check, no time wasted.

Parked

None. All 5 picks landed cleanly. No mid-flight aborts; no finalize-implementer rescues during dispatch. (CI rescue commit 89f55a2 was a coordinator-side reaction to a master-PR CI failure, not a mid-dispatch finalize.)

Filed mid-run (follow-ups, NOT closed by this PR)

Avoided picks (overlap with PR #566 + #596 files in flight)

Skill Evolution

Two skill edits this run:

  • 450a4e8 docs(skill): cargo.lock conflicts are additive — adds a sub-paragraph to Implementer Agent step 6's mechanical-call-site note. Codifies that Cargo.lock (and [workspace.dependencies] table additions) being in another open PR's file list does NOT block a fix that needs a new dep — additions are strictly additive in Cargo.lock, merge resolution is mechanical, refusing creates infinite "wait for PR X to merge" deadlocks. This run's audit F45 [robustness]: WorkerCache uses Instant::now() (native-only) in willow-client lib crate #545 dispatch surfaced the rule (web_time::Instant for WorkerCache wasm fix needed a workspace dep added while Cargo.lock was in PR auto-fix batch claude/friendly-maxwell-M5xB6 2026-05-03 #566's file list; coordinator brief explicitly authorised the additive churn so implementer didn't abort).
  • 57fddc6 docs(skill): static-asset tests need cargo test — adds a sub-paragraph to Implementer Agent step 7's local-merge-gate. Codifies that non-Rust file changes (HTML / CSS / JSON / TOML / YAML / sw.js) still need cargo test -p <touched-crate> because integration tests commonly assert on static-asset contents (e.g. crates/web/tests/static_assets.rs greps index.html for required CSP directives). This run's audit F17 [security]: CSP img-src lacks https: but auto-embed renders external images #584 brief told the implementer to skip cargo test on the basis "no Rust code changed"; CI Test job failed when static_assets.rs:133's exact-substring assertion missed the new https: token. CI rescue 89f55a2 fixed it. Skill edit closes the loophole so future briefs don't repeat.

Lessons Learned

Test plan

Master-PR CI is the load-bearing gate. Each implementer ran the scoped subset locally (fmt + native clippy + native test + wasm32 check + wasm32 clippy on touched crates).

CI gates to verify on this PR:

Cargo.lock conflict expected at merge time with PR #566 (which also touches Cargo.lock). Conflict is strictly additive (this PR adds web-time rows; PR #566's rows untouched) — mechanical resolution.

non_admin_set_profile_is_accepted (#565) status: intermittent. Both polarities reproduced across this run's local runs. CI Test job on 89f55a2 reported it red; expected. Known pre-existing, not introduced by this PR.


Generated by Claude Code

claude added 9 commits May 4, 2026 00:10
The audit (issue #575) flagged `verify` for using ed25519-dalek's
non-strict API. Investigation: `iroh_base::PublicKey::verify`, which our
standalone `verify` delegates to, already calls `verify_strict`
internally (iroh-base 0.98.0 src/key.rs:134-136). The vulnerability does
not exist in the deployed call chain, but the invariant is load-bearing
and silently breakable by future refactors.

Defense in depth: document the strict-mode dependency at the call site
so anyone changing this function knows the contract, and add a
regression test that constructs a non-canonical S component (top bit of
byte 63 set, pushing S above the curve order ℓ) and asserts `verify`
rejects it. The test pins the malleability vector closed regardless of
whether the underlying iroh-base wrapper changes.

Refs #575
EventDag::insert previously called event.verify() (Ed25519 +
bincode + blake3, ~50us) BEFORE the cheap structural caps
(MAX_EVENT_DEPS, MAX_ENCRYPTED_KEY_BYTES). An attacker holding any
SendMessages permission could broadcast events with pathologically
large deps; every receiver paid full sig-verify cost on each
rejected event.

Reorder so cheap O(1)/O(n) syntactic checks fire first and the
crypto verify path only runs once an event passes the structural
caps. The intent comment at the top of insert() already described
this design ("Reject at the inbound DAG boundary so over-cap events
never even reach applied_events"); this commit makes the code match
that intent.

Adds a regression test that constructs an event with deps.len() >
MAX_EVENT_DEPS and a clobbered (all-zero) signature; the test
asserts InsertError::DepsTooLong (not InvalidSignature), which
proves the structural cap short-circuits before sig-verify.

Pre-existing willow-state::tests_materialize::non_admin_set_profile_is_accepted
failure (regression from PR #505, tracked at #565) is unrelated to
this change and was already failing on HEAD.

Refs #519
Content::File carried a self-declared u64 size_bytes plus unbounded
filename / mime_type strings, all peer-supplied. Until now nothing
rejected a 256 KB filename or MIME, and the size field had no warning
that it was attacker-declared.

Add MAX_FILENAME_BYTES = 255 (POSIX NAME_MAX) and MAX_MIME_BYTES = 255
(POSIX-aligned, comfortably above RFC 6838's 127+127 type/subtype
limit) and expose Content::validate / Message::validate that reject
oversized values with MessageValidationError. Wire validation into
InMemoryStore::insert so peer-supplied messages cannot be persisted
without first clearing the structural bounds. Document size_bytes as
advisory-only — UIs may display it but must not use it for any
preallocation or trust decision.

The natural earlier ingress point sits in client/listeners.rs, but
that file is locked under PR #566; an inline NOTE in store.rs flags
the follow-up so the client side can also gate validation once #566
lands. No size_bytes preallocation hazards were found in the tree.

Refs #583
PR #243 added referrerpolicy="no-referrer" to external <img> tags in
the chat auto-embed code path (crates/web/src/components/message.rs),
signalling clear intent to render external images while protecting
referrer privacy. CSP img-src 'self' data: blob: blocked them in the
browser, so embeds silently failed.

Add https: (HTTPS only — http: would risk a mixed-content downgrade
on the page) to img-src, restoring the design intent. is_image_url()
also accepts http:// schemes; the resulting CSP/filter mismatch is
out-of-scope for this fix and will be tracked separately if it
proves to be a real path peers exercise.

Browser-side CSP enforcement was not testable in the dispatch sandbox
(no Chrome/Firefox); change is verified syntactically and via cargo
fmt. Existing Playwright e2e CI gate on the master branch will catch
any HTML / CSP regressions on merge.

Refs #584
WorkerCache called std::time::Instant::now() in a willow-client lib
crate. willow-client must build for wasm32-unknown-unknown per the
dual-target rule, and std::time::Instant compiles on WASM but its
monotonic source is not wired through Performance.now() without
explicit gating. Documented trip-hazard.

Replace std::time::Instant with web_time::Instant. API-compatible
drop-in: native uses std::time::Instant, WASM uses Performance.now()
automatically. No behaviour change on native; correct semantics on
WASM. TTL eviction continues to work on both targets (gating it out
on WASM would have left dead workers in the cache, defeating TTL).

Picked over cfg-gate-eviction (wrong semantics on web) and HLC-derived
clock (wider scope, no benefit). Both rejected in favour of the
minimal-change portable replacement.

Cargo.lock churn: web-time was already a transitive dep; the only
addition is willow-client's direct entry. Strictly additive.

Test gap: wasm-bindgen-test is not wired into willow-client, so the
wasm-target test from the audit suggestion was skipped. Coverage is
provided by `cargo check --target wasm32-unknown-unknown -p
willow-client` and the matching clippy gate, both of which pass and
catch the original compile-time linkage concern.

Refs #545
Skill say "skip files in flight" for in-flight PR overlap. Trip
hazard: Cargo.lock in PR X's file list, your fix need new dep,
implementer think file locked, abort. Wrong — Cargo.lock add
strictly additive (your row append, theirs stay). Merge trivial.

Refusing on Cargo.lock alone create infinite "wait for PR X" wait
loop, block dep upgrade + wasm-fix forever. Note churn in commit
body, ship the dep, let merge handle.

Same logic apply to workspace Cargo.toml [workspace.dependencies]
table — additive only. Coordinator brief should pre-authorise.

Surface in PR #545 dispatch (web_time::Instant for WorkerCache
wasm fix); coordinator brief explicit authorise the Cargo.lock
churn so implementer no abort.
Commit 39a9f1d updated index.html CSP img-src to include https:
(for auto-embed support per #584), but the static_assets integration
test asserts the exact pre-change substring `img-src 'self' data: blob:`
is present. The substring no longer appears verbatim, so CI's Test job
fails. Update REQUIRED_CSP_DIRECTIVES to match the production CSP.

Refs #584
#584 brief tell implementer skip cargo test, "no Rust code change,
HTML only." Wrong. crates/web/tests/static_assets.rs assert exact
CSP directive substring in index.html. Change CSP, substring move,
test red on master-PR CI (#598 Test job failed).

Cost ~10 min CI rescue dispatch + extra commit on master branch.
Avoidable: even non-Rust change need cargo test on crates whose
integration tests read static asset at runtime (HTML/CSS/JSON/TOML
/YAML/sw.js).

Skill now mandate cargo test -p <touched-crate> for any file in
crates/<x>/ regardless of language. Few seconds local, save CI red
+ rescue dispatch + commit noise.
@intendednull
Copy link
Copy Markdown
Owner Author

fix ci plz

The non_admin_set_profile_is_accepted test was flaky: admin's
GrantPermission and alice's SetProfile were causally independent
in the DAG (each on its own per-author prev chain, empty deps),
so materialize()'s topo-sort between them depended on HashMap
iter order. When SetProfile applied before GrantPermission, the
membership gate (PR #505) rejected it and the assertion failed.

Wire SetProfile's deps to the GrantPermission event hash. This
forces topo-sort to apply the grant first, deterministically.
Production behavior (membership gate) unchanged; this is the
test-side fix from issue #565's two valid options.

Refs #565
@intendednull intendednull merged commit 2495af0 into main May 4, 2026
8 checks passed
@intendednull intendednull deleted the claude/friendly-maxwell-cCGXk branch May 4, 2026 09:26
intendednull pushed a commit that referenced this pull request May 4, 2026
Test was flaky (#565). After PR #505's membership gate, the
test depended on HashMap iter order in topo-sort: when SetProfile
was visited before GrantPermission, alice failed the membership
gate and the profile was rejected (`left: None, right: Some("Alice")`).

Add explicit causal dep on grant.hash so topo-sort applies
GrantPermission first regardless of iter order. Mirrors the
fix already on PR #598 for the same test; cherry-picking it
here keeps PR #599's master-PR CI green now that the human
asked for the fix in this batch rather than waiting for #598
to merge.

Refs #565
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment