Skip to content

auto-fix batch claude/friendly-maxwell-EjeTz (2026-05-01)#505

Merged
intendednull merged 4 commits into
mainfrom
claude/friendly-maxwell-EjeTz
May 1, 2026
Merged

auto-fix batch claude/friendly-maxwell-EjeTz (2026-05-01)#505
intendednull merged 4 commits into
mainfrom
claude/friendly-maxwell-EjeTz

Conversation

@intendednull
Copy link
Copy Markdown
Owner

2 small-scope fixes from open issue queue + 1 already-fixed close + skill update folding 2 wait-pattern lessons.

Fixes

  • Fixes [wasm-build] cargo check --target wasm32-unknown-unknown -p willow-network --all-features fails on tokio import #502 — gate crates/network/src/mem.rs behind cfg(all(not(target_arch = "wasm32"), any(test, feature = "test-utils"))) so cargo check --target wasm32-unknown-unknown -p willow-network --all-features no longer trip on tokio::sync::broadcast + tokio::select! import. Single-line edit in crates/network/src/lib.rs:22. No mem::* re-exports in lib.rs need mirror (only topics::* + traits::* re-exported). Acceptance gate cargo check --target wasm32-unknown-unknown -p willow-network --all-features exit 0 + regression just check-wasm still exit 0 + cargo test -p willow-network 18 unit + 5 integration pass + native + wasm clippy -D warnings clean. Commit 58a2965.

  • Fixes [security] Unrestricted events (Pin/Unpin/SetProfile) bypass membership check after kick #177 — gate 4 EventKind handler (SetProfile, UpdateProfile, PinMessage, UnpinMessage) in crates/state/src/materialize.rs on state.members.contains_key(&event.author) per RotateChannelKey defense-in-depth pattern. Issue cite 3 EventKinds (Set/Pin/Unpin); implementer brainstorm pick Option B (expand to UpdateProfile too — same vulnerability shape, identical fix block, in-scope per skill's "mechanical migration" clarification). Spec doc docs/specs/2026-04-12-state-authority-and-mutations.md split old "Unrestricted" tier into **Member-only (server state)** (the 4 EventKinds) + **Per-identity preference (no gate)** (MuteChannel/MuteGrove, intentionally ungated since preferences survive kick). Catch-all comment block in materialize.rs:301-321 updated. Test cascade +756 LOC in crates/state/src/tests/permissions.rs (4 handler × 3 case = 12 new tests: member-applies + kicked-rejects + never-joined-rejects, w/ setup_kicked_peer() helper). Mechanical test adaptation in sync.rs/tests/dag.rs/tests/stress.rs add causal deps link for SetProfile event after grant — load-bearing for new gate to land cleanly under DAG topological ordering. State-crate test count 227 → 239 (+12), all pass. Client-crate test 316 pass (no regression). Implementer's EventKind::* arm scan confirm no other handler with same gap (admin-only + permission-gated + governance + ChannelRevive + genesis all properly gated; MuteChannel/MuteGrove correctly ungated). Commit 8c40e32.

Already-Fixed

Parked

None this run. No mid-fix abort, no follow-up issue filed.

Skill Evolution

Single skill commit d5d6457 (docs(skill): full-SHA capture + mid-flight finalize-implementer pattern). Two failure modes from this run fold into existing ### Waiting for implementer commits without polling section + new ### Implementer cuts off mid-flight (uncommitted edits, agent terminated) subsection:

  1. Full-SHA capture rule. A typo'd hash extension (31c785155a02… vs real 31c7851bf8c6…) make until-loop's first iteration exit immediately on mismatched comparison, then coordinator spuriously conclude implementer crash + dispatch redundant finalize agent. Mandate capture pre-dispatch SHA into shell var via git log -1 --format='%H' before arming wait — never type/synthesize from short prefix.

  2. Mid-flight finalize-implementer pattern. Both implementer this run get cut off after substantive edits but before gate-and-commit step (truncate agent-completion summary, uncommitted tree match scope, stop-hook complain). Document as recognized failure mode w/ finalize-agent recipe: brief on existing diff, hand gate list verbatim, no need redo brainstorm/TDD-red. Also call out two false-alarm pattern (wrong-SHA wait + agent-notification lag) that produce redundant finalize dispatch even when work already cleanly commit.

Lessons Learned

Test plan


Generated by Claude Code


Generated by Claude Code

claude added 4 commits May 1, 2026 00:39
…eatures wasm-check passes

`crates/network/src/mem.rs` uses `tokio::sync::broadcast` and `tokio::select!`
which are unsupported on `wasm32-unknown-unknown`. Previously gated only on
`#[cfg(any(test, feature = "test-utils"))]`, so `cargo check
--target wasm32-unknown-unknown -p willow-network --all-features` enabled the
`test-utils` feature, pulled `mem.rs` in, and failed on the tokio imports.

`MemNetwork` is documented as a native-only test double; production wasm uses
`IrohNetwork`. Adding `not(target_arch = "wasm32")` to the existing gate keeps
native test/test-utils consumers working while removing `mem.rs` from the wasm
build entirely.

Verified the only `pub use` re-exports in `crates/network/src/lib.rs` are of
`topics::*` and `traits::*` (no `mem::*` re-exports to mirror), and that all
in-tree consumers of `willow_network::mem::*` (worker/relay/agent/client tests
and binaries) are native-only.

Acceptance gate output:

    $ cargo check --target wasm32-unknown-unknown -p willow-network --all-features
        Checking iroh-gossip v0.98.0
        Checking willow-network v0.1.0 (/home/user/willow/crates/network)
        Finished `dev` profile [unoptimized + debuginfo] target(s) in 2.93s

Used the fallback for `just check-wasm` (`cargo check --target
wasm32-unknown-unknown -p willow-network`) because `just` is not on PATH in this
sandbox; it also passed.

Refs #502
… post-kick authority

Refs #177.

Four EventKind handlers (`SetProfile`, `UpdateProfile`, `PinMessage`,
`UnpinMessage`) previously fell through `required_permission()`'s
`_ => None` arm with zero authority enforcement. A kicked peer — or any
peer that never joined — could still emit these events, and replaying
peers would apply them. The handlers now reject events whose author is
not a current member.

# Brainstorm: Option A vs Option B

Option A: fix only the 3 EventKinds named in the issue
(`SetProfile`, `PinMessage`, `UnpinMessage`).
Option B: also include `UpdateProfile`, which has the same structural
vulnerability and the same fix shape.

Chose Option B. `UpdateProfile` is in the same `_ => None` comment
block cited by the issue body, has identical authority semantics ("any
current member"), and a partial fix would leave a known-broken handler
of the same shape in place — exactly the failure mode #177 calls out.
The expansion is documented in the source comments and in the spec
doc.

# Test-adaptation cascade

Existing tests in `sync.rs`, `tests/dag.rs`, and `tests/stress.rs`
emitted `SetProfile` events from peers that were members but whose
events lacked causal `deps` linking back to the membership grant.
Under topological replay, this raced: depending on apply order, the
grant might not yet have been applied when the `SetProfile` was
processed, and the new gate would reject it. Fixed by adding explicit
`deps` so each `SetProfile` causally depends on the author's grant.
This is mechanical (≤ a few LOC per call site) and load-bearing for
the gate to land cleanly — without it, existing tests would
intermittently fail under the new authority check.

# Spec update

`docs/specs/2026-04-12-state-authority-and-mutations.md` splits the
old "Unrestricted" tier row into:
- **Member-only (server state)**: the 4 EventKinds gated here, plus
  `ChannelRevive` (already gated separately).
- **Per-identity preference (no gate)**: `MuteChannel`, `MuteGrove`
  (personal preferences, persist across kicks).

The catch-all comment in `required_permission()` and the spec doc
both gain a "Intentionally unrestricted still requires membership"
clarification so future reviewers don't misread the `_ => None` arm
as "no check at all".

# Test count delta

Before: 227 state-crate tests.
After:  239 state-crate tests (+12 = 4 handlers × 3 cases each:
positive `_by_member_is_applied`, negative
`_by_kicked_member_is_rejected`, negative
`_by_never_joined_signer_is_rejected`).

# Verification

```
$ cargo test -p willow-state
test result: ok. 239 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 117.87s
```

All 6 gates green: state tests, native clippy, WASM check, WASM
clippy, fmt, client tests (316 passed).

https://claude.ai/code/session_012hKSRvQYXkqak2zBANqfCZ
Two failure modes from this run folded into the wait-pattern section:

1. Full-SHA capture rule. A typo'd hash extension (31c785155a02… vs the
   real 31c7851…) made the until-loop's first iteration exit
   immediately on the mismatched comparison, then the coordinator
   spuriously concluded the implementer had crashed and dispatched a
   redundant finalize agent. Mandate capturing the pre-dispatch SHA
   into a shell var via git log -1 --format='%H' before arming the
   wait, never typing/synthesizing from a short prefix.

2. Mid-flight finalize-implementer pattern. Both implementers this
   run got cut off after substantive edits but before the gate-and-
   commit step (truncated agent-completion summary, uncommitted tree
   matching scope, stop-hook complaining). Document this as a
   recognized failure mode with a finalize-agent recipe: brief on
   the existing diff, hand the gate list verbatim, no need to redo
   brainstorming or TDD-red. Also note the two false-alarm patterns
   (wrong-SHA wait + agent-notification lag) that produce redundant
   finalize dispatches even when the work already cleanly committed.

Refs: #502, #177 (the run that surfaced both lessons).
@intendednull intendednull merged commit 86da485 into main May 1, 2026
8 checks passed
@intendednull intendednull deleted the claude/friendly-maxwell-EjeTz branch May 1, 2026 04:47
intendednull pushed a commit that referenced this pull request May 3, 2026
Prior PR #511 lessons dismissed
`willow-state::tests_materialize::non_admin_set_profile_is_accepted` as
a sandbox-side flake. This run reproduced it cleanly on coordinator
HEAD post-PR #505 (which added the SetProfile membership gate) — the
"flake" was a real regression all along, just exposed once the gating
PR landed. Filed #565.

Strengthen the implementer-flagged-rot section: always re-verify on
coordinator HEAD; don't rely on a prior dismissal alone. Rot
accumulates between runs; a previously-flaky symptom can become a real
regression as PRs merge.

Refs auto-fix batch claude/friendly-maxwell-M5xB6.
intendednull pushed a commit that referenced this pull request May 4, 2026
EventDag::insert previously called event.verify() (Ed25519 +
bincode + blake3, ~50us) BEFORE the cheap structural caps
(MAX_EVENT_DEPS, MAX_ENCRYPTED_KEY_BYTES). An attacker holding any
SendMessages permission could broadcast events with pathologically
large deps; every receiver paid full sig-verify cost on each
rejected event.

Reorder so cheap O(1)/O(n) syntactic checks fire first and the
crypto verify path only runs once an event passes the structural
caps. The intent comment at the top of insert() already described
this design ("Reject at the inbound DAG boundary so over-cap events
never even reach applied_events"); this commit makes the code match
that intent.

Adds a regression test that constructs an event with deps.len() >
MAX_EVENT_DEPS and a clobbered (all-zero) signature; the test
asserts InsertError::DepsTooLong (not InvalidSignature), which
proves the structural cap short-circuits before sig-verify.

Pre-existing willow-state::tests_materialize::non_admin_set_profile_is_accepted
failure (regression from PR #505, tracked at #565) is unrelated to
this change and was already failing on HEAD.

Refs #519
intendednull pushed a commit that referenced this pull request May 4, 2026
The non_admin_set_profile_is_accepted test was flaky: admin's
GrantPermission and alice's SetProfile were causally independent
in the DAG (each on its own per-author prev chain, empty deps),
so materialize()'s topo-sort between them depended on HashMap
iter order. When SetProfile applied before GrantPermission, the
membership gate (PR #505) rejected it and the assertion failed.

Wire SetProfile's deps to the GrantPermission event hash. This
forces topo-sort to apply the grant first, deterministically.
Production behavior (membership gate) unchanged; this is the
test-side fix from issue #565's two valid options.

Refs #565
intendednull pushed a commit that referenced this pull request May 4, 2026
Test was flaky (#565). After PR #505's membership gate, the
test depended on HashMap iter order in topo-sort: when SetProfile
was visited before GrantPermission, alice failed the membership
gate and the profile was rejected (`left: None, right: Some("Alice")`).

Add explicit causal dep on grant.hash so topo-sort applies
GrantPermission first regardless of iter order. Mirrors the
fix already on PR #598 for the same test; cherry-picking it
here keeps PR #599's master-PR CI green now that the human
asked for the fix in this batch rather than waiting for #598
to merge.

Refs #565
intendednull pushed a commit that referenced this pull request May 4, 2026
The `non_admin_set_profile_is_accepted` test emitted alice's
SetProfile via `do_emit` (deps: vec![]), so topological_sort
ordering depended on hash. When alice's SetProfile sorted before
admin's GrantPermission, the post-#505 membership gate at
materialize.rs:548-555 silently rejected it and the assertion
failed only when the full suite ran.

Mirror the existing pattern at tests/materialize.rs:388-398:
capture the grant hash and use `dag.create_event` directly with
explicit `deps: vec![grant.hash]` so the SetProfile causally
follows the grant under any topological ordering.

Production-side fix (membership-pending buffer in `apply_event`)
is the larger refactor explicitly deferred per #565.

Audit confirms no other test currently relies on lucky topo-sort
to land a non-genesis-author gated event: stress.rs's
non-admin SetProfile loops only assert determinism / sort cost,
not application; permissions.rs uses ManagedDag's incremental
apply path; all SetProfile/UpdateProfile/Pin/Unpin/ChannelRevive
sites in materialize.rs use the genesis author (admin/owner) who
is already a member.

Refs #565
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants