auto-fix batch claude/friendly-maxwell-M5xB6 2026-05-03#566
Merged
Conversation
Lib crates use tracing for observability per CLAUDE.md. Refs #551
Triage of audit F34 (#534): every spec.md-named TODO either cites a live issue / plan path or describes a closed gate. - traits.rs:161 — #119 closed before placeholder fixed; cite #561. - listeners.rs:396 — cite #382 (heads-based delta sync). - views.rs:95 — cite #563 (multi-grove members plumbing). - views.rs:600, message.rs:267 + 1125 — cite #562 (WhisperStart wire). - message.rs:1343 + 1406 — cite #564 (mobile sheet emoji recency + full-picker route, distinct from desktop #186). - views.rs:511 + 700, message.rs:857 + 866, mention.rs:44 — cite plan docs/plans/2026-04-21-ui-phase-2c-profile-card.md. - views.rs:481 + state.rs:142 — rewrite stale doc text; the sync-queue gate is closed by Phase 2b (docs/plans/2026-04-21-ui-phase-2b-sync-queue.md). Comment-only changes; no behaviour drift. cargo fmt / clippy -D warnings / wasm32 check on web + client all clean. Refs #534
Adds a focused unit test that asserts `set_default_server` actually takes effect, via the only observable read path (events routed through `on_event` are queryable under the configured server ID). Verifies the default value, the post-setter routing, and last-write-wins across multiple setter calls. Also documents that the field is in-memory only and not persisted to SQLite. Closes the audit gap where renaming the field or deleting the assignment in the setter would have slipped past CI. Refs #541
Audit F42 flagged set_config as a public mutator covered only indirectly via bootstrap_tests. Add a focused tests/search.rs module with two assertions: - set_config_round_trip pins the SetConfig + GetConfig handler pair (write c, read c back). - set_config_changes_rebuild_query_results indexes a two-grove corpus, opts grove g1 out via per_grove_enabled, rebuilds, and asserts only g0's hit survives. The executor never consults config (gating happens at write time in message_allowed), so rebuild is the path that exposes the post-set_config delta. Refs #542
Swap rand::thread_rng() for rand::rngs::OsRng so the agent crate's auth module stays portable to wasm32 targets, per CLAUDE.md's WASM-portability guidance for lib crates. Refs #546
Permission variants previously leaked into role lists and other UI
surfaces via `format!("{p:?}")`, which prints variant identifiers
(`SyncProvider`, `ManageChannels`). Debug is non-stable for UI and
forces wording changes to grep across multiple call sites.
Add `impl Display for Permission` returning user-facing strings:
SyncProvider -> "Sync provider"
ManageChannels -> "Manage channels"
ManageRoles -> "Manage roles"
SendMessages -> "Send messages"
CreateInvite -> "Create invite"
__UnknownLegacy -> "Unknown" (sentinel; should never reach UI)
Switch the role-list render paths in `crates/web/src/state.rs` and
`crates/client/src/views.rs` from `{p:?}` to `{p}`. Migrate the
permission toggle list in `crates/web/src/components/roles.rs` from
the parallel `PERMISSION_NAMES: &[&str]` constant to a typed
`TOGGLABLE_PERMISSIONS: &[Permission]`, eliminating the
string-round-trip via `Permission::from_name` and centralising the
labels behind Display. Lock the strings in a unit test.
Refs #550
CLAUDE.md requires `// state: lock-ok — <reason>` at each lock use site. Both `WebTrustStore` and `DerivedStateActor::cached` already carry struct-level rationale, but the `.lock()` call sites lacked the inline marker pointing readers there. Audit F44 prescribed migrating `trust_store.rs` to `Rc<RefCell<>>`, but that doesn't compile on the dual-target (native + wasm32) crate — the existing `Mutex<Inner>` is the correct primitive. Residual gap was just the missing inline markers; no behaviour change, panic policy preserved (trust state is security-critical; silent degradation would be unsafe). Refs #544
Switch user-visible error formatting from Debug ({e:?}) to Display
({e}) on the five sites called out by audit F50. Display is
preferable for end-user-facing strings; Debug exposes type internals.
Sites touched:
- crates/web/src/test_hooks/dispatcher.rs:66 — serde_wasm_bindgen
::Error implements Display; direct {e} swap.
- crates/web/src/test_hooks/dispatcher.rs:89, 121, 134 — JsValue
has no Display impl; fall back to e.as_string().unwrap_or_else
(|| format!("{e:?}")) so string-typed JS errors render cleanly
while non-string values still degrade to Debug.
- crates/web/src/voice.rs:60 — JsValue from AudioContext::new();
same as_string fallback. This Result<_, String> is the path that
can surface in user-facing UI, motivating the swap.
Internal tracing::warn!/error! sites with {e:?} are intentionally
left untouched per the audit (internal-only logs).
Refs #549
Without an upper bound on `remote.millis`, a single malicious or buggy peer could broadcast `millis = u64::MAX - k` and permanently poison every receiver's HLC, saturating subsequent `bump_counter` calls and breaking ordering + "now"-based UIs forever. Adopt Kulkarni's recommendation: clamp remote `millis` to `wall + MAX_FORWARD_DRIFT_MS` (24h) and emit a `warn!` so ops can surface peers that consistently exceed the cap. Counter is unaffected since it resets when millis advances. Rejected alternatives: - Hard-reject (drop the receive update): silently desyncs HLC from legitimately drifted peers, leaving message ordering inconsistent. - Clamp without warn: loses the telemetry signal for buggy peers. Tests: receive_clamps_far_future_remote, receive_accepts_within_drift, receive_clamps_at_exact_boundary. Refs #516 https://claude.ai/code/session_01LjdWEgoELvKBf7YGbUCdsY
Primary fix: Event::verify() (crates/state/src/event.rs) now matches on
bincode::serialize() instead of expect()-ing. On Err it logs a warn and
returns false so a malformed attacker-controlled `kind` (e.g. a String
produced via unsafe with non-UTF-8 bytes) cannot panic the verify hot
path. Defense-in-depth — bincode of owned Vec/String/integers should
not realistically fail, but verify() runs on every received event from
untrusted peers, so a panic here is a DoS vector.
Snapshot::new() (crates/state/src/sync.rs): kept expect() but tightened
the panic message to document the invariant ("...should be unreachable
post-validation"). Option C from the plan. Option A (no change) was the
runner-up; chose C to leave breadcrumbs for future readers about why
the panic is safe here. Option B (return Result) was rejected — it
would ripple through every Snapshot::new() call site, far out of scope
for a defense-in-depth tightening. Snapshot input is locally
materialized state which already passed apply_incremental's validation,
so this path is not attacker-influenced in the same way verify() is.
Event::new() (crates/state/src/event.rs:450) deliberately untouched.
That path is local authorship — we construct the event ourselves from
already-validated input, so a serialization panic would indicate a
real bug in our own code, not an attacker-induced DoS. Catching it
would only hide bugs.
Test: added verify_returns_false_on_garbage_event in event.rs tests.
The bincode-failure branch itself is unreachable from safe Rust on
the current owned types, so the test exercises the adjacent
hash/sig-mismatch path to confirm verify() returns gracefully on
adversarial input. Documented in the test that the bincode-error
arm is defense-in-depth standing without a direct test, reachable
only via attacker-side `unsafe` we cannot reproduce in safe Rust.
Local gate: cargo fmt --check, clippy -p willow-state -D warnings,
test -p willow-state, check --target wasm32 -p willow-state. Pre-
existing failure tests_materialize::non_admin_set_profile_is_accepted
remains (baseline rot, out of scope).
Refs #520
Prior PR #511 lessons dismissed `willow-state::tests_materialize::non_admin_set_profile_is_accepted` as a sandbox-side flake. This run reproduced it cleanly on coordinator HEAD post-PR #505 (which added the SetProfile membership gate) — the "flake" was a real regression all along, just exposed once the gating PR landed. Filed #565. Strengthen the implementer-flagged-rot section: always re-verify on coordinator HEAD; don't rely on a prior dismissal alone. Rot accumulates between runs; a previously-flaky symptom can become a real regression as PRs merge. Refs auto-fix batch claude/friendly-maxwell-M5xB6.
Owner
Author
|
fix conflicts |
…ll-M5xB6 # Conflicts: # crates/client/src/lib.rs # crates/client/src/listeners.rs
6 tasks
intendednull
pushed a commit
that referenced
this pull request
May 4, 2026
Content::File carried a self-declared u64 size_bytes plus unbounded filename / mime_type strings, all peer-supplied. Until now nothing rejected a 256 KB filename or MIME, and the size field had no warning that it was attacker-declared. Add MAX_FILENAME_BYTES = 255 (POSIX NAME_MAX) and MAX_MIME_BYTES = 255 (POSIX-aligned, comfortably above RFC 6838's 127+127 type/subtype limit) and expose Content::validate / Message::validate that reject oversized values with MessageValidationError. Wire validation into InMemoryStore::insert so peer-supplied messages cannot be persisted without first clearing the structural bounds. Document size_bytes as advisory-only — UIs may display it but must not use it for any preallocation or trust decision. The natural earlier ingress point sits in client/listeners.rs, but that file is locked under PR #566; an inline NOTE in store.rs flags the follow-up so the client side can also gate validation once #566 lands. No size_bytes preallocation hazards were found in the tree. Refs #583
This was referenced May 4, 2026
Owner
Author
|
fix |
The `non_admin_set_profile_is_accepted` test emitted alice's SetProfile via `do_emit` (deps: vec![]), so topological_sort ordering depended on hash. When alice's SetProfile sorted before admin's GrantPermission, the post-#505 membership gate at materialize.rs:548-555 silently rejected it and the assertion failed only when the full suite ran. Mirror the existing pattern at tests/materialize.rs:388-398: capture the grant hash and use `dag.create_event` directly with explicit `deps: vec![grant.hash]` so the SetProfile causally follows the grant under any topological ordering. Production-side fix (membership-pending buffer in `apply_event`) is the larger refactor explicitly deferred per #565. Audit confirms no other test currently relies on lucky topo-sort to land a non-genesis-author gated event: stress.rs's non-admin SetProfile loops only assert determinism / sort cost, not application; permissions.rs uses ManagedDag's incremental apply path; all SetProfile/UpdateProfile/Pin/Unpin/ChannelRevive sites in materialize.rs use the genesis author (admin/owner) who is already a member. Refs #565
8 tasks
Owner
Author
|
fix conflicts |
…ll-M5xB6 # Conflicts: # crates/state/src/tests/materialize.rs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Scheduled
/resolving-issuessweep. Eleven small-scope fixes from the 2026-05-02 general-audit (master ticket #513) + one regression fix landed sequentially. Originally avoided files in flight (PRs #560 + #511); merged main back in cleanly after they landed. Four orphaned-TODO trackers (#561, #562, #563, #564) filed mid-run.Fixes
chore(worker): replace println with tracing::info in identity(7f1164f). One-liner.chore: link orphaned TODOs to trackers(7e1ebd3). 10 audit + 2 stale-doc-text TODOs cited under issues / plan paths. Filed 4 trackers (1 new this session: [web] mobile action sheet: wire reactions-pins emoji recency + full picker route #564 mobile reactions/picker; 3 from prior session: [network] reopen #119: connection_events() still a never-yielding placeholder #561 connection_events placeholder, [client/web] wire WhisperStart event end-to-end (whisper-mode.md) #562 WhisperStart wiring, [client] plumb state.members per ServerEntry for multi-grove shared-groves intersection #563 multi-grove members plumbing). Profile-card + sync-queue TODOs cite their existing plan paths. Comment-only.test(storage): cover set_default_server round-trip(e3623cd). Three-phase test covering default routing, post-setter routing, and last-write-wins. Documents in-memory-only nature (no SQLite persistence).test(client): cover SearchIndex set_config effect(412f1aa). Two tests in newtests/search.rs: round-trip + per-grove rebuild observable delta. Wired intolib.rsastests_search.fix(agent): use OsRng for bearer token RNG(7e79ab3). One-line swap fromthread_rng()(TLS-bound, not WASM-portable) toOsRng::fill_bytes.feat(state): add Display impl for Permission(c4f4de9). User-facing strings ("Sync provider","Manage channels", etc.) onPermission. Migrated 3 UI render paths (web/state.rs,client/views.rs::compute_roles_view,web/components/roles.rs'sTOGGLABLE_PERMISSIONSconstant — renamed fromPERMISSION_NAMES: &[&str]to drop thePermission::from_nameround-trip).docs(web): mark mutex use sites with lock-ok(8fa46ad). Coordinator-narrowed scope: audit'sRc<RefCell>premise is structurally wrong (trust_store.rsis dual-target native+wasm32;Rc<RefCell>won't compile native). Residual gap per CLAUDE.md is missing inline// state: lock-okmarkers at 6.lock().expect(...)use sites — added pointers back to the existing struct-level rationale comments. No primitive change, no panic-policy change.fix(web): prefer Display for error formatting(74f01ad). 5 sites indispatcher.rs+voice.rs. Direct{e}onserde_wasm_bindgen::Error; forJsValue(noDisplay) useas_string().unwrap_or_else(|| format!("{e:?}"))so string-typed JS errors render cleanly while non-string values gracefully degrade.fix(messaging): clamp HLC remote ts forward-drift(de44bb2). Security HIGH — defeats single-message clock-poisoning DoS. Brainstorm picked Option C (clamp + warn) over A (hard-reject — desyncs HLC from legit drift) and B (clamp without warn — loses telemetry).MAX_FORWARD_DRIFT_MS = 24h. 3 regression tests added.fix(state): graceful bincode fail in event verify(6170d06). Coordinator-narrowed scope: primary fix atEvent::verify()(the attacker-influenced hot path) —matchinstead ofexpect(), log warn + return false on bincode err.Snapshot::new()keptexpect()with tightened panic message (Option C —Resultreturn would ripple through every call site, out of scope).Event::new()left untouched (local authorship; panic = real bug).test(state): fix membership-gate causal linkage(baeee43). Filed + fixed this run. PR auto-fix batch claude/friendly-maxwell-EjeTz (2026-05-01) #505'sSetProfilemembership gate exposed a pre-existing bug innon_admin_set_profile_is_accepted: the test emitted alice'sSetProfileviado_emit(which always passesdeps: vec![]), so topological-sort order was hash-dependent and the gate intermittently rejected the event. Mirrored the existing pattern attests/materialize.rs:388-398— captured the grant hash and threaded it throughdag.create_event(..., vec![grant.hash], 0). Test-side fix only; production-side fix (membership-pending buffer inapply_event) deferred — design call needed, larger refactor. willow-state suite went 240/241 → 241/241.Already-Fixed
None this run. The general-audit at #513 was filed 2026-05-02 against the same
main @ b901575head this run started from; no fixes can have landed in the gap by definition (per the prior-PR-#560 lessons noting this is the expected zero-yield case for same-day audit-to-fix).Parked
None. All 10 audit picks + the regression fix landed cleanly. No mid-flight aborts; no finalize-implementer rescues; no scope-creep guards tripped.
Filed mid-run (follow-ups, NOT closed by this PR)
traits.rs:161connection_events() placeholder (replaces closed [network] connection_events() is a placeholder that never yields #119) — filed prior session, kept this session.WhisperStartevent end-to-end — filed prior session.state.membersperServerEntryfor multi-grove — filed prior session.Skill Evolution
df17da3 docs(skill): re-verify dismissed baseline failures every run— strengthens the### Implementer-flagged out-of-scope rotsection. Prior PR auto-fix batch claude/friendly-maxwell-f34GI 2026-05-02 #511 lesson said "implementer reports baseline failure → coordinator verifies → dismiss as flake if it doesn't repro." This run produced the inverse: same test that was dismissed in auto-fix batch claude/friendly-maxwell-f34GI 2026-05-02 #511 reproduces cleanly today because PR auto-fix batch claude/friendly-maxwell-EjeTz (2026-05-01) #505 landed in between. Skill now mandates re-verification every run regardless of prior dismissal, and to file the follow-up withgit log <last-attempt>..HEADshowing what flipped flake → real.Lessons Learned
Rc<RefCell>prescription was structurally wrong (dual-target). Coordinator narrowed the brief to inline lock-ok markers (the residual CLAUDE.md gap). Implementer landed cleanly without touching primitives.bincode::serialize.expect()sites; onlyEvent::verify()is genuinely attacker-influenced. Coordinator pre-decided narrowing in the brief (primary fix at verify; tighten panic msg at Snapshot; leave Event::new alone). Implementer's brainstorm landed within cap.Pre-decided narrowing in the implementer brief saves the brainstorm gate from spending its budget re-deriving the same tradeoffs the coordinator already worked out.
non_admin_set_profile_is_acceptedas flake; this run reproduced it cleanly because PR auto-fix batch claude/friendly-maxwell-EjeTz (2026-05-01) #505's gate had landed in between. Took 2 implementer reports + 1 coordinator-sidecargo test+ a 5-mingit logtrace to confirm. Filed test(state): non_admin_set_profile_is_accepted regression after #505 membership gate #565, then fixed it test-side this same run after the human asked. Skill editdf17da3captures the lesson for future runs.git merge origin/mainafter both landed had only 2 real conflicts (crates/client/src/lib.rstest-mod declaration;crates/client/src/listeners.rsTODO citation vscompute_sync_reply()adoption). Both resolved by taking main's better implementation while keeping this batch's TODO citations. No hand-edited fix logic conflicts.Test plan
Master-PR CI is the load-bearing gate. Locally each implementer ran the scoped subset (fmt + native clippy + native test + wasm32 check on touched crates). Browser tests deferred to CI (no wasm-pack/Firefox in sandbox).
CI gates:
cargo fmtcargo clippyworkspace (native + wasm32)cargo testworkspace (state + client + identity + replay + storage + common + worker + agent + web + actor + messaging + network) — was red onnon_admin_set_profile_is_accepted; now green atbaeee43.wasm-packbrowser tests (Firefox + geckodriver — only observable on CI)cargo audit(no advisory changes this run)Generated by Claude Code