Skip to content

auto-fix batch claude/friendly-maxwell-M5xB6 2026-05-03#566

Merged
intendednull merged 15 commits into
mainfrom
claude/friendly-maxwell-M5xB6
May 8, 2026
Merged

auto-fix batch claude/friendly-maxwell-M5xB6 2026-05-03#566
intendednull merged 15 commits into
mainfrom
claude/friendly-maxwell-M5xB6

Conversation

@intendednull
Copy link
Copy Markdown
Owner

@intendednull intendednull commented May 3, 2026

Scheduled /resolving-issues sweep. Eleven small-scope fixes from the 2026-05-02 general-audit (master ticket #513) + one regression fix landed sequentially. Originally avoided files in flight (PRs #560 + #511); merged main back in cleanly after they landed. Four orphaned-TODO trackers (#561, #562, #563, #564) filed mid-run.

Fixes

Already-Fixed

None this run. The general-audit at #513 was filed 2026-05-02 against the same main @ b901575 head this run started from; no fixes can have landed in the gap by definition (per the prior-PR-#560 lessons noting this is the expected zero-yield case for same-day audit-to-fix).

Parked

None. All 10 audit picks + the regression fix landed cleanly. No mid-flight aborts; no finalize-implementer rescues; no scope-creep guards tripped.

Filed mid-run (follow-ups, NOT closed by this PR)

Skill Evolution

Lessons Learned

Test plan

Master-PR CI is the load-bearing gate. Locally each implementer ran the scoped subset (fmt + native clippy + native test + wasm32 check on touched crates). Browser tests deferred to CI (no wasm-pack/Firefox in sandbox).

CI gates:

  • cargo fmt
  • cargo clippy workspace (native + wasm32)
  • cargo test workspace (state + client + identity + replay + storage + common + worker + agent + web + actor + messaging + network) — was red on non_admin_set_profile_is_accepted; now green at baeee43.
  • wasm-pack browser tests (Firefox + geckodriver — only observable on CI)
  • cargo audit (no advisory changes this run)
  • Playwright e2e (no behaviour changes — sanity only; in-progress at last check)

Generated by Claude Code

claude added 12 commits May 3, 2026 08:07
Lib crates use tracing for observability per CLAUDE.md.

Refs #551
Triage of audit F34 (#534): every spec.md-named TODO either cites
a live issue / plan path or describes a closed gate.

- traits.rs:161 — #119 closed before placeholder fixed; cite #561.
- listeners.rs:396 — cite #382 (heads-based delta sync).
- views.rs:95 — cite #563 (multi-grove members plumbing).
- views.rs:600, message.rs:267 + 1125 — cite #562 (WhisperStart wire).
- message.rs:1343 + 1406 — cite #564 (mobile sheet emoji recency
  + full-picker route, distinct from desktop #186).
- views.rs:511 + 700, message.rs:857 + 866, mention.rs:44 — cite
  plan docs/plans/2026-04-21-ui-phase-2c-profile-card.md.
- views.rs:481 + state.rs:142 — rewrite stale doc text; the
  sync-queue gate is closed by Phase 2b
  (docs/plans/2026-04-21-ui-phase-2b-sync-queue.md).

Comment-only changes; no behaviour drift. cargo fmt / clippy
-D warnings / wasm32 check on web + client all clean.

Refs #534
Adds a focused unit test that asserts `set_default_server` actually
takes effect, via the only observable read path (events routed through
`on_event` are queryable under the configured server ID). Verifies the
default value, the post-setter routing, and last-write-wins across
multiple setter calls. Also documents that the field is in-memory only
and not persisted to SQLite.

Closes the audit gap where renaming the field or deleting the
assignment in the setter would have slipped past CI.

Refs #541
Audit F42 flagged set_config as a public mutator covered only
indirectly via bootstrap_tests. Add a focused tests/search.rs
module with two assertions:

- set_config_round_trip pins the SetConfig + GetConfig handler
  pair (write c, read c back).
- set_config_changes_rebuild_query_results indexes a two-grove
  corpus, opts grove g1 out via per_grove_enabled, rebuilds, and
  asserts only g0's hit survives. The executor never consults
  config (gating happens at write time in message_allowed), so
  rebuild is the path that exposes the post-set_config delta.

Refs #542
Swap rand::thread_rng() for rand::rngs::OsRng so the agent
crate's auth module stays portable to wasm32 targets, per
CLAUDE.md's WASM-portability guidance for lib crates.

Refs #546
Permission variants previously leaked into role lists and other UI
surfaces via `format!("{p:?}")`, which prints variant identifiers
(`SyncProvider`, `ManageChannels`). Debug is non-stable for UI and
forces wording changes to grep across multiple call sites.

Add `impl Display for Permission` returning user-facing strings:
  SyncProvider   -> "Sync provider"
  ManageChannels -> "Manage channels"
  ManageRoles    -> "Manage roles"
  SendMessages   -> "Send messages"
  CreateInvite   -> "Create invite"
  __UnknownLegacy -> "Unknown"  (sentinel; should never reach UI)

Switch the role-list render paths in `crates/web/src/state.rs` and
`crates/client/src/views.rs` from `{p:?}` to `{p}`. Migrate the
permission toggle list in `crates/web/src/components/roles.rs` from
the parallel `PERMISSION_NAMES: &[&str]` constant to a typed
`TOGGLABLE_PERMISSIONS: &[Permission]`, eliminating the
string-round-trip via `Permission::from_name` and centralising the
labels behind Display. Lock the strings in a unit test.

Refs #550
CLAUDE.md requires `// state: lock-ok — <reason>` at each lock use
site. Both `WebTrustStore` and `DerivedStateActor::cached` already
carry struct-level rationale, but the `.lock()` call sites lacked
the inline marker pointing readers there.

Audit F44 prescribed migrating `trust_store.rs` to `Rc<RefCell<>>`,
but that doesn't compile on the dual-target (native + wasm32) crate
— the existing `Mutex<Inner>` is the correct primitive. Residual
gap was just the missing inline markers; no behaviour change, panic
policy preserved (trust state is security-critical; silent
degradation would be unsafe).

Refs #544
Switch user-visible error formatting from Debug ({e:?}) to Display
({e}) on the five sites called out by audit F50. Display is
preferable for end-user-facing strings; Debug exposes type internals.

Sites touched:
- crates/web/src/test_hooks/dispatcher.rs:66 — serde_wasm_bindgen
  ::Error implements Display; direct {e} swap.
- crates/web/src/test_hooks/dispatcher.rs:89, 121, 134 — JsValue
  has no Display impl; fall back to e.as_string().unwrap_or_else
  (|| format!("{e:?}")) so string-typed JS errors render cleanly
  while non-string values still degrade to Debug.
- crates/web/src/voice.rs:60 — JsValue from AudioContext::new();
  same as_string fallback. This Result<_, String> is the path that
  can surface in user-facing UI, motivating the swap.

Internal tracing::warn!/error! sites with {e:?} are intentionally
left untouched per the audit (internal-only logs).

Refs #549
Without an upper bound on `remote.millis`, a single malicious or buggy
peer could broadcast `millis = u64::MAX - k` and permanently poison
every receiver's HLC, saturating subsequent `bump_counter` calls and
breaking ordering + "now"-based UIs forever.

Adopt Kulkarni's recommendation: clamp remote `millis` to
`wall + MAX_FORWARD_DRIFT_MS` (24h) and emit a `warn!` so ops can
surface peers that consistently exceed the cap. Counter is unaffected
since it resets when millis advances.

Rejected alternatives:
- Hard-reject (drop the receive update): silently desyncs HLC from
  legitimately drifted peers, leaving message ordering inconsistent.
- Clamp without warn: loses the telemetry signal for buggy peers.

Tests: receive_clamps_far_future_remote, receive_accepts_within_drift,
receive_clamps_at_exact_boundary.

Refs #516

https://claude.ai/code/session_01LjdWEgoELvKBf7YGbUCdsY
Primary fix: Event::verify() (crates/state/src/event.rs) now matches on
bincode::serialize() instead of expect()-ing. On Err it logs a warn and
returns false so a malformed attacker-controlled `kind` (e.g. a String
produced via unsafe with non-UTF-8 bytes) cannot panic the verify hot
path. Defense-in-depth — bincode of owned Vec/String/integers should
not realistically fail, but verify() runs on every received event from
untrusted peers, so a panic here is a DoS vector.

Snapshot::new() (crates/state/src/sync.rs): kept expect() but tightened
the panic message to document the invariant ("...should be unreachable
post-validation"). Option C from the plan. Option A (no change) was the
runner-up; chose C to leave breadcrumbs for future readers about why
the panic is safe here. Option B (return Result) was rejected — it
would ripple through every Snapshot::new() call site, far out of scope
for a defense-in-depth tightening. Snapshot input is locally
materialized state which already passed apply_incremental's validation,
so this path is not attacker-influenced in the same way verify() is.

Event::new() (crates/state/src/event.rs:450) deliberately untouched.
That path is local authorship — we construct the event ourselves from
already-validated input, so a serialization panic would indicate a
real bug in our own code, not an attacker-induced DoS. Catching it
would only hide bugs.

Test: added verify_returns_false_on_garbage_event in event.rs tests.
The bincode-failure branch itself is unreachable from safe Rust on
the current owned types, so the test exercises the adjacent
hash/sig-mismatch path to confirm verify() returns gracefully on
adversarial input. Documented in the test that the bincode-error
arm is defense-in-depth standing without a direct test, reachable
only via attacker-side `unsafe` we cannot reproduce in safe Rust.

Local gate: cargo fmt --check, clippy -p willow-state -D warnings,
test -p willow-state, check --target wasm32 -p willow-state. Pre-
existing failure tests_materialize::non_admin_set_profile_is_accepted
remains (baseline rot, out of scope).

Refs #520
Prior PR #511 lessons dismissed
`willow-state::tests_materialize::non_admin_set_profile_is_accepted` as
a sandbox-side flake. This run reproduced it cleanly on coordinator
HEAD post-PR #505 (which added the SetProfile membership gate) — the
"flake" was a real regression all along, just exposed once the gating
PR landed. Filed #565.

Strengthen the implementer-flagged-rot section: always re-verify on
coordinator HEAD; don't rely on a prior dismissal alone. Rot
accumulates between runs; a previously-flaky symptom can become a real
regression as PRs merge.

Refs auto-fix batch claude/friendly-maxwell-M5xB6.
@intendednull
Copy link
Copy Markdown
Owner Author

fix conflicts

…ll-M5xB6

# Conflicts:
#	crates/client/src/lib.rs
#	crates/client/src/listeners.rs
intendednull pushed a commit that referenced this pull request May 3, 2026
Adds sibling paragraph to the Implementer Agent step 8 sandbox-interference
note, capturing the inverse phenomenon: pre-staged uncommitted edits at
session start. Observed in ~40-50% of dispatches across PR #566 + #567
runs. Brief implementers to verify + gate + commit (don't redo).

Refs #567
intendednull pushed a commit that referenced this pull request May 4, 2026
Content::File carried a self-declared u64 size_bytes plus unbounded
filename / mime_type strings, all peer-supplied. Until now nothing
rejected a 256 KB filename or MIME, and the size field had no warning
that it was attacker-declared.

Add MAX_FILENAME_BYTES = 255 (POSIX NAME_MAX) and MAX_MIME_BYTES = 255
(POSIX-aligned, comfortably above RFC 6838's 127+127 type/subtype
limit) and expose Content::validate / Message::validate that reject
oversized values with MessageValidationError. Wire validation into
InMemoryStore::insert so peer-supplied messages cannot be persisted
without first clearing the structural bounds. Document size_bytes as
advisory-only — UIs may display it but must not use it for any
preallocation or trust decision.

The natural earlier ingress point sits in client/listeners.rs, but
that file is locked under PR #566; an inline NOTE in store.rs flags
the follow-up so the client side can also gate validation once #566
lands. No size_bytes preallocation hazards were found in the tree.

Refs #583
@intendednull
Copy link
Copy Markdown
Owner Author

fix

The `non_admin_set_profile_is_accepted` test emitted alice's
SetProfile via `do_emit` (deps: vec![]), so topological_sort
ordering depended on hash. When alice's SetProfile sorted before
admin's GrantPermission, the post-#505 membership gate at
materialize.rs:548-555 silently rejected it and the assertion
failed only when the full suite ran.

Mirror the existing pattern at tests/materialize.rs:388-398:
capture the grant hash and use `dag.create_event` directly with
explicit `deps: vec![grant.hash]` so the SetProfile causally
follows the grant under any topological ordering.

Production-side fix (membership-pending buffer in `apply_event`)
is the larger refactor explicitly deferred per #565.

Audit confirms no other test currently relies on lucky topo-sort
to land a non-genesis-author gated event: stress.rs's
non-admin SetProfile loops only assert determinism / sort cost,
not application; permissions.rs uses ManagedDag's incremental
apply path; all SetProfile/UpdateProfile/Pin/Unpin/ChannelRevive
sites in materialize.rs use the genesis author (admin/owner) who
is already a member.

Refs #565
@intendednull
Copy link
Copy Markdown
Owner Author

fix conflicts

…ll-M5xB6

# Conflicts:
#	crates/state/src/tests/materialize.rs
@intendednull intendednull merged commit 261f3cf into main May 8, 2026
8 checks passed
@intendednull intendednull deleted the claude/friendly-maxwell-M5xB6 branch May 8, 2026 07:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment