fix(state): per-author cap for pending buffer (SEC-V-08) by intendednull · Pull Request #451 · intendednull/willow

intendednull · 2026-04-28T01:10:16Z

why

SEC-V-08. PendingBuffer had global cap only. One signer can fill all 5_000 slots with chain-gap events, starve legit out-of-order events for the full 1 h eviction window. Robustness, not impact, but trivial to fix.

what

new sub-cap max_per_author on PendingBuffer. default = max_entries / 50 (floor 1).
- 5_000 client cap -> 100 per author.
- 10_000 state default -> 200 per author.
bookkeeping mirrors cached_count: per_author_count: BTreeMap<EndpointId, usize> updated on insert / resolve / evict_expired / evict_to.
warn-once-per-author via warned_full_authors: BTreeSet<EndpointId>. one chatty offender cannot flood logs.
with_per_author_cap(usize) builder for tests/explicit overrides. production callers stay on derived default.

tradeoffs

runner-up: hardcode the divisor, no override knob. rejected. three existing tests covering global capacity-eviction would need multi-author rewrites or magic-number tuning. override knob is cheaper.
per-author drop is silent after first warn for that author. matches existing pending buffer at capacity warn cadence — debug logs can be raised if needed.

tests (state crate, lowest tier)

new:

per_author_cap_isolates_one_signer — attacker capped, victim untouched
per_author_cap_defaults_to_total_over_divisor — 5_000 / 50 = 100
per_author_cap_floor_is_one — 5 / 50 = 0 floors to 1
per_author_cap_warns_once_per_author — bookkeeping HashSet
per_author_cap_frees_on_resolve — slots freed on resolve

adapted to use explicit sub-cap so they keep testing global policy:

pending_buffer_auto_evicts_when_limit_exceeded
capacity_eviction_drops_oldest_when_exceeded
pending_count_reflects_both_eviction_policies
pending_buffer_eviction_reduces_count_to_cap

deep_pending_chain_does_not_stack_overflow chain shortened 3000 -> 1500 to stay under per-author cap of 2000. iterative-resolution coverage unchanged.

verification

local (no just on the box, ran equivalents):

cargo fmt --check -> clean
cargo clippy --workspace --all-targets -- -D warnings -> clean
cargo test --workspace -> all pass
- state: 222 passed; 0 failed
- client: 281 passed; 0 failed
cargo check --target wasm32-unknown-unknown (state, identity, messaging, crypto, transport, common, network, client, web) -> clean

Refs #237

https://claude.ai/code/session_01VdcwSdvqig423A5CX2fSzR

Generated by Claude Code

SEC-V-08. PendingBuffer had only a global cap (5_000 in client). One chatty signer could fill 100% of the slots with unresolved-prev events and starve legitimate out-of-order events for the whole 1 h eviction window. Adds a per-author sub-cap. Default = max_entries / 50 (200 at the client cap, 100 with the state-crate default of 10_000), with a floor of 1 so tiny test caps remain usable. When an author is at the sub-cap, further events from that author are dropped without touching other authors' buckets. Bookkeeping mirrors the existing cached_count pattern: per_author_count BTreeMap<EndpointId, usize> updated on insert / resolve / evict_expired / evict_to. Warn-once-per-author via warned_full_authors set so a single offender cannot flood logs. Why a config knob and not a hardcoded constant: tests need to override the sub-cap to exercise the global eviction policy in isolation. with_per_author_cap(usize) is the explicit override; production callers stay on the derived default. Runner-up was a hardcoded constant — rejected because three existing tests would either need multi-author rewriting or magic-number tuning, and the override is cheaper to read. Tests at the lowest tier (state crate, sync.rs): * per_author_cap_isolates_one_signer * per_author_cap_defaults_to_total_over_divisor * per_author_cap_floor_is_one * per_author_cap_warns_once_per_author * per_author_cap_frees_on_resolve Adapted three existing capacity-eviction tests to set an explicit sub-cap (so they keep testing the global policy): * sync::tests::pending_buffer_auto_evicts_when_limit_exceeded * sync::tests::capacity_eviction_drops_oldest_when_exceeded * sync::tests::pending_count_reflects_both_eviction_policies * tests::pending_buffer_eviction_reduces_count_to_cap Reduced deep_pending_chain_does_not_stack_overflow chain depth from 3000 to 1500 so it fits within the 100_000-cap'd ManagedDag's per- author budget of 2000. Test still proves the iterative resolution path; capacity policy is validated separately. Refs #237

…ional three lessons from batch 2026-04-28-002530: 1. coordinator must `git fetch + reset` master batch branch before each implementer dispatch — stale local state contaminates the next worktree (#451 implementer found half-applied prior work in fresh worktree). 2. when implementer's pre-flight detects upstream fix already landed (#316/#317/#318 vs PR #402), close issues + caveman-comment, do NOT include in master PR `Fixes` list — record under `## Already-Fixed` instead. 3. github webhook subscriptions for sub-PRs are informational only — implementer owns its merge gate, coordinator must not duplicate that work.

intendednull merged commit 5b78e61 into auto-fix/batch-2026-04-28-002530 Apr 28, 2026

intendednull mentioned this pull request Apr 28, 2026

auto-fix batch 2026-04-28-002530 #457

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(state): per-author cap for pending buffer (SEC-V-08)#451

fix(state): per-author cap for pending buffer (SEC-V-08)#451
intendednull merged 1 commit into
auto-fix/batch-2026-04-28-002530from
auto-fix/issue-237-pending-per-author-cap

intendednull commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

intendednull commented Apr 28, 2026

why

what

tradeoffs

tests (state crate, lowest tier)

verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants