Skip to content

Implement machine-readable wire-rejection reasons (per spec from #216) #378

@intendednull

Description

@intendednull

Spec

Summary

Introduce a typed WireRejectReason enum carried in a new WireMessage::Reject(RejectPayload) variant in willow-common, so peers can react programmatically to rejections — retry on rate-limit, re-auth on AuthRequired, drop silently on Duplicate, surface a permission prompt on PermissionDenied — instead of matching on free-form error strings that are only fit for logs. This brings Nostr-style machine-readable rejection signaling to Willow's binary bincode wire format, with the added benefit of compile-time exhaustive matching and structured payloads (retry_after_ms, violated Permission, expected/actual hashes, etc.).

Build phases

  • Phase 1 — Wire types in willow-common
    • Add WireRejectReason enum (#[non_exhaustive]) to crates/common/src/wire.rs with all variants from the spec (Duplicate, Invalid, RateLimited, PermissionDenied, ParentHashMismatch, SeqGap, SignatureInvalid, PayloadTooLarge, UnsupportedVersion, AuthRequired, Restricted, NotSyncProvider, UnknownTopic, TopicInvalid, Capacity, ServerError).
    • Add RejectPayload { target_peer, reason, context, human } and RejectContext { Event, Topic([u8; 32]), Envelope }.
    • Add WireMessage::Reject(RejectPayload) variant.
    • Round-trip tests in the existing pack_unpack_*_round_trip style for every variant.
  • Phase 2 — Upstream typing for PermissionDenied
    • Thread typed Permission out of check_permission (return Result<(), CheckPermissionError>).
    • Change InsertError::PermissionDenied(String)InsertError::PermissionDenied { author, missing: Permission } (or similar) and update managed.rs .map_err site.
    • Optionally rename existing got fields on InsertError::SeqGap / PrevMismatch and TransportError::UnsupportedVersion to actual for consistency, or translate at the boundary.
  • Phase 3 — Producer wiring
    • Map every row from the spec's mapping table from InsertError / IdentityError / TransportError / relay sites to WireRejectReason and emit WireMessage::Reject via pack_wire.
    • Relay (crates/relay/src/lib.rs): replace logged-only topic_str_is_valid failures and MAX_TOPICS cap hits with on-wire rejects targeting the offending peer; leave connection-cap saturation logged-only (documented exception).
    • Replay (crates/replay/src/role.rs): emit rejects from DAG-insert failure paths during sync streaming.
    • Storage (crates/storage/src/role.rs): emit rejects from apply / archival-write failure paths.
  • Phase 4 — Receiver wiring
    • Add ClientEvent::Rejected { from: EndpointId, payload: RejectPayload } to crates/client/src/events.rs.
    • Add the WireMessage::Reject arm in each crate's gossip listener that already matches on WireMessage, dropping rejects whose target_peer ≠ local EndpointId.
    • Forward decoded rejects to the client's event stream.
  • Phase 5 — Client consumption
    • Implement the dispatch shown in the spec's "Client consumption pattern" section (backoff, re-auth prompt, permission UI, history re-request, upgrade prompt, payload chunking, etc.).
  • Phase 6 — Tests
    • Round-trip in crates/common/src/wire.rs (one value per variant).
    • Exhaustive InsertErrorWireRejectReason mapping in crates/state/src/tests.rs.
    • check_permission per-Permission rejection asserts PermissionDenied(p).
    • Transport: oversized payload → PayloadTooLarge; v0 / v999 → UnsupportedVersion.
    • Relay: invalid topic → TopicInvalid; topic cap full → Capacity.
    • Forward-compat: synthetic unknown discriminant → log + outbox preserved + no panic.
    • Browser test (crates/web/tests/browser.rs): PermissionDenied(SendMessages) flips a sent bubble to "cannot-send" within one tick().await.
  • Phase 7 — Logging & metrics
    • Structured tracing::warn!(reason = ?r, context = ?c, human = %h) at every emit site.
    • Metric counter keyed on discriminant() for a flat rejection-cause histogram.
  • Phase 8 — Docs
    • Update crates/common and crates/client module-level docs with the new variant and the receiver-side filtering rule.
    • Note in the relay docs which paths now emit rejects vs. stay logged-only.

Acceptance criteria

  • WireMessage::Reject(RejectPayload) exists in willow-common and round-trips through pack_wire / unpack_wire for every WireRejectReason variant.
  • check_permission and InsertError::PermissionDenied carry a typed Permission end-to-end; no format!-based extraction anywhere in the production path.
  • Every rejection site listed in the spec's mapping table either (a) emits a WireMessage::Reject with the variant from the table, or (b) is documented as local-only with a logged justification (currently: iroh gossip receive error, relay connection-cap saturation).
  • Receivers drop Reject payloads whose target_peer doesn't match their own EndpointId before any application-layer processing.
  • Client surfaces the new ClientEvent::Rejected and dispatches each variant per the "Client consumption pattern" section (backoff, re-auth prompt, permission UI, history re-request, upgrade prompt, payload chunking).
  • Forward-compat: a WireMessage::Reject carrying an unknown discriminant fails unpack_wire cleanly, the client logs and preserves the outbox, no panic.
  • All required tests (state, client, transport, relay, browser round-trip, forward-compat) are added and pass under just check.
  • Existing free-form rejection strings remain available only as the optional human field on RejectPayload (logs/UI), never matched on by code.
  • just check (fmt + clippy + test + WASM) passes with zero warnings.

Out of scope

  • Bumping PROTOCOL_VERSION — adding a WireMessage variant is not a breaking wire change (spec, "Extensibility & versioning").
  • Adding a positive-acknowledgement MessageType::Ack (deferred to open question 4 — revisit after Reject ships).
  • Recording rejections as a new EventKind for audit (open question 5 — contradicts the "rejected events never enter the DAG" rule).
  • The relay/src/lib.rs:156 connection-cap-saturation path: connection is already dropped before any topic subscription, so a same-topic reject can't reach the peer; stays logged-only by design.
  • Encrypting PermissionDenied's payload to the recipient — only required if a malicious-relay threat materializes (open question 2).

Open questions

(All from the spec; resolve during build.)

  1. Envelope correlation. How do we correlate RejectContext::Envelope with the offending send when the envelope never carried a hash? Proposal: stamp an outbound send_id (u64) in every Envelope and echo it in the reject.
  2. PermissionDenied leakage. Is telling a third party which Permission a peer lacks a role-enumeration risk? target_peer filtering helps, but a malicious relay reading the gossip stream still observes it. Encrypt the payload to the recipient if this becomes a threat.
  3. RateLimited.retry_after_ms semantics. Advisory (client may ignore) or enforced (peer drops earlier retries)? Nostr leaves this implementation-defined.
  4. Need for a positive Ack? Currently we rely on gossip delivery as implicit ACK; revisit after Reject ships.
  5. Audit trail via new EventKind? Tempting, but contradicts the "rejected events never enter the DAG" rule from 2026-04-12-state-authority-and-mutations.md.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions