Skip to content

general-audit lessons: 2026-05-04 #628

@intendednull

Description

@intendednull

Lessons from /general-audit run @ 88498a5 (master: #600). Caveman style.

What worked

  • Stalled-agent early detection lived up to spec. Architecture sweep agent stuck at scaffold (31 bytes, 5 min). Backfill dispatched in parallel. Recovered ~3 min wall time. Backfill returned 2 fresh findings (F1 invite-mint authority + 1 internal-dupe of F1). Pattern is repeatable.
  • Pass 3 cargo-audit drift = clean. 7 active advisories, 7 ignore entries, exact match. Last run pruned 3 stale entries (audit F25 [quality]: 3 stale RUSTSEC IDs in ci.yml ignore list (cargo-audit drift) #592 closed). Audit-ignore-list discipline now self-sustaining run-over-run.
  • Sibling-of-closed pass ROI was modest but real. Of 3 raw seeds, 1 surfaced an actual new bug (F11 / F12 / F13 / F14 / F15 cluster — Content::validate File-only ⇒ extends to all Content variants AND to all EventKind String fields AND to ciphertext). Without sibling-of-closed framing, the sweep would have stopped at one or two file-cap sites and missed the wider pattern. Worth keeping.
  • dedup subagent's cross-agent dupe detection caught 4 internal dupes (F35↔F20 + F10↔F17 explicit + 2 implicit). Pre-flagging these to the dedup agent in the prompt also helped.
  • Verification subagent could NOT close one finding (F06) because gh CLI was unavailable in its environment. Orchestrator closed it post-hoc via mcp__github__issue_read. ⇒ next-run brief: tell verification subagent it CAN use the GitHub MCP (mcp__github__issue_read etc.).

What didn't

  • Existing-label assumption guess. I assumed tech-debt + audit labels existed (correct, per prior issues), but didn't verify whether security/robustness labels exist. Used only audit for security/robustness/quality findings to avoid label-create. Filed issues lack severity-discriminating labels at the GitHub UI level. Consequence: severity is in title + body, but reviewers can't filter by security label. Cost: low. Mitigation: pre-fetch label list once at audit start.
  • general agent finished AFTER initial dedup batch was prepared. The 8th–10th findings (F44/F45 in my draft, became F26/F27) had to be appended in a second mini-dedup pass. Skill should explicitly note: don't seal the dedup batch until ALL sweep agents complete. (Otherwise stragglers go through a second, less rigorous, dedup path.)
  • Sub-issue MCP echoes the master body each call. 27 calls × ~3KB body = ~80KB context inflation just from sub-issue linking. The skill already warns about this and prescribes batch parallelism (which I did — 14+13 wide). Stayed inside budget.
  • Each agent's task-completion summary still contains the full finding texts. That's accidental context cost. Skill could ask agents to confine their final summary to "N findings written, file path, no contents echoed."
  • One CSP wildcard finding (F30/F17 connect-src) is borderline-architectural — the relay-URL pinning fix touches relay-config code. Filed as security not architecture; could go either way. Consistent with prior audits filing CSP as security.
  • Audit time budget exceeded for one sweep agent (general review, ~7 min vs 6 min budget). 2 of 10 agents finished after the 6-min watchdog cutoff but before the 10-min hard kill. Findings counted. Pattern: large-scope agents (general review, sibling-of-closed across many areas) need a slightly higher budget (8 min?). Consider a per-concern budget tunable.

Recurring patterns worth codifying

  • Content::validate File-only ⇒ all Content variants is a high-yield sibling. The same shape was flagged by both sibling-of-closed AND security-input-dos sweep agents (became internal dupe). Predictable cross-talk between concern lanes. Skill could add a single "audit-glob-of-validators" reminder.
  • Authority-not-in-apply_event (F1) was caught by the architecture-backfill agent only, not the security-auth agent. Authority-spec drift is more an architecture concern than a security concern by framing — agents should not assume the other lane covers it. Make this explicit.

Suggested edits to .claude/skills/general-audit/SKILL.md

  1. Add to Synthesis → Verification step: "Verification subagent has access to mcp__github__issue_read. When a finding cites a closed-issue number, the agent should verify the issue is closed via that tool rather than returning partially-verified."
  2. Add to Hard Rules → Setup: "Pre-fetch the label list once at audit start (mcp__github__list_issue_types + a quick scan of recent audit issues' labels field) and pass the available severity labels to the issue-filing step. Don't guess."
  3. Add to Synthesis → Dedup step: "Don't dispatch the dedup subagent until ALL sweep agents have written their final batch. Recompute the raw-findings file count immediately before dispatching dedup — if it differs from the count when the dedup brief was prepared, append the new entries to the brief."
  4. Add to Pass 2 (standard sweep) bullet: "Authority-spec drift (events emitted outside apply_event / authority not centralised in required_permission()) is an architecture concern, NOT a security concern by framing. Both lanes should look, but the architecture lane owns it."
  5. Add to Agent prompts (mandatory fields): "Final summary message: 'N findings written to <path>. .' Do NOT echo finding bodies in the summary — they're already on disk and re-echoing inflates orchestrator context."
  6. Adjust Time budget rule: "6-min default, but for general / sibling-of-closed agents with broad scope, allow up to 8 min. Cap at 10 min hard."
  7. Add to Synthesis → Master issue body bullet: "Don't list dropped-finding numbers in the master body. Drops live in the lessons issue. (Already partially codified; reiterate.)"
  8. Add a new sub-pattern (d) to Pass 1 (sibling-of-closed): "(d) Validator scope creep — when a closed PR added a per-field validator/cap on one variant of an enum / map / struct, look for the same shape on all other variants. PR audit F16 [security]: Content::File size_bytes attacker-declared, filename/mime unbounded #583 added Content::File caps; this run found Content::Text/Reply/Edit/System/Reaction + SealedContent.ciphertext + per-field EventKind String fields are all unbounded by the same logic. Single-variant validator is almost always wrong scope."

Filed by /general-audit @ 88498a5 (2026-05-04). master: #600.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions