From 77fc6e2d22f2c673ff8a4fb95aae4984ebe637ed Mon Sep 17 00:00:00 2001 From: Claude Date: Mon, 4 May 2026 13:51:03 +0000 Subject: [PATCH] chore(skill): fold audit lessons #628 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Applied: - Pass 1 sub-pattern (d): validator scope creep — single-variant validators (PR #583's `Content::File`-only caps) almost always need to extend to all sibling variants. Caught 5 findings this run. - Verification subagent has access to `mcp__github__issue_read` — closed-issue check should not return `partially-verified` when GH MCP can confirm. - Don't dispatch dedup until ALL sweep agents land; recompute raw count just before dispatch. The 2026-05-04 run's `general` agent landed F9/F10 (relay docker break + stale port docs) AFTER initial dedup brief was prepared. - Authority-spec drift = architecture concern, not security. The architecture lane owns events emitted outside `apply_event` (e.g. invite-mint side channels). Security-auth lane should look but architecture lane catches. - Agent prompts: 8 min budget for broad-scope agents (`general`, `sibling-of-closed`); per-concern agents stay at 6 min. - Agent final-summary rule: report path + one-line strongest finding only, never echo finding bodies. Saves orchestrator context. - Pre-fetch available severity-label set before filing child issues. Don't guess `security`/`robustness` exist — verify against recent audit issues' `labels` field. Skipped: - "Reiterate master-body keep-dedup-metadata-minimal" — already codified at the existing "Master issue body" paragraph; second copy would just drift over time. https://claude.ai/code/session_01H1t1f3KpZuQfXV3CmG1JhS --- .claude/skills/general-audit/SKILL.md | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/.claude/skills/general-audit/SKILL.md b/.claude/skills/general-audit/SKILL.md index 0082eb39..f27313fa 100644 --- a/.claude/skills/general-audit/SKILL.md +++ b/.claude/skills/general-audit/SKILL.md @@ -42,6 +42,8 @@ Default split — one agent per concern: Spawn more if an area needs depth. For very large diffs, add per-crate agents on top of per-concern agents — both axes, not one or the other. +**Authority-spec drift is an architecture concern, not a security concern by framing.** Events emitted outside `apply_event` / authority not centralised in `required_permission()` (e.g. invite-mint as an out-of-band side channel) belongs in the architecture lane. The security/auth lane should look but the architecture lane owns it. The 2026-05-04 audit caught the invite-mint authority case ONLY in the architecture-backfill agent — the security-auth agent's framing assumed all authority lives in `apply_event`, so it missed events that don't. + ## Audit Pass Order Run passes in order. Sibling-of-closed FIRST — small fix-scope mismatches are a top source of real findings on every run. @@ -79,6 +81,8 @@ git diff .. -- '*.rs' | rg "^\+\s*pub (async )?fn (\w+)" -or ' # Then for each fn name: rg "\.\(" crates --glob '!**/tests*' ``` +**(d) Validator scope creep.** When a closed PR added a per-field validator/cap on **one** variant of an enum / map / struct, look for the same shape on **all** other variants. PR #583 added `Content::File` filename + mime caps; the 2026-05-04 audit found `Content::Text/Reply/Edit/System/Reaction` + `SealedContent.ciphertext` + per-field `EventKind` String fields are all unbounded by the same logic. Single-variant validator is almost always wrong scope — emit a finding for every sibling variant left uncapped. + **Commit-prefix filter when N merged PRs > 5.** Auto-fix-batch PRs (`#auto-fix batch ...`) + skill-only PRs add noise. Drilling each commit per-PR is expensive. Pre-filter by commit-subject prefix: ```bash @@ -158,12 +162,16 @@ Dedup subagent's job: Orchestrator drops `dup`/`superseded` findings from the file-list. Survivors get filed. +**Don't dispatch the dedup subagent until ALL sweep agents have written their final batch.** Recompute the raw-findings file count immediately before dispatching dedup — if it differs from the count when the dedup brief was prepared, append the new entries to the brief before sending. Stragglers slipping through a second, less-rigorous dedup path is the most common way bad dupes leak into filed issues. + ### Verification step (fresh subagent) Second fresh subagent verifies surviving findings real via grep/rg for exact patterns cited. Drop any finding whose verification grep returns 0 hits. **Drop `partially-verified` findings whose body claim contradicts the verification spot-check, not just `FAILED`.** A finding marked "lock-ok marker missing at line 31" verified with "marker exists at line 23" is contradicted, not partially supported — drop it. Filing inaccurate child issues wastes reviewer time and undermines confidence in audit output. Orchestrator must enforce this drop, not just defer to the subagent's softer "accepted on review" verdict. +**Verification subagent has access to `mcp__github__issue_read`.** When a finding cites a closed-issue number (e.g. `TODO(#119)`), the agent should verify the issue is actually closed via that tool rather than returning `partially-verified` for "could not verify issue closed via gh CLI." Brief the verification subagent explicitly that the GitHub MCP is available — the previous run had to close one such finding post-hoc in the orchestrator. + ### File the issues 1. **Master issue** = commit hash + survivors list. @@ -208,7 +216,8 @@ This closes the loop: each audit run feeds the next. ### Agent prompts (mandatory fields) -- Time budget: 6 min, stop+save if exceeded. +- Time budget: 6 min default, stop+save if exceeded. **Broad-scope agents (`general` review, `sibling-of-closed`) get 8 min** — both have wider sweeps than per-concern agents and consistently brush against the 6-min cap. Hard kill at 10 min. +- **Final summary message: "N findings written to ``. ." Do NOT echo finding bodies in the summary** — they're already on disk and re-echoing inflates orchestrator context. Per-finding rationale belongs in the appended file, not the agent's return summary. - **Write findings in small chunks. Never big batches.** Big batched writes are the dominant timeout cause — stream-idle timeouts hit at the final large-write step and lose the entire run's work. Append each finding **as soon as it's identified** — one rg hit + one Read confirmation = one append. One finding per write. Do NOT accumulate findings in memory and dump them at the end. - Scaffold report file before 2nd tool call. Then append one finding per write thereafter. - Per-finding entry stays small: file:line, severity (split: security = confidentiality/integrity; robustness = availability/DoS), Obvious? yes/no. One short paragraph max. If a finding's evidence is large (a long grep result, a code block), summarise it in the entry and link to file:line — don't paste it inline. @@ -222,6 +231,7 @@ This closes the loop: each audit run feeds the next. ### Setup - Pre-worktree: `git stash` or `git restore` main dir; `.claude/worktrees/` in `.gitignore`. One worktree per subagent AND one for the lessons-PR. Tear down audit worktrees after report submitted; lessons-PR worktree stays until PR merges/closes. - `cargo install --locked cargo-audit` upfront (or verify); orchestrator runs `cargo audit` directly — no agent needed. Yank-check 403s are harmless noise. +- **Pre-fetch the available severity-label set** before filing child issues — scan a couple of recent audit issues' `labels` arrays to learn what's defined (`tech-debt`, `audit`, etc.). Don't guess `security`/`robustness` exist; pass the verified label set to the issue-filing step. Severity that has no label gets captured in the title + body only, not as a filterable label. ### Quality - Quality > speed. Always thorough path.