Sync cross-toolkit improvements: numbering, comment-aware triage, shared utilities#9
Sync cross-toolkit improvements: numbering, comment-aware triage, shared utilities#9
Conversation
…red utilities - Global non-restarting finding numbering in explore.md synthesis - scan_common.py (stdlib-only): extract_nearby_comments, has_safety_annotation, make_finding, load_json_data with stderr warnings - Wire annotation-aware suppression into 4 scanners - Harden test assertions to catch silent-failure / empty-data bugs - --max-files flag on all file-walking scanners Ported patterns from ft-review-toolkit (comment-aware triage, silent-failure detection) and code-review-toolkit (memory optimizations).
…mplexity docstrings
Complements the MCP-pushed scan_common.py and agent-prompt updates with
the remaining integration work:
- scan_refcounts/error_paths/null_checks/gil_usage.py now call
extract_nearby_comments + has_safety_annotation on each candidate
finding and suppress (noqa / SAFETY:) or downgrade confidence.
- tests/test_scan_common.py covers the new helpers.
- Extended test_scan_{refcounts,error_paths,null_checks,gil_usage}.py
with safety-annotation cases.
…ions Follow-on prompt-only sync pass (companion to PR #9 code/test changes). Scope 1 — operational footer for script-backed agents: Appends a "## Running the script" block covering Bash timeout (300000 ms), unique /tmp JSON filename with $$ PID suffix, --max-files/--workers forwarding, and no-retry-on-timeout fallback to Grep/Read. Applied to the 7 script-backed agents: - refcount-auditor - error-path-analyzer - null-safety-scanner - gil-discipline-checker - c-complexity-analyzer - pep7-style-checker - include-graph-mapper git-history-analyzer already carries equivalent guidance from v0.4.0, so it is intentionally skipped for Scope 1. Scope 2 — confidence definitions: Appends a "## Confidence" block defining HIGH (>=90%), MEDIUM (70-89%), and LOW (50-69%) likelihoods for agents that use HIGH/MEDIUM/LOW as a confidence axis in findings: - refcount-auditor - error-path-analyzer - null-safety-scanner - gil-discipline-checker - git-history-analyzer Agents without a HIGH/MEDIUM/LOW confidence axis (api-deprecation-tracker, c-complexity-analyzer, include-graph-mapper, macro-hygiene-reviewer, memory-pattern-analyzer, pep7-style-checker) are untouched by Scope 2.
|
Follow-on prompt-only scaffolding commit pushed: Scope 1 —
Scope 2 —
Agents without a HIGH/MEDIUM/LOW confidence axis ( Scope 3 (git-history-analyzer methodology) was already complete and required no changes. 8 files touched, +89 lines, no code/test changes. Not merging. Generated by Claude Code |
Summary
Cross-toolkit sync: ports top cross-cutting improvements into
cpython-review-toolkit. Stdlib-only constraint preserved — no tree-sitter, no third-party deps.commands/explore.md(Phase 3 synthesis) andcommands/health.md. Findings numbered sequentially across categories (FIX 1..N, CONSIDER N+1..M, POLICY M+1..P, ACCEPTABLE P+1..Q), with the Action Plan referencing those global numbers. Rubric ported fromcext-review-toolkit(issue #33).scripts/scan_common.py(stdlib-only):extract_nearby_comments(regex-based for/* ... */and//over source text),has_safety_annotation,make_finding,load_json_data(withWARNING:stderr on failure). Tuned_SAFETY_KEYWORDSfor CPython C review vocabulary (gil held,already locked,refcount safe, etc.).scan_refcounts.py,scan_error_paths.py,scan_null_checks.py,scan_gil_usage.py. Each consults nearby comments on every candidate finding;noqasuppresses, other safety keywords downgrade confidence tolow.refcount-auditor,error-path-analyzer,null-safety-scanner,gil-discipline-checker) with new## Safety Annotationssections documenting the vocabulary so reviewers know what suppresses a finding.--max-filesflag onmeasure_c_complexity.py,check_pep7.py,analyze_includes.pydocstrings (already supported in codepaths).test_scan_common.py(16 tests); extended scanner test files with safety-annotation cases.Ports from
ft-review-toolkit— comment-aware triage helpers + silent-failure pattern (adapted from tree-sitter → regex for stdlib-only).cext-review-toolkit— global non-restarting numbering rubric.code-review-toolkit—--max-filesmemory optimization.Test plan
python -m unittest discover tests— 131 tests pass.scan_commonhelpers covered by 16 unit tests.Do not merge — open for review.