Skip to content

Sync cross-toolkit improvements: numbering, comment-aware triage, shared utilities#9

Open
devdanzin wants to merge 9 commits intomainfrom
claude/sync-review-toolkit-improvements-NG190
Open

Sync cross-toolkit improvements: numbering, comment-aware triage, shared utilities#9
devdanzin wants to merge 9 commits intomainfrom
claude/sync-review-toolkit-improvements-NG190

Conversation

@devdanzin
Copy link
Copy Markdown
Owner

Summary

Cross-toolkit sync: ports top cross-cutting improvements into cpython-review-toolkit. Stdlib-only constraint preserved — no tree-sitter, no third-party deps.

  • Global non-restarting finding numbering in commands/explore.md (Phase 3 synthesis) and commands/health.md. Findings numbered sequentially across categories (FIX 1..N, CONSIDER N+1..M, POLICY M+1..P, ACCEPTABLE P+1..Q), with the Action Plan referencing those global numbers. Rubric ported from cext-review-toolkit (issue #33).
  • New scripts/scan_common.py (stdlib-only): extract_nearby_comments (regex-based for /* ... */ and // over source text), has_safety_annotation, make_finding, load_json_data (with WARNING: stderr on failure). Tuned _SAFETY_KEYWORDS for CPython C review vocabulary (gil held, already locked, refcount safe, etc.).
  • Annotation-aware suppression wired into 4 scanners: scan_refcounts.py, scan_error_paths.py, scan_null_checks.py, scan_gil_usage.py. Each consults nearby comments on every candidate finding; noqa suppresses, other safety keywords downgrade confidence to low.
  • Agent prompts updated (refcount-auditor, error-path-analyzer, null-safety-scanner, gil-discipline-checker) with new ## Safety Annotations sections documenting the vocabulary so reviewers know what suppresses a finding.
  • --max-files flag on measure_c_complexity.py, check_pep7.py, analyze_includes.py docstrings (already supported in codepaths).
  • Tests: new test_scan_common.py (16 tests); extended scanner test files with safety-annotation cases.

Ports from

  • ft-review-toolkit — comment-aware triage helpers + silent-failure pattern (adapted from tree-sitter → regex for stdlib-only).
  • cext-review-toolkit — global non-restarting numbering rubric.
  • code-review-toolkit--max-files memory optimization.

Test plan

  • python -m unittest discover tests — 131 tests pass.
  • New scan_common helpers covered by 16 unit tests.
  • Safety-annotation cases added to each wired scanner's test file.

Do not merge — open for review.

devdanzin and others added 9 commits April 17, 2026 16:40
…red utilities

- Global non-restarting finding numbering in explore.md synthesis
- scan_common.py (stdlib-only): extract_nearby_comments, has_safety_annotation,
  make_finding, load_json_data with stderr warnings
- Wire annotation-aware suppression into 4 scanners
- Harden test assertions to catch silent-failure / empty-data bugs
- --max-files flag on all file-walking scanners

Ported patterns from ft-review-toolkit (comment-aware triage, silent-failure
detection) and code-review-toolkit (memory optimizations).
Complements the MCP-pushed scan_common.py and agent-prompt updates with
the remaining integration work:

- scan_refcounts/error_paths/null_checks/gil_usage.py now call
  extract_nearby_comments + has_safety_annotation on each candidate
  finding and suppress (noqa / SAFETY:) or downgrade confidence.
- tests/test_scan_common.py covers the new helpers.
- Extended test_scan_{refcounts,error_paths,null_checks,gil_usage}.py
  with safety-annotation cases.
…ions

Follow-on prompt-only sync pass (companion to PR #9 code/test changes).

Scope 1 — operational footer for script-backed agents:
Appends a "## Running the script" block covering Bash timeout (300000 ms),
unique /tmp JSON filename with $$ PID suffix, --max-files/--workers
forwarding, and no-retry-on-timeout fallback to Grep/Read. Applied to
the 7 script-backed agents:
  - refcount-auditor
  - error-path-analyzer
  - null-safety-scanner
  - gil-discipline-checker
  - c-complexity-analyzer
  - pep7-style-checker
  - include-graph-mapper
git-history-analyzer already carries equivalent guidance from v0.4.0,
so it is intentionally skipped for Scope 1.

Scope 2 — confidence definitions:
Appends a "## Confidence" block defining HIGH (>=90%), MEDIUM (70-89%),
and LOW (50-69%) likelihoods for agents that use HIGH/MEDIUM/LOW as a
confidence axis in findings:
  - refcount-auditor
  - error-path-analyzer
  - null-safety-scanner
  - gil-discipline-checker
  - git-history-analyzer
Agents without a HIGH/MEDIUM/LOW confidence axis (api-deprecation-tracker,
c-complexity-analyzer, include-graph-mapper, macro-hygiene-reviewer,
memory-pattern-analyzer, pep7-style-checker) are untouched by Scope 2.
Copy link
Copy Markdown
Owner Author

Follow-on prompt-only scaffolding commit pushed: 9f19369 on claude/sync-review-toolkit-improvements-NG190.

Scope 1 — ## Running the script footer (Bash timeout 300000 ms, unique /tmp/<slug>_<scope>_$$.json filename, --max-files / --workers forwarding, no-retry-on-timeout fallback to Grep/Read) appended to the 7 script-backed agents:

  • refcount-auditor
  • error-path-analyzer
  • null-safety-scanner
  • gil-discipline-checker
  • c-complexity-analyzer
  • pep7-style-checker
  • include-graph-mapper

git-history-analyzer already carries equivalent guidance under Phase 1 from the v0.4.0 work, so it is intentionally skipped for Scope 1.

Scope 2 — ## Confidence block (HIGH ≥90%, MEDIUM 70–89%, LOW 50–69%; findings below LOW not reported) appended to the 5 agents that use HIGH/MEDIUM/LOW as a confidence axis in findings:

  • refcount-auditor
  • error-path-analyzer
  • null-safety-scanner
  • gil-discipline-checker
  • git-history-analyzer

Agents without a HIGH/MEDIUM/LOW confidence axis (api-deprecation-tracker, c-complexity-analyzer, include-graph-mapper, macro-hygiene-reviewer, memory-pattern-analyzer, pep7-style-checker) are untouched by Scope 2 — they use FIX/CONSIDER/POLICY/ACCEPTABLE severity labels rather than a numeric confidence axis.

Scope 3 (git-history-analyzer methodology) was already complete and required no changes.

8 files touched, +89 lines, no code/test changes. Not merging.


Generated by Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants