feat(ce-resolve-pr-feedback): drop bot noise, centralize test runs by tmchow · Pull Request #610 · EveryInc/compound-engineering-plugin

tmchow · 2026-04-20T07:26:54Z

Summary

Two wins for ce-resolve-pr-feedback, both cutting per-thread work that should happen once per invocation:

Review-bot wrapper comments no longer reach the agent. CodeRabbit, Codex, Gemini Code Assist, Copilot, and Codecov post boilerplate the skill repeatedly deliberated on and then narrated dropping. Their actionable asks live in inline review threads, which still come through untouched.
Resolver agents no longer run the full project test suite. They run only targeted tests for what they changed; the skill runs project validation once against the combined diff before commit.

Bot wrapper filter

scripts/get-pr-comments drops known review-bot and CI-bot authors from pr_comments and review_bodies in the jq pipeline, before the skill sees them:

coderabbitai, chatgpt-codex-connector, gemini-code-assist, copilot-pull-request-reviewer — AI review bots whose top-level output is wrapper text; their actionable asks are inline and still processed as review_threads.
codecov — CI summary, never actionable on its own.

The filter list is narrow by design: exact logins, not patterns, so future bots that post genuinely actionable top-level content are not suppressed preemptively. The GraphQL payload also drops url and createdAt fields that weren't consumed downstream.

The skill gains an explicit silent-drop rule for any wrapper that slips through the script — no announcing, counting, or surfacing dropped items. Removes the "ignoring the Codex bot comment" chatter from the skill's output.

Test split between resolver and orchestrator

Parallel resolvers previously ran the full project test suite in isolation, duplicating work (N × full-suite time, N × test output read by the model) and never validating the merged diff.

Resolver agent writes tests when warranted and runs only targeted tests for what it changed (specific file, test pattern, or the test it just wrote). Trivial edits (doc, comment, string-literal) skip tests entirely. If targeted tests can't be located, the agent notes that in reason and the orchestrator catches issues in the combined run. No verdict downgrade.
Orchestrator gains a new step 6 (Validate Combined State). After all resolvers return, it aggregates files_changed and runs the project's validation command once against the combined diff. Three outcomes: green proceeds to commit, red in touched files gets one inline fix attempt before escalating needs-human, red in untouched files is treated as pre-existing and noted in the commit footer.

Steps 6-9 renumbered to 7-10 to accommodate the new Validate step. Step 10 summary includes a one-line validation outcome (Validation: bun test passed (893/893)).

Test plan

bun test green (893/893). Frontmatter and converter tests would catch a malformed skill change or a broken script.
bun run release:validate in sync.
Manual verification of the bot filter against live PRs: siddharthvaddem/openscreen#471 (CodeRabbit + Codex), GoogleCloudPlatform/cluster-toolkit#5531 (Gemini), jaegertracing/jaeger-ui#3740 (Copilot + Codecov). Every bot wrapper filtered; human reviewer comments preserved in each case.

…pers, split test responsibility - Filter CodeRabbit, Codex, Gemini, Copilot, and Codecov top-level comments and review bodies in get-pr-comments. These bots put actionable feedback in inline review threads; their wrapper bodies are reliable noise and the agent repeatedly narrated ignoring them. Also trim unused url/createdAt fields from the GraphQL payload. - Add silent-drop rule so the agent stops announcing non-actionable items. - Split test responsibility: resolvers run only targeted tests for what they changed (specific file, pattern, or the test they wrote), never the full project suite. Orchestrator runs the project's validation once against the combined diff before commit. Covers pre-existing failures (note in footer), failures in touched files (one inline retry, then escalate), and the zero-changes skip path.

tmchow merged commit b35de99 into main Apr 20, 2026
2 checks passed

This was referenced Apr 20, 2026

chore: release main #596

Merged

chore: release main #654

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ce-resolve-pr-feedback): drop bot noise, centralize test runs#610

feat(ce-resolve-pr-feedback): drop bot noise, centralize test runs#610
tmchow merged 1 commit intomainfrom
tmchow/optimize-pr-feedback

tmchow commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tmchow commented Apr 20, 2026

Summary

Bot wrapper filter

Test split between resolver and orchestrator

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant