feat: add `ce:review-beta` with structured persona pipeline by tmchow · Pull Request #348 · EveryInc/compound-engineering-plugin

tmchow · 2026-03-23T07:19:01Z

Summary

Adds ce:review-beta, a new beta review skill that replaces ce:review's free-form agent dispatch with a structured 6-stage pipeline adapted from the iterative-engineering plugin's code-review skill
Adds 8 new persona agents with confidence calibration, false-positive suppression, and structured JSON output
Integrates CE-specific agents (agent-native-reviewer, learnings-researcher, schema-drift-detector, deployment-verification-agent) into the pipeline

Why this is different from ce:review

The existing ce:review dispatches autonomous agents (security-sentinel, performance-oracle, etc.) that return unstructured prose reports. The orchestrator collects these reports and synthesizes them for the user.

In practice, this produces reviews where real bugs get buried in noise. An agent told to "be paranoid and leave no stone unturned" will generate 15 findings, but only 3 are real problems -- the rest are vague suggestions like "consider adding rate limiting" or defensive coding advice for values that can't actually be null. With 5+ agents running, the same unbounded query gets flagged by both the security and performance agents with different framing, so the report looks twice as long as it should. And every finding looks equally important because there's no way to distinguish "this will crash in production" from "this might theoretically be an issue at extreme scale." The user ends up manually triaging the review output itself, which defeats the purpose of automated review.

ce:review-beta fixes this with pipeline-level quality control:

	ce:review	ce:review-beta
Agent output	Free-form prose (Executive Summary, Risk Matrix, etc.)	Structured JSON matching a schema (severity, confidence, evidence, file:line)
False positive control	None -- agents told to "be paranoid, leave no stone unturned"	Each persona has explicit "what you don't flag" anti-patterns and per-domain confidence thresholds
Confidence scoring	None -- every finding looks equally important	0.0-1.0 per finding, calibrated per domain (security at 0.60 = actionable; performance at 0.60 = noise). Findings below threshold are suppressed before the user sees them
Multi-agent dedup	None -- overlapping findings from different agents appear as duplicates	Fingerprint-based merge: same file + line range + issue = single finding with highest severity, unioned evidence
Agent selection	User-configured roster via settings file	Tiered: 3 always-on + 5 conditional selected by diff content
Intent awareness	None -- agents review code in a vacuum	Intent discovery step before spawning; shapes how hard each reviewer looks
Quality gates	None	Verify actionability, ban vague language ("consider improving..."), check line accuracy, calibrate severity

Review modes

The skill supports three execution modes, chosen by passing a mode token in the arguments:

Mode	Trigger	Behavior
Interactive (default)	No mode token	Asks the user one question to establish intent before spawning reviewers. Presents findings and offers to fix.
Autonomous	`mode:autonomous`	No user interaction. Runs the full pipeline, applies only policy-allowed `safe_auto` fixes, re-reviews in bounded rounds, writes a run artifact, and emits residual work as `file-todos` for downstream resolution.
Report-only	`mode:report-only`	Strictly read-only. Reviews and reports findings, then stops. No edits, todos, commits, pushes, or PR actions. Safe for parallel execution on the same checkout (e.g., alongside browser testing). Will refuse to switch the shared checkout if given a PR or branch target -- must run in an isolated worktree or on the already checked-out branch.

The mode design supports pipeline composition: an orchestrator can run mode:report-only in parallel with other tools on the same checkout, then follow up with mode:autonomous in a dedicated worktree for fixes.

The persona agents

8 new agents in agents/review/, each with:

Confidence calibration with per-domain thresholds (security has a lower bar because the cost of missing a vuln is high; performance has a higher bar because perf issues are easy to measure later)
"What you don't flag" sections that prevent common LLM false positives (e.g., security reviewer won't suggest "consider adding rate limiting" without a specific exploitable finding)
tools: Read, Grep, Glob, Bash frontmatter for read-only enforcement
Structured JSON output matching findings-schema.json

Persona	Tier	Focus
correctness-reviewer	Always-on	Logic errors, edge cases, state bugs, error propagation
testing-reviewer	Always-on	Coverage gaps, weak assertions, brittle tests
maintainability-reviewer	Always-on	Coupling, complexity, naming, dead code
security-reviewer	Conditional	Exploitable vulnerabilities (attacker mindset, not compliance checklist)
performance-reviewer	Conditional	Runtime perf and scalability (N+1, unbounded memory, blocking I/O)
api-contract-reviewer	Conditional	Breaking changes to public interfaces
data-migrations-reviewer	Conditional	Migration safety, swapped IDs, dual-write, orphaned references
reliability-reviewer	Conditional	Failure modes, missing timeouts, error swallowing, retry storms

The data-migrations-reviewer was enhanced beyond the iterative-engineering source with items from CE's existing data-migration-expert (swapped ID/enum mapping detection, dual-write validation, orphaned reference search).

CE-specific adaptations

On top of the iterative-engineering base:

agent-native-reviewer and learnings-researcher as always-on CE agents
schema-drift-detector and deployment-verification-agent as conditional (migration PRs)
Protected artifacts: findings against docs/brainstorms/, docs/plans/, docs/solutions/ are discarded
Quality gates from structured review best practices

Test plan

Run /ce:review-beta on a PR with code changes -- verify structured output with confidence scores
Run /ce:review-beta on a PR with migration files -- verify conditional agents (data-migrations, schema-drift-detector) are selected
Run /ce:review-beta with no argument on a feature branch -- verify standalone mode works
Run /ce:review-beta <branch-name> for a branch without a PR -- verify branch mode doesn't fail
Run /ce:review-beta mode:autonomous -- verify no user interaction, safe_auto fixes applied, residual todos created
Run /ce:review-beta mode:report-only -- verify strictly read-only, no edits or mutations
Run /ce:review-beta mode:report-only <PR-number> -- verify it refuses to switch the shared checkout
Verify bun run release:validate passes

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e8ebed3cba

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 01ca2b4dee

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9d42aaba2b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8cd6d67fff

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e56c3fba36

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

tmchow · 2026-03-23T18:17:51Z

@codex review again.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d007aa6784

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dc18dc1646

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0f882638cc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a721eecf9c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3febc62a74

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c79ecb07f8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: db5db938c5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…ona agents Add a structured, multi-agent code review skill with: - 8 specialized reviewer persona agents (correctness, security, testing, maintainability, performance, reliability, api-contract, data-migrations) - Tiered review modes: interactive, autonomous, and report-only - Structured findings schema with severity, confidence, autofix classification - Read-only reviewer contract with non-mutating git/gh inspection - Promotion contract documenting path from beta to stable ce:review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 70e87292c7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

… in review-beta The autonomous mode handoff creates todos/ items but didn't specify the format. Now explicitly loads the file-todos skill and maps finding severity to todo priority.

… FETCH_HEAD in review-beta - Branch-mode and standalone-mode base resolution now mirrors PR-mode logic, resolving from the PR's actual base repository instead of assuming origin - FETCH_HEAD is only read after a successful fetch to avoid stale refs from unrelated prior fetches

Adds codex-reviewer agent for cross-model code review validation. Rebased onto main after EveryInc#348 merge. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

tmchow changed the title ~~feat: add ce:review-beta with structured persona pipeline~~ feat: add ce:review-beta with structured persona pipeline Mar 23, 2026

chatgpt-codex-connector Bot reviewed Mar 23, 2026

View reviewed changes

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

chatgpt-codex-connector Bot reviewed Mar 23, 2026

View reviewed changes

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md

Comment thread plugins/compound-engineering/agents/review/correctness-reviewer.md

chatgpt-codex-connector Bot reviewed Mar 23, 2026

View reviewed changes

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

chatgpt-codex-connector Bot reviewed Mar 23, 2026

View reviewed changes

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

tmchow mentioned this pull request Mar 23, 2026

feat(review): add codex-reviewer agent for cross-model validation #352

Closed

chatgpt-codex-connector Bot reviewed Mar 23, 2026

View reviewed changes

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

chatgpt-codex-connector Bot reviewed Mar 23, 2026

View reviewed changes

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

chatgpt-codex-connector Bot reviewed Mar 23, 2026

View reviewed changes

Comment thread src/converters/claude-to-copilot.ts Outdated

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

tmchow force-pushed the feat/compare-review-skills branch from dc18dc1 to 00d7deb Compare March 23, 2026 18:56

chatgpt-codex-connector Bot reviewed Mar 23, 2026

View reviewed changes

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

chatgpt-codex-connector Bot reviewed Mar 23, 2026

View reviewed changes

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

Comment thread src/converters/claude-to-copilot.ts Outdated

chatgpt-codex-connector Bot reviewed Mar 23, 2026

View reviewed changes

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

Comment thread src/converters/claude-to-kiro.ts Outdated

chatgpt-codex-connector Bot reviewed Mar 23, 2026

View reviewed changes

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md

Comment thread src/utils/agent-content.ts Outdated

Comment thread src/converters/claude-to-codex.ts Outdated

mvanhorn mentioned this pull request Mar 24, 2026

feat(review): integrate codex-reviewer into stable review pipeline #356

Open

chatgpt-codex-connector Bot reviewed Mar 24, 2026

View reviewed changes

Comment thread src/converters/claude-to-copilot.ts Outdated

tmchow force-pushed the feat/compare-review-skills branch from 76bf361 to 70e8729 Compare March 24, 2026 03:02

chatgpt-codex-connector Bot reviewed Mar 24, 2026

View reviewed changes

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated

tmchow added 2 commits March 23, 2026 20:11

fix(compound-engineering): reference file-todos skill for todo format…

8e41755

… in review-beta The autonomous mode handoff creates todos/ items but didn't specify the format. Now explicitly loads the file-todos skill and maps finding severity to todo priority.

tmchow added a commit that referenced this pull request Mar 24, 2026

docs: update plan with PR #348 alignment and 2-bucket routing

a209cac

tmchow mentioned this pull request Mar 24, 2026

feat: redesign document-review skill with persona-based review #359

Merged

5 tasks

tmchow merged commit e932276 into main Mar 24, 2026
2 checks passed

github-actions Bot mentioned this pull request Mar 24, 2026

chore: release main #360

Merged

Conversation

tmchow commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why this is different from ce:review

Review modes

The persona agents

CE-specific adaptations

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tmchow commented Mar 23, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tmchow commented Mar 23, 2026 •

edited

Loading