Skip to content

feat: add ce:review-beta with structured persona pipeline#348

Merged
tmchow merged 3 commits intomainfrom
feat/compare-review-skills
Mar 24, 2026
Merged

feat: add ce:review-beta with structured persona pipeline#348
tmchow merged 3 commits intomainfrom
feat/compare-review-skills

Conversation

@tmchow
Copy link
Copy Markdown
Collaborator

@tmchow tmchow commented Mar 23, 2026

Summary

  • Adds ce:review-beta, a new beta review skill that replaces ce:review's free-form agent dispatch with a structured 6-stage pipeline adapted from the iterative-engineering plugin's code-review skill
  • Adds 8 new persona agents with confidence calibration, false-positive suppression, and structured JSON output
  • Integrates CE-specific agents (agent-native-reviewer, learnings-researcher, schema-drift-detector, deployment-verification-agent) into the pipeline

Why this is different from ce:review

The existing ce:review dispatches autonomous agents (security-sentinel, performance-oracle, etc.) that return unstructured prose reports. The orchestrator collects these reports and synthesizes them for the user.

In practice, this produces reviews where real bugs get buried in noise. An agent told to "be paranoid and leave no stone unturned" will generate 15 findings, but only 3 are real problems -- the rest are vague suggestions like "consider adding rate limiting" or defensive coding advice for values that can't actually be null. With 5+ agents running, the same unbounded query gets flagged by both the security and performance agents with different framing, so the report looks twice as long as it should. And every finding looks equally important because there's no way to distinguish "this will crash in production" from "this might theoretically be an issue at extreme scale." The user ends up manually triaging the review output itself, which defeats the purpose of automated review.

ce:review-beta fixes this with pipeline-level quality control:

ce:review ce:review-beta
Agent output Free-form prose (Executive Summary, Risk Matrix, etc.) Structured JSON matching a schema (severity, confidence, evidence, file:line)
False positive control None -- agents told to "be paranoid, leave no stone unturned" Each persona has explicit "what you don't flag" anti-patterns and per-domain confidence thresholds
Confidence scoring None -- every finding looks equally important 0.0-1.0 per finding, calibrated per domain (security at 0.60 = actionable; performance at 0.60 = noise). Findings below threshold are suppressed before the user sees them
Multi-agent dedup None -- overlapping findings from different agents appear as duplicates Fingerprint-based merge: same file + line range + issue = single finding with highest severity, unioned evidence
Agent selection User-configured roster via settings file Tiered: 3 always-on + 5 conditional selected by diff content
Intent awareness None -- agents review code in a vacuum Intent discovery step before spawning; shapes how hard each reviewer looks
Quality gates None Verify actionability, ban vague language ("consider improving..."), check line accuracy, calibrate severity

Review modes

The skill supports three execution modes, chosen by passing a mode token in the arguments:

Mode Trigger Behavior
Interactive (default) No mode token Asks the user one question to establish intent before spawning reviewers. Presents findings and offers to fix.
Autonomous mode:autonomous No user interaction. Runs the full pipeline, applies only policy-allowed safe_auto fixes, re-reviews in bounded rounds, writes a run artifact, and emits residual work as file-todos for downstream resolution.
Report-only mode:report-only Strictly read-only. Reviews and reports findings, then stops. No edits, todos, commits, pushes, or PR actions. Safe for parallel execution on the same checkout (e.g., alongside browser testing). Will refuse to switch the shared checkout if given a PR or branch target -- must run in an isolated worktree or on the already checked-out branch.

The mode design supports pipeline composition: an orchestrator can run mode:report-only in parallel with other tools on the same checkout, then follow up with mode:autonomous in a dedicated worktree for fixes.

The persona agents

8 new agents in agents/review/, each with:

  • Confidence calibration with per-domain thresholds (security has a lower bar because the cost of missing a vuln is high; performance has a higher bar because perf issues are easy to measure later)
  • "What you don't flag" sections that prevent common LLM false positives (e.g., security reviewer won't suggest "consider adding rate limiting" without a specific exploitable finding)
  • tools: Read, Grep, Glob, Bash frontmatter for read-only enforcement
  • Structured JSON output matching findings-schema.json
Persona Tier Focus
correctness-reviewer Always-on Logic errors, edge cases, state bugs, error propagation
testing-reviewer Always-on Coverage gaps, weak assertions, brittle tests
maintainability-reviewer Always-on Coupling, complexity, naming, dead code
security-reviewer Conditional Exploitable vulnerabilities (attacker mindset, not compliance checklist)
performance-reviewer Conditional Runtime perf and scalability (N+1, unbounded memory, blocking I/O)
api-contract-reviewer Conditional Breaking changes to public interfaces
data-migrations-reviewer Conditional Migration safety, swapped IDs, dual-write, orphaned references
reliability-reviewer Conditional Failure modes, missing timeouts, error swallowing, retry storms

The data-migrations-reviewer was enhanced beyond the iterative-engineering source with items from CE's existing data-migration-expert (swapped ID/enum mapping detection, dual-write validation, orphaned reference search).

CE-specific adaptations

On top of the iterative-engineering base:

  • agent-native-reviewer and learnings-researcher as always-on CE agents
  • schema-drift-detector and deployment-verification-agent as conditional (migration PRs)
  • Protected artifacts: findings against docs/brainstorms/, docs/plans/, docs/solutions/ are discarded
  • Quality gates from structured review best practices

Test plan

  • Run /ce:review-beta on a PR with code changes -- verify structured output with confidence scores
  • Run /ce:review-beta on a PR with migration files -- verify conditional agents (data-migrations, schema-drift-detector) are selected
  • Run /ce:review-beta with no argument on a feature branch -- verify standalone mode works
  • Run /ce:review-beta <branch-name> for a branch without a PR -- verify branch mode doesn't fail
  • Run /ce:review-beta mode:autonomous -- verify no user interaction, safe_auto fixes applied, residual todos created
  • Run /ce:review-beta mode:report-only -- verify strictly read-only, no edits or mutations
  • Run /ce:review-beta mode:report-only <PR-number> -- verify it refuses to switch the shared checkout
  • Verify bun run release:validate passes

@tmchow tmchow changed the title feat: add ce:review-beta with structured persona pipeline feat: add ce:review-beta with structured persona pipeline Mar 23, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e8ebed3cba

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 01ca2b4dee

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md
Comment thread plugins/compound-engineering/agents/review/correctness-reviewer.md
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9d42aaba2b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8cd6d67fff

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md
Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e56c3fba36

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
@tmchow
Copy link
Copy Markdown
Collaborator Author

tmchow commented Mar 23, 2026

@codex review again.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d007aa6784

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md
Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dc18dc1646

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/converters/claude-to-copilot.ts Outdated
Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
@tmchow tmchow force-pushed the feat/compare-review-skills branch from dc18dc1 to 00d7deb Compare March 23, 2026 18:56
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0f882638cc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a721eecf9c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
Comment thread src/converters/claude-to-copilot.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3febc62a74

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
Comment thread src/converters/claude-to-kiro.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c79ecb07f8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md
Comment thread src/utils/agent-content.ts Outdated
Comment thread src/converters/claude-to-codex.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: db5db938c5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/converters/claude-to-copilot.ts Outdated
…ona agents

Add a structured, multi-agent code review skill with:
- 8 specialized reviewer persona agents (correctness, security, testing,
  maintainability, performance, reliability, api-contract, data-migrations)
- Tiered review modes: interactive, autonomous, and report-only
- Structured findings schema with severity, confidence, autofix classification
- Read-only reviewer contract with non-mutating git/gh inspection
- Promotion contract documenting path from beta to stable ce:review
@tmchow tmchow force-pushed the feat/compare-review-skills branch from 76bf361 to 70e8729 Compare March 24, 2026 03:02
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 70e87292c7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
Comment thread plugins/compound-engineering/skills/ce-review-beta/SKILL.md Outdated
tmchow added 2 commits March 23, 2026 20:11
… in review-beta

The autonomous mode handoff creates todos/ items but didn't specify
the format. Now explicitly loads the file-todos skill and maps finding
severity to todo priority.
… FETCH_HEAD in review-beta

- Branch-mode and standalone-mode base resolution now mirrors PR-mode
  logic, resolving from the PR's actual base repository instead of
  assuming origin
- FETCH_HEAD is only read after a successful fetch to avoid stale refs
  from unrelated prior fetches
@tmchow tmchow merged commit e932276 into main Mar 24, 2026
2 checks passed
@github-actions github-actions Bot mentioned this pull request Mar 24, 2026
mvanhorn added a commit to mvanhorn/compound-engineering-plugin that referenced this pull request Mar 24, 2026
Adds codex-reviewer agent for cross-model code review validation.
Rebased onto main after EveryInc#348 merge.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant