Skip to content

Multi-model consensus PR review Squad + mode support in team.md#200

Merged
PureWeen merged 2 commits intomainfrom
multi-model-pr-review
Feb 23, 2026
Merged

Multi-model consensus PR review Squad + mode support in team.md#200
PureWeen merged 2 commits intomainfrom
multi-model-pr-review

Conversation

@PureWeen
Copy link
Copy Markdown
Owner

@PureWeen PureWeen commented Feb 23, 2026

Summary

Two changes to improve the Squad-based PR review workflow.

How to use

  1. Click + in the sidebar → select PR Review Squad from "📂 From Repo"
  2. Click the orchestrator session
  3. Send: Review PRs #194, #193, #191, #192, #190
  4. The orchestrator assigns 1 PR per worker, each worker dispatches 5 sub-agents across different models, and the orchestrator produces a summary table with verdicts

The mode is automatically set to Orchestrator via mode: orchestrator in team.md — no manual mode selection needed.

Example output

The orchestrator produces a summary table like:

PR Verdict Key Issues
#194 ⚠️ Needs changes cursor:pointer without click handler, brittle test regex
#193 ⚠️ Needs changes Incomplete state sync → infinite unthrottled renders
#191 ✅ Ready to merge Clean — concerns were pre-existing patterns
#192 ⚠️ Needs changes GC-finalization test unreliable, flaky IP test
#190 ⚠️ Needs changes Tests don't guard actual regression

Each worker's detailed report includes only issues flagged by 2+ models (consensus filter), with file:line references and severity ratings.


1. Restructure PR Review Squad for multi-model consensus

Replaces 5 specialized reviewers (bug-hunter, security-analyst, etc.) with 5 generic reviewers that each independently perform a full multi-model consensus review.

Each worker:

  1. Fetches the PR diff via gh pr diff / gh pr view
  2. Dispatches 5 parallel sub-agent reviews across different models:
    • claude-opus-4.6 (deep bug analysis + architecture review)
    • claude-sonnet-4.6 (correctness + edge cases)
    • gemini-3-pro-preview (security focus)
    • gpt-5.3-codex (code quality + logic errors)
  3. Synthesizes findings using a consensus filter (2+ models must flag an issue)
  4. Produces a severity-ranked report with file:line references and a verdict

2. Support mode: field in team.md

SquadDiscovery now reads an optional mode: line from team.md to set the multi-agent mode automatically. Previously hardcoded to OrchestratorReflect.

Supported values: broadcast, sequential, orchestrator, orchestrator-reflect

# My Team
mode: orchestrator

Tests

  • 6 new tests for ParseMode covering all modes + case insensitivity + default behavior
  • All tests pass

PureWeen and others added 2 commits February 23, 2026 10:25
Replace 5 specialized reviewers (bug-hunter, security-analyst, etc.) with
5 generic reviewers that each dispatch 5 sub-agents across different models:
2× Opus 4.6, 1× Sonnet 4.6, 1× Gemini 3 Pro, 1× Codex 5.3.

Each worker fetches the PR diff, fans out to 5 models in parallel, and
synthesizes findings using a consensus filter (2+ models must flag an issue).
The orchestrator just assigns 1 PR per worker and produces a summary table.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
SquadDiscovery now reads an optional 'mode:' line from team.md to set
the multi-agent mode (broadcast, sequential, orchestrator, or
orchestrator-reflect). Defaults to OrchestratorReflect when not specified.

PR Review Squad team.md now specifies 'mode: orchestrator' so it's
correct out of the box without manual mode selection.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen force-pushed the multi-model-pr-review branch from e0db264 to 76f65eb Compare February 23, 2026 16:26
@PureWeen PureWeen merged commit 3e7382c into main Feb 23, 2026
@PureWeen PureWeen deleted the multi-model-pr-review branch February 23, 2026 16:32
PureWeen added a commit that referenced this pull request Mar 25, 2026
)

## Problem

The orchestrator dispatch prompt led with **"Assign ALL workers that
have relevant work"** which caused models to invent micro-tasks to fill
all 5 workers even for single-item requests. For example, a single PR
re-review would get split into:
- Worker-1: check commits
- Worker-2: do 5-model dispatch  
- Worker-3: grep one file
- Worker-4: build and test
- Worker-5: grep another file

This wastes resources and makes sessions appear stuck (all 5 workers
busy for 10+ minutes on what should be a single-worker job).

## Fix

Flip the emphasis so the **default is ONE worker** and fan-out is the
exception:
- `Assign the MINIMUM number of workers needed`
- `Do NOT split a single task into micro-tasks across workers`
- Only fan out for genuinely independent tasks (e.g., "review PR #100
and PR #200")

## History

- **Mar 10** (`a79f9f40`): Introduced aggressive fan-out with retry loop
- **Mar 12** (`9c58ebf0`): Removed retry loop, added "only relevant"
caveat
- **This PR**: Flips the default from "assign all" to "assign minimum"

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
arisng pushed a commit to arisng/PolyPilot that referenced this pull request Apr 4, 2026
…ureWeen#429)

## Problem

The orchestrator dispatch prompt led with **"Assign ALL workers that
have relevant work"** which caused models to invent micro-tasks to fill
all 5 workers even for single-item requests. For example, a single PR
re-review would get split into:
- Worker-1: check commits
- Worker-2: do 5-model dispatch  
- Worker-3: grep one file
- Worker-4: build and test
- Worker-5: grep another file

This wastes resources and makes sessions appear stuck (all 5 workers
busy for 10+ minutes on what should be a single-worker job).

## Fix

Flip the emphasis so the **default is ONE worker** and fan-out is the
exception:
- `Assign the MINIMUM number of workers needed`
- `Do NOT split a single task into micro-tasks across workers`
- Only fan out for genuinely independent tasks (e.g., "review PR PureWeen#100
and PR PureWeen#200")

## History

- **Mar 10** (`a79f9f40`): Introduced aggressive fan-out with retry loop
- **Mar 12** (`9c58ebf0`): Removed retry loop, added "only relevant"
caveat
- **This PR**: Flips the default from "assign all" to "assign minimum"

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant