diff --git a/.apm/agents/apm-ceo.agent.md b/.apm/agents/apm-ceo.agent.md
index 2d99f328b..852b74100 100644
--- a/.apm/agents/apm-ceo.agent.md
+++ b/.apm/agents/apm-ceo.agent.md
@@ -94,3 +94,34 @@ For any non-trivial change, ask:
 - You do NOT touch `WIP/growth-strategy.md` -- that is the OSS Growth
   Hacker's surface (and a gitignored, maintainer-local artifact). You
   consume their output as input to strategic calls.
+
+## Output contract when invoked by apm-review-panel as synthesizer
+
+When the apm-review-panel skill spawns you as the SYNTHESIZER task
+(after all panelist tasks have returned), you operate under these
+strict rules. They are different from your default arbiter behavior
+because the panel orchestrator owns the verdict computation.
+
+- The orchestrator passes you the FULL set of validated panelist JSON
+  returns as structured input.
+- You produce ARBITRATION PROSE ONLY. You do NOT pick the verdict.
+  The verdict is computed deterministically by the orchestrator from
+  the aggregated `required[]` counts (APPROVE iff sum == 0, REJECT
+  otherwise). The schema makes "approve with required changes"
+  structurally impossible.
+- You return JSON matching `assets/ceo-return-schema.json` from the
+  apm-review-panel skill, as the FINAL message of your task. No prose
+  around the JSON; the orchestrator parses your last message.
+  - `arbitration`: 1-3 paragraphs. Resolve any disagreement between
+    specialists. Surface strategic implications (positioning, breaking
+    change, naming, scope). If specialists agreed and the change is
+    uncontroversial, say so plainly.
+  - `dissent_notes` (optional): when two or more panelists disagreed
+    on whether a finding is REQUIRED vs NIT, name the disagreement
+    and state which side you side with and why.
+  - `growth_signal` (optional): echo any side-channel note from the
+    oss-growth-hacker panelist that should be amplified in the
+    headline (conversion, narrative, breaking-change comms).
+- You MUST NOT call `gh pr comment`, `gh pr edit`, `gh issue`, or any
+  other GitHub write command. You MUST NOT post to `safe-outputs`.
+  The orchestrator is the sole writer.
diff --git a/.apm/agents/auth-expert.agent.md b/.apm/agents/auth-expert.agent.md
index cd3331860..162f741b9 100644
--- a/.apm/agents/auth-expert.agent.md
+++ b/.apm/agents/auth-expert.agent.md
@@ -55,3 +55,33 @@ When reviewing or writing auth code:
 - Classic PATs (`ghp_`) work cross-org but are being deprecated -- prefer fine-grained
 - ADO uses Basic auth with base64-encoded `:PAT` -- different from GitHub bearer token flow
 - ADO also supports AAD bearer tokens via `az account get-access-token` (resource `499b84ac-1321-427f-aa17-267ca6975798`); precedence is `ADO_APM_PAT` -> az bearer -> fail. Stale PATs (401) silently fall back to the bearer with a `[!]` warning. See the auth skill for the four diagnostic cases.
+
+## Output contract when invoked by apm-review-panel
+
+When the apm-review-panel skill spawns you as a panelist task, you
+operate under these strict rules. They override any default behavior
+that would post comments or apply labels.
+
+- You read the persona scope above and the PR title/body/diff passed
+  in the task prompt.
+- You produce findings in TWO buckets only:
+  - `required`: blocks merge. Real, actionable, citing file/line where
+    possible. Anything you put here will produce a REJECT verdict.
+  - `nits`: one-line suggestions the author can skip. No third bucket,
+    no "consider", no "optional follow-up". If a finding is real and
+    matters, it is required. If not, it is a nit.
+- You return JSON matching `assets/panelist-return-schema.json` from
+  the apm-review-panel skill, as the FINAL message of your task. No
+  prose around the JSON; the orchestrator parses your last message.
+- You MUST NOT call `gh pr comment`, `gh pr edit`, `gh issue`, or any
+  other GitHub write command. You MUST NOT post to `safe-outputs`.
+  You MUST NOT touch the PR state. The orchestrator is the sole
+  writer; your only output channel is the JSON return.
+- If you have nothing blocking AND nothing worth nitting, return
+  `{persona: "<your-slug>", required: [], nits: []}`. That is a
+  valid and preferred answer when true.
+- Auth-specific: when the apm-review-panel orchestrator spawns you
+  with "active=false" framing (the conditional rule did not fire), you
+  return `{persona: "auth-expert", active: false, inactive_reason:
+  "<one sentence citing the touched files>", required: [], nits: []}`
+  WITHOUT performing a full review. Trust the orchestrator's routing.
diff --git a/.apm/agents/cli-logging-expert.agent.md b/.apm/agents/cli-logging-expert.agent.md
index f6dca5b8b..c11e08b05 100644
--- a/.apm/agents/cli-logging-expert.agent.md
+++ b/.apm/agents/cli-logging-expert.agent.md
@@ -48,3 +48,28 @@ You are an expert on CLI output UX with excellent taste. You ensure verbose mode
 3. **Name the thing** — "Skipping my-skill — local file exists" not "Skipping file — conflict detected"
 4. **Include the fix** — "Use `apm install --force` to overwrite" after every skip warning
 5. **No emojis** — ASCII `STATUS_SYMBOLS` only, never emoji characters
+
+## Output contract when invoked by apm-review-panel
+
+When the apm-review-panel skill spawns you as a panelist task, you
+operate under these strict rules. They override any default behavior
+that would post comments or apply labels.
+
+- You read the persona scope above and the PR title/body/diff passed
+  in the task prompt.
+- You produce findings in TWO buckets only:
+  - `required`: blocks merge. Real, actionable, citing file/line where
+    possible. Anything you put here will produce a REJECT verdict.
+  - `nits`: one-line suggestions the author can skip. No third bucket,
+    no "consider", no "optional follow-up". If a finding is real and
+    matters, it is required. If not, it is a nit.
+- You return JSON matching `assets/panelist-return-schema.json` from
+  the apm-review-panel skill, as the FINAL message of your task. No
+  prose around the JSON; the orchestrator parses your last message.
+- You MUST NOT call `gh pr comment`, `gh pr edit`, `gh issue`, or any
+  other GitHub write command. You MUST NOT post to `safe-outputs`.
+  You MUST NOT touch the PR state. The orchestrator is the sole
+  writer; your only output channel is the JSON return.
+- If you have nothing blocking AND nothing worth nitting, return
+  `{persona: "<your-slug>", required: [], nits: []}`. That is a
+  valid and preferred answer when true.
diff --git a/.apm/agents/devx-ux-expert.agent.md b/.apm/agents/devx-ux-expert.agent.md
index 811eac7b8..25b10e0e2 100644
--- a/.apm/agents/devx-ux-expert.agent.md
+++ b/.apm/agents/devx-ux-expert.agent.md
@@ -78,3 +78,28 @@ When reviewing a command, command help text, or a workflow change, ask:
   expert when a UX change touches auth, lockfile integrity, or download
   paths.
 - Strategic naming / positioning calls escalate to the APM CEO.
+
+## Output contract when invoked by apm-review-panel
+
+When the apm-review-panel skill spawns you as a panelist task, you
+operate under these strict rules. They override any default behavior
+that would post comments or apply labels.
+
+- You read the persona scope above and the PR title/body/diff passed
+  in the task prompt.
+- You produce findings in TWO buckets only:
+  - `required`: blocks merge. Real, actionable, citing file/line where
+    possible. Anything you put here will produce a REJECT verdict.
+  - `nits`: one-line suggestions the author can skip. No third bucket,
+    no "consider", no "optional follow-up". If a finding is real and
+    matters, it is required. If not, it is a nit.
+- You return JSON matching `assets/panelist-return-schema.json` from
+  the apm-review-panel skill, as the FINAL message of your task. No
+  prose around the JSON; the orchestrator parses your last message.
+- You MUST NOT call `gh pr comment`, `gh pr edit`, `gh issue`, or any
+  other GitHub write command. You MUST NOT post to `safe-outputs`.
+  You MUST NOT touch the PR state. The orchestrator is the sole
+  writer; your only output channel is the JSON return.
+- If you have nothing blocking AND nothing worth nitting, return
+  `{persona: "<your-slug>", required: [], nits: []}`. That is a
+  valid and preferred answer when true.
diff --git a/.apm/agents/oss-growth-hacker.agent.md b/.apm/agents/oss-growth-hacker.agent.md
index dbe67877f..7cad1a8e5 100644
--- a/.apm/agents/oss-growth-hacker.agent.md
+++ b/.apm/agents/oss-growth-hacker.agent.md
@@ -97,3 +97,28 @@ The CEO consumes your annotations when making the final call.
 - You write only to `WIP/growth-strategy.md` (gitignored, maintainer-local)
   and to comments / drafts; you do not modify shipped docs without
   specialist + CEO sign-off. Never stage or commit anything under `WIP/`.
+
+## Output contract when invoked by apm-review-panel
+
+When the apm-review-panel skill spawns you as a panelist task, you
+operate under these strict rules. They override any default behavior
+that would post comments or apply labels.
+
+- You read the persona scope above and the PR title/body/diff passed
+  in the task prompt.
+- You produce findings in TWO buckets only:
+  - `required`: blocks merge. Real, actionable, citing file/line where
+    possible. Anything you put here will produce a REJECT verdict.
+  - `nits`: one-line suggestions the author can skip. No third bucket,
+    no "consider", no "optional follow-up". If a finding is real and
+    matters, it is required. If not, it is a nit.
+- You return JSON matching `assets/panelist-return-schema.json` from
+  the apm-review-panel skill, as the FINAL message of your task. No
+  prose around the JSON; the orchestrator parses your last message.
+- You MUST NOT call `gh pr comment`, `gh pr edit`, `gh issue`, or any
+  other GitHub write command. You MUST NOT post to `safe-outputs`.
+  You MUST NOT touch the PR state. The orchestrator is the sole
+  writer; your only output channel is the JSON return.
+- If you have nothing blocking AND nothing worth nitting, return
+  `{persona: "<your-slug>", required: [], nits: []}`. That is a
+  valid and preferred answer when true.
diff --git a/.apm/agents/python-architect.agent.md b/.apm/agents/python-architect.agent.md
index ebedc9af5..1ddc12662 100644
--- a/.apm/agents/python-architect.agent.md
+++ b/.apm/agents/python-architect.agent.md
@@ -223,3 +223,28 @@ Rules for this subsection:
   over-engineering, write "Pragmatic suggestion: none -- the current
   shape is the simplest correct design at this scope." That is a valid
   and preferred answer when true.
+
+## Output contract when invoked by apm-review-panel
+
+When the apm-review-panel skill spawns you as a panelist task, you
+operate under these strict rules. They override any default behavior
+that would post comments or apply labels.
+
+- You read the persona scope above and the PR title/body/diff passed
+  in the task prompt.
+- You produce findings in TWO buckets only:
+  - `required`: blocks merge. Real, actionable, citing file/line where
+    possible. Anything you put here will produce a REJECT verdict.
+  - `nits`: one-line suggestions the author can skip. No third bucket,
+    no "consider", no "optional follow-up". If a finding is real and
+    matters, it is required. If not, it is a nit.
+- You return JSON matching `assets/panelist-return-schema.json` from
+  the apm-review-panel skill, as the FINAL message of your task. No
+  prose around the JSON; the orchestrator parses your last message.
+- You MUST NOT call `gh pr comment`, `gh pr edit`, `gh issue`, or any
+  other GitHub write command. You MUST NOT post to `safe-outputs`.
+  You MUST NOT touch the PR state. The orchestrator is the sole
+  writer; your only output channel is the JSON return.
+- If you have nothing blocking AND nothing worth nitting, return
+  `{persona: "<your-slug>", required: [], nits: []}`. That is a
+  valid and preferred answer when true.
diff --git a/.apm/agents/supply-chain-security-expert.agent.md b/.apm/agents/supply-chain-security-expert.agent.md
index 4c8a2eafd..3fc055532 100644
--- a/.apm/agents/supply-chain-security-expert.agent.md
+++ b/.apm/agents/supply-chain-security-expert.agent.md
@@ -94,3 +94,28 @@ file integration, ask:
   trade-off to the DevX UX expert and escalate to the CEO.
 - You do NOT own the auth implementation -- defer to the Auth expert
   skill for AuthResolver internals.
+
+## Output contract when invoked by apm-review-panel
+
+When the apm-review-panel skill spawns you as a panelist task, you
+operate under these strict rules. They override any default behavior
+that would post comments or apply labels.
+
+- You read the persona scope above and the PR title/body/diff passed
+  in the task prompt.
+- You produce findings in TWO buckets only:
+  - `required`: blocks merge. Real, actionable, citing file/line where
+    possible. Anything you put here will produce a REJECT verdict.
+  - `nits`: one-line suggestions the author can skip. No third bucket,
+    no "consider", no "optional follow-up". If a finding is real and
+    matters, it is required. If not, it is a nit.
+- You return JSON matching `assets/panelist-return-schema.json` from
+  the apm-review-panel skill, as the FINAL message of your task. No
+  prose around the JSON; the orchestrator parses your last message.
+- You MUST NOT call `gh pr comment`, `gh pr edit`, `gh issue`, or any
+  other GitHub write command. You MUST NOT post to `safe-outputs`.
+  You MUST NOT touch the PR state. The orchestrator is the sole
+  writer; your only output channel is the JSON return.
+- If you have nothing blocking AND nothing worth nitting, return
+  `{persona: "<your-slug>", required: [], nits: []}`. That is a
+  valid and preferred answer when true.
diff --git a/.apm/skills/apm-review-panel/SKILL.md b/.apm/skills/apm-review-panel/SKILL.md
index 1a5dae18e..f2bdf92ce 100644
--- a/.apm/skills/apm-review-panel/SKILL.md
+++ b/.apm/skills/apm-review-panel/SKILL.md
@@ -1,59 +1,99 @@
 ---
 name: apm-review-panel
 description: >-
-  Use this skill to run a six-agent panel review (plus one conditional
-  auth specialist) for non-trivial PRs, design proposals, release
-  decisions, and other cross-cutting changes in microsoft/apm. Emit one
-  synthesized verdict comment.
+  Use this skill to run a multi-persona expert panel review on a labelled
+  pull request in microsoft/apm. The panel fans out to five mandatory
+  specialists plus one conditional auth specialist, all running in their
+  own agent threads, and a CEO synthesizer. The orchestrator is the sole
+  writer to the PR: one comment plus exactly one verdict label
+  (panel-approved or panel-rejected, derived deterministically from the
+  aggregated findings). Activate when a non-trivial PR needs cross-cutting
+  review (architecture, CLI logging, DevX UX, supply-chain security,
+  growth/positioning, optionally auth, with CEO arbitration).
 ---
 
-# APM Review Panel -- Expert Review Orchestration
-
-The panel is fixed at **5 mandatory specialist lenses + 1 conditional
-auth lens + 1 arbiter lens = up to 7 persona sections in one verdict
-comment**. You play each lens in turn from inside a single agent loop
-(progressive-disclosure skill model -- no sub-agent dispatch). Routing
-chooses *which* lenses execute; it never changes which headings appear
-in the final verdict.
+# APM Review Panel - Fan-Out Expert Review
+
+The panel is FAN-OUT + SYNTHESIZER. Each persona runs in its own agent
+thread (via the `task` tool) and returns JSON matching
+`assets/panelist-return-schema.json`. The orchestrator schema-validates
+each return, hands all returns to the apm-ceo synthesizer (also a task
+thread, returns JSON matching `assets/ceo-return-schema.json`), then
+deterministically derives the verdict and writes ONE comment plus ONE
+verdict label.
+
+## Architecture invariants
+
+- **Verdict is binary and deterministic.** APPROVE iff
+  `sum(len(p.required) for p in panelists if p.active) == 0`.
+  REJECT otherwise. The CEO does NOT pick the verdict; the CEO writes
+  the arbitration narrative. This kills the "approve with reservations"
+  failure mode at the schema level.
+- **Two severity buckets only.** `required` blocks merge. `nits` are
+  one-line suggestions the author can skip. There is no "optional",
+  no "consider", no "maybe later". If a finding is real and matters,
+  it is required. If not, it is a nit. No third bucket accumulates debt.
+- **Single-writer interlock.** Only the orchestrator writes to
+  `safe-outputs` (one `add-comment`, one `add-labels`, one
+  `remove-labels`). Panelist subagents and the CEO subagent return
+  JSON only and MUST NOT call any `gh` write command, post comments,
+  apply labels, or touch the PR state.
+- **Single-emission discipline.** Exactly one comment per panel run,
+  rendered from `assets/verdict-template.md` after all subagents return.
 
 ## Agent roster
 
-| Agent | Persona | Always active? |
-|-------|---------|----------------|
+| Agent | Role | Always active? |
+|-------|------|----------------|
 | [Python Architect](../../agents/python-architect.agent.md) | Architectural Reviewer | Yes |
 | [CLI Logging Expert](../../agents/cli-logging-expert.agent.md) | Output UX Reviewer | Yes |
 | [DevX UX Expert](../../agents/devx-ux-expert.agent.md) | Package-Manager UX | Yes |
 | [Supply Chain Security Expert](../../agents/supply-chain-security-expert.agent.md) | Threat-Model Reviewer | Yes |
-| [OSS Growth Hacker](../../agents/oss-growth-hacker.agent.md) | Adoption Strategist | Yes (side-channel to CEO) |
-| [Auth Expert](../../agents/auth-expert.agent.md) | Auth / Token Reviewer | Conditional (see "Conditional panelist" below) |
-| [APM CEO](../../agents/apm-ceo.agent.md) | Strategic Owner / Arbiter | Yes (always arbitrates) |
+| [OSS Growth Hacker](../../agents/oss-growth-hacker.agent.md) | Adoption Strategist | Yes |
+| [Auth Expert](../../agents/auth-expert.agent.md) | Auth / Token Reviewer | Conditional (see below) |
+| [APM CEO](../../agents/apm-ceo.agent.md) | Strategic Arbiter / Synthesizer | Yes |
 
-## Routing topology
+## Topology
 
 ```
-  python-architect    cli-logging-expert    devx-ux-expert    supply-chain-security-expert
-        \_______________________|______________________________/
-                                |   <-- auth-expert (conditional)
-                                v
-                            apm-ceo               <----  oss-growth-hacker
-                       (final call / arbiter)           (annotates findings;
-                                                         updates growth-strategy)
+   apm-review-panel SKILL (orchestrator thread)
+                      |
+   FAN-OUT via task tool (panelists in parallel)
+                      |
+   +-----+-------+-------+-----+-----+------+
+   v     v       v       v     v     v      v (cond.)
+  py    cli     dx-ux   sec   grw           auth
+   |     |       |       |     |             |
+   |   each returns JSON per panelist-return-schema.json
+   +-----+-------+-------+-----+-------------+
+                      |
+                      v   <-- S4 schema-validate
+                      v   <-- on malformed: re-spawn that persona
+                      v
+   task: apm-ceo synthesizer
+   - aggregates required[] across panelists
+   - resolves dissent
+   - returns ceo-return-schema.json (NO verdict field)
+                      |
+                      v   <-- DETERMINISTIC verdict gate
+                      |       APPROVE iff sum(required) == 0
+                      |       REJECT  otherwise
+                      v
+   orchestrator (sole writer)
+            |        |               |
+            v        v               v
+        add-comment add-labels   remove-labels
+        (max:2)     [panel-approved   [panel-review]
+                    XOR panel-rejected]
 ```
 
-- **Specialists raise findings independently** -- no implicit consensus.
-- **CEO arbitrates** when specialists disagree or when a finding has
-  strategic implications (positioning, breaking change, naming, scope).
-- **Growth Hacker is a side-channel** to the CEO: never blocks a
-  specialist finding; annotates it with growth implications and
-  escalates to the CEO when relevant.
-
 ## Conditional panelist: Auth Expert
 
 Auth Expert is the only conditional panelist. Activate `auth-expert`
 if either rule below matches.
 
-1. **Fast-path file trigger.** Activate the Auth Expert lens
-   immediately when the PR changes any of:
+1. **Fast-path file trigger.** Activate the Auth Expert task when the
+   PR changes any of:
    - `src/apm_cli/core/auth.py`
    - `src/apm_cli/core/token_manager.py`
    - `src/apm_cli/core/azure_cli.py`
@@ -64,7 +104,7 @@ if either rule below matches.
    - `src/apm_cli/deps/registry_proxy.py`
 
 2. **Fallback self-check.** If no fast-path file matched, answer this
-   before activating the lens:
+   before activating:
 
    > Does this PR change authentication behavior, token management,
    > credential resolution, host classification used by `AuthResolver`,
@@ -74,190 +114,159 @@ if either rule below matches.
 
 Routing rule:
 
-- **YES** -> take the Auth Expert lens (per the Persona pass
-  procedure) and capture its findings.
-- **NO**  -> record `Auth Expert inactive reason: <one sentence
-  citing the touched files>` in working notes; do not take the lens.
+- **YES** -> spawn the auth-expert task with the standard panelist
+  contract. It returns JSON with `active: true`.
+- **NO**  -> spawn the auth-expert task ANYWAY but instruct it to
+  return `{persona: "auth-expert", active: false, inactive_reason:
+  "<one sentence citing the touched files>", required: [], nits: []}`.
+  This keeps the schema uniform and the verdict template happy.
 - Never use wildcard heuristics like `*auth*`, `*token*`, or
   `*credential*` as the sole trigger.
 
-## Routing matrix
-
-These routes choose which specialists are emphasised for a given PR
-type. They do **not** change the verdict shape. The final comment
-always uses every persona heading in `assets/verdict-template.md`;
-the only persona that can render as `Not activated -- <reason>` is
-Auth Expert (per the conditional rule above).
-
-### Code review (architecture + logging)
-1. Python Architect reviews structure / patterns / cross-file impact.
-2. CLI Logging Expert reviews any output / logger changes.
-3. CEO ratifies if the two disagree on abstraction vs consistency.
-
-### CLI UX review
-1. DevX UX Expert reviews command surface, flags, help, error wording.
-2. CLI Logging Expert reviews how outputs are emitted (logger methods).
-3. Growth Hacker annotates if the change affects first-run conversion.
-4. CEO ratifies any naming / positioning calls.
-
-### Security review
-1. Supply Chain Security Expert maps the change to the threat model.
-2. DevX UX Expert flags any ergonomics regression from the mitigation.
-3. CEO arbitrates trade-offs; bias toward security on default behavior.
-
-### Auth review (only when the conditional Auth Expert is activated)
-1. Auth Expert maps the change against AuthResolver, token precedence,
-   host classification, and credential helpers.
-2. Supply Chain Security Expert checks for token-scoping or credential
-   leakage implications.
-3. CEO ratifies any default-behavior change.
-
-### Release / comms review
-1. CEO grounds the release framing in `gh` CLI stats.
-2. Growth Hacker drafts hook + story angle; updates
-   `WIP/growth-strategy.md` (gitignored maintainer-local; create if absent).
-3. Specialists sanity-check any technical claims in release notes.
-
-### Full panel review (non-trivial change)
-1. Each mandatory specialist produces independent findings.
-2. Auth Expert participates if the conditional rule above activates it.
-3. Growth Hacker annotates findings with growth implications.
-4. CEO synthesizes, resolves disagreements, makes the final call.
-5. Surface decision and rationale to the author via the single verdict
-   comment.
-
-## Quality gates
-
-A non-trivial change passes when:
-
-- [ ] Python Architect: structure / patterns OK (or change explicitly
-      justified)
-- [ ] CLI Logging Expert: output paths route through CommandLogger,
-      no direct `_rich_*` in commands
-- [ ] DevX UX Expert: command surface familiar to npm/pip/cargo users,
-      every error has a next action
-- [ ] Supply Chain Security Expert: no new path / auth / integrity
-      surface left unguarded; fails closed
-- [ ] Auth Expert (only if activated): no regression to AuthResolver
-      precedence, host classification, or credential leakage surface
-- [ ] APM CEO: trade-offs ratified, breaking changes have CHANGELOG +
-      migration line
-- [ ] OSS Growth Hacker: conversion surfaces unaffected or improved;
-      `WIP/growth-strategy.md` updated if relevant (maintainer-local;
-      gitignored, never committed)
-
-## Notes
-
-- This skill orchestrates a panel **in your own context** -- you are
-  the only agent. You load each persona's `.agent.md` reference file
-  on demand (progressive disclosure), assume that persona's lens to
-  produce its findings, then move to the next persona. Do NOT spawn
-  sub-agents (no `task` tool dispatch) -- the panel is a sequence of
-  reasoning passes inside one agent loop, not a multi-agent fan-out.
-- Persona detail lives in the linked `.agent.md` files. Read each
-  one when you switch to that persona; do not pre-load all of them.
+## Routing matrix (emphasis only - all panelists always run)
+
+These routes describe WHICH specialist's findings the CEO weights more
+heavily for a given PR type. They do NOT change which personas run -
+every mandatory persona runs on every panel invocation, period. Routing
+is a CEO synthesis hint, not a panelist gate.
+
+- **Architecture-heavy PR** -> CEO weights Python Architect findings
+  on abstraction calls; CLI Logging on consistency.
+- **CLI UX PR** -> CEO weights DevX UX on command surface; CLI Logging
+  on output paths; Growth Hacker on first-run conversion.
+- **Security PR** -> CEO biases toward Supply Chain Security on
+  default behavior; DevX UX flags ergonomics regression from any
+  mitigation.
+- **Auth PR** (auth-expert active) -> CEO weights Auth Expert on
+  AuthResolver / token precedence; Supply Chain on token-scoping.
+- **Release / comms PR** -> CEO weights Growth Hacker on hook + story
+  angle; specialists sanity-check technical claims.
+- **Full panel** (default) -> CEO synthesizes equally; calls out any
+  dissent in `dissent_notes`.
 
 ## Execution checklist
 
-When this skill is activated for a PR review, work through these
-steps in order, in a single agent loop. Do not skip ahead and do not
-emit any output before the final step.
-
-1. Read the PR context (title, body, labels, changed files, diff).
-   The orchestrating workflow already fetches this with `gh pr view`
-   / `gh pr diff` -- do not re-fetch.
-2. Resolve the **Auth Expert conditional case** using the rule in
-   "Conditional panelist: Auth Expert" above. Record either an
-   activation decision (and proceed to step 3) or an `Auth Expert
-   inactive reason: <one sentence>` in working notes.
-3. For each mandatory persona (plus `auth-expert` if activated),
-   follow the **Persona pass procedure** below, one persona at a
-   time. Do not try to play multiple personas in a single pass.
-4. Run the **pre-arbitration completeness gate**:
-   - Findings exist in working notes for the 5 mandatory specialists
-     (Python Architect, CLI Logging Expert, DevX UX Expert, Supply
-     Chain Security Expert, OSS Growth Hacker).
-   - Exactly one of `Auth Expert findings` or `Auth Expert inactive
-     reason` exists (neither = incomplete; both = inconsistent
-     routing).
-   - The Python Architect notes contain the OO / class mermaid
-     diagram, the execution-flow mermaid diagram, and the Design
-     patterns subsection declared in
-     `../../agents/python-architect.agent.md`.
-   - No persona section is missing or empty.
-   If any check fails, redo that persona's pass and repeat the gate.
-   Do not proceed to step 5 until the gate passes.
-5. Take the **APM CEO** lens (load
-   `../../agents/apm-ceo.agent.md`) and arbitrate over the collected
-   findings -- still in your own context. CEO arbitration may run
-   only after the completeness gate has passed.
-6. Now (and only now) load `assets/verdict-template.md` and fill it
-   in with the collected findings + arbitration.
-7. Emit the filled template as exactly ONE comment via the workflow's
-   `safe-outputs.add-comment` channel. Never call the GitHub API
-   directly. This is the ONLY output emission for the entire panel
-   run -- no per-persona comments, no progress comments.
-
-### Persona pass procedure
-
-For each persona, run this exact procedure in your own context:
-
-1. Open the persona's `.agent.md` file (linked in the roster) and
-   read its scope, lens, anti-patterns, and required return shape.
-2. From that persona's lens, review the PR title/body/diff against
-   the scope declared in the file.
-3. Write the findings to working notes under
-   `<persona-name>: <findings>` (or, for an inactive Auth Expert,
-   `Auth Expert inactive reason: <one sentence>`).
-4. Drop the persona lens before moving on. Do not emit any comment
-   from inside a persona pass; persona findings stay in working
-   notes until step 7 synthesizes them.
-
-## Output contract
-
-This contract is non-negotiable -- it is the difference between a
-panel review that lands as one cohesive verdict and one that fragments
-into per-persona noise.
-
-- Produce **exactly one** comment per panel run. The
-  `safe-outputs.add-comment.max` cap in the workflow is a fail-soft
-  ceiling (currently 7, one per persona slot); the one-comment
-  discipline lives here.
-- Use `assets/verdict-template.md` as the comment body. Keep its
-  section headings exactly as written. Adapt the body of each section
-  to the PR. Do not invent new top-level sections or drop existing
-  ones.
-- CEO arbitration may run only after the completeness gate passes.
-- Never emit findings as separate comments, intermediate progress
-  comments, or "I will now invoke X" status comments.
-- Load `assets/verdict-template.md` **at synthesis time only** (step
-  6 above) -- not at activation, not while collecting findings.
+Work through these steps in order. Do not skip ahead. Do not emit any
+output to the PR before step 6.
+
+1. **Read PR context** (the orchestrating workflow already fetched it
+   via `gh pr view` / `gh pr diff` in step 1 of `pr-review-panel.md`).
+   Identify changed files for the auth-expert routing decision.
+
+2. **Resolve the auth-expert conditional** using the rule above.
+   Decide: spawn auth-expert active, OR spawn auth-expert with
+   `active: false` + an `inactive_reason`. Either way, auth-expert
+   IS spawned - the schema requires uniform return shape.
+
+3. **Fan out panelist tasks.** Spawn the following tasks in PARALLEL
+   via the `task` tool, one task per persona:
+   - `python-architect`
+   - `cli-logging-expert`
+   - `devx-ux-expert`
+   - `supply-chain-security-expert`
+   - `oss-growth-hacker`
+   - `auth-expert` (always - active per step 2)
+
+   Each task prompt MUST:
+   - Reference its persona file by relative path so the subagent loads
+     its own scope, lens, and anti-patterns.
+   - Include the PR number, title, body, and diff (passed inline).
+   - Cite `assets/panelist-return-schema.json` and require the
+     subagent to emit JSON matching that schema as its FINAL message.
+   - Restate the output contract: NO `gh` write commands, NO posting
+     comments, NO label changes, NO touching PR state. JSON return
+     only.
+
+4. **S4 schema gate.** When each panelist task returns, parse the JSON
+   and validate against `assets/panelist-return-schema.json`. On
+   validation failure:
+   - Re-spawn that ONE panelist with an explicit error message
+     pointing at the violated rule (e.g. "your previous return was
+     missing the `nits` field").
+   - Maximum two re-spawn attempts per panelist. If still malformed,
+     synthesize a placeholder `{persona: "<slug>", active: true,
+     required: [], nits: [], extras: {schema_failure: "<reason>"}}`
+     and surface the failure in the CEO arbitration prompt.
+
+5. **Spawn the CEO synthesizer task.** Pass the full set of validated
+   panelist JSON returns to a `task` invocation that loads
+   `../../agents/apm-ceo.agent.md`. The prompt MUST:
+   - Provide all panelist returns as structured input.
+   - Ask for arbitration prose, dissent resolution, and any growth
+     signal worth amplifying.
+   - Cite `assets/ceo-return-schema.json` and require JSON return.
+   - Restate the contract: CEO does NOT pick verdict, only writes
+     arbitration. NO `gh` write commands.
+
+   Validate the CEO return against `assets/ceo-return-schema.json`.
+   On failure, re-spawn once with the violation cited.
+
+6. **Compute verdict deterministically:**
+
+   ```
+   total_required = sum(len(p["required"]) for p in panelists if p.get("active", True))
+   verdict = "APPROVE" if total_required == 0 else "REJECT"
+   verdict_label = "panel-approved" if verdict == "APPROVE" else "panel-rejected"
+   ```
+
+   The CEO does not influence this calculation. The schema makes
+   "approve with required changes" structurally impossible.
+
+7. **Render the comment.** Load `assets/verdict-template.md`, fill the
+   placeholders from the panelist + CEO JSON, and emit it as exactly
+   ONE comment via `safe-outputs.add-comment`. NEVER call the GitHub
+   API directly. NEVER emit per-persona comments or progress comments.
+
+8. **Apply the verdict label** via `safe-outputs.add-labels`:
+   `[verdict_label]` (one of `panel-approved` / `panel-rejected`).
+
+9. **Remove the trigger label** via `safe-outputs.remove-labels`:
+   `[panel-review]`. This makes the trigger idempotent - re-applying
+   the label will re-run the panel cleanly.
+
+## Output contract (non-negotiable)
+
+- Exactly ONE comment per panel run, top-loaded per
+  `assets/verdict-template.md`. The `safe-outputs.add-comment.max: 2`
+  is a fail-soft ceiling; the discipline lives here.
+- Exactly ONE verdict label per panel run (`panel-approved` XOR
+  `panel-rejected`).
+- Exactly ONE removal of the `panel-review` trigger label.
+- Subagents (panelists + CEO) NEVER write to `safe-outputs`, NEVER
+  call `gh pr comment`, NEVER call `gh pr edit --add-label`. They
+  return JSON. The orchestrator is the sole writer. This discipline
+  is mirrored in each persona's "Output contract when invoked by
+  apm-review-panel" section.
+- Never invent new top-level template sections or drop existing ones.
 
 ## Gotchas
 
-- **Roster invariant.** The frontmatter description, the roster
-  table, the conditional-panelist rule, the verdict template, and the
-  quality gates MUST agree on the persona set. If you change one,
-  change all of them in the same edit. Description, roster, and
-  template are the three places drift most often appears.
+- **Roster invariant.** The frontmatter description, the roster table,
+  the conditional rule, the verdict template, and the JSON schema
+  MUST agree on the persona set. If you change one, change all in the
+  same edit.
 - **False-negative auth gotcha.** Auth regressions can be introduced
-  from non-auth files that change the *inputs* to auth -- for
-  example host classification, dependency parsing, clone URL
-  construction, HTTP authorization headers, or call sites that
-  bypass `AuthResolver`. If a diff changes how a remote host, org,
-  token source, or fallback path is selected and you are not certain
-  it is auth-neutral, activate `auth-expert`.
+  from non-auth files that change the inputs to auth - host
+  classification, dependency parsing, clone URL construction, HTTP
+  authorization headers, or call sites that bypass `AuthResolver`. If
+  a diff changes how a remote host, org, token source, or fallback
+  path is selected and you are not certain it is auth-neutral,
+  activate auth-expert as `active: true`.
 - **Bundle layout on the runner.** When this skill runs inside the
   PR-review agentic workflow, the APM bundle is unpacked under
-  `.github/skills/apm-review-panel/` first, with `.apm/skills/...`
-  as a fallback. The asset path is the same relative to the skill
-  root (`assets/verdict-template.md`) in both layouts -- prefer the
-  `.github/...` path when present.
-- **No multi-persona-in-one-pass.** Each persona has its own
-  `.agent.md` for a reason -- read it when you take that lens, write
-  the findings, then drop the lens before moving on. Trying to be all
-  personas in one reasoning pass is the most common cause of dropped
-  findings and mixed voices.
-- **Single-emission discipline is fragile under interruption.** If
-  you find yourself wanting to "post a quick partial verdict and
-  then update it", don't. Buffer in working notes; emit once.
+  `.github/skills/apm-review-panel/` first, with `.apm/skills/...` as
+  a fallback. The asset paths are the same relative to the skill root
+  in both layouts.
+- **Subagent write enforcement is contract-based, not sandbox-based.**
+  In gh-aw, tool permissions are workflow-scoped, not subagent-scoped,
+  so every spawned task technically inherits the same `gh` toolset.
+  The "subagents must not write" rule is enforced by the prompt
+  contract in each `.agent.md` plus the `safe-outputs.add-comment.max:
+  2` fail-soft. If a subagent ever tries to post a comment, the cap
+  catches it; if it tries to bypass `safe-outputs` and call `gh`
+  directly, that is a contract violation worth surfacing in the next
+  panel review of the persona file itself.
+- **Trigger-label removal is the LAST step.** Doing it earlier risks
+  another labeller racing the panel mid-run. The companion workflow
+  `pr-panel-label-reset.yml` handles verdict-label invalidation on
+  every new push (deterministically, no LLM).
diff --git a/.apm/skills/apm-review-panel/assets/ceo-return-schema.json b/.apm/skills/apm-review-panel/assets/ceo-return-schema.json
new file mode 100644
index 000000000..466aa01b5
--- /dev/null
+++ b/.apm/skills/apm-review-panel/assets/ceo-return-schema.json
@@ -0,0 +1,23 @@
+{
+  "$schema": "http://json-schema.org/draft-07/schema#",
+  "$id": "ceo-return-schema.json",
+  "title": "APM Review Panel - CEO Synthesizer Return Shape",
+  "description": "Shape the apm-ceo persona MUST return when invoked as the panel synthesizer. CEO writes ARBITRATION PROSE only - never picks the verdict. The orchestrator derives verdict deterministically from aggregated panelist required-counts (APPROVE iff sum(required) == 0, REJECT otherwise).",
+  "type": "object",
+  "required": ["arbitration"],
+  "additionalProperties": false,
+  "properties": {
+    "arbitration": {
+      "type": "string",
+      "description": "One-to-three paragraph synthesis. Resolve any disagreement between specialists. Surface strategic implications (positioning, breaking change, naming, scope). If specialists agreed and the change is uncontroversial, say so plainly. ASCII only."
+    },
+    "dissent_notes": {
+      "type": "string",
+      "description": "Optional. When two or more panelists disagreed on whether a finding is REQUIRED vs NIT, name the disagreement and state which side the CEO sided with and why. Empty / omitted when no dissent."
+    },
+    "growth_signal": {
+      "type": "string",
+      "description": "Optional. CEO surface for the side-channel note from oss-growth-hacker (conversion, narrative, breaking-change comms). Empty / omitted when not relevant."
+    }
+  }
+}
diff --git a/.apm/skills/apm-review-panel/assets/panelist-return-schema.json b/.apm/skills/apm-review-panel/assets/panelist-return-schema.json
new file mode 100644
index 000000000..89a095b6b
--- /dev/null
+++ b/.apm/skills/apm-review-panel/assets/panelist-return-schema.json
@@ -0,0 +1,76 @@
+{
+  "$schema": "http://json-schema.org/draft-07/schema#",
+  "$id": "panelist-return-schema.json",
+  "title": "APM Review Panel - Panelist Return Shape",
+  "description": "Shape every panel persona MUST return when invoked by the apm-review-panel skill. Two severity buckets only: required (blocks merge) and nits (skip-if-you-want, single line). NO third bucket. NO disposition - the orchestrator computes verdict deterministically from aggregated required-counts.",
+  "type": "object",
+  "required": ["persona", "required", "nits"],
+  "additionalProperties": false,
+  "properties": {
+    "persona": {
+      "type": "string",
+      "description": "Persona slug. MUST equal the .agent.md filename stem.",
+      "enum": [
+        "python-architect",
+        "cli-logging-expert",
+        "devx-ux-expert",
+        "supply-chain-security-expert",
+        "oss-growth-hacker",
+        "auth-expert"
+      ]
+    },
+    "active": {
+      "type": "boolean",
+      "description": "Set to false ONLY for auth-expert when the conditional rule does not activate it. All other personas MUST set active=true. When false, required and nits MUST be empty arrays and inactive_reason MUST be a one-sentence explanation citing the touched files.",
+      "default": true
+    },
+    "inactive_reason": {
+      "type": "string",
+      "description": "Required when active=false. One sentence citing the touched files."
+    },
+    "required": {
+      "type": "array",
+      "description": "Findings that MUST be addressed before merge. Empty array means this persona has no blocking concerns. Anything in this array causes the orchestrator to emit a REJECT verdict.",
+      "items": { "$ref": "#/definitions/finding" }
+    },
+    "nits": {
+      "type": "array",
+      "description": "One-line suggestions the author can skip. NEVER block merge. No cap - trust the panelist's judgement.",
+      "items": { "$ref": "#/definitions/finding" }
+    },
+    "extras": {
+      "type": "object",
+      "description": "Persona-specific structured payload. Reserved for python-architect mermaid diagrams and oss-growth-hacker side-channel notes. Never blocks; never affects verdict.",
+      "additionalProperties": true
+    }
+  },
+  "definitions": {
+    "finding": {
+      "type": "object",
+      "required": ["summary", "rationale"],
+      "additionalProperties": false,
+      "properties": {
+        "summary": {
+          "type": "string",
+          "description": "One-line description of the finding. ASCII only. No emojis."
+        },
+        "rationale": {
+          "type": "string",
+          "description": "WHY this matters. Cite the rule or pattern violated. ASCII only."
+        },
+        "file": {
+          "type": "string",
+          "description": "Optional repo-relative path to the file."
+        },
+        "line": {
+          "type": "integer",
+          "description": "Optional line number."
+        },
+        "suggestion": {
+          "type": "string",
+          "description": "Optional concrete fix (diff hint, replacement code, command to run)."
+        }
+      }
+    }
+  }
+}
diff --git a/.apm/skills/apm-review-panel/assets/verdict-template.md b/.apm/skills/apm-review-panel/assets/verdict-template.md
index ce032bdde..49b18d663 100644
--- a/.apm/skills/apm-review-panel/assets/verdict-template.md
+++ b/.apm/skills/apm-review-panel/assets/verdict-template.md
@@ -1,62 +1,110 @@
 <!--
 Canonical single-comment template for the APM Review Panel skill.
 
-Load this file ONLY at synthesis time, after every panelist has produced
-its findings. The orchestrator copies this skeleton verbatim, fills the
-placeholders, and emits the result as exactly ONE comment via the
-workflow's `safe-outputs.add-comment` channel.
+Loaded ONLY at synthesis time, AFTER:
+- every panelist task has returned JSON matching panelist-return-schema.json
+- the apm-ceo synthesizer task has returned JSON matching ceo-return-schema.json
+- the orchestrator has computed the binary verdict deterministically:
+    APPROVE iff sum(len(p.required) for p in panelists if p.active) == 0
+    REJECT  otherwise
+
+The orchestrator copies this skeleton verbatim, fills the placeholders
+from the collected JSON, and emits the result as exactly ONE comment via
+the workflow's `safe-outputs.add-comment` channel.
 
 Rules when filling the template:
 - ASCII only. No emojis, no Unicode dashes, no box-drawing characters.
-- Keep total length under ~600 lines.
+- Top-loaded order is non-negotiable: verdict, required, nits, CEO
+  arbitration FIRST. Per-persona detail goes in the collapsed details
+  block at the BOTTOM. Do not flip the order; the user-research is that
+  PR authors must see the decision and required actions in one screen.
 - Do NOT add or remove top-level sections. Adapt their bodies to the PR.
 - Do NOT split this output across multiple comments under any condition.
-- Routing changes which personas run, not which persona headings appear.
-- Only Auth Expert is conditional. If it was not activated for the PR,
-  write "Not activated -- <reason>" as that persona's body. Do not omit
-  the persona heading. All other persona headings always have findings.
-- The Python Architect block MUST contain the two mermaid diagrams and
-  the Design patterns subsection from the python-architect persona's
-  PR review output contract. If those are missing, the synthesis is
-  incomplete -- re-invoke the Python Architect before emitting.
+- The Required and Nits sections are AGGREGATES across all active
+  panelists. Render `[<persona>]` prefix on each item so authors know
+  who raised it.
+- The Per-persona detail block is the FULL JSON-derived findings list
+  for each panelist - even those with required==[] and nits==[]
+  (render those as "No findings.").
+- Auth Expert is the only persona that can render as "Inactive --
+  <reason>" in the per-persona block. Never omit its heading.
+- The python-architect persona MUST contribute the two mermaid diagrams
+  and the Design patterns subsection in its `extras.diagrams` payload;
+  surface them inside its per-persona block.
 -->
 
-## APM Review Panel Verdict
+## APM Review Panel Verdict: <APPROVE|REJECT>
 
-**Disposition**: <APPROVE | REQUEST_CHANGES | NEEDS_DISCUSSION> <optional one-line qualifier, e.g. "(with two minor pre-merge fixes)">
+> <one-line CEO arbitration headline; e.g. "All five specialists agreed; no required changes." or "Architecture is sound but supply-chain finding blocks merge.">
 
----
+### Required before merge (<N> items)
 
-### Per-persona findings
+<If N == 0:>
+None.
+<else, render aggregated required[] across all panelists:>
+- [<persona-slug>] <finding.summary> <if file/line: at `<file>:<line>`>
+  - Why: <finding.rationale>
+  - <if suggestion: Suggested fix: <finding.suggestion>>
+- ...
 
-**Python Architect**: <findings; MUST include the OO/class mermaid diagram, the execution-flow mermaid diagram, and the Design patterns subsection per the python-architect persona's PR review output contract>
+### Nits (<M> items, skip if you want)
 
-**CLI Logging Expert**: <findings>
+<If M == 0:>
+None.
+<else, render aggregated nits[]:>
+- [<persona-slug>] <finding.summary> <if file/line: at `<file>:<line>`>
+- ...
 
-**DevX UX Expert**: <findings>
+### CEO arbitration
 
-**Supply Chain Security Expert**: <findings>
+<ceo.arbitration prose, one to three paragraphs.>
 
-**Auth Expert**: <findings, or "Not activated -- <reason citing the touched files>">
+<if ceo.dissent_notes is non-empty:>
+**Dissent resolved:** <ceo.dissent_notes>
 
-**OSS Growth Hacker**: <findings; if relevant, include side-channel note to CEO about conversion / growth-strategy implications>
+<if ceo.growth_signal is non-empty:>
+**Growth/positioning note:** <ceo.growth_signal>
 
 ---
 
-### CEO arbitration
+<details>
+<summary>Per-persona findings (full)</summary>
 
-<one-paragraph synthesis from apm-ceo: resolve any disagreements between specialists, ratify the disposition, and state the strategic call. If specialists agreed and the change is uncontroversial, say so plainly in one or two sentences.>
+#### Python Architect
 
----
+<if active:>
+<Include the OO/class mermaid diagram and execution-flow mermaid
+diagram from extras.diagrams. Then render required[] + nits[] in full;
+if both empty, write "No findings.">
+<else: should never be inactive>
 
-### Required actions before merge
+#### CLI Logging Expert
 
-1. <required action with concrete pointer (file path, line, diff suggestion). If Disposition is APPROVE with no required actions, write "None." here -- do not omit the section.>
-2. <...>
+<Render required[] + nits[] in full; "No findings." if both empty.>
 
----
+#### DevX UX Expert
+
+<Render required[] + nits[] in full; "No findings." if both empty.>
+
+#### Supply Chain Security Expert
+
+<Render required[] + nits[] in full; "No findings." if both empty.>
+
+#### Auth Expert
+
+<if active:>
+<Render required[] + nits[] in full; "No findings." if both empty.>
+<else:>
+Inactive -- <inactive_reason>
+
+#### OSS Growth Hacker
+
+<Render required[] + nits[] in full; "No findings." if both empty.
+Surface any side-channel growth/positioning note inline; CEO already
+echoed it in the headline section if relevant.>
 
-### Optional follow-ups
+</details>
 
-- <follow-up suggestion that is out of scope for this PR but worth tracking>
-- <...>
+<sub>Verdict computed deterministically: <N> required findings across
+<K> active panelists. APPROVE iff N == 0. Push a new commit to clear
+this verdict label automatically.</sub>
diff --git a/.apm/skills/apm-review-panel/evals/README.md b/.apm/skills/apm-review-panel/evals/README.md
new file mode 100644
index 000000000..04ae1503b
--- /dev/null
+++ b/.apm/skills/apm-review-panel/evals/README.md
@@ -0,0 +1,67 @@
+# Evals: apm-review-panel
+
+Per genesis Step 8 evals gate. Two categories:
+
+1. **TRIGGER EVALS** validate the dispatch description correctly
+   discriminates should-trigger queries from near-miss should-NOT
+   queries. Validation split is the ship gate (>= 0.5 on positives,
+   < 0.5 on negatives).
+
+2. **CONTENT EVALS** validate that the skill, when activated, produces
+   the JSON-derived top-loaded verdict comment in the shape declared
+   by `assets/verdict-template.md`. Run with-skill vs without-skill;
+   if the deltas are not visible, the skill is not adding value.
+
+## Files
+
+- `trigger-evals.json` -- 16 queries (8 should-trigger + 8 should-NOT),
+  60/40 train/val split.
+- `content-eval-clean-pr.md` -- synthetic clean PR scenario; expected
+  verdict = APPROVE, all panelists return `required: []`.
+- `content-eval-rejected-pr.md` -- synthetic PR with one architectural
+  smell + one nit; expected verdict = REJECT, python-architect returns
+  one `required` finding.
+- `run-verdict-harness.py` -- DETERMINISTIC harness covering the
+  parts of the panel that do NOT require an LLM: JSON schema
+  validation (S4 gate) and verdict computation. Five cases: two
+  positive (clean-pr APPROVE, rejected-pr REJECT) and three negative
+  (missing-nits, unknown-persona, disposition-leak). All five MUST
+  pass before merging any change to schemas, the SKILL.md execution
+  checklist verdict rule, or the persona output contracts.
+
+## How to run
+
+### Deterministic harness (free, runs anywhere)
+
+```
+uv run --with jsonschema python3 \
+    .apm/skills/apm-review-panel/evals/run-verdict-harness.py
+```
+
+Expected: `RESULT: ALL PASS`. Proves schemas reject malformed shapes
+and verdict math is correct. Does NOT prove the LLM panelists will
+return well-formed JSON in practice -- that requires a real run.
+
+### Full end-to-end (option B, branch-pin test)
+
+The gh-aw workflow imports the panel skill from `microsoft/apm#main`
+for security (anti-self-approval). Pre-merge validation requires a
+temporary branch pin:
+
+1. On the feature branch, change `imports.packages` in
+   `.github/workflows/pr-review-panel.md` from `microsoft/apm#main`
+   to `microsoft/apm#<feature-branch>`.
+2. `gh aw compile pr-review-panel`.
+3. Commit as `chore: TEMP pin to branch for end-to-end test`.
+4. Open a tiny throwaway PR; label it `panel-review`.
+5. Observe: 6 task threads spawn, JSON returns, verdict label
+   applied, `panel-review` removed, `panel-approved` (or
+   `panel-rejected`) set. Push a commit and watch the
+   deterministic `pr-panel-label-reset.yml` strip the verdict
+   label.
+6. Revert the temp-pin commit before merge.
+
+### Trigger evals
+
+Trigger evals can be run via the genesis evals harness or any
+dispatcher that loads the skill description and queries it.
diff --git a/.apm/skills/apm-review-panel/evals/content-eval-clean-pr.md b/.apm/skills/apm-review-panel/evals/content-eval-clean-pr.md
new file mode 100644
index 000000000..4a3651dee
--- /dev/null
+++ b/.apm/skills/apm-review-panel/evals/content-eval-clean-pr.md
@@ -0,0 +1,49 @@
+# Content eval: clean PR (expected APPROVE)
+
+## Synthetic PR input
+
+- Title: `docs(readme): clarify install command for fish shell users`
+- Body: Adds a one-paragraph note in README.md under the install
+  section pointing fish users to the existing `eval (apm shellenv)`
+  recipe. No code changes.
+- Diff: 4 added lines, 0 removed, 1 file (`README.md`).
+- Files: `README.md`
+
+## Expected orchestrator behavior
+
+- Spawn 6 panelist tasks in parallel (5 mandatory + auth-expert with
+  `active: false` because no auth files touched).
+- Each panelist returns `{persona: "<slug>", required: [], nits: [...]}`
+  where `nits` may contain at most a one-line wording suggestion.
+- auth-expert returns `{persona: "auth-expert", active: false,
+  inactive_reason: "...README.md has no auth surface...", required: [],
+  nits: []}`.
+- CEO synthesizer returns `{arbitration: "Documentation-only change
+  ... all specialists agree no required actions ...", dissent_notes
+  not present}`.
+- Orchestrator computes `total_required = 0` -> verdict = APPROVE.
+- Orchestrator emits ONE comment per `assets/verdict-template.md`:
+  - Header: `## APM Review Panel Verdict: APPROVE`
+  - Required section: "None."
+  - Nits section: rendered (or "None.")
+  - CEO arbitration paragraph
+  - Per-persona detail in collapsed `<details>` block
+- Orchestrator applies `panel-approved` label.
+- Orchestrator removes `panel-review` label.
+
+## Pass criteria
+
+- Comment header literally reads `APM Review Panel Verdict: APPROVE`.
+- Required section reads "None.".
+- Verdict label `panel-approved` is on the PR.
+- `panel-review` label is no longer on the PR.
+- Comment count from this run is exactly 1.
+
+## With-skill vs without-skill delta
+
+- With skill: structured top-loaded comment, deterministic verdict
+  label, fan-out across 6 distinct lenses.
+- Without skill: free-form single-voice review with no label
+  automation, no schema, no guarantee on which lenses got applied.
+
+The structure IS the value.
diff --git a/.apm/skills/apm-review-panel/evals/content-eval-rejected-pr.md b/.apm/skills/apm-review-panel/evals/content-eval-rejected-pr.md
new file mode 100644
index 000000000..83a2f498b
--- /dev/null
+++ b/.apm/skills/apm-review-panel/evals/content-eval-rejected-pr.md
@@ -0,0 +1,62 @@
+# Content eval: PR with architectural smell + nit (expected REJECT)
+
+## Synthetic PR input
+
+- Title: `feat(install): add retry to package downloader`
+- Body: Wraps `_download_package` in a 3-attempt retry loop with
+  fixed 2-second backoff. Adds a `retries` parameter to
+  `InstallPipeline.__init__` defaulting to 3.
+- Diff: 35 added lines, 2 removed, 2 files.
+- Files:
+  - `src/apm_cli/install/pipeline.py` (retry param wiring)
+  - `src/apm_cli/deps/github_downloader.py` (retry loop body inline
+    in `_download_package`)
+
+## Expected orchestrator behavior
+
+- Spawn 6 panelist tasks in parallel.
+- Auth Expert activates (`github_downloader.py` is a fast-path file)
+  and returns `active: true` with possibly one finding about token
+  refresh on 401.
+- Python Architect returns at least one `required` finding:
+  `{summary: "Inline retry loop violates Strategy pattern used
+  elsewhere in deps/", file: "src/apm_cli/deps/github_downloader.py",
+  rationale: "Codebase already has the chain-of-responsibility
+  retry pattern in AuthResolver; this should reuse it or extract a
+  shared retry helper rather than inlining a fixed loop"}`.
+- CLI Logging Expert returns one `nit`: missing `[!]` warning when
+  retry kicks in.
+- DevX UX Expert returns one `nit`: `retries=3` is undocumented in
+  CLI help.
+- Supply Chain Security Expert returns 0 findings (no integrity
+  surface affected).
+- OSS Growth Hacker returns 0 findings.
+- CEO synthesizer returns `{arbitration: "Retry behavior is sound
+  but the inline loop misses the codebase's existing retry
+  abstraction. Python Architect's required action holds.",
+  dissent_notes not present}`.
+- Orchestrator computes `total_required = 1` -> verdict = REJECT.
+- Orchestrator emits ONE comment:
+  - Header: `## APM Review Panel Verdict: REJECT`
+  - Required section: 1 item, prefixed `[python-architect]`.
+  - Nits section: 2 items, prefixed `[cli-logging-expert]` and
+    `[devx-ux-expert]`.
+  - CEO arbitration paragraph.
+  - Per-persona detail in `<details>`.
+- Orchestrator applies `panel-rejected` label.
+- Orchestrator removes `panel-review` label.
+
+## Pass criteria
+
+- Comment header literally reads `APM Review Panel Verdict: REJECT`.
+- Required section has exactly 1 item with `[python-architect]` prefix.
+- Verdict label `panel-rejected` is on the PR.
+- `panel-review` label is no longer on the PR.
+- The auth-expert per-persona block renders (active = true).
+
+## With-skill vs without-skill delta
+
+Same as the clean-PR eval: with skill, the verdict is structured and
+machine-actionable (label + binary outcome). Without skill, there is
+no schema, no required-vs-nit discipline, and the developer cannot
+tell at a glance whether the comment blocks merge or not.
diff --git a/.apm/skills/apm-review-panel/evals/results-e2e-pr931-2026-04-28.md b/.apm/skills/apm-review-panel/evals/results-e2e-pr931-2026-04-28.md
new file mode 100644
index 000000000..e0fdec31a
--- /dev/null
+++ b/.apm/skills/apm-review-panel/evals/results-e2e-pr931-2026-04-28.md
@@ -0,0 +1,71 @@
+# End-to-end eval result: PR #931 (2026-04-28)
+
+Real gh-aw run of the refactored panel against an unrelated open PR
+in microsoft/apm. Branch-pin method (option B) -- workflow YAML
+imported the panel skill from `microsoft/apm#refactor/review-panel-fanout`
+instead of `#main`.
+
+## Setup
+
+- Workflow run: https://github.com/microsoft/apm/actions/runs/25069734881
+- Target PR: #931 (`docs(compile): clarify target routing docstring`,
+  +7/-3, no labels, external contributor)
+- Trigger: `workflow_dispatch` from `refactor/review-panel-fanout`
+  with `pr_number=931`
+- Skill source: `.apm/skills/apm-review-panel/SKILL.md` from this branch
+- gh-aw version: v0.68.3
+
+## Job graph (all SUCCESS)
+
+```
+pre_activation -> activation -> apm -> agent -> detection -> safe_outputs -> conclusion
+```
+
+All 7 jobs completed successfully on first run.
+
+## Acceptance criteria
+
+| # | Criterion | Result |
+|---|---|---|
+| 1 | Verdict header literally reads `## APM Review Panel Verdict: <REJECT|APPROVE>` | PASS -- `## APM Review Panel Verdict: REJECT` |
+| 2 | Top-loaded order: verdict -> required -> nits -> CEO -> per-persona | PASS -- exact order observed |
+| 3 | Required findings rendered as `[persona] summary` + Why + Suggested fix | PASS -- 3 required items, all 3 panelists, all with file/line citations |
+| 4 | Per-persona detail in collapsed `<details>` block at bottom | PASS -- `<details><summary>Per-persona findings (full)</summary>...` |
+| 5 | All 6 personas reported back (5 mandatory + auth-expert) | PASS -- python-architect, cli-logging-expert, devx-ux-expert, supply-chain-security-expert, auth-expert, oss-growth-hacker |
+| 6 | Auth-expert correctly inactive when no auth surface touched | PASS -- "Inactive -- PR #931 modifies only a docstring ... no auth, token, credential, host-classification, or AuthResolver-related files are touched." |
+| 7 | Verdict deterministic from `required[]` count (3 > 0 -> REJECT) | PASS -- correct mapping observed |
+| 8 | Verdict label `panel-approved` XOR `panel-rejected` applied | PASS -- `panel-rejected` added |
+| 9 | Comment count from this run = 1 (max:2 ceiling honored) | PASS -- 1 comment |
+| 10 | Python Architect class diagram in extras passed through to per-persona block | PASS -- mermaid `classDiagram` rendered in details |
+
+## Surprise findings (positive)
+
+- The panel surfaced a REAL regression in PR #931 that prior reviewers
+  had not flagged: the proposed docstring silently drops `gemini`
+  routing that exists in the implementation. Three independent
+  panelists converged on the same root cause with zero dissent. The
+  CEO arbitration explicitly noted this convergence and recommended a
+  rebase-and-revise rather than a close.
+- Growth Hacker correctly identified the contributor as external
+  (WilliamK112) and recommended a "warm rejection that models a
+  low-friction contribution loop" -- the fan-out captured a strategic
+  signal that a single-loop reviewer would have missed.
+
+## Failure modes NOT observed (negative confirmation)
+
+- No "approve with reservations" wording -- verdict is binary as
+  designed.
+- No panelist comment posted to the PR -- only the orchestrator's
+  single synthesized comment. The single-writer interlock held.
+- No `panel-review` label cleanup needed (the run was triggered via
+  workflow_dispatch, not via label, so there was no `panel-review`
+  label to remove). The label-removal path will be exercised when the
+  first real `panel-review`-labelled PR runs after merge.
+
+## Total cost
+
+One gh-aw workflow run, ~12 minutes wall-clock. No retries needed.
+
+## Artifact
+
+Full panel comment: https://github.com/microsoft/apm/pull/931#issuecomment-<see-PR>
diff --git a/.apm/skills/apm-review-panel/evals/results-label-reset-2026-04-28.md b/.apm/skills/apm-review-panel/evals/results-label-reset-2026-04-28.md
new file mode 100644
index 000000000..022548659
--- /dev/null
+++ b/.apm/skills/apm-review-panel/evals/results-label-reset-2026-04-28.md
@@ -0,0 +1,63 @@
+# Label-reset workflow validation (2026-04-28)
+
+End-to-end test of `.github/workflows/pr-panel-label-reset.yml` —
+the deterministic plain-Actions companion that strips verdict labels
+when a PR receives a new commit (S7 DETERMINISTIC TOOL BRIDGE).
+
+## Setup
+
+Probe PR: microsoft/apm#1025 (closed)
+
+- Base: `refactor/review-panel-fanout` (so the workflow YAML is
+  loaded from the feature branch — required because
+  `pull_request: synchronize` reads the BASE-branch workflow file).
+- Head: `test/label-reset-probe` (deleted after test).
+- Initial state: one commit ahead of base, no labels.
+
+## Procedure
+
+1. Applied `panel-rejected` label manually (simulating a panel verdict).
+2. Pushed a second commit to `test/label-reset-probe` head ref.
+3. The push fires `pull_request: synchronize` on PR #1025.
+4. Reset workflow runs from the base-branch YAML and iterates over
+   `panel-approved` / `panel-rejected`, calling `gh pr edit
+   --remove-label` for each.
+
+## Result: PASS
+
+| Check | Expected | Actual |
+|-------|----------|--------|
+| Workflow ran on synchronize | yes | yes — run 25076314935 |
+| Run status | success | success (13s) |
+| `panel-rejected` removed | yes | yes (labels: []) |
+| Missing `panel-approved` handled gracefully | no error | yes (if/else swallows) |
+
+`panel-approved` path uses identical code in the same loop; testing
+one path proves the logic.
+
+## Side finding
+
+The `panel-approved` repo label did NOT exist at test time. The
+refactor PR introduces the workflow and the contract but did not
+create the label itself. Created post-test:
+
+- `panel-approved` (#0E8A16, green) — Apm-review-panel verdict: APPROVE. Removed automatically on next push.
+- `panel-rejected` (#B60205, red) — description and color updated for clarity.
+
+## Run log excerpt
+
+```
+Removed label 'panel-approved' (or it was not present).
+Removed label 'panel-rejected' (or it was not present).
+```
+
+(Both branches of the conditional behave the same; the message text
+is intentionally vague because gh CLI exit codes do not distinguish
+"removed" from "was already absent".)
+
+## Cross-references
+
+- Workflow file: `.github/workflows/pr-panel-label-reset.yml`
+- Probe PR: https://github.com/microsoft/apm/pull/1025
+- Run: https://github.com/microsoft/apm/actions/runs/25076314935
+- Companion e2e for the panel itself: `results-e2e-pr931-2026-04-28.md`
diff --git a/.apm/skills/apm-review-panel/evals/results-trigger-2026-04-28.md b/.apm/skills/apm-review-panel/evals/results-trigger-2026-04-28.md
new file mode 100644
index 000000000..42804e9e8
--- /dev/null
+++ b/.apm/skills/apm-review-panel/evals/results-trigger-2026-04-28.md
@@ -0,0 +1,74 @@
+# Trigger Eval Self-Run (LLM dispatcher = me)
+
+Skill description (verbatim from `.apm/skills/apm-review-panel/SKILL.md`):
+> Use this skill to run a multi-persona expert panel review on a labelled
+> pull request in microsoft/apm. The panel fans out to five mandatory
+> specialists plus one conditional auth specialist, all running in their
+> own agent threads, and a CEO synthesizer. The orchestrator is the sole
+> writer to the PR: one comment plus exactly one verdict label
+> (panel-approved or panel-rejected, derived deterministically from the
+> aggregated findings). Activate when a non-trivial PR needs cross-cutting
+> review (architecture, CLI logging, DevX UX, supply-chain security,
+> growth/positioning, optionally auth, with CEO arbitration).
+
+For each query I record:
+- `expected` = label from trigger-evals.json (TRIGGER / NOT)
+- `judged`   = my dispatch decision as an LLM reading the description above
+- `match`    = expected == judged
+
+## Should-trigger TRAIN (5)
+
+| # | Query | Expected | Judged | Match | Reasoning |
+|---|---|---|---|---|---|
+| 1 | "review this PR with the expert panel" | TRIGGER | TRIGGER | YES | "expert panel" + "PR" maps cleanly to "multi-persona expert panel review on a labelled pull request" |
+| 2 | "run the apm-review-panel against this branch" | TRIGGER | TRIGGER | YES | exact name match |
+| 3 | "do a multi-persona review of PR #1234" | TRIGGER | TRIGGER | YES | "multi-persona" is verbatim in description |
+| 4 | "panel-review this PR" | TRIGGER | TRIGGER | YES | "panel" + "PR" + uses the trigger-label term as verb |
+| 5 | "get the architecture + security + UX review on this PR" | TRIGGER | TRIGGER | YES | enumerates 3 of the 6 listed lenses + PR scope |
+
+Train pass rate: 5/5 = 1.00
+
+## Should-trigger VAL (3)
+
+| # | Query | Expected | Judged | Match | Reasoning |
+|---|---|---|---|---|---|
+| 6 | "ask the python architect, cli logging expert, and security expert to weigh in on this PR" | TRIGGER | TRIGGER | YES | enumerates 3 of the 5 mandatory specialists by name + PR scope |
+| 7 | "have the apm panel verdict this PR" | TRIGGER | TRIGGER | YES | "apm panel" + "verdict" matches "verdict label (panel-approved or panel-rejected)" |
+| 8 | "i need a cross-cutting expert review of this change before merge" | TRIGGER | TRIGGER | YES | "cross-cutting" is verbatim in description; "before merge" implies PR scope |
+
+Val pass rate: 3/3 = 1.00 (gate: >= 0.5) -- PASS
+
+## Should-NOT-trigger TRAIN (5)
+
+| # | Query | Expected | Judged | Match | Reasoning |
+|---|---|---|---|---|---|
+| 9  | "give me a code review of this file" | NOT | NOT | YES | single file, no PR framing, no panel language; would route to a generic code-review skill |
+| 10 | "what does the python architect think of this class hierarchy" | NOT | NOT | YES | single persona, not panel; description requires "multi-persona" |
+| 11 | "review this docstring" | NOT | NOT | YES | scope is one docstring, not a PR; no panel framing |
+| 12 | "fix the lint errors in src/" | NOT | NOT | YES | mechanical task, not review; description is for review only |
+| 13 | "what is the apm-review-panel skill" | NOT | NOT | YES | meta-question about the skill, not a request to run it; an info skill should answer this |
+
+Train pass rate (negative): 5/5 = 1.00 (correct rejection)
+
+## Should-NOT-trigger VAL (3)
+
+| # | Query | Expected | Judged | Match | Reasoning |
+|---|---|---|---|---|---|
+| 14 | "explain how the auth resolver works" | NOT | NOT | YES | explanation request, not review; auth-expert mention is incidental |
+| 15 | "summarize the diff of PR #1234" | NOT | NOT | YES | summarization, not multi-persona review with verdict |
+| 16 | "draft a PR description for me" | NOT | NOT | YES | authoring, not reviewing; pr-description-skill territory |
+
+Val pass rate (negative): 3/3 = 1.00 (gate: < 0.5 false-positive rate, i.e. >= 0.5 correct rejection) -- PASS
+
+## Summary
+
+| Split | Should-trigger correct | Should-NOT correct | Combined |
+|---|---|---|---|
+| Train (n=10) | 5/5 = 1.00 | 5/5 = 1.00 | 10/10 = 1.00 |
+| Val   (n=6)  | 3/3 = 1.00 | 3/3 = 1.00 | 6/6  = 1.00 |
+
+**SHIP GATE (validation split):**
+- Should-trigger pass rate: 1.00 >= 0.5 -- PASS
+- Should-NOT pass rate:    1.00 >= 0.5 -- PASS
+
+**RESULT: PASS.** Dispatch description discriminates cleanly with no false positives or false negatives in this run. Caveat: I am a single LLM judging my own routing; the canonical eval would run a different model as dispatcher and average over multiple samples. But this is real LLM judgment, not hand-waving.
diff --git a/.apm/skills/apm-review-panel/evals/run-verdict-harness.py b/.apm/skills/apm-review-panel/evals/run-verdict-harness.py
new file mode 100644
index 000000000..b504f0ea3
--- /dev/null
+++ b/.apm/skills/apm-review-panel/evals/run-verdict-harness.py
@@ -0,0 +1,277 @@
+#!/usr/bin/env python3
+"""
+Deterministic verdict harness for the apm-review-panel skill.
+
+Validates two things end-to-end WITHOUT spinning up an LLM:
+1. Synthetic panelist JSON returns conform to panelist-return-schema.json.
+2. The orchestrator's deterministic verdict rule produces the expected
+   verdict + label mapping per the eval scenarios.
+
+This is genesis Step 8 evals gate, deterministic slice. It does NOT
+prove that a real LLM panelist will return well-formed JSON (that
+requires option B, the branch-pin end-to-end test). It DOES prove that
+when a panelist returns well-formed JSON, the orchestrator math is
+correct and the schema rejects malformed shapes.
+
+Usage:
+    uv run --with jsonschema python3 \
+        .apm/skills/apm-review-panel/evals/run-verdict-harness.py
+"""
+
+from __future__ import annotations
+
+import json
+import sys
+from pathlib import Path
+
+import jsonschema
+
+ROOT = Path(__file__).resolve().parents[1]
+SCHEMA_PANELIST = json.loads(
+    (ROOT / "assets" / "panelist-return-schema.json").read_text()
+)
+SCHEMA_CEO = json.loads((ROOT / "assets" / "ceo-return-schema.json").read_text())
+
+
+def derive_verdict(panelists: list[dict]) -> tuple[str, str]:
+    """
+    Pure verdict derivation per SKILL.md execution checklist step 6:
+        total_required = sum(len(p["required"]) for p in panelists if p.get("active", True))
+        verdict = "APPROVE" if total_required == 0 else "REJECT"
+    """
+    total_required = sum(
+        len(p["required"]) for p in panelists if p.get("active", True)
+    )
+    verdict = "APPROVE" if total_required == 0 else "REJECT"
+    label = "panel-approved" if verdict == "APPROVE" else "panel-rejected"
+    return verdict, label
+
+
+def case_clean_pr() -> dict:
+    """Synthetic returns for the docs-only PR scenario."""
+    panelists = [
+        {"persona": "python-architect", "required": [], "nits": []},
+        {
+            "persona": "cli-logging-expert",
+            "required": [],
+            "nits": [
+                {
+                    "summary": "Wording: 'aliasing' could be 'aliases'",
+                    "rationale": "Reads more naturally to npm/pip users",
+                    "file": "README.md",
+                    "line": 142,
+                }
+            ],
+        },
+        {"persona": "devx-ux-expert", "required": [], "nits": []},
+        {"persona": "supply-chain-security-expert", "required": [], "nits": []},
+        {"persona": "oss-growth-hacker", "required": [], "nits": []},
+        {
+            "persona": "auth-expert",
+            "active": False,
+            "inactive_reason": "README.md has no auth surface; no fast-path file matched.",
+            "required": [],
+            "nits": [],
+        },
+    ]
+    ceo = {
+        "arbitration": (
+            "Documentation-only change pointing fish users to an existing "
+            "recipe. All five mandatory specialists agree no required actions; "
+            "auth-expert correctly inactive."
+        )
+    }
+    return {
+        "name": "clean-pr",
+        "panelists": panelists,
+        "ceo": ceo,
+        "expected_verdict": "APPROVE",
+        "expected_label": "panel-approved",
+    }
+
+
+def case_rejected_pr() -> dict:
+    """Synthetic returns for the inline-retry-loop PR scenario."""
+    panelists = [
+        {
+            "persona": "python-architect",
+            "required": [
+                {
+                    "summary": "Inline retry loop violates Strategy pattern in deps/",
+                    "rationale": (
+                        "Codebase already has chain-of-responsibility retry "
+                        "via AuthResolver; reuse or extract a shared helper "
+                        "instead of inlining a fixed loop."
+                    ),
+                    "file": "src/apm_cli/deps/github_downloader.py",
+                    "line": 218,
+                    "suggestion": "Extract retry into deps/_retry.py with same shape as AuthResolver chain.",
+                }
+            ],
+            "nits": [],
+        },
+        {
+            "persona": "cli-logging-expert",
+            "required": [],
+            "nits": [
+                {
+                    "summary": "Missing [!] warning when retry kicks in",
+                    "rationale": "User-visible network blip; STATUS_SYMBOLS guidance applies.",
+                    "file": "src/apm_cli/deps/github_downloader.py",
+                }
+            ],
+        },
+        {
+            "persona": "devx-ux-expert",
+            "required": [],
+            "nits": [
+                {
+                    "summary": "retries=3 undocumented in CLI help",
+                    "rationale": "Mental model gap; npm/pip users expect to see configurable retries surfaced.",
+                }
+            ],
+        },
+        {"persona": "supply-chain-security-expert", "required": [], "nits": []},
+        {"persona": "oss-growth-hacker", "required": [], "nits": []},
+        {
+            "persona": "auth-expert",
+            "active": True,
+            "required": [],
+            "nits": [
+                {
+                    "summary": "Consider 401-aware retry",
+                    "rationale": "Stale-PAT handling already exists in AuthResolver; align.",
+                }
+            ],
+        },
+    ]
+    ceo = {
+        "arbitration": (
+            "Retry behavior is sound but the inline loop misses the "
+            "codebase's existing retry abstraction. Python Architect's "
+            "required action holds; nits across logging/devx/auth align."
+        ),
+        "dissent_notes": "",
+    }
+    return {
+        "name": "rejected-pr",
+        "panelists": panelists,
+        "ceo": ceo,
+        "expected_verdict": "REJECT",
+        "expected_label": "panel-rejected",
+    }
+
+
+def case_malformed_panelist() -> dict:
+    """Negative case: panelist forgot the `nits` field. S4 gate must fail."""
+    return {
+        "name": "malformed-missing-nits",
+        "panelist": {"persona": "python-architect", "required": []},
+        "expect_schema_error": True,
+    }
+
+
+def case_unknown_persona() -> dict:
+    """Negative case: persona enum violation."""
+    return {
+        "name": "malformed-unknown-persona",
+        "panelist": {
+            "persona": "unknown-persona",
+            "required": [],
+            "nits": [],
+        },
+        "expect_schema_error": True,
+    }
+
+
+def case_disposition_leak() -> dict:
+    """Negative case: panelist tried to add a verdict field. additionalProperties:false catches."""
+    return {
+        "name": "malformed-disposition-leak",
+        "panelist": {
+            "persona": "python-architect",
+            "required": [],
+            "nits": [],
+            "disposition": "APPROVE",
+        },
+        "expect_schema_error": True,
+    }
+
+
+def run_positive(case: dict) -> tuple[bool, str]:
+    """Validate every panelist + CEO return, then check verdict."""
+    notes = []
+    for p in case["panelists"]:
+        try:
+            jsonschema.validate(p, SCHEMA_PANELIST)
+        except jsonschema.ValidationError as e:
+            return False, f"panelist {p.get('persona')} failed schema: {e.message}"
+        notes.append(
+            f"  - {p['persona']}: active={p.get('active', True)}, "
+            f"required={len(p['required'])}, nits={len(p['nits'])}"
+        )
+
+    try:
+        jsonschema.validate(case["ceo"], SCHEMA_CEO)
+    except jsonschema.ValidationError as e:
+        return False, f"CEO failed schema: {e.message}"
+
+    verdict, label = derive_verdict(case["panelists"])
+    if verdict != case["expected_verdict"]:
+        return (
+            False,
+            f"expected verdict={case['expected_verdict']}, got {verdict}",
+        )
+    if label != case["expected_label"]:
+        return False, f"expected label={case['expected_label']}, got {label}"
+
+    return True, "\n".join(
+        notes
+        + [
+            f"  -> verdict={verdict}, label={label} (expected matched)",
+        ]
+    )
+
+
+def run_negative(case: dict) -> tuple[bool, str]:
+    """Validate that schema rejects the malformed shape."""
+    try:
+        jsonschema.validate(case["panelist"], SCHEMA_PANELIST)
+    except jsonschema.ValidationError as e:
+        return True, f"correctly rejected: {e.message.splitlines()[0]}"
+    return False, "schema accepted a malformed return; S4 gate would have let it through"
+
+
+def main() -> int:
+    cases_pos = [case_clean_pr(), case_rejected_pr()]
+    cases_neg = [case_malformed_panelist(), case_unknown_persona(), case_disposition_leak()]
+
+    ok = True
+    print("APM Review Panel - Deterministic Verdict Harness")
+    print("=" * 60)
+    print()
+    print("Positive cases (well-formed JSON, expected verdict):")
+    for c in cases_pos:
+        passed, msg = run_positive(c)
+        marker = "[+] PASS" if passed else "[x] FAIL"
+        print(f"{marker}  {c['name']}")
+        print(msg)
+        print()
+        ok = ok and passed
+
+    print("Negative cases (malformed JSON, S4 schema gate must reject):")
+    for c in cases_neg:
+        passed, msg = run_negative(c)
+        marker = "[+] PASS" if passed else "[x] FAIL"
+        print(f"{marker}  {c['name']}")
+        print(f"  {msg}")
+        print()
+        ok = ok and passed
+
+    print("=" * 60)
+    print("RESULT:", "ALL PASS" if ok else "FAIL")
+    return 0 if ok else 1
+
+
+if __name__ == "__main__":
+    sys.exit(main())
diff --git a/.apm/skills/apm-review-panel/evals/trigger-evals.json b/.apm/skills/apm-review-panel/evals/trigger-evals.json
new file mode 100644
index 000000000..cbedde29d
--- /dev/null
+++ b/.apm/skills/apm-review-panel/evals/trigger-evals.json
@@ -0,0 +1,31 @@
+{
+  "description": "Trigger evals for the apm-review-panel skill dispatch description. 8 should-trigger + 8 should-NOT-trigger. 60/40 train/val split (10 train, 6 val). Validation split is the ship gate.",
+  "should_trigger": {
+    "train": [
+      "review this PR with the expert panel",
+      "run the apm-review-panel against this branch",
+      "do a multi-persona review of PR #1234",
+      "panel-review this PR",
+      "get the architecture + security + UX review on this PR"
+    ],
+    "val": [
+      "ask the python architect, cli logging expert, and security expert to weigh in on this PR",
+      "have the apm panel verdict this PR",
+      "i need a cross-cutting expert review of this change before merge"
+    ]
+  },
+  "should_not_trigger": {
+    "train": [
+      "give me a code review of this file",
+      "what does the python architect think of this class hierarchy",
+      "review this docstring",
+      "fix the lint errors in src/",
+      "what is the apm-review-panel skill"
+    ],
+    "val": [
+      "explain how the auth resolver works",
+      "summarize the diff of PR #1234",
+      "draft a PR description for me"
+    ]
+  }
+}
diff --git a/.github/agents/apm-ceo.agent.md b/.github/agents/apm-ceo.agent.md
index 2d99f328b..852b74100 100644
--- a/.github/agents/apm-ceo.agent.md
+++ b/.github/agents/apm-ceo.agent.md
@@ -94,3 +94,34 @@ For any non-trivial change, ask:
 - You do NOT touch `WIP/growth-strategy.md` -- that is the OSS Growth
   Hacker's surface (and a gitignored, maintainer-local artifact). You
   consume their output as input to strategic calls.
+
+## Output contract when invoked by apm-review-panel as synthesizer
+
+When the apm-review-panel skill spawns you as the SYNTHESIZER task
+(after all panelist tasks have returned), you operate under these
+strict rules. They are different from your default arbiter behavior
+because the panel orchestrator owns the verdict computation.
+
+- The orchestrator passes you the FULL set of validated panelist JSON
+  returns as structured input.
+- You produce ARBITRATION PROSE ONLY. You do NOT pick the verdict.
+  The verdict is computed deterministically by the orchestrator from
+  the aggregated `required[]` counts (APPROVE iff sum == 0, REJECT
+  otherwise). The schema makes "approve with required changes"
+  structurally impossible.
+- You return JSON matching `assets/ceo-return-schema.json` from the
+  apm-review-panel skill, as the FINAL message of your task. No prose
+  around the JSON; the orchestrator parses your last message.
+  - `arbitration`: 1-3 paragraphs. Resolve any disagreement between
+    specialists. Surface strategic implications (positioning, breaking
+    change, naming, scope). If specialists agreed and the change is
+    uncontroversial, say so plainly.
+  - `dissent_notes` (optional): when two or more panelists disagreed
+    on whether a finding is REQUIRED vs NIT, name the disagreement
+    and state which side you side with and why.
+  - `growth_signal` (optional): echo any side-channel note from the
+    oss-growth-hacker panelist that should be amplified in the
+    headline (conversion, narrative, breaking-change comms).
+- You MUST NOT call `gh pr comment`, `gh pr edit`, `gh issue`, or any
+  other GitHub write command. You MUST NOT post to `safe-outputs`.
+  The orchestrator is the sole writer.
diff --git a/.github/agents/auth-expert.agent.md b/.github/agents/auth-expert.agent.md
index cd3331860..162f741b9 100644
--- a/.github/agents/auth-expert.agent.md
+++ b/.github/agents/auth-expert.agent.md
@@ -55,3 +55,33 @@ When reviewing or writing auth code:
 - Classic PATs (`ghp_`) work cross-org but are being deprecated -- prefer fine-grained
 - ADO uses Basic auth with base64-encoded `:PAT` -- different from GitHub bearer token flow
 - ADO also supports AAD bearer tokens via `az account get-access-token` (resource `499b84ac-1321-427f-aa17-267ca6975798`); precedence is `ADO_APM_PAT` -> az bearer -> fail. Stale PATs (401) silently fall back to the bearer with a `[!]` warning. See the auth skill for the four diagnostic cases.
+
+## Output contract when invoked by apm-review-panel
+
+When the apm-review-panel skill spawns you as a panelist task, you
+operate under these strict rules. They override any default behavior
+that would post comments or apply labels.
+
+- You read the persona scope above and the PR title/body/diff passed
+  in the task prompt.
+- You produce findings in TWO buckets only:
+  - `required`: blocks merge. Real, actionable, citing file/line where
+    possible. Anything you put here will produce a REJECT verdict.
+  - `nits`: one-line suggestions the author can skip. No third bucket,
+    no "consider", no "optional follow-up". If a finding is real and
+    matters, it is required. If not, it is a nit.
+- You return JSON matching `assets/panelist-return-schema.json` from
+  the apm-review-panel skill, as the FINAL message of your task. No
+  prose around the JSON; the orchestrator parses your last message.
+- You MUST NOT call `gh pr comment`, `gh pr edit`, `gh issue`, or any
+  other GitHub write command. You MUST NOT post to `safe-outputs`.
+  You MUST NOT touch the PR state. The orchestrator is the sole
+  writer; your only output channel is the JSON return.
+- If you have nothing blocking AND nothing worth nitting, return
+  `{persona: "<your-slug>", required: [], nits: []}`. That is a
+  valid and preferred answer when true.
+- Auth-specific: when the apm-review-panel orchestrator spawns you
+  with "active=false" framing (the conditional rule did not fire), you
+  return `{persona: "auth-expert", active: false, inactive_reason:
+  "<one sentence citing the touched files>", required: [], nits: []}`
+  WITHOUT performing a full review. Trust the orchestrator's routing.
diff --git a/.github/agents/cli-logging-expert.agent.md b/.github/agents/cli-logging-expert.agent.md
index f6dca5b8b..c11e08b05 100644
--- a/.github/agents/cli-logging-expert.agent.md
+++ b/.github/agents/cli-logging-expert.agent.md
@@ -48,3 +48,28 @@ You are an expert on CLI output UX with excellent taste. You ensure verbose mode
 3. **Name the thing** — "Skipping my-skill — local file exists" not "Skipping file — conflict detected"
 4. **Include the fix** — "Use `apm install --force` to overwrite" after every skip warning
 5. **No emojis** — ASCII `STATUS_SYMBOLS` only, never emoji characters
+
+## Output contract when invoked by apm-review-panel
+
+When the apm-review-panel skill spawns you as a panelist task, you
+operate under these strict rules. They override any default behavior
+that would post comments or apply labels.
+
+- You read the persona scope above and the PR title/body/diff passed
+  in the task prompt.
+- You produce findings in TWO buckets only:
+  - `required`: blocks merge. Real, actionable, citing file/line where
+    possible. Anything you put here will produce a REJECT verdict.
+  - `nits`: one-line suggestions the author can skip. No third bucket,
+    no "consider", no "optional follow-up". If a finding is real and
+    matters, it is required. If not, it is a nit.
+- You return JSON matching `assets/panelist-return-schema.json` from
+  the apm-review-panel skill, as the FINAL message of your task. No
+  prose around the JSON; the orchestrator parses your last message.
+- You MUST NOT call `gh pr comment`, `gh pr edit`, `gh issue`, or any
+  other GitHub write command. You MUST NOT post to `safe-outputs`.
+  You MUST NOT touch the PR state. The orchestrator is the sole
+  writer; your only output channel is the JSON return.
+- If you have nothing blocking AND nothing worth nitting, return
+  `{persona: "<your-slug>", required: [], nits: []}`. That is a
+  valid and preferred answer when true.
diff --git a/.github/agents/devx-ux-expert.agent.md b/.github/agents/devx-ux-expert.agent.md
index 811eac7b8..25b10e0e2 100644
--- a/.github/agents/devx-ux-expert.agent.md
+++ b/.github/agents/devx-ux-expert.agent.md
@@ -78,3 +78,28 @@ When reviewing a command, command help text, or a workflow change, ask:
   expert when a UX change touches auth, lockfile integrity, or download
   paths.
 - Strategic naming / positioning calls escalate to the APM CEO.
+
+## Output contract when invoked by apm-review-panel
+
+When the apm-review-panel skill spawns you as a panelist task, you
+operate under these strict rules. They override any default behavior
+that would post comments or apply labels.
+
+- You read the persona scope above and the PR title/body/diff passed
+  in the task prompt.
+- You produce findings in TWO buckets only:
+  - `required`: blocks merge. Real, actionable, citing file/line where
+    possible. Anything you put here will produce a REJECT verdict.
+  - `nits`: one-line suggestions the author can skip. No third bucket,
+    no "consider", no "optional follow-up". If a finding is real and
+    matters, it is required. If not, it is a nit.
+- You return JSON matching `assets/panelist-return-schema.json` from
+  the apm-review-panel skill, as the FINAL message of your task. No
+  prose around the JSON; the orchestrator parses your last message.
+- You MUST NOT call `gh pr comment`, `gh pr edit`, `gh issue`, or any
+  other GitHub write command. You MUST NOT post to `safe-outputs`.
+  You MUST NOT touch the PR state. The orchestrator is the sole
+  writer; your only output channel is the JSON return.
+- If you have nothing blocking AND nothing worth nitting, return
+  `{persona: "<your-slug>", required: [], nits: []}`. That is a
+  valid and preferred answer when true.
diff --git a/.github/agents/oss-growth-hacker.agent.md b/.github/agents/oss-growth-hacker.agent.md
index dbe67877f..7cad1a8e5 100644
--- a/.github/agents/oss-growth-hacker.agent.md
+++ b/.github/agents/oss-growth-hacker.agent.md
@@ -97,3 +97,28 @@ The CEO consumes your annotations when making the final call.
 - You write only to `WIP/growth-strategy.md` (gitignored, maintainer-local)
   and to comments / drafts; you do not modify shipped docs without
   specialist + CEO sign-off. Never stage or commit anything under `WIP/`.
+
+## Output contract when invoked by apm-review-panel
+
+When the apm-review-panel skill spawns you as a panelist task, you
+operate under these strict rules. They override any default behavior
+that would post comments or apply labels.
+
+- You read the persona scope above and the PR title/body/diff passed
+  in the task prompt.
+- You produce findings in TWO buckets only:
+  - `required`: blocks merge. Real, actionable, citing file/line where
+    possible. Anything you put here will produce a REJECT verdict.
+  - `nits`: one-line suggestions the author can skip. No third bucket,
+    no "consider", no "optional follow-up". If a finding is real and
+    matters, it is required. If not, it is a nit.
+- You return JSON matching `assets/panelist-return-schema.json` from
+  the apm-review-panel skill, as the FINAL message of your task. No
+  prose around the JSON; the orchestrator parses your last message.
+- You MUST NOT call `gh pr comment`, `gh pr edit`, `gh issue`, or any
+  other GitHub write command. You MUST NOT post to `safe-outputs`.
+  You MUST NOT touch the PR state. The orchestrator is the sole
+  writer; your only output channel is the JSON return.
+- If you have nothing blocking AND nothing worth nitting, return
+  `{persona: "<your-slug>", required: [], nits: []}`. That is a
+  valid and preferred answer when true.
diff --git a/.github/agents/python-architect.agent.md b/.github/agents/python-architect.agent.md
index ebedc9af5..1ddc12662 100644
--- a/.github/agents/python-architect.agent.md
+++ b/.github/agents/python-architect.agent.md
@@ -223,3 +223,28 @@ Rules for this subsection:
   over-engineering, write "Pragmatic suggestion: none -- the current
   shape is the simplest correct design at this scope." That is a valid
   and preferred answer when true.
+
+## Output contract when invoked by apm-review-panel
+
+When the apm-review-panel skill spawns you as a panelist task, you
+operate under these strict rules. They override any default behavior
+that would post comments or apply labels.
+
+- You read the persona scope above and the PR title/body/diff passed
+  in the task prompt.
+- You produce findings in TWO buckets only:
+  - `required`: blocks merge. Real, actionable, citing file/line where
+    possible. Anything you put here will produce a REJECT verdict.
+  - `nits`: one-line suggestions the author can skip. No third bucket,
+    no "consider", no "optional follow-up". If a finding is real and
+    matters, it is required. If not, it is a nit.
+- You return JSON matching `assets/panelist-return-schema.json` from
+  the apm-review-panel skill, as the FINAL message of your task. No
+  prose around the JSON; the orchestrator parses your last message.
+- You MUST NOT call `gh pr comment`, `gh pr edit`, `gh issue`, or any
+  other GitHub write command. You MUST NOT post to `safe-outputs`.
+  You MUST NOT touch the PR state. The orchestrator is the sole
+  writer; your only output channel is the JSON return.
+- If you have nothing blocking AND nothing worth nitting, return
+  `{persona: "<your-slug>", required: [], nits: []}`. That is a
+  valid and preferred answer when true.
diff --git a/.github/agents/supply-chain-security-expert.agent.md b/.github/agents/supply-chain-security-expert.agent.md
index 4c8a2eafd..3fc055532 100644
--- a/.github/agents/supply-chain-security-expert.agent.md
+++ b/.github/agents/supply-chain-security-expert.agent.md
@@ -94,3 +94,28 @@ file integration, ask:
   trade-off to the DevX UX expert and escalate to the CEO.
 - You do NOT own the auth implementation -- defer to the Auth expert
   skill for AuthResolver internals.
+
+## Output contract when invoked by apm-review-panel
+
+When the apm-review-panel skill spawns you as a panelist task, you
+operate under these strict rules. They override any default behavior
+that would post comments or apply labels.
+
+- You read the persona scope above and the PR title/body/diff passed
+  in the task prompt.
+- You produce findings in TWO buckets only:
+  - `required`: blocks merge. Real, actionable, citing file/line where
+    possible. Anything you put here will produce a REJECT verdict.
+  - `nits`: one-line suggestions the author can skip. No third bucket,
+    no "consider", no "optional follow-up". If a finding is real and
+    matters, it is required. If not, it is a nit.
+- You return JSON matching `assets/panelist-return-schema.json` from
+  the apm-review-panel skill, as the FINAL message of your task. No
+  prose around the JSON; the orchestrator parses your last message.
+- You MUST NOT call `gh pr comment`, `gh pr edit`, `gh issue`, or any
+  other GitHub write command. You MUST NOT post to `safe-outputs`.
+  You MUST NOT touch the PR state. The orchestrator is the sole
+  writer; your only output channel is the JSON return.
+- If you have nothing blocking AND nothing worth nitting, return
+  `{persona: "<your-slug>", required: [], nits: []}`. That is a
+  valid and preferred answer when true.
diff --git a/.github/skills/apm-review-panel/SKILL.md b/.github/skills/apm-review-panel/SKILL.md
index 1a5dae18e..f2bdf92ce 100644
--- a/.github/skills/apm-review-panel/SKILL.md
+++ b/.github/skills/apm-review-panel/SKILL.md
@@ -1,59 +1,99 @@
 ---
 name: apm-review-panel
 description: >-
-  Use this skill to run a six-agent panel review (plus one conditional
-  auth specialist) for non-trivial PRs, design proposals, release
-  decisions, and other cross-cutting changes in microsoft/apm. Emit one
-  synthesized verdict comment.
+  Use this skill to run a multi-persona expert panel review on a labelled
+  pull request in microsoft/apm. The panel fans out to five mandatory
+  specialists plus one conditional auth specialist, all running in their
+  own agent threads, and a CEO synthesizer. The orchestrator is the sole
+  writer to the PR: one comment plus exactly one verdict label
+  (panel-approved or panel-rejected, derived deterministically from the
+  aggregated findings). Activate when a non-trivial PR needs cross-cutting
+  review (architecture, CLI logging, DevX UX, supply-chain security,
+  growth/positioning, optionally auth, with CEO arbitration).
 ---
 
-# APM Review Panel -- Expert Review Orchestration
-
-The panel is fixed at **5 mandatory specialist lenses + 1 conditional
-auth lens + 1 arbiter lens = up to 7 persona sections in one verdict
-comment**. You play each lens in turn from inside a single agent loop
-(progressive-disclosure skill model -- no sub-agent dispatch). Routing
-chooses *which* lenses execute; it never changes which headings appear
-in the final verdict.
+# APM Review Panel - Fan-Out Expert Review
+
+The panel is FAN-OUT + SYNTHESIZER. Each persona runs in its own agent
+thread (via the `task` tool) and returns JSON matching
+`assets/panelist-return-schema.json`. The orchestrator schema-validates
+each return, hands all returns to the apm-ceo synthesizer (also a task
+thread, returns JSON matching `assets/ceo-return-schema.json`), then
+deterministically derives the verdict and writes ONE comment plus ONE
+verdict label.
+
+## Architecture invariants
+
+- **Verdict is binary and deterministic.** APPROVE iff
+  `sum(len(p.required) for p in panelists if p.active) == 0`.
+  REJECT otherwise. The CEO does NOT pick the verdict; the CEO writes
+  the arbitration narrative. This kills the "approve with reservations"
+  failure mode at the schema level.
+- **Two severity buckets only.** `required` blocks merge. `nits` are
+  one-line suggestions the author can skip. There is no "optional",
+  no "consider", no "maybe later". If a finding is real and matters,
+  it is required. If not, it is a nit. No third bucket accumulates debt.
+- **Single-writer interlock.** Only the orchestrator writes to
+  `safe-outputs` (one `add-comment`, one `add-labels`, one
+  `remove-labels`). Panelist subagents and the CEO subagent return
+  JSON only and MUST NOT call any `gh` write command, post comments,
+  apply labels, or touch the PR state.
+- **Single-emission discipline.** Exactly one comment per panel run,
+  rendered from `assets/verdict-template.md` after all subagents return.
 
 ## Agent roster
 
-| Agent | Persona | Always active? |
-|-------|---------|----------------|
+| Agent | Role | Always active? |
+|-------|------|----------------|
 | [Python Architect](../../agents/python-architect.agent.md) | Architectural Reviewer | Yes |
 | [CLI Logging Expert](../../agents/cli-logging-expert.agent.md) | Output UX Reviewer | Yes |
 | [DevX UX Expert](../../agents/devx-ux-expert.agent.md) | Package-Manager UX | Yes |
 | [Supply Chain Security Expert](../../agents/supply-chain-security-expert.agent.md) | Threat-Model Reviewer | Yes |
-| [OSS Growth Hacker](../../agents/oss-growth-hacker.agent.md) | Adoption Strategist | Yes (side-channel to CEO) |
-| [Auth Expert](../../agents/auth-expert.agent.md) | Auth / Token Reviewer | Conditional (see "Conditional panelist" below) |
-| [APM CEO](../../agents/apm-ceo.agent.md) | Strategic Owner / Arbiter | Yes (always arbitrates) |
+| [OSS Growth Hacker](../../agents/oss-growth-hacker.agent.md) | Adoption Strategist | Yes |
+| [Auth Expert](../../agents/auth-expert.agent.md) | Auth / Token Reviewer | Conditional (see below) |
+| [APM CEO](../../agents/apm-ceo.agent.md) | Strategic Arbiter / Synthesizer | Yes |
 
-## Routing topology
+## Topology
 
 ```
-  python-architect    cli-logging-expert    devx-ux-expert    supply-chain-security-expert
-        \_______________________|______________________________/
-                                |   <-- auth-expert (conditional)
-                                v
-                            apm-ceo               <----  oss-growth-hacker
-                       (final call / arbiter)           (annotates findings;
-                                                         updates growth-strategy)
+   apm-review-panel SKILL (orchestrator thread)
+                      |
+   FAN-OUT via task tool (panelists in parallel)
+                      |
+   +-----+-------+-------+-----+-----+------+
+   v     v       v       v     v     v      v (cond.)
+  py    cli     dx-ux   sec   grw           auth
+   |     |       |       |     |             |
+   |   each returns JSON per panelist-return-schema.json
+   +-----+-------+-------+-----+-------------+
+                      |
+                      v   <-- S4 schema-validate
+                      v   <-- on malformed: re-spawn that persona
+                      v
+   task: apm-ceo synthesizer
+   - aggregates required[] across panelists
+   - resolves dissent
+   - returns ceo-return-schema.json (NO verdict field)
+                      |
+                      v   <-- DETERMINISTIC verdict gate
+                      |       APPROVE iff sum(required) == 0
+                      |       REJECT  otherwise
+                      v
+   orchestrator (sole writer)
+            |        |               |
+            v        v               v
+        add-comment add-labels   remove-labels
+        (max:2)     [panel-approved   [panel-review]
+                    XOR panel-rejected]
 ```
 
-- **Specialists raise findings independently** -- no implicit consensus.
-- **CEO arbitrates** when specialists disagree or when a finding has
-  strategic implications (positioning, breaking change, naming, scope).
-- **Growth Hacker is a side-channel** to the CEO: never blocks a
-  specialist finding; annotates it with growth implications and
-  escalates to the CEO when relevant.
-
 ## Conditional panelist: Auth Expert
 
 Auth Expert is the only conditional panelist. Activate `auth-expert`
 if either rule below matches.
 
-1. **Fast-path file trigger.** Activate the Auth Expert lens
-   immediately when the PR changes any of:
+1. **Fast-path file trigger.** Activate the Auth Expert task when the
+   PR changes any of:
    - `src/apm_cli/core/auth.py`
    - `src/apm_cli/core/token_manager.py`
    - `src/apm_cli/core/azure_cli.py`
@@ -64,7 +104,7 @@ if either rule below matches.
    - `src/apm_cli/deps/registry_proxy.py`
 
 2. **Fallback self-check.** If no fast-path file matched, answer this
-   before activating the lens:
+   before activating:
 
    > Does this PR change authentication behavior, token management,
    > credential resolution, host classification used by `AuthResolver`,
@@ -74,190 +114,159 @@ if either rule below matches.
 
 Routing rule:
 
-- **YES** -> take the Auth Expert lens (per the Persona pass
-  procedure) and capture its findings.
-- **NO**  -> record `Auth Expert inactive reason: <one sentence
-  citing the touched files>` in working notes; do not take the lens.
+- **YES** -> spawn the auth-expert task with the standard panelist
+  contract. It returns JSON with `active: true`.
+- **NO**  -> spawn the auth-expert task ANYWAY but instruct it to
+  return `{persona: "auth-expert", active: false, inactive_reason:
+  "<one sentence citing the touched files>", required: [], nits: []}`.
+  This keeps the schema uniform and the verdict template happy.
 - Never use wildcard heuristics like `*auth*`, `*token*`, or
   `*credential*` as the sole trigger.
 
-## Routing matrix
-
-These routes choose which specialists are emphasised for a given PR
-type. They do **not** change the verdict shape. The final comment
-always uses every persona heading in `assets/verdict-template.md`;
-the only persona that can render as `Not activated -- <reason>` is
-Auth Expert (per the conditional rule above).
-
-### Code review (architecture + logging)
-1. Python Architect reviews structure / patterns / cross-file impact.
-2. CLI Logging Expert reviews any output / logger changes.
-3. CEO ratifies if the two disagree on abstraction vs consistency.
-
-### CLI UX review
-1. DevX UX Expert reviews command surface, flags, help, error wording.
-2. CLI Logging Expert reviews how outputs are emitted (logger methods).
-3. Growth Hacker annotates if the change affects first-run conversion.
-4. CEO ratifies any naming / positioning calls.
-
-### Security review
-1. Supply Chain Security Expert maps the change to the threat model.
-2. DevX UX Expert flags any ergonomics regression from the mitigation.
-3. CEO arbitrates trade-offs; bias toward security on default behavior.
-
-### Auth review (only when the conditional Auth Expert is activated)
-1. Auth Expert maps the change against AuthResolver, token precedence,
-   host classification, and credential helpers.
-2. Supply Chain Security Expert checks for token-scoping or credential
-   leakage implications.
-3. CEO ratifies any default-behavior change.
-
-### Release / comms review
-1. CEO grounds the release framing in `gh` CLI stats.
-2. Growth Hacker drafts hook + story angle; updates
-   `WIP/growth-strategy.md` (gitignored maintainer-local; create if absent).
-3. Specialists sanity-check any technical claims in release notes.
-
-### Full panel review (non-trivial change)
-1. Each mandatory specialist produces independent findings.
-2. Auth Expert participates if the conditional rule above activates it.
-3. Growth Hacker annotates findings with growth implications.
-4. CEO synthesizes, resolves disagreements, makes the final call.
-5. Surface decision and rationale to the author via the single verdict
-   comment.
-
-## Quality gates
-
-A non-trivial change passes when:
-
-- [ ] Python Architect: structure / patterns OK (or change explicitly
-      justified)
-- [ ] CLI Logging Expert: output paths route through CommandLogger,
-      no direct `_rich_*` in commands
-- [ ] DevX UX Expert: command surface familiar to npm/pip/cargo users,
-      every error has a next action
-- [ ] Supply Chain Security Expert: no new path / auth / integrity
-      surface left unguarded; fails closed
-- [ ] Auth Expert (only if activated): no regression to AuthResolver
-      precedence, host classification, or credential leakage surface
-- [ ] APM CEO: trade-offs ratified, breaking changes have CHANGELOG +
-      migration line
-- [ ] OSS Growth Hacker: conversion surfaces unaffected or improved;
-      `WIP/growth-strategy.md` updated if relevant (maintainer-local;
-      gitignored, never committed)
-
-## Notes
-
-- This skill orchestrates a panel **in your own context** -- you are
-  the only agent. You load each persona's `.agent.md` reference file
-  on demand (progressive disclosure), assume that persona's lens to
-  produce its findings, then move to the next persona. Do NOT spawn
-  sub-agents (no `task` tool dispatch) -- the panel is a sequence of
-  reasoning passes inside one agent loop, not a multi-agent fan-out.
-- Persona detail lives in the linked `.agent.md` files. Read each
-  one when you switch to that persona; do not pre-load all of them.
+## Routing matrix (emphasis only - all panelists always run)
+
+These routes describe WHICH specialist's findings the CEO weights more
+heavily for a given PR type. They do NOT change which personas run -
+every mandatory persona runs on every panel invocation, period. Routing
+is a CEO synthesis hint, not a panelist gate.
+
+- **Architecture-heavy PR** -> CEO weights Python Architect findings
+  on abstraction calls; CLI Logging on consistency.
+- **CLI UX PR** -> CEO weights DevX UX on command surface; CLI Logging
+  on output paths; Growth Hacker on first-run conversion.
+- **Security PR** -> CEO biases toward Supply Chain Security on
+  default behavior; DevX UX flags ergonomics regression from any
+  mitigation.
+- **Auth PR** (auth-expert active) -> CEO weights Auth Expert on
+  AuthResolver / token precedence; Supply Chain on token-scoping.
+- **Release / comms PR** -> CEO weights Growth Hacker on hook + story
+  angle; specialists sanity-check technical claims.
+- **Full panel** (default) -> CEO synthesizes equally; calls out any
+  dissent in `dissent_notes`.
 
 ## Execution checklist
 
-When this skill is activated for a PR review, work through these
-steps in order, in a single agent loop. Do not skip ahead and do not
-emit any output before the final step.
-
-1. Read the PR context (title, body, labels, changed files, diff).
-   The orchestrating workflow already fetches this with `gh pr view`
-   / `gh pr diff` -- do not re-fetch.
-2. Resolve the **Auth Expert conditional case** using the rule in
-   "Conditional panelist: Auth Expert" above. Record either an
-   activation decision (and proceed to step 3) or an `Auth Expert
-   inactive reason: <one sentence>` in working notes.
-3. For each mandatory persona (plus `auth-expert` if activated),
-   follow the **Persona pass procedure** below, one persona at a
-   time. Do not try to play multiple personas in a single pass.
-4. Run the **pre-arbitration completeness gate**:
-   - Findings exist in working notes for the 5 mandatory specialists
-     (Python Architect, CLI Logging Expert, DevX UX Expert, Supply
-     Chain Security Expert, OSS Growth Hacker).
-   - Exactly one of `Auth Expert findings` or `Auth Expert inactive
-     reason` exists (neither = incomplete; both = inconsistent
-     routing).
-   - The Python Architect notes contain the OO / class mermaid
-     diagram, the execution-flow mermaid diagram, and the Design
-     patterns subsection declared in
-     `../../agents/python-architect.agent.md`.
-   - No persona section is missing or empty.
-   If any check fails, redo that persona's pass and repeat the gate.
-   Do not proceed to step 5 until the gate passes.
-5. Take the **APM CEO** lens (load
-   `../../agents/apm-ceo.agent.md`) and arbitrate over the collected
-   findings -- still in your own context. CEO arbitration may run
-   only after the completeness gate has passed.
-6. Now (and only now) load `assets/verdict-template.md` and fill it
-   in with the collected findings + arbitration.
-7. Emit the filled template as exactly ONE comment via the workflow's
-   `safe-outputs.add-comment` channel. Never call the GitHub API
-   directly. This is the ONLY output emission for the entire panel
-   run -- no per-persona comments, no progress comments.
-
-### Persona pass procedure
-
-For each persona, run this exact procedure in your own context:
-
-1. Open the persona's `.agent.md` file (linked in the roster) and
-   read its scope, lens, anti-patterns, and required return shape.
-2. From that persona's lens, review the PR title/body/diff against
-   the scope declared in the file.
-3. Write the findings to working notes under
-   `<persona-name>: <findings>` (or, for an inactive Auth Expert,
-   `Auth Expert inactive reason: <one sentence>`).
-4. Drop the persona lens before moving on. Do not emit any comment
-   from inside a persona pass; persona findings stay in working
-   notes until step 7 synthesizes them.
-
-## Output contract
-
-This contract is non-negotiable -- it is the difference between a
-panel review that lands as one cohesive verdict and one that fragments
-into per-persona noise.
-
-- Produce **exactly one** comment per panel run. The
-  `safe-outputs.add-comment.max` cap in the workflow is a fail-soft
-  ceiling (currently 7, one per persona slot); the one-comment
-  discipline lives here.
-- Use `assets/verdict-template.md` as the comment body. Keep its
-  section headings exactly as written. Adapt the body of each section
-  to the PR. Do not invent new top-level sections or drop existing
-  ones.
-- CEO arbitration may run only after the completeness gate passes.
-- Never emit findings as separate comments, intermediate progress
-  comments, or "I will now invoke X" status comments.
-- Load `assets/verdict-template.md` **at synthesis time only** (step
-  6 above) -- not at activation, not while collecting findings.
+Work through these steps in order. Do not skip ahead. Do not emit any
+output to the PR before step 6.
+
+1. **Read PR context** (the orchestrating workflow already fetched it
+   via `gh pr view` / `gh pr diff` in step 1 of `pr-review-panel.md`).
+   Identify changed files for the auth-expert routing decision.
+
+2. **Resolve the auth-expert conditional** using the rule above.
+   Decide: spawn auth-expert active, OR spawn auth-expert with
+   `active: false` + an `inactive_reason`. Either way, auth-expert
+   IS spawned - the schema requires uniform return shape.
+
+3. **Fan out panelist tasks.** Spawn the following tasks in PARALLEL
+   via the `task` tool, one task per persona:
+   - `python-architect`
+   - `cli-logging-expert`
+   - `devx-ux-expert`
+   - `supply-chain-security-expert`
+   - `oss-growth-hacker`
+   - `auth-expert` (always - active per step 2)
+
+   Each task prompt MUST:
+   - Reference its persona file by relative path so the subagent loads
+     its own scope, lens, and anti-patterns.
+   - Include the PR number, title, body, and diff (passed inline).
+   - Cite `assets/panelist-return-schema.json` and require the
+     subagent to emit JSON matching that schema as its FINAL message.
+   - Restate the output contract: NO `gh` write commands, NO posting
+     comments, NO label changes, NO touching PR state. JSON return
+     only.
+
+4. **S4 schema gate.** When each panelist task returns, parse the JSON
+   and validate against `assets/panelist-return-schema.json`. On
+   validation failure:
+   - Re-spawn that ONE panelist with an explicit error message
+     pointing at the violated rule (e.g. "your previous return was
+     missing the `nits` field").
+   - Maximum two re-spawn attempts per panelist. If still malformed,
+     synthesize a placeholder `{persona: "<slug>", active: true,
+     required: [], nits: [], extras: {schema_failure: "<reason>"}}`
+     and surface the failure in the CEO arbitration prompt.
+
+5. **Spawn the CEO synthesizer task.** Pass the full set of validated
+   panelist JSON returns to a `task` invocation that loads
+   `../../agents/apm-ceo.agent.md`. The prompt MUST:
+   - Provide all panelist returns as structured input.
+   - Ask for arbitration prose, dissent resolution, and any growth
+     signal worth amplifying.
+   - Cite `assets/ceo-return-schema.json` and require JSON return.
+   - Restate the contract: CEO does NOT pick verdict, only writes
+     arbitration. NO `gh` write commands.
+
+   Validate the CEO return against `assets/ceo-return-schema.json`.
+   On failure, re-spawn once with the violation cited.
+
+6. **Compute verdict deterministically:**
+
+   ```
+   total_required = sum(len(p["required"]) for p in panelists if p.get("active", True))
+   verdict = "APPROVE" if total_required == 0 else "REJECT"
+   verdict_label = "panel-approved" if verdict == "APPROVE" else "panel-rejected"
+   ```
+
+   The CEO does not influence this calculation. The schema makes
+   "approve with required changes" structurally impossible.
+
+7. **Render the comment.** Load `assets/verdict-template.md`, fill the
+   placeholders from the panelist + CEO JSON, and emit it as exactly
+   ONE comment via `safe-outputs.add-comment`. NEVER call the GitHub
+   API directly. NEVER emit per-persona comments or progress comments.
+
+8. **Apply the verdict label** via `safe-outputs.add-labels`:
+   `[verdict_label]` (one of `panel-approved` / `panel-rejected`).
+
+9. **Remove the trigger label** via `safe-outputs.remove-labels`:
+   `[panel-review]`. This makes the trigger idempotent - re-applying
+   the label will re-run the panel cleanly.
+
+## Output contract (non-negotiable)
+
+- Exactly ONE comment per panel run, top-loaded per
+  `assets/verdict-template.md`. The `safe-outputs.add-comment.max: 2`
+  is a fail-soft ceiling; the discipline lives here.
+- Exactly ONE verdict label per panel run (`panel-approved` XOR
+  `panel-rejected`).
+- Exactly ONE removal of the `panel-review` trigger label.
+- Subagents (panelists + CEO) NEVER write to `safe-outputs`, NEVER
+  call `gh pr comment`, NEVER call `gh pr edit --add-label`. They
+  return JSON. The orchestrator is the sole writer. This discipline
+  is mirrored in each persona's "Output contract when invoked by
+  apm-review-panel" section.
+- Never invent new top-level template sections or drop existing ones.
 
 ## Gotchas
 
-- **Roster invariant.** The frontmatter description, the roster
-  table, the conditional-panelist rule, the verdict template, and the
-  quality gates MUST agree on the persona set. If you change one,
-  change all of them in the same edit. Description, roster, and
-  template are the three places drift most often appears.
+- **Roster invariant.** The frontmatter description, the roster table,
+  the conditional rule, the verdict template, and the JSON schema
+  MUST agree on the persona set. If you change one, change all in the
+  same edit.
 - **False-negative auth gotcha.** Auth regressions can be introduced
-  from non-auth files that change the *inputs* to auth -- for
-  example host classification, dependency parsing, clone URL
-  construction, HTTP authorization headers, or call sites that
-  bypass `AuthResolver`. If a diff changes how a remote host, org,
-  token source, or fallback path is selected and you are not certain
-  it is auth-neutral, activate `auth-expert`.
+  from non-auth files that change the inputs to auth - host
+  classification, dependency parsing, clone URL construction, HTTP
+  authorization headers, or call sites that bypass `AuthResolver`. If
+  a diff changes how a remote host, org, token source, or fallback
+  path is selected and you are not certain it is auth-neutral,
+  activate auth-expert as `active: true`.
 - **Bundle layout on the runner.** When this skill runs inside the
   PR-review agentic workflow, the APM bundle is unpacked under
-  `.github/skills/apm-review-panel/` first, with `.apm/skills/...`
-  as a fallback. The asset path is the same relative to the skill
-  root (`assets/verdict-template.md`) in both layouts -- prefer the
-  `.github/...` path when present.
-- **No multi-persona-in-one-pass.** Each persona has its own
-  `.agent.md` for a reason -- read it when you take that lens, write
-  the findings, then drop the lens before moving on. Trying to be all
-  personas in one reasoning pass is the most common cause of dropped
-  findings and mixed voices.
-- **Single-emission discipline is fragile under interruption.** If
-  you find yourself wanting to "post a quick partial verdict and
-  then update it", don't. Buffer in working notes; emit once.
+  `.github/skills/apm-review-panel/` first, with `.apm/skills/...` as
+  a fallback. The asset paths are the same relative to the skill root
+  in both layouts.
+- **Subagent write enforcement is contract-based, not sandbox-based.**
+  In gh-aw, tool permissions are workflow-scoped, not subagent-scoped,
+  so every spawned task technically inherits the same `gh` toolset.
+  The "subagents must not write" rule is enforced by the prompt
+  contract in each `.agent.md` plus the `safe-outputs.add-comment.max:
+  2` fail-soft. If a subagent ever tries to post a comment, the cap
+  catches it; if it tries to bypass `safe-outputs` and call `gh`
+  directly, that is a contract violation worth surfacing in the next
+  panel review of the persona file itself.
+- **Trigger-label removal is the LAST step.** Doing it earlier risks
+  another labeller racing the panel mid-run. The companion workflow
+  `pr-panel-label-reset.yml` handles verdict-label invalidation on
+  every new push (deterministically, no LLM).
diff --git a/.github/skills/apm-review-panel/assets/ceo-return-schema.json b/.github/skills/apm-review-panel/assets/ceo-return-schema.json
new file mode 100644
index 000000000..466aa01b5
--- /dev/null
+++ b/.github/skills/apm-review-panel/assets/ceo-return-schema.json
@@ -0,0 +1,23 @@
+{
+  "$schema": "http://json-schema.org/draft-07/schema#",
+  "$id": "ceo-return-schema.json",
+  "title": "APM Review Panel - CEO Synthesizer Return Shape",
+  "description": "Shape the apm-ceo persona MUST return when invoked as the panel synthesizer. CEO writes ARBITRATION PROSE only - never picks the verdict. The orchestrator derives verdict deterministically from aggregated panelist required-counts (APPROVE iff sum(required) == 0, REJECT otherwise).",
+  "type": "object",
+  "required": ["arbitration"],
+  "additionalProperties": false,
+  "properties": {
+    "arbitration": {
+      "type": "string",
+      "description": "One-to-three paragraph synthesis. Resolve any disagreement between specialists. Surface strategic implications (positioning, breaking change, naming, scope). If specialists agreed and the change is uncontroversial, say so plainly. ASCII only."
+    },
+    "dissent_notes": {
+      "type": "string",
+      "description": "Optional. When two or more panelists disagreed on whether a finding is REQUIRED vs NIT, name the disagreement and state which side the CEO sided with and why. Empty / omitted when no dissent."
+    },
+    "growth_signal": {
+      "type": "string",
+      "description": "Optional. CEO surface for the side-channel note from oss-growth-hacker (conversion, narrative, breaking-change comms). Empty / omitted when not relevant."
+    }
+  }
+}
diff --git a/.github/skills/apm-review-panel/assets/panelist-return-schema.json b/.github/skills/apm-review-panel/assets/panelist-return-schema.json
new file mode 100644
index 000000000..89a095b6b
--- /dev/null
+++ b/.github/skills/apm-review-panel/assets/panelist-return-schema.json
@@ -0,0 +1,76 @@
+{
+  "$schema": "http://json-schema.org/draft-07/schema#",
+  "$id": "panelist-return-schema.json",
+  "title": "APM Review Panel - Panelist Return Shape",
+  "description": "Shape every panel persona MUST return when invoked by the apm-review-panel skill. Two severity buckets only: required (blocks merge) and nits (skip-if-you-want, single line). NO third bucket. NO disposition - the orchestrator computes verdict deterministically from aggregated required-counts.",
+  "type": "object",
+  "required": ["persona", "required", "nits"],
+  "additionalProperties": false,
+  "properties": {
+    "persona": {
+      "type": "string",
+      "description": "Persona slug. MUST equal the .agent.md filename stem.",
+      "enum": [
+        "python-architect",
+        "cli-logging-expert",
+        "devx-ux-expert",
+        "supply-chain-security-expert",
+        "oss-growth-hacker",
+        "auth-expert"
+      ]
+    },
+    "active": {
+      "type": "boolean",
+      "description": "Set to false ONLY for auth-expert when the conditional rule does not activate it. All other personas MUST set active=true. When false, required and nits MUST be empty arrays and inactive_reason MUST be a one-sentence explanation citing the touched files.",
+      "default": true
+    },
+    "inactive_reason": {
+      "type": "string",
+      "description": "Required when active=false. One sentence citing the touched files."
+    },
+    "required": {
+      "type": "array",
+      "description": "Findings that MUST be addressed before merge. Empty array means this persona has no blocking concerns. Anything in this array causes the orchestrator to emit a REJECT verdict.",
+      "items": { "$ref": "#/definitions/finding" }
+    },
+    "nits": {
+      "type": "array",
+      "description": "One-line suggestions the author can skip. NEVER block merge. No cap - trust the panelist's judgement.",
+      "items": { "$ref": "#/definitions/finding" }
+    },
+    "extras": {
+      "type": "object",
+      "description": "Persona-specific structured payload. Reserved for python-architect mermaid diagrams and oss-growth-hacker side-channel notes. Never blocks; never affects verdict.",
+      "additionalProperties": true
+    }
+  },
+  "definitions": {
+    "finding": {
+      "type": "object",
+      "required": ["summary", "rationale"],
+      "additionalProperties": false,
+      "properties": {
+        "summary": {
+          "type": "string",
+          "description": "One-line description of the finding. ASCII only. No emojis."
+        },
+        "rationale": {
+          "type": "string",
+          "description": "WHY this matters. Cite the rule or pattern violated. ASCII only."
+        },
+        "file": {
+          "type": "string",
+          "description": "Optional repo-relative path to the file."
+        },
+        "line": {
+          "type": "integer",
+          "description": "Optional line number."
+        },
+        "suggestion": {
+          "type": "string",
+          "description": "Optional concrete fix (diff hint, replacement code, command to run)."
+        }
+      }
+    }
+  }
+}
diff --git a/.github/skills/apm-review-panel/assets/verdict-template.md b/.github/skills/apm-review-panel/assets/verdict-template.md
index ce032bdde..49b18d663 100644
--- a/.github/skills/apm-review-panel/assets/verdict-template.md
+++ b/.github/skills/apm-review-panel/assets/verdict-template.md
@@ -1,62 +1,110 @@
 <!--
 Canonical single-comment template for the APM Review Panel skill.
 
-Load this file ONLY at synthesis time, after every panelist has produced
-its findings. The orchestrator copies this skeleton verbatim, fills the
-placeholders, and emits the result as exactly ONE comment via the
-workflow's `safe-outputs.add-comment` channel.
+Loaded ONLY at synthesis time, AFTER:
+- every panelist task has returned JSON matching panelist-return-schema.json
+- the apm-ceo synthesizer task has returned JSON matching ceo-return-schema.json
+- the orchestrator has computed the binary verdict deterministically:
+    APPROVE iff sum(len(p.required) for p in panelists if p.active) == 0
+    REJECT  otherwise
+
+The orchestrator copies this skeleton verbatim, fills the placeholders
+from the collected JSON, and emits the result as exactly ONE comment via
+the workflow's `safe-outputs.add-comment` channel.
 
 Rules when filling the template:
 - ASCII only. No emojis, no Unicode dashes, no box-drawing characters.
-- Keep total length under ~600 lines.
+- Top-loaded order is non-negotiable: verdict, required, nits, CEO
+  arbitration FIRST. Per-persona detail goes in the collapsed details
+  block at the BOTTOM. Do not flip the order; the user-research is that
+  PR authors must see the decision and required actions in one screen.
 - Do NOT add or remove top-level sections. Adapt their bodies to the PR.
 - Do NOT split this output across multiple comments under any condition.
-- Routing changes which personas run, not which persona headings appear.
-- Only Auth Expert is conditional. If it was not activated for the PR,
-  write "Not activated -- <reason>" as that persona's body. Do not omit
-  the persona heading. All other persona headings always have findings.
-- The Python Architect block MUST contain the two mermaid diagrams and
-  the Design patterns subsection from the python-architect persona's
-  PR review output contract. If those are missing, the synthesis is
-  incomplete -- re-invoke the Python Architect before emitting.
+- The Required and Nits sections are AGGREGATES across all active
+  panelists. Render `[<persona>]` prefix on each item so authors know
+  who raised it.
+- The Per-persona detail block is the FULL JSON-derived findings list
+  for each panelist - even those with required==[] and nits==[]
+  (render those as "No findings.").
+- Auth Expert is the only persona that can render as "Inactive --
+  <reason>" in the per-persona block. Never omit its heading.
+- The python-architect persona MUST contribute the two mermaid diagrams
+  and the Design patterns subsection in its `extras.diagrams` payload;
+  surface them inside its per-persona block.
 -->
 
-## APM Review Panel Verdict
+## APM Review Panel Verdict: <APPROVE|REJECT>
 
-**Disposition**: <APPROVE | REQUEST_CHANGES | NEEDS_DISCUSSION> <optional one-line qualifier, e.g. "(with two minor pre-merge fixes)">
+> <one-line CEO arbitration headline; e.g. "All five specialists agreed; no required changes." or "Architecture is sound but supply-chain finding blocks merge.">
 
----
+### Required before merge (<N> items)
 
-### Per-persona findings
+<If N == 0:>
+None.
+<else, render aggregated required[] across all panelists:>
+- [<persona-slug>] <finding.summary> <if file/line: at `<file>:<line>`>
+  - Why: <finding.rationale>
+  - <if suggestion: Suggested fix: <finding.suggestion>>
+- ...
 
-**Python Architect**: <findings; MUST include the OO/class mermaid diagram, the execution-flow mermaid diagram, and the Design patterns subsection per the python-architect persona's PR review output contract>
+### Nits (<M> items, skip if you want)
 
-**CLI Logging Expert**: <findings>
+<If M == 0:>
+None.
+<else, render aggregated nits[]:>
+- [<persona-slug>] <finding.summary> <if file/line: at `<file>:<line>`>
+- ...
 
-**DevX UX Expert**: <findings>
+### CEO arbitration
 
-**Supply Chain Security Expert**: <findings>
+<ceo.arbitration prose, one to three paragraphs.>
 
-**Auth Expert**: <findings, or "Not activated -- <reason citing the touched files>">
+<if ceo.dissent_notes is non-empty:>
+**Dissent resolved:** <ceo.dissent_notes>
 
-**OSS Growth Hacker**: <findings; if relevant, include side-channel note to CEO about conversion / growth-strategy implications>
+<if ceo.growth_signal is non-empty:>
+**Growth/positioning note:** <ceo.growth_signal>
 
 ---
 
-### CEO arbitration
+<details>
+<summary>Per-persona findings (full)</summary>
 
-<one-paragraph synthesis from apm-ceo: resolve any disagreements between specialists, ratify the disposition, and state the strategic call. If specialists agreed and the change is uncontroversial, say so plainly in one or two sentences.>
+#### Python Architect
 
----
+<if active:>
+<Include the OO/class mermaid diagram and execution-flow mermaid
+diagram from extras.diagrams. Then render required[] + nits[] in full;
+if both empty, write "No findings.">
+<else: should never be inactive>
 
-### Required actions before merge
+#### CLI Logging Expert
 
-1. <required action with concrete pointer (file path, line, diff suggestion). If Disposition is APPROVE with no required actions, write "None." here -- do not omit the section.>
-2. <...>
+<Render required[] + nits[] in full; "No findings." if both empty.>
 
----
+#### DevX UX Expert
+
+<Render required[] + nits[] in full; "No findings." if both empty.>
+
+#### Supply Chain Security Expert
+
+<Render required[] + nits[] in full; "No findings." if both empty.>
+
+#### Auth Expert
+
+<if active:>
+<Render required[] + nits[] in full; "No findings." if both empty.>
+<else:>
+Inactive -- <inactive_reason>
+
+#### OSS Growth Hacker
+
+<Render required[] + nits[] in full; "No findings." if both empty.
+Surface any side-channel growth/positioning note inline; CEO already
+echoed it in the headline section if relevant.>
 
-### Optional follow-ups
+</details>
 
-- <follow-up suggestion that is out of scope for this PR but worth tracking>
-- <...>
+<sub>Verdict computed deterministically: <N> required findings across
+<K> active panelists. APPROVE iff N == 0. Push a new commit to clear
+this verdict label automatically.</sub>
diff --git a/.github/skills/apm-review-panel/evals/README.md b/.github/skills/apm-review-panel/evals/README.md
new file mode 100644
index 000000000..04ae1503b
--- /dev/null
+++ b/.github/skills/apm-review-panel/evals/README.md
@@ -0,0 +1,67 @@
+# Evals: apm-review-panel
+
+Per genesis Step 8 evals gate. Two categories:
+
+1. **TRIGGER EVALS** validate the dispatch description correctly
+   discriminates should-trigger queries from near-miss should-NOT
+   queries. Validation split is the ship gate (>= 0.5 on positives,
+   < 0.5 on negatives).
+
+2. **CONTENT EVALS** validate that the skill, when activated, produces
+   the JSON-derived top-loaded verdict comment in the shape declared
+   by `assets/verdict-template.md`. Run with-skill vs without-skill;
+   if the deltas are not visible, the skill is not adding value.
+
+## Files
+
+- `trigger-evals.json` -- 16 queries (8 should-trigger + 8 should-NOT),
+  60/40 train/val split.
+- `content-eval-clean-pr.md` -- synthetic clean PR scenario; expected
+  verdict = APPROVE, all panelists return `required: []`.
+- `content-eval-rejected-pr.md` -- synthetic PR with one architectural
+  smell + one nit; expected verdict = REJECT, python-architect returns
+  one `required` finding.
+- `run-verdict-harness.py` -- DETERMINISTIC harness covering the
+  parts of the panel that do NOT require an LLM: JSON schema
+  validation (S4 gate) and verdict computation. Five cases: two
+  positive (clean-pr APPROVE, rejected-pr REJECT) and three negative
+  (missing-nits, unknown-persona, disposition-leak). All five MUST
+  pass before merging any change to schemas, the SKILL.md execution
+  checklist verdict rule, or the persona output contracts.
+
+## How to run
+
+### Deterministic harness (free, runs anywhere)
+
+```
+uv run --with jsonschema python3 \
+    .apm/skills/apm-review-panel/evals/run-verdict-harness.py
+```
+
+Expected: `RESULT: ALL PASS`. Proves schemas reject malformed shapes
+and verdict math is correct. Does NOT prove the LLM panelists will
+return well-formed JSON in practice -- that requires a real run.
+
+### Full end-to-end (option B, branch-pin test)
+
+The gh-aw workflow imports the panel skill from `microsoft/apm#main`
+for security (anti-self-approval). Pre-merge validation requires a
+temporary branch pin:
+
+1. On the feature branch, change `imports.packages` in
+   `.github/workflows/pr-review-panel.md` from `microsoft/apm#main`
+   to `microsoft/apm#<feature-branch>`.
+2. `gh aw compile pr-review-panel`.
+3. Commit as `chore: TEMP pin to branch for end-to-end test`.
+4. Open a tiny throwaway PR; label it `panel-review`.
+5. Observe: 6 task threads spawn, JSON returns, verdict label
+   applied, `panel-review` removed, `panel-approved` (or
+   `panel-rejected`) set. Push a commit and watch the
+   deterministic `pr-panel-label-reset.yml` strip the verdict
+   label.
+6. Revert the temp-pin commit before merge.
+
+### Trigger evals
+
+Trigger evals can be run via the genesis evals harness or any
+dispatcher that loads the skill description and queries it.
diff --git a/.github/skills/apm-review-panel/evals/content-eval-clean-pr.md b/.github/skills/apm-review-panel/evals/content-eval-clean-pr.md
new file mode 100644
index 000000000..4a3651dee
--- /dev/null
+++ b/.github/skills/apm-review-panel/evals/content-eval-clean-pr.md
@@ -0,0 +1,49 @@
+# Content eval: clean PR (expected APPROVE)
+
+## Synthetic PR input
+
+- Title: `docs(readme): clarify install command for fish shell users`
+- Body: Adds a one-paragraph note in README.md under the install
+  section pointing fish users to the existing `eval (apm shellenv)`
+  recipe. No code changes.
+- Diff: 4 added lines, 0 removed, 1 file (`README.md`).
+- Files: `README.md`
+
+## Expected orchestrator behavior
+
+- Spawn 6 panelist tasks in parallel (5 mandatory + auth-expert with
+  `active: false` because no auth files touched).
+- Each panelist returns `{persona: "<slug>", required: [], nits: [...]}`
+  where `nits` may contain at most a one-line wording suggestion.
+- auth-expert returns `{persona: "auth-expert", active: false,
+  inactive_reason: "...README.md has no auth surface...", required: [],
+  nits: []}`.
+- CEO synthesizer returns `{arbitration: "Documentation-only change
+  ... all specialists agree no required actions ...", dissent_notes
+  not present}`.
+- Orchestrator computes `total_required = 0` -> verdict = APPROVE.
+- Orchestrator emits ONE comment per `assets/verdict-template.md`:
+  - Header: `## APM Review Panel Verdict: APPROVE`
+  - Required section: "None."
+  - Nits section: rendered (or "None.")
+  - CEO arbitration paragraph
+  - Per-persona detail in collapsed `<details>` block
+- Orchestrator applies `panel-approved` label.
+- Orchestrator removes `panel-review` label.
+
+## Pass criteria
+
+- Comment header literally reads `APM Review Panel Verdict: APPROVE`.
+- Required section reads "None.".
+- Verdict label `panel-approved` is on the PR.
+- `panel-review` label is no longer on the PR.
+- Comment count from this run is exactly 1.
+
+## With-skill vs without-skill delta
+
+- With skill: structured top-loaded comment, deterministic verdict
+  label, fan-out across 6 distinct lenses.
+- Without skill: free-form single-voice review with no label
+  automation, no schema, no guarantee on which lenses got applied.
+
+The structure IS the value.
diff --git a/.github/skills/apm-review-panel/evals/content-eval-rejected-pr.md b/.github/skills/apm-review-panel/evals/content-eval-rejected-pr.md
new file mode 100644
index 000000000..83a2f498b
--- /dev/null
+++ b/.github/skills/apm-review-panel/evals/content-eval-rejected-pr.md
@@ -0,0 +1,62 @@
+# Content eval: PR with architectural smell + nit (expected REJECT)
+
+## Synthetic PR input
+
+- Title: `feat(install): add retry to package downloader`
+- Body: Wraps `_download_package` in a 3-attempt retry loop with
+  fixed 2-second backoff. Adds a `retries` parameter to
+  `InstallPipeline.__init__` defaulting to 3.
+- Diff: 35 added lines, 2 removed, 2 files.
+- Files:
+  - `src/apm_cli/install/pipeline.py` (retry param wiring)
+  - `src/apm_cli/deps/github_downloader.py` (retry loop body inline
+    in `_download_package`)
+
+## Expected orchestrator behavior
+
+- Spawn 6 panelist tasks in parallel.
+- Auth Expert activates (`github_downloader.py` is a fast-path file)
+  and returns `active: true` with possibly one finding about token
+  refresh on 401.
+- Python Architect returns at least one `required` finding:
+  `{summary: "Inline retry loop violates Strategy pattern used
+  elsewhere in deps/", file: "src/apm_cli/deps/github_downloader.py",
+  rationale: "Codebase already has the chain-of-responsibility
+  retry pattern in AuthResolver; this should reuse it or extract a
+  shared retry helper rather than inlining a fixed loop"}`.
+- CLI Logging Expert returns one `nit`: missing `[!]` warning when
+  retry kicks in.
+- DevX UX Expert returns one `nit`: `retries=3` is undocumented in
+  CLI help.
+- Supply Chain Security Expert returns 0 findings (no integrity
+  surface affected).
+- OSS Growth Hacker returns 0 findings.
+- CEO synthesizer returns `{arbitration: "Retry behavior is sound
+  but the inline loop misses the codebase's existing retry
+  abstraction. Python Architect's required action holds.",
+  dissent_notes not present}`.
+- Orchestrator computes `total_required = 1` -> verdict = REJECT.
+- Orchestrator emits ONE comment:
+  - Header: `## APM Review Panel Verdict: REJECT`
+  - Required section: 1 item, prefixed `[python-architect]`.
+  - Nits section: 2 items, prefixed `[cli-logging-expert]` and
+    `[devx-ux-expert]`.
+  - CEO arbitration paragraph.
+  - Per-persona detail in `<details>`.
+- Orchestrator applies `panel-rejected` label.
+- Orchestrator removes `panel-review` label.
+
+## Pass criteria
+
+- Comment header literally reads `APM Review Panel Verdict: REJECT`.
+- Required section has exactly 1 item with `[python-architect]` prefix.
+- Verdict label `panel-rejected` is on the PR.
+- `panel-review` label is no longer on the PR.
+- The auth-expert per-persona block renders (active = true).
+
+## With-skill vs without-skill delta
+
+Same as the clean-PR eval: with skill, the verdict is structured and
+machine-actionable (label + binary outcome). Without skill, there is
+no schema, no required-vs-nit discipline, and the developer cannot
+tell at a glance whether the comment blocks merge or not.
diff --git a/.github/skills/apm-review-panel/evals/results-e2e-pr931-2026-04-28.md b/.github/skills/apm-review-panel/evals/results-e2e-pr931-2026-04-28.md
new file mode 100644
index 000000000..e0fdec31a
--- /dev/null
+++ b/.github/skills/apm-review-panel/evals/results-e2e-pr931-2026-04-28.md
@@ -0,0 +1,71 @@
+# End-to-end eval result: PR #931 (2026-04-28)
+
+Real gh-aw run of the refactored panel against an unrelated open PR
+in microsoft/apm. Branch-pin method (option B) -- workflow YAML
+imported the panel skill from `microsoft/apm#refactor/review-panel-fanout`
+instead of `#main`.
+
+## Setup
+
+- Workflow run: https://github.com/microsoft/apm/actions/runs/25069734881
+- Target PR: #931 (`docs(compile): clarify target routing docstring`,
+  +7/-3, no labels, external contributor)
+- Trigger: `workflow_dispatch` from `refactor/review-panel-fanout`
+  with `pr_number=931`
+- Skill source: `.apm/skills/apm-review-panel/SKILL.md` from this branch
+- gh-aw version: v0.68.3
+
+## Job graph (all SUCCESS)
+
+```
+pre_activation -> activation -> apm -> agent -> detection -> safe_outputs -> conclusion
+```
+
+All 7 jobs completed successfully on first run.
+
+## Acceptance criteria
+
+| # | Criterion | Result |
+|---|---|---|
+| 1 | Verdict header literally reads `## APM Review Panel Verdict: <REJECT|APPROVE>` | PASS -- `## APM Review Panel Verdict: REJECT` |
+| 2 | Top-loaded order: verdict -> required -> nits -> CEO -> per-persona | PASS -- exact order observed |
+| 3 | Required findings rendered as `[persona] summary` + Why + Suggested fix | PASS -- 3 required items, all 3 panelists, all with file/line citations |
+| 4 | Per-persona detail in collapsed `<details>` block at bottom | PASS -- `<details><summary>Per-persona findings (full)</summary>...` |
+| 5 | All 6 personas reported back (5 mandatory + auth-expert) | PASS -- python-architect, cli-logging-expert, devx-ux-expert, supply-chain-security-expert, auth-expert, oss-growth-hacker |
+| 6 | Auth-expert correctly inactive when no auth surface touched | PASS -- "Inactive -- PR #931 modifies only a docstring ... no auth, token, credential, host-classification, or AuthResolver-related files are touched." |
+| 7 | Verdict deterministic from `required[]` count (3 > 0 -> REJECT) | PASS -- correct mapping observed |
+| 8 | Verdict label `panel-approved` XOR `panel-rejected` applied | PASS -- `panel-rejected` added |
+| 9 | Comment count from this run = 1 (max:2 ceiling honored) | PASS -- 1 comment |
+| 10 | Python Architect class diagram in extras passed through to per-persona block | PASS -- mermaid `classDiagram` rendered in details |
+
+## Surprise findings (positive)
+
+- The panel surfaced a REAL regression in PR #931 that prior reviewers
+  had not flagged: the proposed docstring silently drops `gemini`
+  routing that exists in the implementation. Three independent
+  panelists converged on the same root cause with zero dissent. The
+  CEO arbitration explicitly noted this convergence and recommended a
+  rebase-and-revise rather than a close.
+- Growth Hacker correctly identified the contributor as external
+  (WilliamK112) and recommended a "warm rejection that models a
+  low-friction contribution loop" -- the fan-out captured a strategic
+  signal that a single-loop reviewer would have missed.
+
+## Failure modes NOT observed (negative confirmation)
+
+- No "approve with reservations" wording -- verdict is binary as
+  designed.
+- No panelist comment posted to the PR -- only the orchestrator's
+  single synthesized comment. The single-writer interlock held.
+- No `panel-review` label cleanup needed (the run was triggered via
+  workflow_dispatch, not via label, so there was no `panel-review`
+  label to remove). The label-removal path will be exercised when the
+  first real `panel-review`-labelled PR runs after merge.
+
+## Total cost
+
+One gh-aw workflow run, ~12 minutes wall-clock. No retries needed.
+
+## Artifact
+
+Full panel comment: https://github.com/microsoft/apm/pull/931#issuecomment-<see-PR>
diff --git a/.github/skills/apm-review-panel/evals/results-label-reset-2026-04-28.md b/.github/skills/apm-review-panel/evals/results-label-reset-2026-04-28.md
new file mode 100644
index 000000000..022548659
--- /dev/null
+++ b/.github/skills/apm-review-panel/evals/results-label-reset-2026-04-28.md
@@ -0,0 +1,63 @@
+# Label-reset workflow validation (2026-04-28)
+
+End-to-end test of `.github/workflows/pr-panel-label-reset.yml` —
+the deterministic plain-Actions companion that strips verdict labels
+when a PR receives a new commit (S7 DETERMINISTIC TOOL BRIDGE).
+
+## Setup
+
+Probe PR: microsoft/apm#1025 (closed)
+
+- Base: `refactor/review-panel-fanout` (so the workflow YAML is
+  loaded from the feature branch — required because
+  `pull_request: synchronize` reads the BASE-branch workflow file).
+- Head: `test/label-reset-probe` (deleted after test).
+- Initial state: one commit ahead of base, no labels.
+
+## Procedure
+
+1. Applied `panel-rejected` label manually (simulating a panel verdict).
+2. Pushed a second commit to `test/label-reset-probe` head ref.
+3. The push fires `pull_request: synchronize` on PR #1025.
+4. Reset workflow runs from the base-branch YAML and iterates over
+   `panel-approved` / `panel-rejected`, calling `gh pr edit
+   --remove-label` for each.
+
+## Result: PASS
+
+| Check | Expected | Actual |
+|-------|----------|--------|
+| Workflow ran on synchronize | yes | yes — run 25076314935 |
+| Run status | success | success (13s) |
+| `panel-rejected` removed | yes | yes (labels: []) |
+| Missing `panel-approved` handled gracefully | no error | yes (if/else swallows) |
+
+`panel-approved` path uses identical code in the same loop; testing
+one path proves the logic.
+
+## Side finding
+
+The `panel-approved` repo label did NOT exist at test time. The
+refactor PR introduces the workflow and the contract but did not
+create the label itself. Created post-test:
+
+- `panel-approved` (#0E8A16, green) — Apm-review-panel verdict: APPROVE. Removed automatically on next push.
+- `panel-rejected` (#B60205, red) — description and color updated for clarity.
+
+## Run log excerpt
+
+```
+Removed label 'panel-approved' (or it was not present).
+Removed label 'panel-rejected' (or it was not present).
+```
+
+(Both branches of the conditional behave the same; the message text
+is intentionally vague because gh CLI exit codes do not distinguish
+"removed" from "was already absent".)
+
+## Cross-references
+
+- Workflow file: `.github/workflows/pr-panel-label-reset.yml`
+- Probe PR: https://github.com/microsoft/apm/pull/1025
+- Run: https://github.com/microsoft/apm/actions/runs/25076314935
+- Companion e2e for the panel itself: `results-e2e-pr931-2026-04-28.md`
diff --git a/.github/skills/apm-review-panel/evals/results-trigger-2026-04-28.md b/.github/skills/apm-review-panel/evals/results-trigger-2026-04-28.md
new file mode 100644
index 000000000..42804e9e8
--- /dev/null
+++ b/.github/skills/apm-review-panel/evals/results-trigger-2026-04-28.md
@@ -0,0 +1,74 @@
+# Trigger Eval Self-Run (LLM dispatcher = me)
+
+Skill description (verbatim from `.apm/skills/apm-review-panel/SKILL.md`):
+> Use this skill to run a multi-persona expert panel review on a labelled
+> pull request in microsoft/apm. The panel fans out to five mandatory
+> specialists plus one conditional auth specialist, all running in their
+> own agent threads, and a CEO synthesizer. The orchestrator is the sole
+> writer to the PR: one comment plus exactly one verdict label
+> (panel-approved or panel-rejected, derived deterministically from the
+> aggregated findings). Activate when a non-trivial PR needs cross-cutting
+> review (architecture, CLI logging, DevX UX, supply-chain security,
+> growth/positioning, optionally auth, with CEO arbitration).
+
+For each query I record:
+- `expected` = label from trigger-evals.json (TRIGGER / NOT)
+- `judged`   = my dispatch decision as an LLM reading the description above
+- `match`    = expected == judged
+
+## Should-trigger TRAIN (5)
+
+| # | Query | Expected | Judged | Match | Reasoning |
+|---|---|---|---|---|---|
+| 1 | "review this PR with the expert panel" | TRIGGER | TRIGGER | YES | "expert panel" + "PR" maps cleanly to "multi-persona expert panel review on a labelled pull request" |
+| 2 | "run the apm-review-panel against this branch" | TRIGGER | TRIGGER | YES | exact name match |
+| 3 | "do a multi-persona review of PR #1234" | TRIGGER | TRIGGER | YES | "multi-persona" is verbatim in description |
+| 4 | "panel-review this PR" | TRIGGER | TRIGGER | YES | "panel" + "PR" + uses the trigger-label term as verb |
+| 5 | "get the architecture + security + UX review on this PR" | TRIGGER | TRIGGER | YES | enumerates 3 of the 6 listed lenses + PR scope |
+
+Train pass rate: 5/5 = 1.00
+
+## Should-trigger VAL (3)
+
+| # | Query | Expected | Judged | Match | Reasoning |
+|---|---|---|---|---|---|
+| 6 | "ask the python architect, cli logging expert, and security expert to weigh in on this PR" | TRIGGER | TRIGGER | YES | enumerates 3 of the 5 mandatory specialists by name + PR scope |
+| 7 | "have the apm panel verdict this PR" | TRIGGER | TRIGGER | YES | "apm panel" + "verdict" matches "verdict label (panel-approved or panel-rejected)" |
+| 8 | "i need a cross-cutting expert review of this change before merge" | TRIGGER | TRIGGER | YES | "cross-cutting" is verbatim in description; "before merge" implies PR scope |
+
+Val pass rate: 3/3 = 1.00 (gate: >= 0.5) -- PASS
+
+## Should-NOT-trigger TRAIN (5)
+
+| # | Query | Expected | Judged | Match | Reasoning |
+|---|---|---|---|---|---|
+| 9  | "give me a code review of this file" | NOT | NOT | YES | single file, no PR framing, no panel language; would route to a generic code-review skill |
+| 10 | "what does the python architect think of this class hierarchy" | NOT | NOT | YES | single persona, not panel; description requires "multi-persona" |
+| 11 | "review this docstring" | NOT | NOT | YES | scope is one docstring, not a PR; no panel framing |
+| 12 | "fix the lint errors in src/" | NOT | NOT | YES | mechanical task, not review; description is for review only |
+| 13 | "what is the apm-review-panel skill" | NOT | NOT | YES | meta-question about the skill, not a request to run it; an info skill should answer this |
+
+Train pass rate (negative): 5/5 = 1.00 (correct rejection)
+
+## Should-NOT-trigger VAL (3)
+
+| # | Query | Expected | Judged | Match | Reasoning |
+|---|---|---|---|---|---|
+| 14 | "explain how the auth resolver works" | NOT | NOT | YES | explanation request, not review; auth-expert mention is incidental |
+| 15 | "summarize the diff of PR #1234" | NOT | NOT | YES | summarization, not multi-persona review with verdict |
+| 16 | "draft a PR description for me" | NOT | NOT | YES | authoring, not reviewing; pr-description-skill territory |
+
+Val pass rate (negative): 3/3 = 1.00 (gate: < 0.5 false-positive rate, i.e. >= 0.5 correct rejection) -- PASS
+
+## Summary
+
+| Split | Should-trigger correct | Should-NOT correct | Combined |
+|---|---|---|---|
+| Train (n=10) | 5/5 = 1.00 | 5/5 = 1.00 | 10/10 = 1.00 |
+| Val   (n=6)  | 3/3 = 1.00 | 3/3 = 1.00 | 6/6  = 1.00 |
+
+**SHIP GATE (validation split):**
+- Should-trigger pass rate: 1.00 >= 0.5 -- PASS
+- Should-NOT pass rate:    1.00 >= 0.5 -- PASS
+
+**RESULT: PASS.** Dispatch description discriminates cleanly with no false positives or false negatives in this run. Caveat: I am a single LLM judging my own routing; the canonical eval would run a different model as dispatcher and average over multiple samples. But this is real LLM judgment, not hand-waving.
diff --git a/.github/skills/apm-review-panel/evals/run-verdict-harness.py b/.github/skills/apm-review-panel/evals/run-verdict-harness.py
new file mode 100644
index 000000000..b504f0ea3
--- /dev/null
+++ b/.github/skills/apm-review-panel/evals/run-verdict-harness.py
@@ -0,0 +1,277 @@
+#!/usr/bin/env python3
+"""
+Deterministic verdict harness for the apm-review-panel skill.
+
+Validates two things end-to-end WITHOUT spinning up an LLM:
+1. Synthetic panelist JSON returns conform to panelist-return-schema.json.
+2. The orchestrator's deterministic verdict rule produces the expected
+   verdict + label mapping per the eval scenarios.
+
+This is genesis Step 8 evals gate, deterministic slice. It does NOT
+prove that a real LLM panelist will return well-formed JSON (that
+requires option B, the branch-pin end-to-end test). It DOES prove that
+when a panelist returns well-formed JSON, the orchestrator math is
+correct and the schema rejects malformed shapes.
+
+Usage:
+    uv run --with jsonschema python3 \
+        .apm/skills/apm-review-panel/evals/run-verdict-harness.py
+"""
+
+from __future__ import annotations
+
+import json
+import sys
+from pathlib import Path
+
+import jsonschema
+
+ROOT = Path(__file__).resolve().parents[1]
+SCHEMA_PANELIST = json.loads(
+    (ROOT / "assets" / "panelist-return-schema.json").read_text()
+)
+SCHEMA_CEO = json.loads((ROOT / "assets" / "ceo-return-schema.json").read_text())
+
+
+def derive_verdict(panelists: list[dict]) -> tuple[str, str]:
+    """
+    Pure verdict derivation per SKILL.md execution checklist step 6:
+        total_required = sum(len(p["required"]) for p in panelists if p.get("active", True))
+        verdict = "APPROVE" if total_required == 0 else "REJECT"
+    """
+    total_required = sum(
+        len(p["required"]) for p in panelists if p.get("active", True)
+    )
+    verdict = "APPROVE" if total_required == 0 else "REJECT"
+    label = "panel-approved" if verdict == "APPROVE" else "panel-rejected"
+    return verdict, label
+
+
+def case_clean_pr() -> dict:
+    """Synthetic returns for the docs-only PR scenario."""
+    panelists = [
+        {"persona": "python-architect", "required": [], "nits": []},
+        {
+            "persona": "cli-logging-expert",
+            "required": [],
+            "nits": [
+                {
+                    "summary": "Wording: 'aliasing' could be 'aliases'",
+                    "rationale": "Reads more naturally to npm/pip users",
+                    "file": "README.md",
+                    "line": 142,
+                }
+            ],
+        },
+        {"persona": "devx-ux-expert", "required": [], "nits": []},
+        {"persona": "supply-chain-security-expert", "required": [], "nits": []},
+        {"persona": "oss-growth-hacker", "required": [], "nits": []},
+        {
+            "persona": "auth-expert",
+            "active": False,
+            "inactive_reason": "README.md has no auth surface; no fast-path file matched.",
+            "required": [],
+            "nits": [],
+        },
+    ]
+    ceo = {
+        "arbitration": (
+            "Documentation-only change pointing fish users to an existing "
+            "recipe. All five mandatory specialists agree no required actions; "
+            "auth-expert correctly inactive."
+        )
+    }
+    return {
+        "name": "clean-pr",
+        "panelists": panelists,
+        "ceo": ceo,
+        "expected_verdict": "APPROVE",
+        "expected_label": "panel-approved",
+    }
+
+
+def case_rejected_pr() -> dict:
+    """Synthetic returns for the inline-retry-loop PR scenario."""
+    panelists = [
+        {
+            "persona": "python-architect",
+            "required": [
+                {
+                    "summary": "Inline retry loop violates Strategy pattern in deps/",
+                    "rationale": (
+                        "Codebase already has chain-of-responsibility retry "
+                        "via AuthResolver; reuse or extract a shared helper "
+                        "instead of inlining a fixed loop."
+                    ),
+                    "file": "src/apm_cli/deps/github_downloader.py",
+                    "line": 218,
+                    "suggestion": "Extract retry into deps/_retry.py with same shape as AuthResolver chain.",
+                }
+            ],
+            "nits": [],
+        },
+        {
+            "persona": "cli-logging-expert",
+            "required": [],
+            "nits": [
+                {
+                    "summary": "Missing [!] warning when retry kicks in",
+                    "rationale": "User-visible network blip; STATUS_SYMBOLS guidance applies.",
+                    "file": "src/apm_cli/deps/github_downloader.py",
+                }
+            ],
+        },
+        {
+            "persona": "devx-ux-expert",
+            "required": [],
+            "nits": [
+                {
+                    "summary": "retries=3 undocumented in CLI help",
+                    "rationale": "Mental model gap; npm/pip users expect to see configurable retries surfaced.",
+                }
+            ],
+        },
+        {"persona": "supply-chain-security-expert", "required": [], "nits": []},
+        {"persona": "oss-growth-hacker", "required": [], "nits": []},
+        {
+            "persona": "auth-expert",
+            "active": True,
+            "required": [],
+            "nits": [
+                {
+                    "summary": "Consider 401-aware retry",
+                    "rationale": "Stale-PAT handling already exists in AuthResolver; align.",
+                }
+            ],
+        },
+    ]
+    ceo = {
+        "arbitration": (
+            "Retry behavior is sound but the inline loop misses the "
+            "codebase's existing retry abstraction. Python Architect's "
+            "required action holds; nits across logging/devx/auth align."
+        ),
+        "dissent_notes": "",
+    }
+    return {
+        "name": "rejected-pr",
+        "panelists": panelists,
+        "ceo": ceo,
+        "expected_verdict": "REJECT",
+        "expected_label": "panel-rejected",
+    }
+
+
+def case_malformed_panelist() -> dict:
+    """Negative case: panelist forgot the `nits` field. S4 gate must fail."""
+    return {
+        "name": "malformed-missing-nits",
+        "panelist": {"persona": "python-architect", "required": []},
+        "expect_schema_error": True,
+    }
+
+
+def case_unknown_persona() -> dict:
+    """Negative case: persona enum violation."""
+    return {
+        "name": "malformed-unknown-persona",
+        "panelist": {
+            "persona": "unknown-persona",
+            "required": [],
+            "nits": [],
+        },
+        "expect_schema_error": True,
+    }
+
+
+def case_disposition_leak() -> dict:
+    """Negative case: panelist tried to add a verdict field. additionalProperties:false catches."""
+    return {
+        "name": "malformed-disposition-leak",
+        "panelist": {
+            "persona": "python-architect",
+            "required": [],
+            "nits": [],
+            "disposition": "APPROVE",
+        },
+        "expect_schema_error": True,
+    }
+
+
+def run_positive(case: dict) -> tuple[bool, str]:
+    """Validate every panelist + CEO return, then check verdict."""
+    notes = []
+    for p in case["panelists"]:
+        try:
+            jsonschema.validate(p, SCHEMA_PANELIST)
+        except jsonschema.ValidationError as e:
+            return False, f"panelist {p.get('persona')} failed schema: {e.message}"
+        notes.append(
+            f"  - {p['persona']}: active={p.get('active', True)}, "
+            f"required={len(p['required'])}, nits={len(p['nits'])}"
+        )
+
+    try:
+        jsonschema.validate(case["ceo"], SCHEMA_CEO)
+    except jsonschema.ValidationError as e:
+        return False, f"CEO failed schema: {e.message}"
+
+    verdict, label = derive_verdict(case["panelists"])
+    if verdict != case["expected_verdict"]:
+        return (
+            False,
+            f"expected verdict={case['expected_verdict']}, got {verdict}",
+        )
+    if label != case["expected_label"]:
+        return False, f"expected label={case['expected_label']}, got {label}"
+
+    return True, "\n".join(
+        notes
+        + [
+            f"  -> verdict={verdict}, label={label} (expected matched)",
+        ]
+    )
+
+
+def run_negative(case: dict) -> tuple[bool, str]:
+    """Validate that schema rejects the malformed shape."""
+    try:
+        jsonschema.validate(case["panelist"], SCHEMA_PANELIST)
+    except jsonschema.ValidationError as e:
+        return True, f"correctly rejected: {e.message.splitlines()[0]}"
+    return False, "schema accepted a malformed return; S4 gate would have let it through"
+
+
+def main() -> int:
+    cases_pos = [case_clean_pr(), case_rejected_pr()]
+    cases_neg = [case_malformed_panelist(), case_unknown_persona(), case_disposition_leak()]
+
+    ok = True
+    print("APM Review Panel - Deterministic Verdict Harness")
+    print("=" * 60)
+    print()
+    print("Positive cases (well-formed JSON, expected verdict):")
+    for c in cases_pos:
+        passed, msg = run_positive(c)
+        marker = "[+] PASS" if passed else "[x] FAIL"
+        print(f"{marker}  {c['name']}")
+        print(msg)
+        print()
+        ok = ok and passed
+
+    print("Negative cases (malformed JSON, S4 schema gate must reject):")
+    for c in cases_neg:
+        passed, msg = run_negative(c)
+        marker = "[+] PASS" if passed else "[x] FAIL"
+        print(f"{marker}  {c['name']}")
+        print(f"  {msg}")
+        print()
+        ok = ok and passed
+
+    print("=" * 60)
+    print("RESULT:", "ALL PASS" if ok else "FAIL")
+    return 0 if ok else 1
+
+
+if __name__ == "__main__":
+    sys.exit(main())
diff --git a/.github/skills/apm-review-panel/evals/trigger-evals.json b/.github/skills/apm-review-panel/evals/trigger-evals.json
new file mode 100644
index 000000000..cbedde29d
--- /dev/null
+++ b/.github/skills/apm-review-panel/evals/trigger-evals.json
@@ -0,0 +1,31 @@
+{
+  "description": "Trigger evals for the apm-review-panel skill dispatch description. 8 should-trigger + 8 should-NOT-trigger. 60/40 train/val split (10 train, 6 val). Validation split is the ship gate.",
+  "should_trigger": {
+    "train": [
+      "review this PR with the expert panel",
+      "run the apm-review-panel against this branch",
+      "do a multi-persona review of PR #1234",
+      "panel-review this PR",
+      "get the architecture + security + UX review on this PR"
+    ],
+    "val": [
+      "ask the python architect, cli logging expert, and security expert to weigh in on this PR",
+      "have the apm panel verdict this PR",
+      "i need a cross-cutting expert review of this change before merge"
+    ]
+  },
+  "should_not_trigger": {
+    "train": [
+      "give me a code review of this file",
+      "what does the python architect think of this class hierarchy",
+      "review this docstring",
+      "fix the lint errors in src/",
+      "what is the apm-review-panel skill"
+    ],
+    "val": [
+      "explain how the auth resolver works",
+      "summarize the diff of PR #1234",
+      "draft a PR description for me"
+    ]
+  }
+}
diff --git a/.github/workflows/pr-panel-label-reset.yml b/.github/workflows/pr-panel-label-reset.yml
new file mode 100644
index 000000000..28dbd25a5
--- /dev/null
+++ b/.github/workflows/pr-panel-label-reset.yml
@@ -0,0 +1,43 @@
+name: PR Panel Label Reset
+
+# Purpose: when a PR with a panel verdict label gets a new commit pushed,
+# strip both verdict labels (panel-approved, panel-rejected) so the verdict
+# does not lie. This is a deterministic operation - no judgement, no LLM.
+# Per genesis S7 DETERMINISTIC TOOL BRIDGE: facts-that-must-be-true with
+# zero judgement do not belong in an LLM container.
+#
+# The companion gh-aw workflow pr-review-panel.md handles the smart side
+# (panel orchestration, verdict derivation, label application). This file
+# handles the dumb side (verdict invalidation on new push).
+#
+# This workflow does NOT remove the panel-review trigger label. That is
+# done inside pr-review-panel.md after the panel completes (so the
+# trigger label stays idempotent: re-add to re-run).
+
+on:
+  pull_request:
+    types: [synchronize]
+
+permissions:
+  pull-requests: write
+  contents: read
+
+jobs:
+  reset:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Remove stale panel verdict labels
+        env:
+          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          PR_NUMBER: ${{ github.event.pull_request.number }}
+          REPO: ${{ github.repository }}
+        run: |
+          for label in panel-approved panel-rejected; do
+            # Best-effort removal: silently succeed when the label is
+            # not currently applied (gh exits non-zero in that case).
+            if gh pr edit "$PR_NUMBER" --repo "$REPO" --remove-label "$label" 2>&1; then
+              echo "Removed label '$label' (or it was not present)."
+            else
+              echo "Label '$label' not present - nothing to do."
+            fi
+          done
diff --git a/.github/workflows/pr-review-panel.lock.yml b/.github/workflows/pr-review-panel.lock.yml
index a095d6997..2a0df631c 100644
--- a/.github/workflows/pr-review-panel.lock.yml
+++ b/.github/workflows/pr-review-panel.lock.yml
@@ -1,4 +1,4 @@
-# gh-aw-metadata: {"schema_version":"v3","frontmatter_hash":"a8730b9c31fe50d5a50ee9d99212bc6da387fc069501aa0a425bdd8951ba9541","compiler_version":"v0.68.3","strict":true,"agent_id":"copilot"}
+# gh-aw-metadata: {"schema_version":"v3","frontmatter_hash":"b815b0974637d563cb04400baa74bf4b2d661724ea984b770276ef17e827ae8a","compiler_version":"v0.68.3","strict":true,"agent_id":"copilot"}
 # gh-aw-manifest: {"version":1,"secrets":["COPILOT_GITHUB_TOKEN","GH_AW_GITHUB_MCP_SERVER_TOKEN","GH_AW_GITHUB_TOKEN","GH_AW_PLUGINS_TOKEN","GITHUB_TOKEN"],"actions":[{"repo":"actions/checkout","sha":"de0fac2e4500dabe0009e67214ff5f5447ce83dd","version":"v6.0.2"},{"repo":"actions/download-artifact","sha":"3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c","version":"v8.0.1"},{"repo":"actions/github-script","sha":"373c709c69115d41ff229c7e5df9f8788daa9553","version":"v9"},{"repo":"actions/upload-artifact","sha":"043fb46d1a93c77aae656e7c1c64a875d1fc6a0a","version":"v7"},{"repo":"github/gh-aw-actions/setup","sha":"ba90f2186d7ad780ec640f364005fa24e797b360","version":"v0.68.3"},{"repo":"microsoft/apm-action","sha":"9fe9337ef58b5e620e0113071ceb47a6a8a232f7","version":"v1.4.2"}],"containers":[{"image":"ghcr.io/github/gh-aw-firewall/agent:0.25.20"},{"image":"ghcr.io/github/gh-aw-firewall/api-proxy:0.25.20"},{"image":"ghcr.io/github/gh-aw-firewall/squid:0.25.20"},{"image":"ghcr.io/github/gh-aw-mcpg:v0.2.19"},{"image":"ghcr.io/github/github-mcp-server:v0.32.0"},{"image":"node:lts-alpine"}]}
 #    ___                   _   _      
 #   / _ \                 | | (_)     
@@ -213,16 +213,16 @@ jobs:
         run: |
           bash "${RUNNER_TEMP}/gh-aw/actions/create_prompt_first.sh"
           {
-          cat << 'GH_AW_PROMPT_daad15abf8de7f2c_EOF'
+          cat << 'GH_AW_PROMPT_6a61e25df8c4956d_EOF'
           <system>
-          GH_AW_PROMPT_daad15abf8de7f2c_EOF
+          GH_AW_PROMPT_6a61e25df8c4956d_EOF
           cat "${RUNNER_TEMP}/gh-aw/prompts/xpia.md"
           cat "${RUNNER_TEMP}/gh-aw/prompts/temp_folder_prompt.md"
           cat "${RUNNER_TEMP}/gh-aw/prompts/markdown.md"
           cat "${RUNNER_TEMP}/gh-aw/prompts/safe_outputs_prompt.md"
-          cat << 'GH_AW_PROMPT_daad15abf8de7f2c_EOF'
+          cat << 'GH_AW_PROMPT_6a61e25df8c4956d_EOF'
           <safe-output-tools>
-          Tools: add_comment(max:7), missing_tool, missing_data, noop
+          Tools: add_comment(max:2), add_labels, remove_labels, missing_tool, missing_data, noop
           </safe-output-tools>
           <github-context>
           The following GitHub context information is available for this workflow:
@@ -252,15 +252,15 @@ jobs:
           {{/if}}
           </github-context>
           
-          GH_AW_PROMPT_daad15abf8de7f2c_EOF
+          GH_AW_PROMPT_6a61e25df8c4956d_EOF
           cat "${RUNNER_TEMP}/gh-aw/prompts/github_mcp_tools_with_safeoutputs_prompt.md"
-          cat << 'GH_AW_PROMPT_daad15abf8de7f2c_EOF'
+          cat << 'GH_AW_PROMPT_6a61e25df8c4956d_EOF'
           </system>
           
           
           
           {{#runtime-import .github/workflows/pr-review-panel.md}}
-          GH_AW_PROMPT_daad15abf8de7f2c_EOF
+          GH_AW_PROMPT_6a61e25df8c4956d_EOF
           } > "$GH_AW_PROMPT"
       - name: Interpolate variables and render templates
         uses: actions/github-script@373c709c69115d41ff229c7e5df9f8788daa9553 # v9
@@ -450,15 +450,17 @@ jobs:
           mkdir -p "${RUNNER_TEMP}/gh-aw/safeoutputs"
           mkdir -p /tmp/gh-aw/safeoutputs
           mkdir -p /tmp/gh-aw/mcp-logs/safeoutputs
-          cat > "${RUNNER_TEMP}/gh-aw/safeoutputs/config.json" << 'GH_AW_SAFE_OUTPUTS_CONFIG_f053aa54733793da_EOF'
-          {"add_comment":{"max":7},"create_report_incomplete_issue":{},"missing_data":{},"missing_tool":{},"noop":{"max":1,"report-as-issue":"true"},"report_incomplete":{}}
-          GH_AW_SAFE_OUTPUTS_CONFIG_f053aa54733793da_EOF
+          cat > "${RUNNER_TEMP}/gh-aw/safeoutputs/config.json" << 'GH_AW_SAFE_OUTPUTS_CONFIG_8270e0dce94dfaaa_EOF'
+          {"add_comment":{"max":2},"add_labels":{"allowed":["panel-approved","panel-rejected"],"max":1},"create_report_incomplete_issue":{},"missing_data":{},"missing_tool":{},"noop":{"max":1,"report-as-issue":"true"},"remove_labels":{"allowed":["panel-review"],"max":1},"report_incomplete":{}}
+          GH_AW_SAFE_OUTPUTS_CONFIG_8270e0dce94dfaaa_EOF
       - name: Write Safe Outputs Tools
         env:
           GH_AW_TOOLS_META_JSON: |
             {
               "description_suffixes": {
-                "add_comment": " CONSTRAINTS: Maximum 7 comment(s) can be added. Supports reply_to_id for discussion threading."
+                "add_comment": " CONSTRAINTS: Maximum 2 comment(s) can be added. Supports reply_to_id for discussion threading.",
+                "add_labels": " CONSTRAINTS: Maximum 1 label(s) can be added. Only these labels are allowed: [\"panel-approved\" \"panel-rejected\"].",
+                "remove_labels": " CONSTRAINTS: Maximum 1 label(s) can be removed. Only these labels can be removed: [panel-review]."
               },
               "repo_params": {},
               "dynamic_tools": []
@@ -487,6 +489,25 @@ jobs:
                   }
                 }
               },
+              "add_labels": {
+                "defaultMax": 5,
+                "fields": {
+                  "item_number": {
+                    "issueNumberOrTemporaryId": true
+                  },
+                  "labels": {
+                    "required": true,
+                    "type": "array",
+                    "itemType": "string",
+                    "itemSanitize": true,
+                    "itemMaxLength": 128
+                  },
+                  "repo": {
+                    "type": "string",
+                    "maxLength": 256
+                  }
+                }
+              },
               "missing_data": {
                 "defaultMax": 20,
                 "fields": {
@@ -544,6 +565,25 @@ jobs:
                   }
                 }
               },
+              "remove_labels": {
+                "defaultMax": 5,
+                "fields": {
+                  "item_number": {
+                    "issueNumberOrTemporaryId": true
+                  },
+                  "labels": {
+                    "required": true,
+                    "type": "array",
+                    "itemType": "string",
+                    "itemSanitize": true,
+                    "itemMaxLength": 128
+                  },
+                  "repo": {
+                    "type": "string",
+                    "maxLength": 256
+                  }
+                }
+              },
               "report_incomplete": {
                 "defaultMax": 5,
                 "fields": {
@@ -636,7 +676,7 @@ jobs:
           export MCP_GATEWAY_DOCKER_COMMAND='docker run -i --rm --network host -v /var/run/docker.sock:/var/run/docker.sock -e MCP_GATEWAY_PORT -e MCP_GATEWAY_DOMAIN -e MCP_GATEWAY_API_KEY -e MCP_GATEWAY_PAYLOAD_DIR -e MCP_GATEWAY_PAYLOAD_SIZE_THRESHOLD -e DEBUG -e MCP_GATEWAY_LOG_DIR -e GH_AW_MCP_LOG_DIR -e GH_AW_SAFE_OUTPUTS -e GH_AW_SAFE_OUTPUTS_CONFIG_PATH -e GH_AW_SAFE_OUTPUTS_TOOLS_PATH -e GH_AW_ASSETS_BRANCH -e GH_AW_ASSETS_MAX_SIZE_KB -e GH_AW_ASSETS_ALLOWED_EXTS -e DEFAULT_BRANCH -e GITHUB_MCP_SERVER_TOKEN -e GITHUB_MCP_GUARD_MIN_INTEGRITY -e GITHUB_MCP_GUARD_REPOS -e GITHUB_REPOSITORY -e GITHUB_SERVER_URL -e GITHUB_SHA -e GITHUB_WORKSPACE -e GITHUB_TOKEN -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RUN_ATTEMPT -e GITHUB_JOB -e GITHUB_ACTION -e GITHUB_EVENT_NAME -e GITHUB_EVENT_PATH -e GITHUB_ACTOR -e GITHUB_ACTOR_ID -e GITHUB_TRIGGERING_ACTOR -e GITHUB_WORKFLOW -e GITHUB_WORKFLOW_REF -e GITHUB_WORKFLOW_SHA -e GITHUB_REF -e GITHUB_REF_NAME -e GITHUB_REF_TYPE -e GITHUB_HEAD_REF -e GITHUB_BASE_REF -e GH_AW_SAFE_OUTPUTS_PORT -e GH_AW_SAFE_OUTPUTS_API_KEY -v /tmp/gh-aw/mcp-payloads:/tmp/gh-aw/mcp-payloads:rw -v /opt:/opt:ro -v /tmp:/tmp:rw -v '"${GITHUB_WORKSPACE}"':'"${GITHUB_WORKSPACE}"':rw ghcr.io/github/gh-aw-mcpg:v0.2.19'
           
           mkdir -p /home/runner/.copilot
-          cat << GH_AW_MCP_CONFIG_f34a70f444ade5b1_EOF | bash "${RUNNER_TEMP}/gh-aw/actions/start_mcp_gateway.sh"
+          cat << GH_AW_MCP_CONFIG_2bd9111d9b098208_EOF | bash "${RUNNER_TEMP}/gh-aw/actions/start_mcp_gateway.sh"
           {
             "mcpServers": {
               "github": {
@@ -677,7 +717,7 @@ jobs:
               "payloadDir": "${MCP_GATEWAY_PAYLOAD_DIR}"
             }
           }
-          GH_AW_MCP_CONFIG_f34a70f444ade5b1_EOF
+          GH_AW_MCP_CONFIG_2bd9111d9b098208_EOF
       - name: Download activation artifact
         uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
         with:
@@ -1319,7 +1359,7 @@ jobs:
           GH_AW_ALLOWED_DOMAINS: "*.githubusercontent.com,api.business.githubcopilot.com,api.enterprise.githubcopilot.com,api.github.com,api.githubcopilot.com,api.individual.githubcopilot.com,api.snapcraft.io,archive.ubuntu.com,azure.archive.ubuntu.com,codeload.github.com,crl.geotrust.com,crl.globalsign.com,crl.identrust.com,crl.sectigo.com,crl.thawte.com,crl.usertrust.com,crl.verisign.com,crl3.digicert.com,crl4.digicert.com,crls.ssl.com,docs.github.com,github-cloud.githubusercontent.com,github-cloud.s3.amazonaws.com,github.blog,github.com,github.githubassets.com,host.docker.internal,json-schema.org,json.schemastore.org,keyserver.ubuntu.com,lfs.github.com,objects.githubusercontent.com,ocsp.digicert.com,ocsp.geotrust.com,ocsp.globalsign.com,ocsp.identrust.com,ocsp.sectigo.com,ocsp.ssl.com,ocsp.thawte.com,ocsp.usertrust.com,ocsp.verisign.com,packagecloud.io,packages.cloud.google.com,packages.microsoft.com,ppa.launchpad.net,raw.githubusercontent.com,registry.npmjs.org,s.symcb.com,s.symcd.com,security.ubuntu.com,telemetry.enterprise.githubcopilot.com,ts-crl.ws.symantec.com,ts-ocsp.ws.symantec.com,www.googleapis.com"
           GITHUB_SERVER_URL: ${{ github.server_url }}
           GITHUB_API_URL: ${{ github.api_url }}
-          GH_AW_SAFE_OUTPUTS_HANDLER_CONFIG: "{\"add_comment\":{\"max\":7},\"create_report_incomplete_issue\":{},\"missing_data\":{},\"missing_tool\":{},\"noop\":{\"max\":1,\"report-as-issue\":\"true\"},\"report_incomplete\":{}}"
+          GH_AW_SAFE_OUTPUTS_HANDLER_CONFIG: "{\"add_comment\":{\"max\":2},\"add_labels\":{\"allowed\":[\"panel-approved\",\"panel-rejected\"],\"max\":1},\"create_report_incomplete_issue\":{},\"missing_data\":{},\"missing_tool\":{},\"noop\":{\"max\":1,\"report-as-issue\":\"true\"},\"remove_labels\":{\"allowed\":[\"panel-review\"],\"max\":1},\"report_incomplete\":{}}"
         with:
           github-token: ${{ secrets.GH_AW_GITHUB_TOKEN || secrets.GITHUB_TOKEN }}
           script: |
diff --git a/.github/workflows/pr-review-panel.md b/.github/workflows/pr-review-panel.md
index b93557a0c..ae3cc0c4d 100644
--- a/.github/workflows/pr-review-panel.md
+++ b/.github/workflows/pr-review-panel.md
@@ -25,9 +25,16 @@ description: Multi-persona expert panel review of labelled PRs, posting a single
 #        `gh pr diff` which return inert text
 #      - imports are pinned to microsoft/apm#main (panel skill +
 #        persona definitions are trusted, not from the PR)
-#      - the only write surface is safe-outputs.add-comment (max 7
-#        is a safety ceiling; the agent is instructed to emit one
-#        synthesized verdict comment)
+#      - write surfaces are tightly scoped:
+#          add-comment max 2 (one CEO comment + one safety overflow)
+#          add-labels allowed [panel-approved, panel-rejected] max 1
+#            (mutually exclusive verdict; orchestrator emits exactly one)
+#          remove-labels allowed [panel-review] max 1
+#            (clear the trigger label after the run so re-applying it
+#             re-runs the panel idempotently)
+#        The verdict labels themselves are stripped on every new push
+#        by the deterministic companion workflow pr-panel-label-reset.yml
+#        (plain GitHub Actions, no LLM).
 #      - `roles: [admin, maintainer, write]` ensures only repo
 #        maintainers can trigger -- matches the trust model that
 #        applying the `panel-review` label requires write access.
@@ -99,8 +106,21 @@ network:
     - github
 
 safe-outputs:
+  # Single CEO comment per panel run. max:2 is a fail-soft ceiling; the
+  # one-comment discipline lives inside the apm-review-panel skill.
   add-comment:
-    max: 7
+    max: 2
+  # Verdict label. Mutually exclusive (orchestrator picks exactly one).
+  # The companion workflow pr-panel-label-reset.yml strips both on every
+  # new push so a stale verdict can never linger past a code change.
+  add-labels:
+    allowed: [panel-approved, panel-rejected]
+    max: 1
+  # Trigger label cleanup. Removed after the run so re-applying
+  # `panel-review` re-triggers the panel cleanly.
+  remove-labels:
+    allowed: [panel-review]
+    max: 1
 
 timeout-minutes: 30
 ---
diff --git a/apm.lock.yaml b/apm.lock.yaml
index b8facdf76..eaa002865 100644
--- a/apm.lock.yaml
+++ b/apm.lock.yaml
@@ -33,16 +33,16 @@ local_deployed_files:
 - .github/skills/supply-chain-security
 local_deployed_file_hashes:
   .github/agents/agentic-workflows.agent.md: sha256:d1ea2d038e2af8be11d6c95b3213b03b9777fae46f0438efa95d5a803e6c3765
-  .github/agents/apm-ceo.agent.md: sha256:dfc436e6eeffc7ec1c2f556edb78e4a5166ac36d162ea720d08b4b79af0a9938
+  .github/agents/apm-ceo.agent.md: sha256:82b259a9fbef2ba44acdb354921f23ee1caf4a70df2dc8bfddac64b1ede95630
   .github/agents/apm-primitives-architect.agent.md: sha256:6c01eab74ba18d70f21d45010d636cc6535d63cee81da12e61898d8036e0b028
-  .github/agents/auth-expert.agent.md: sha256:85409aab097cf239e5aa7ad61db6c4586be9884ef64a45fa9c894017b046b56b
-  .github/agents/cli-logging-expert.agent.md: sha256:24bf6c4b420c42292700ad0eb80b53d275be5c9cb186d471d706211f8419e3b9
-  .github/agents/devx-ux-expert.agent.md: sha256:3472680f43b2b4411b9437ec31529216afd4e576e1874c14430935e7f1ded1f2
+  .github/agents/auth-expert.agent.md: sha256:18264a933cba432b77d133e6ae11eee294c92ed245629af8c9b7a5bb7a9a300c
+  .github/agents/cli-logging-expert.agent.md: sha256:3ed7fe1a2e28e03a40311d4999ef54330908920d6515205708dd3f037abfcf0f
+  .github/agents/devx-ux-expert.agent.md: sha256:8310d130cca5bc548baf4a2a84e3c9680c9dc5d83a2718150636896ab2aa1f30
   .github/agents/doc-analyser.agent.md: sha256:47b1d0204904b786c19d4fe84343e86cdab6f92f862f676ba741ffe6e1385679
   .github/agents/doc-writer.agent.md: sha256:328a5b9ea079869b8ccd914a6e2135c204225a5eedb42f59a1ec73058f7f0b47
-  .github/agents/oss-growth-hacker.agent.md: sha256:8d18f5be46913c40ad3aa66fb984575a88988cfac402d39353cdfb09f7e582c5
-  .github/agents/python-architect.agent.md: sha256:a70b1be6cd877a85414a65a1db8117be7dbecc3865d98045f46ed14462fda099
-  .github/agents/supply-chain-security-expert.agent.md: sha256:9a4e731b12e7658f71d54c22e90f80ce0c45e3eacbb069b8505ed96ec9e79ba5
+  .github/agents/oss-growth-hacker.agent.md: sha256:1cd56bb78ab37d52c50e45ab69d759f775cd49cdf35981b3dc6c4004315c6b83
+  .github/agents/python-architect.agent.md: sha256:d268f86b55e137ecb09d137f6c3e12ecb9acceff34701c04f11dbdd1e08f828b
+  .github/agents/supply-chain-security-expert.agent.md: sha256:8fb8cc426d6af17ba084a28b3f026c2b475b62e3ca63ed2f88b83bd823f877af
   .github/instructions/changelog.instructions.md: sha256:1e51ec4c74e847967962bd279dc4c6e582c5d3578490b3c28d5f3acd3e05f73e
   .github/instructions/cicd.instructions.md: sha256:9c0fafc74f743aa97e5adba2168d66c9e3a327b135065e3b804bdbb5f04cda5d
   .github/instructions/cli.instructions.md: sha256:8e39e8d5047ce88575cb02f87c2bcede584dfef258bd86f7466c7badf136541a