Skip to content

feat(review): AI review agents, local worktree, and UI polish#491

Merged
backnotprop merged 38 commits intomainfrom
feat/hook-up-review-ai
Apr 6, 2026
Merged

feat(review): AI review agents, local worktree, and UI polish#491
backnotprop merged 38 commits intomainfrom
feat/hook-up-review-ai

Conversation

@backnotprop
Copy link
Copy Markdown
Owner

@backnotprop backnotprop commented Apr 6, 2026

Summary

  • AI review agents: Codex and Claude Code as background review providers with live log streaming, structured output parsing, and findings as external annotations in the diff viewer
  • Claude-specific review model: Own schema (severity-based: important/nit/pre_existing), multi-agent prompt (Bug+Regression, Security, Code Quality, Guideline Compliance), reasoning traces per finding, converged with Claude Code's remote review service
  • Local worktree (default for PRs): Automatically creates a temporary worktree/clone so agents have full file access. Same-repo uses fast git worktree with fresh origin/; cross-repo uses shallow clone with local + remote tracking refs. --no-local to opt out
  • Separated concerns: agentCwd option keeps agent sandbox completely independent from the diff rendering pipeline — no contamination of file contents, diff switching, or expandable context
  • PR comments panel overhaul: Search, sort, collapse, author filter, inline review threads with resolved/outdated state and diff hunk previews via @pierre/diffs
  • Pi extension parity: Full agent review mirroring — stdin/stdout/stderr handling, live log streaming, result ingestion, buildCommand/onJobComplete callbacks. Vendored review modules with import rewriting. Runtime-agnostic codex-review.ts (node:fs/promises instead of Bun APIs)
  • UI polish: Unified 33px panel headers, consistent borders/backgrounds/typography, responsive toolbar buttons, dockview tab focus fix, shared ReviewAgentsIcon, severity markers + reasoning in inline annotations and copy/export paths
  • Security: XSS fix (removed onerror from DOMPurify allowlist), removed find/ls/cat from Claude allowed tools (exec bypass), added glab CLI support, git flag injection guards, baseSha hex validation, shell provider removed, same-repo detection includes host comparison for GHE
  • Reliability: stdout drain race fix (2s grace period after proc.exited), Codex verdict override for P0/P1 findings, Claude parse failure logging, temp clone cleanup on failed fetch/checkout
  • @pierre/diffs upgrade: 1.1.0-beta.19 → 1.1.12 (13 stable releases of bug fixes)
  • Docs: AI Code Review Agents guide with full prompt/command/schema transparency

Test plan

  • plannotator review <pr-url> — PR loads with local worktree by default, worktree badge visible
  • plannotator review <pr-url> --no-local — no worktree, agents run without file access
  • plannotator review (local, no URL) — diff switcher works, merge-base option available
  • Launch Codex agent — findings appear with [P0]/[P1] tags, verdict red when P0/P1 present
  • Launch Claude agent — findings appear with [important]/[nit] tags, severity dots, reasoning text
  • Agent detail panel: live logs stream, findings sorted by severity, legend visible, copy includes reasoning
  • PR comments panel: search works, threads show resolved/outdated state, no j/k conflict with file tree
  • Panel headers aligned at same height across all three columns
  • Same-repo detection works for GHE (host comparison)
  • Cross-repo clone creates both local and origin/ refs
  • No debug log growth unless PLANNOTATOR_DEBUG=1
  • Pi extension: agent launch succeeds with vendored review modules
  • No type errors introduced (5 pre-existing errors in review tsconfig)

…ree, and panel UI

Hook up Codex as the first AI review agent in the code review system:

- Spawn `codex exec` with Codex's native review prompt and output schema
- Parse structured findings (ReviewOutputEvent) and push as external annotations
- Annotations appear inline in the diff viewer pinned to specific lines
- Review verdict (correct/incorrect + confidence + explanation) displayed in panel

Agent job infrastructure enhancements:
- Server-side command building via `buildCommand` callback (providers don't need frontend commands)
- Result ingestion via `onJobComplete` callback (reads output file, transforms findings)
- `addAnnotations` method on external annotation handler (bypasses HTTP for server-internal producers)
- Live stderr streaming via `job:log` SSE events with 200ms buffer-and-flush
- `cwd` and `summary` fields on AgentJobInfo

PR review with --local worktree:
- `plannotator review <PR_URL> --local` creates a temp git worktree with the PR branch
- Agent gets full local file access without touching the user's working tree
- Hybrid server mode: both prMetadata (platform features) and gitContext (local access)
- Automatic worktree cleanup on session end
- Runtime-agnostic worktree primitives in packages/shared/worktree.ts

Panel UI redesign:
- Findings | Logs tab system with underline-style tabs
- LiveLogViewer component with auto-scroll, truncation, and copy
- Review verdict card (correct/incorrect with confidence and explanation)
- Pending state with labeled "Review Verdict — Pending..." (not skeleton bars)
- Job card click opens detail panel directly (removed separate icon button)
- Dismissed annotation tracking (deleted annotations persist as "dismissed" in panel)
- Copy all annotations as formatted markdown
- Worktree badge in header with info dialog showing path
- CopyButton extended with inline variant for reuse
- ConfirmDialog extended with wide option

For provenance purposes, this commit was AI assisted.
…de review UI

- VerdictCard: remove AI-template left-border, use background-only tint
- Inline annotations: 6px radius, subtle shadow, hover elevation, action button scale
- File tree: tighter indentation (4 + depth*10), reduced container padding
- Select dropdowns: normalized to 4px border-radius matching pierre diffs
- Dockview tabs: close button pushed to far right with margin-left auto, visible at 0.25 opacity
- PR icons moved from sidebar to header (next to PR link)
- AnnotationRow: translate-x hover feedback
- Border-radius normalized to `rounded` (4px) across all components
- Type scale consolidated: text-[8px]/[9px] → text-[10px], text-[11px] → text-xs
- Sidebar tab hit targets increased (px-2.5 py-1.5, w-4 icons)
- Sticky file group headers in annotation sidebar
- FileTree controls collapsed (worktree + diff selectors share one row)
- Tab micro-animations (transition-all duration-150)
- ScrollFade component for gradient indicators on scrollable containers
- Prose containment: max-w-2xl + px-6 padding on PR Summary/Comments/Checks panels
- MarkdownBody: leading-relaxed for comfortable reading
- PR Comments: surface lift (bg-muted/10), hover feedback, author font-semibold
- PR Checks: link affordance (text-primary, underline on hover, external link icon)

For provenance purposes, this commit was AI assisted.
…n, and polish

Comments panel enhancements:
- Search: real-time text filter across author and body, match count display
- Keyboard navigation: j/k to move between comments, scroll-to-selected
- Sort: toggle between oldest/newest first
- Collapsible comments: click header to collapse, collapse/expand all controls
- Author exclusion filter: click authors to hide their comments (not inclusion)
- Comment actions: hover-reveal "View on GitHub" link + copy button (bottom-right)
- Review URLs: PRReview type now includes optional url field, populated from GitHub API

PR Summary fixes:
- Label contrast: use theme foreground color for label text instead of raw GitHub hex
- Linked issues: replaced broken hardcoded SVG with proper GitHub Octicons issue-opened icon

Data plumbing:
- platformUser exposed through ReviewStateContext for "Mine" filtering
- Panel wrapper changed to overflow-hidden for sticky toolbar support
- "Commented" review badge hidden (noise — only show Approved/Changes Requested/Dismissed)

For provenance purposes, this commit was AI assisted.
…diff hunk previews

PR Review Threads:
- Fetch inline code review comments via GitHub GraphQL (reviewThreads query)
- PRReviewThread and PRThreadComment types with isResolved, isOutdated, path, line, diffSide
- ThreadCard component: file/line context, Outdated/Resolved badges, nested replies
- Resolved/outdated threads: gradient fade on body with "Show full comment" expand
- GitLab: reviewThreads placeholder (TODO: parse DiffNote positions from notes)

DiffHunkPreview:
- Renders diff hunks using @pierre/diffs FileDiff component (read-only, compact)
- Full theme integration: reads computed CSS vars, injects via unsafeCSS (same as main DiffViewer)
- Respects user font settings from ReviewState context
- Handles bare GitHub diffHunk format (prepends synthetic file headers for pierre parsing)

Comments Panel Polish:
- Comment cards: bg-card + subtle shadow for depth and isolation from panel background
- Hover: shadow elevation (0_2px_6px) for interactive feedback
- Thread cards: dimmed shadow for resolved/outdated, full shadow for active
- Prose padding: px-8 (32px) across all dockview panels (Summary, Comments, Checks, Findings)

For provenance purposes, this commit was AI assisted.
Claude Code review agent:
- claude-review.ts: prompt (adapted from code-review plugin), command builder
  (dontAsk + granular allowedTools/disallowedTools), JSONL stream output parser
- Prompt sent via stdin (not argv) to avoid quoting/variadic flag conflicts
- stream-json --verbose for live JSONL streaming + final structured_output
- Same schema as Codex — transformReviewFindings is now provider-agnostic

Agent jobs infrastructure:
- stdout capture (captureStdout option) for providers that return results on stdout
- stdin prompt writing (stdinPrompt option) for providers that read prompt from stdin
- cwd override in buildCommand return for providers without -C flag
- await stdoutDone before onJobComplete to prevent drain race condition
- job.prompt field for transparent prompt display in detail panel

Cross-repo --local:
- Detect same-repo vs cross-repo via parseRemoteUrl comparison
- Cross-repo: shallow clone via gh/glab repo clone (--depth 1 --no-checkout + targeted fetch)
- Cross-repo uses platform diff (gh pr diff) for display, clone for agent file access
- Same-repo: existing worktree path unchanged
- Cleanup: rmSync for clones, worktree remove for same-repo

Performance: jobLogs context split
- Separate JobLogsContext to prevent high-frequency log SSE from re-rendering all panels
- Only ReviewAgentJobDetailPanel subscribes to JobLogsProvider
- Standard React pattern: split contexts by update frequency

Image error handling:
- SafeHtmlBlock component wraps dangerouslySetInnerHTML with img onerror handlers
- Broken images (expired GitHub JWTs) hide on first 404 instead of flickering
- Prevents console 404 flood from re-render retry loops

For provenance purposes, this commit was AI assisted.
…etup, default local for PRs

The --local flag was setting gitContext from the worktree, which contaminated
the diff rendering pipeline — causing pierre "trailing context mismatch" errors
because /api/file-content read worktree files instead of using the GitHub API.

Root cause: gitContext serves two purposes — diff pipeline (file contents, diff
switching, staging) and agent sandbox (cwd for agent processes). These are now
properly separated via a new agentCwd option on ReviewServerOptions.

Changes:
- Add agentCwd to ReviewServerOptions, independent of gitContext
- Agent handler (getCwd, buildCommand, onJobComplete) prefers agentCwd
- Stop setting gitContext in --local PR path — diff pipeline untouched
- Revert band-aid !isPRMode guards (no longer needed)
- Fix same-repo worktree: fetch origin/<baseBranch> so agents see correct diff
- Fix cross-repo clone: create local branch at baseSha for git diff accuracy
- Fix FETCH_HEAD ordering: fetch base branch before PR head (createWorktree needs PR tip)
- Fix macOS path mismatch: realpathSync(tmpdir()) so agent paths strip correctly
- Change buildCodexReviewUserMessage signature from GitContext to focused options
- Make --local the default for PR/MR reviews (--no-local to opt out)
- Pass agentCwd to client for worktree badge display

For provenance purposes, this commit was AI assisted.
- Unify all panel headers at 33px via --panel-header-h CSS variable
- FileHeader now uses shared variable instead of hardcoded 30px
- Dockview tab bar height, font size, and padding aligned with sidebars
- FileTree header uses fixed height instead of padding-based sizing
- Consistent border opacity (border-border/50) across all panels
- Consistent font weight (font-semibold) on all header labels
- Remove dockview tab focus outline (::after pseudo-element)
- Dockview tab close button pushed to right edge
- Tab bar void area uses muted background
- Top app header compacted from h-12 to py-1
- Sidebar footer: copy button and diff stats side by side
- FeedbackButton responsive labels (Send/Post at md, full labels at lg)
- Move ReviewAgentsIcon to packages/ui for shared use
- Agent empty state uses shared ReviewAgentsIcon instead of hardcoded SVG
- Sidebar header label truncates when narrow (tabs never clip)
- Detail panel prompt disclosures get proper spacing
- React.memo on PR tab components (PRSummaryTab, PRCommentsTab)
- Inline onerror on img tags for broken GitHub image handling

For provenance purposes, this commit was AI assisted.
Bun's proc.stdin is typed as `number | FileSink` but we need to call
.write() and .end() on it. Cast to FileSink to satisfy tsc --noEmit.

For provenance purposes, this commit was AI assisted.
…mismatch

Security:
- Remove `onerror` from DOMPurify ALLOWED_ATTR — was allowing arbitrary JS
  execution via PR descriptions containing `<img onerror="...">`. Replace
  with SafeHtml component that attaches error handlers via useEffect + ref.
- Add `--` end-of-options separator to git fetch and git branch calls to
  prevent flag injection via crafted branch names from API responses.

Bug fix:
- Cross-repo clones now create both local branch AND remote-tracking ref
  (`refs/remotes/origin/<baseBranch>`) at baseSha, so agents can use either
  `git diff main...HEAD` or `git diff origin/main...HEAD`.

Polish:
- Add copy button to verdict card in agent detail panel
- Remove hover:translate-x animation from finding rows
- Reduce file tree indent per level, remove extra file offset
- Add pr-action endpoint logging for debugging submit failures

For provenance purposes, this commit was AI assisted.
The beta pin was needed for processFile() API (expandable diff context),
which shipped in 1.1.0 stable on March 14. We were 13 releases behind.

Notable fixes in the upgrade path:
- 1.1.5: Fix diffAcceptRejectHunk with partial FileDiffMetadata
- 1.1.6: Patch parsing fix for renames and dotfiles
- 1.1.8: Fix maxLineDiffLength regression

May resolve intermittent "trailing context mismatch" errors in diff rendering.

For provenance purposes, this commit was AI assisted.
Copy link
Copy Markdown
Owner Author

@backnotprop backnotprop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review from Plannotator

Comment on lines 235 to 238
onLaunch({
provider: selectedProvider,
// Stub command — will be replaced with real agent invocation
command: ['sleep', String(10 + Math.floor(Math.random() * 21))],
label: provider ? `${provider.name}` : selectedProvider,
label: provider ? `${provider.name} Review` : selectedProvider,
});
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] Shell-capable installs can no longer start any review agent

This launch payload now omits command, but the server only synthesizes commands for non-shell providers. On machines where shell is the only available capability, or when a user explicitly selects it, /api/agents/jobs returns Missing "command" array and launchJob() drops the error, so the newly re-enabled Run Agent button becomes a silent no-op.

Comment on lines +181 to +184
if (output.findings.length > 0) {
const cwd = options.agentCwd ?? resolveVcsCwd(currentDiffType, gitContext?.cwd) ?? process.cwd();
const authorName = job.provider === "codex" ? "Codex" : job.provider === "claude" ? "Claude Code" : "Review Agent";
const annotations = transformReviewFindings(output.findings, job.source, cwd, authorName);
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] Findings are normalized against the wrong working tree after a diff switch

This path normalization uses currentDiffType at completion time instead of the job's launch-time cwd from meta.cwd. If the reviewer switches worktrees or diff modes while an agent is still running, absolute paths from the agent are rebased against the wrong root, so the generated annotations no longer match the relative diff file paths and show up unattached or on the wrong file.

Comment on lines +307 to +312
const hostnameArgs = (host === "github.com" || host === "gitlab.com") ? [] : ["--hostname", host];

// Step 1: Fast skeleton clone (no checkout, depth 1 — minimal data transfer)
console.error(`Cloning ${prRepo} (shallow)...`);
const cloneResult = Bun.spawnSync(
[cli, "repo", "clone", prRepo, localPath, ...hostnameArgs, "--", "--depth=1", "--no-checkout"],
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] GitHub Enterprise cross-repo --local reviews never get a local checkout

In the cross-repo path this expands to gh repo clone owner/repo <dir> --hostname <host> -- ..., but gh repo clone does not support a --hostname flag. For self-hosted GitHub PRs the shallow clone step fails immediately, so the code falls back to remote-only mode and agents lose the local worktree access that --local is supposed to provide.

Comment on lines +78 to +80
const handleAnnotationClick = useCallback((ann: CodeAnnotation) => {
state.onSelectAnnotation(ann.id);
}, [state.onSelectAnnotation]);
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] openDiffFile omitted from annotation click handler — clicking findings no longer navigates to the file

In handleAnnotationClick, the call to state.openDiffFile(ann.filePath) was removed during the refactor that added live logging. The handler now only calls state.onSelectAnnotation(ann.id), which means clicking a finding row in the agent job detail panel selects the annotation in state but the diff viewer does not switch to the relevant file. The function openDiffFile still exists in ReviewStateContext (line 80) and is properly typed. The missing call should be restored:

const handleAnnotationClick = useCallback((ann: CodeAnnotation) => {
  state.openDiffFile(ann.filePath);  // was dropped
  state.onSelectAnnotation(ann.id);
}, [state.openDiffFile, state.onSelectAnnotation]);

Comment on lines +311 to +314
const cloneResult = Bun.spawnSync(
[cli, "repo", "clone", prRepo, localPath, ...hostnameArgs, "--", "--depth=1", "--no-checkout"],
{ stderr: "pipe" },
);
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] gh repo clone passes unvalidated owner/repo before --, enabling flag injection via crafted PR URL

The prRepo value (owner/repo) comes from a URL regex that only excludes / — it allows any other characters, including --flag=value. The -- end-of-flags marker is placed after prRepo and localPath, so it does not protect against flag injection:

[cli, "repo", "clone", prRepo, localPath, ...hostnameArgs, "--", "--depth=1", "--no-checkout"]

A URL like https://github.com/--upload-pack=malicious/repo/pull/1 would produce owner = "--upload-pack=malicious", which gh interprets as a flag. The --upload-pack option causes gh/git to execute an arbitrary command on clone.

Fix: move -- before prRepo:

[cli, "repo", "clone", "--", prRepo, localPath, ...hostnameArgs, "--depth=1", "--no-checkout"]

Or validate owner and repo match ^[a-zA-Z0-9._-]+$ before use.

Comment on lines +219 to +228
// Forward JSONL lines as log events (skip result events)
const lines = text.split('\n');
for (const line of lines) {
if (!line.trim()) continue;
try {
const event = JSON.parse(line);
if (event.type === 'result') continue; // handled in onJobComplete
} catch { /* not JSON — forward as raw log */ }
broadcast({ type: "job:log", jobId: id, delta: line + '\n' });
}
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] JSONL lines split on chunk boundaries in stdout drain — garbled live log output

The stdout drain loop splits text (the current chunk) on '\n' directly, without maintaining a partial-line buffer across chunks:

const lines = text.split('\n');

When a JSONL line (e.g. a Claude assistant message) spans two network chunks, the tail of the first chunk and the head of the second are treated as separate lines. Both fail JSON.parse and are broadcast as raw fragments, producing garbled output in LiveLogViewer. The final result is unaffected (stdoutBuf accumulates correctly), but the live view is broken for large messages.

Fix: maintain a lineBuffer that persists the incomplete last segment across iterations:

let lineBuffer = "";
for await (const chunk of reader) {
  const text = ...;
  stdoutBuf += text;
  lineBuffer += text;
  const parts = lineBuffer.split('\n');
  lineBuffer = parts.pop()!;
  for (const line of parts) { /* broadcast */ }
}

Comment on lines +203 to +232
export function formatClaudeLogEvent(line: string): string | null {
try {
const event = JSON.parse(line);

// Skip the final result event — handled separately
if (event.type === 'result') return null;

// Assistant messages (the agent's thinking/responses)
if (event.type === 'assistant' && event.message?.content) {
const parts = Array.isArray(event.message.content) ? event.message.content : [event.message.content];
const texts = parts
.filter((p: any) => p.type === 'text' && p.text)
.map((p: any) => p.text);
if (texts.length > 0) return texts.join('\n');
}

// Tool use events
if (event.type === 'assistant' && event.message?.content) {
const tools = (Array.isArray(event.message.content) ? event.message.content : [])
.filter((p: any) => p.type === 'tool_use');
if (tools.length > 0) {
return tools.map((t: any) => `[${t.name}] ${typeof t.input === 'string' ? t.input.slice(0, 100) : JSON.stringify(t.input).slice(0, 100)}`).join('\n');
}
}

return null;
} catch {
return null;
}
}
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] formatClaudeLogEvent is exported but never imported or called — live logs show raw JSONL instead of formatted text

formatClaudeLogEvent in packages/server/claude-review.ts was written to extract human-readable text from Claude's stream-json JSONL events (assistant messages, tool calls). However, it is never imported or called anywhere in the codebase — confirmed by grep showing exactly one occurrence (the definition itself).

The stdout drain in agent-jobs.ts broadcasts raw JSONL lines directly. Users watching a Claude review job's Logs tab see raw JSON like {"type":"assistant","message":{"content":[{"type":"text","text":"..."}],...}} instead of formatted output.

Fix: import formatClaudeLogEvent in agent-jobs.ts and use it to filter/format lines before broadcasting:

import { formatClaudeLogEvent } from "./claude-review";
// then in the drain loop:
const formatted = formatClaudeLogEvent(line);
if (formatted !== null) broadcast({ type: "job:log", jobId: id, delta: formatted + '\n' });

…ting, copy UX

Security:
- Remove shell provider from agent capabilities (unauthenticated RCE vector)
- Move `--` before prRepo in `gh repo clone` to prevent flag injection

Features:
- Wire up formatClaudeLogEvent — Claude live logs now show readable text
  instead of raw JSONL
- Sidebar annotations: copy + delete buttons appear on hover (no overlap)
- Agent finding rows: copy button on hover (progressive disclosure)
- Verdict card: copy button pushed to the right
- File tree: reduced indent per level, files aligned with folders

Infra:
- PR action endpoint logging for debugging submit failures

For provenance purposes, this commit was AI assisted.
The `--` separator in `gh repo clone` separates gh args from git args,
not positional args from flags. Using `--` before prRepo would break
the git flags. Instead, validate that the repo identifier doesn't start
with `-` to prevent flag injection via crafted PR URLs.

For provenance purposes, this commit was AI assisted.
…and multi-agent prompt

Claude review agent now has its own schema, prompt, and transform — separate
from Codex's P0-P3 priority model. Each provider uses its natural review style.

Schema changes:
- Claude findings use severity (important/nit/pre_existing) instead of priority (0-3)
- Flat structure: file, line, end_line instead of nested code_location
- description (single field) instead of title + body
- reasoning field captures the validation chain per finding
- summary with counts instead of overall_correctness/confidence

Prompt: Converges the open-source Claude Code review prompt with the remote
review service model. 4 parallel agents (Bug+Regression at Opus, Security at
Opus, Code Quality at Sonnet, Guideline Compliance at Haiku), validation step,
deduplication, severity classification. CLAUDE.md and REVIEW.md awareness.

UI: Severity markers (colored dots) on finding rows. Collapsible reasoning
section via <details>. New optional severity/reasoning fields on CodeAnnotation
and the external annotation store — backward compatible, only set by Claude.

Transform: transformClaudeFindings normalizes Claude output into the shared
annotation format. Codex path (transformReviewFindings) is completely untouched.

For provenance purposes, this commit was AI assisted.
…ined)

For provenance purposes, this commit was AI assisted.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands the review experience by adding first-class AI “review agents” (Codex + Claude Code) with structured outputs and live log streaming, introduces default PR-local worktrees for agent file access without affecting the user’s checkout, and polishes the review UI (comments panel overhaul, headers, dock tabs, icons, and styling).

Changes:

  • Add Codex/Claude review-agent integration with server-side command building, output parsing → external diff annotations, and live log streaming (job:log).
  • Implement default PR-local worktree/sandbox creation (--no-local opt-out), plus shared git worktree primitives vendored to Pi.
  • Overhaul PR comments UI (search, navigation, threads, diff hunk previews) and apply broad UI polish across panels/toolbars.

Reviewed changes

Copilot reviewed 45 out of 50 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
packages/ui/hooks/useAgentJobs.ts Adds jobLogs accumulation from new job:log SSE events.
packages/ui/components/ToolbarButtons.tsx Adds responsive “short label” support for toolbar buttons.
packages/ui/components/ReviewAgentsIcon.tsx Introduces shared icon component for review agents tab.
packages/ui/components/ConfirmDialog.tsx Adds wide layout option for dialogs.
packages/ui/components/AgentsTab.tsx Refines agent job cards and launches server-built agent runs.
packages/shared/worktree.ts Adds runtime-agnostic git worktree primitives (fetch/create/remove).
packages/shared/pr-provider.ts Extends PR context with review threads + optional review URL.
packages/shared/pr-gitlab.ts Stubs reviewThreads for GitLab (TODO parsing).
packages/shared/pr-github.ts Fetches GitHub review threads via GraphQL and maps into shared types.
packages/shared/package.json Exports new ./worktree entry.
packages/shared/agent-jobs.ts Extends AgentJobInfo (cwd/prompt/summary) + adds job:log event.
packages/server/vcs.ts Re-exports bun git runtime for shared worktree utilities.
packages/server/review.ts Adds agentCwd sandboxing, agent command builders, ingestion → annotations, and cleanup hook.
packages/server/PR_LOCAL_WORKTREE.md Design notes for PR local worktree behavior.
packages/server/git.ts Exports bun runtime implementing ReviewGitRuntime.
packages/server/external-annotations.ts Adds direct addAnnotations() helper (bypassing HTTP) with validation reuse.
packages/server/codex-review.ts New Codex review agent module (prompt/command/schema/output parsing/transform).
packages/server/codex-review-schema.json Adds on-disk schema file (also duplicated inline in code).
packages/server/claude-review.ts New Claude Code review agent module (command builder + JSONL parser + log formatter).
packages/server/agent-jobs.ts Server-side command building, stdin prompt support, stdout capture, and live log SSE deltas.
packages/server/AGENT_LIVE_LOGS.md Design notes for job:log SSE live logs.
packages/review-editor/package.json Updates @pierre/diffs dependency.
packages/review-editor/index.css Dockview theme/header sizing polish + focus outline changes.
packages/review-editor/dock/ReviewStateContext.tsx Adds platformUser to shared dock state.
packages/review-editor/dock/ReviewDockTabRenderer.tsx Minor sizing tweak for tab close icon.
packages/review-editor/dock/panels/ReviewPRCommentsPanel.tsx Passes platformUser through; adjusts overflow behavior.
packages/review-editor/dock/panels/ReviewAgentJobDetailPanel.tsx Major redesign: findings/logs tabs, verdict, prompt disclosure, live logs, copy/export affordances.
packages/review-editor/dock/JobLogsContext.tsx Adds separate context for high-frequency job logs.
packages/review-editor/components/ScrollFade.tsx Adds scroll fade wrapper component for better overflow cues.
packages/review-editor/components/ReviewSidebar.tsx Re-enables agents tab, UI tweaks, and shared icon import.
packages/review-editor/components/PRSummaryTab.tsx Safer HTML rendering + broken image hiding; layout polish; memoization.
packages/review-editor/components/PRCommentsTab.tsx Comments panel overhaul: search, keyboard nav, filters, threads + diff hunk previews.
packages/review-editor/components/PRChecksTab.tsx Layout polish and clearer check link affordances.
packages/review-editor/components/LiveLogViewer.tsx New reusable streaming log viewer with truncation and copy.
packages/review-editor/components/FileTreeNode.tsx Adjusts file tree indentation/padding.
packages/review-editor/components/FileTree.tsx Combines selectors, aligns header height, updates footer copy-diff UI.
packages/review-editor/components/FileHeader.tsx Aligns header height with shared panel header variable.
packages/review-editor/components/DiffHunkPreview.tsx Adds inline diff hunk preview rendering via @pierre/diffs.
packages/review-editor/components/CopyButton.tsx Adds inline/overlay variants + optional labels.
packages/review-editor/App.tsx Wires job logs context, worktree badge/dialog, PR panel buttons, and short feedback labels.
package.json Updates @pierre/diffs dependency.
import-octi/issue-reopened.svg Adds SVG asset.
import-octi/issue-opened.svg Adds SVG asset.
import-octi/issue-closed.svg Adds SVG asset.
bun.lock Locks updated dependencies/versions.
apps/review/package.json Updates @pierre/diffs dependency.
apps/pi-extension/vendor.sh Vendors new shared worktree.ts into Pi extension generated code.
apps/hook/server/index.ts Implements default PR-local worktree/clone creation and passes agentCwd + cleanup into server.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +349 to +353
function toRelativePath(absolutePath: string, cwd?: string): string {
if (!cwd) return absolutePath;
const prefix = cwd.endsWith("/") ? cwd : cwd + "/";
return absolutePath.startsWith(prefix) ? absolutePath.slice(prefix.length) : absolutePath;
}
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

toRelativePath() assumes POSIX / separators when stripping cwd from absolute paths. On Windows (or when Codex returns backslashes), this will fail to relativize paths and annotations may not match diff file paths. Using path.relative(cwd, absolutePath) (and normalizing to /) is more robust across platforms and symlinks.

Copilot uses AI. Check for mistakes.

worktreeCleanup = () => removeWorktree(gitRuntime, localPath, { force: true, cwd: repoDir });
process.on("exit", () => {
try { Bun.spawnSync(["git", "worktree", "remove", "--force", localPath]); } catch {}
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explicit process.on("exit") cleanup runs git worktree remove without setting cwd to the source repo. If the process cwd changes during runtime, cleanup may fail. Prefer reusing worktreeCleanup (which already supplies cwd) or pass { cwd: repoDir } to the spawn call.

Suggested change
try { Bun.spawnSync(["git", "worktree", "remove", "--force", localPath]); } catch {}
try { Bun.spawnSync(["git", "worktree", "remove", "--force", localPath], { cwd: repoDir }); } catch {}

Copilot uses AI. Check for mistakes.
if (spawnOptions?.prompt) info.prompt = spawnOptions.prompt;
jobs.set(id, { info, proc });
if (outputPath) jobOutputPaths.set(id, outputPath);
if (spawnOptions?.cwd) jobOutputPaths.set(`${id}:cwd`, spawnOptions.cwd);
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jobOutputPaths stores ${id}:cwd only when an explicit spawnOptions.cwd override is provided, but spawnCwd can also come from the default getCwd(). This makes the cwd metadata used in onJobComplete inconsistent. Consider storing spawnCwd for every job (e.g., always set ${id}:cwd or pass entry.info.cwd).

Suggested change
if (spawnOptions?.cwd) jobOutputPaths.set(`${id}:cwd`, spawnOptions.cwd);
jobOutputPaths.set(`${id}:cwd`, spawnCwd);

Copilot uses AI. Check for mistakes.
Comment on lines +265 to +273
let isSameRepo = false;
try {
const remoteResult = await gitRuntime.runGit(["remote", "get-url", "origin"]);
if (remoteResult.exitCode === 0) {
const currentRepo = parseRemoteUrl(remoteResult.stdout.trim());
const prRepo = prMetadata.platform === "github"
? `${prMetadata.owner}/${prMetadata.repo}`
: prMetadata.projectPath;
isSameRepo = !!currentRepo && currentRepo.toLowerCase() === prRepo.toLowerCase();
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same-repo detection compares only org/repo (parseRemoteUrl) and ignores the remote host. On GitHub Enterprise / self-hosted GitLab, this can incorrectly classify a PR as “same repo” and attempt to fetch from the wrong origin. Consider parsing/comparing the host too (e.g., ensure it matches prMetadata.host).

Suggested change
let isSameRepo = false;
try {
const remoteResult = await gitRuntime.runGit(["remote", "get-url", "origin"]);
if (remoteResult.exitCode === 0) {
const currentRepo = parseRemoteUrl(remoteResult.stdout.trim());
const prRepo = prMetadata.platform === "github"
? `${prMetadata.owner}/${prMetadata.repo}`
: prMetadata.projectPath;
isSameRepo = !!currentRepo && currentRepo.toLowerCase() === prRepo.toLowerCase();
const getRemoteHost = (remoteUrl: string): string | null => {
const trimmed = remoteUrl.trim();
if (!trimmed) return null;
// SCP-like SSH syntax: git@host:owner/repo.git
const scpLikeMatch = trimmed.match(/^[^@]+@([^:]+):.+$/);
if (scpLikeMatch?.[1]) {
return scpLikeMatch[1].toLowerCase();
}
try {
return new URL(trimmed).host.toLowerCase();
} catch {
return null;
}
};
let isSameRepo = false;
try {
const remoteResult = await gitRuntime.runGit(["remote", "get-url", "origin"]);
if (remoteResult.exitCode === 0) {
const remoteUrl = remoteResult.stdout.trim();
const currentRepo = parseRemoteUrl(remoteUrl);
const currentHost = getRemoteHost(remoteUrl);
const prRepo = prMetadata.platform === "github"
? `${prMetadata.owner}/${prMetadata.repo}`
: prMetadata.projectPath;
const prHost = typeof prMetadata.host === "string" ? prMetadata.host.toLowerCase() : null;
const repoMatches = !!currentRepo && currentRepo.toLowerCase() === prRepo.toLowerCase();
const hostMatches = !prHost || (!!currentHost && currentHost === prHost);
isSameRepo = repoMatches && hostMatches;

Copilot uses AI. Check for mistakes.
Comment on lines +14 to +16
New `--local` flag on the review command. When reviewing a PR with `--local`:

1. **Fetch the PR head** into the user's local repo (doesn't change their checkout):
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doc still describes --local as opt-in, but the CLI logic now makes local checkout the default for PR/MR reviews and introduces --no-local to opt out. Updating the wording and examples here will prevent confusion.

Copilot uses AI. Check for mistakes.
<svg className="w-3 h-3" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
<path strokeLinecap="round" strokeLinejoin="round" d="M10 6H6a2 2 0 00-2 2v10a2 2 0 002 2h10a2 2 0 002-2v-4M14 4h6m0 0v6m0-6L10 14" />
</svg>
View on GitHub
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link action label is hardcoded as “View on GitHub”, but PR context can also come from GitLab. This will be misleading for GitLab reviews. Consider deriving the label from the PR platform or using a neutral label like “View on platform”.

Suggested change
View on GitHub
View on platform

Copilot uses AI. Check for mistakes.
Comment on lines +18 to +22
const DEBUG_LOG_PATH = join(homedir(), ".plannotator", "codex-review-debug.log");

async function debugLog(label: string, data?: unknown): Promise<void> {
try {
await mkdir(join(homedir(), ".plannotator"), { recursive: true });
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module writes a persistent debug log to ~/.plannotator/codex-review-debug.log on every run (and later logs raw outputs). That can silently persist sensitive repo/PR content on disk and adds ongoing I/O. Consider gating debug logging behind an explicit env/config flag and avoid logging full raw outputs by default.

Copilot uses AI. Check for mistakes.
Comment on lines +88 to +92
case 'job:log':
setJobLogs((prev) => {
const next = new Map(prev);
next.set(parsed.jobId, (prev.get(parsed.jobId) ?? '') + parsed.delta);
return next;
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

job:log deltas are appended into jobLogs but there’s no cleanup path (e.g., on terminal completion, jobs:cleared, or when a job disappears from a future snapshot). Over time this can grow without bound. Consider deleting/clearing per-job log entries when jobs finish or are cleared, and/or trimming here.

Copilot uses AI. Check for mistakes.
…findings UX

Security:
- Add -- separator to ensureObjectAvailable git fetch (worktree.ts)
- Validate baseBranch against path traversal (..) before git ref operations
- Use process.once('exit') instead of process.on for worktree cleanup

Debug:
- Gate debugLog behind PLANNOTATOR_DEBUG env var (no more unconditional writes)
- Remove PARSE_OUTPUT_RAW dump that logged full JSON output to disk

Deduplication:
- Extract toRelativePath to packages/server/path-utils.ts (was duplicated
  in codex-review.ts and claude-review.ts)

Pi extension:
- Remove shell provider from capabilities
- Add buildCommand callback to AgentJobHandlerOptions
- POST handler calls buildCommand for server-side command synthesis

UI:
- Findings sorted by severity (important → nit → pre_existing)
- Severity legend under findings header (colored dots)
- Reasoning always visible (not collapsible) — fixes click navigation bug
  where <details> captured click events and broke annotation linkage
- Full finding text shown (removed line-clamp-2 truncation)
- Sidebar annotation hover actions aligned to the right
- DiffHunkPreview: cancel requestAnimationFrame on unmount

For provenance purposes, this commit was AI assisted.
For provenance purposes, this commit was AI assisted.
… remove design docs

- Same-repo detection now compares both owner/repo AND hostname from the
  git remote URL against prMetadata.host. Prevents false positives on
  GitHub Enterprise where different instances share org/repo names.
- "View on GitHub" label in PR comments tab now shows "View on GitLab"
  for GitLab MR comments based on the comment URL.
- Remove internal design docs (PR_LOCAL_WORKTREE.md, AGENT_LIVE_LOGS.md)
  that were development artifacts, not user-facing documentation.

For provenance purposes, this commit was AI assisted.
…ations

The severity and reasoning fields from Claude findings were only visible in
the agent detail panel, not in the inline diff annotations. Now:

- DiffAnnotationMetadata carries severity and reasoning fields
- DiffViewer passes them through when mapping annotations
- InlineAnnotation renders colored severity dot and reasoning text

For provenance purposes, this commit was AI assisted.
Findings now show as "[important] description", "[nit] description",
"[pre_existing] description" — consistent with Codex's [P0]/[P1] tags.

For provenance purposes, this commit was AI assisted.
…ult ingestion

Pi extension agent-jobs handler now mirrors the Bun server's full capabilities:
- stdin piping for Claude prompt delivery
- stdout capture for Claude JSONL stream parsing
- Live stderr streaming with 200ms buffer-and-flush for job:log events
- Claude JSONL formatting via vendored formatClaudeLogEvent
- onJobComplete callback for result parsing and annotation push
- Full buildCommand integration in POST handler
- jobOutputPaths tracking with cleanup on kill

Pi serverReview.ts now wires buildCommand and onJobComplete with the same
logic as the Bun review server — Codex and Claude commands are built
server-side, results are parsed and transformed into external annotations.

Runtime compatibility: replaced Bun.file/Bun.write in codex-review.ts with
node:fs/promises equivalents (writeFile, readFile, existsSync) that work on
both Bun and Node. Verified Bun build passes.

Vendoring: vendor.sh now copies codex-review.ts, claude-review.ts, and
path-utils.ts from packages/server/ with import path rewriting for the
generated/ layout.

Also: Review Prompt label, px-8 padding on agent detail header/tabs/logs.

For provenance purposes, this commit was AI assisted.
For provenance purposes, this commit was AI assisted.
- exportReviewFeedback: appends **Reasoning:** after finding text
- Per-finding copy button: appends reasoning to copy text
- Sidebar annotation copy: appends reasoning to copy text

This ensures reasoning flows through Copy All, Send Feedback, and
individual copy actions — not just the visual rendering.

For provenance purposes, this commit was AI assisted.
…atch, Windows paths

1. openDiffFile: clicking a finding now navigates to the correct file
   before selecting the annotation (was silently selecting in wrong file)

2. SEVERITY_STYLES: extracted to packages/ui/types.ts as shared constant,
   imported in both ReviewAgentJobDetailPanel and InlineAnnotation
   (was duplicated with per-render rebuild in InlineAnnotation)

3. killJob: added jobOutputPaths.delete calls to match Pi's version
   (was leaking two strings per killed job)

4. CommentActions: replaced hand-rolled copy with CopyButton inline
   variant (was reimplementing useState/clipboard/setTimeout pattern)

5. Branch mode prompt: changed from three-dot to two-dot to match
   the UI's actual diff computation (agent was reviewing different diff)

6. toRelativePath: uses path.relative + forward-slash normalization
   for Windows compatibility (was string prefix matching with / only)

For provenance purposes, this commit was AI assisted.
…og streaming

Every job:log SSE event triggers setJobLogs in useAgentJobs, which re-renders
App.tsx. Without memo, the sidebar re-renders on every event (~5/sec) even
though its props (agentJobs.jobs, capabilities, callbacks) haven't changed.
This caused visible flickering when a review tab was open during agent runs.

React.memo shallow-compares props — all sidebar props are stable references
(jobs array only changes on status events, callbacks are useCallback-wrapped),
so the sidebar correctly skips re-renders during log streaming.

For provenance purposes, this commit was AI assisted.
… CLI

Security: Bash(find:*) allowed find -exec to spawn arbitrary subprocesses
that bypassed --disallowedTools. Removed find, ls, and cat — Claude has
Glob, Read, and Grep built-in which cover file access without shell exec.

Feature: Added glab mr view/diff/list and glab api to allowed tools so
Claude can inspect GitLab MR context in remote-mode reviews.

For provenance purposes, this commit was AI assisted.
Pi extension:
- Add addAnnotations() to external-annotations.ts return object —
  serverReview.ts calls it when agent jobs complete but the method
  was missing (build/runtime error)
- Change proc.on('exit') to proc.on('close') in agent-jobs.ts —
  Node's 'exit' fires before stdio streams drain, so stdoutBuf could
  be incomplete when onJobComplete parses Claude's JSONL result

Cross-repo --local:
- Check git checkout FETCH_HEAD exit code — throw if it fails so the
  outer catch falls back to remote-only with a clear warning
- Log warning if baseSha fetch fails (non-fatal, agents just can't
  diff locally)

For provenance purposes, this commit was AI assisted.
Critical:
- Move ensureObjectAvailable before PR head fetch — it can overwrite
  FETCH_HEAD if baseSha needs fetching, causing createWorktree to
  check out the base commit instead of the PR head
- Pi serverReview.ts: extract resolveAgentCwd() helper used by getCwd,
  buildCommand, and onJobComplete — was bypassing worktree-aware path
  resolution, causing agents to run in wrong directory

Cleanup:
- Hoist SEVERITY_ORDER to module scope in ReviewAgentJobDetailPanel
  (was recreated inside component body on every render)

For provenance purposes, this commit was AI assisted.
- Bun agent-jobs: change provider default from "shell" to "" (shell was
  removed from capabilities, default should match Pi)
- Pi serverReview: log errors from addAnnotations in onJobComplete
  (Bun logs them, Pi was silently ignoring)

For provenance purposes, this commit was AI assisted.
New docs page covering:
- Overview of Codex and Claude review agents
- How findings work (severity/priority, reasoning, navigation)
- Local worktree behavior (same-repo vs cross-repo)
- Full transparency section with:
  - Claude multi-agent pipeline prompt (all 6 steps)
  - Claude command and allowed/blocked tools
  - Codex review prompt and command
  - Both output schemas (Claude severity + Codex priority)
- Security notes (read-only, no network, local execution, no commenting)
- Customization via CLAUDE.md and REVIEW.md

For provenance purposes, this commit was AI assisted.
Credit Anthropic's Claude Code Review service, the open-source
code-review plugin, and OpenAI's Codex CLI as the foundations
for our review agent integrations.

For provenance purposes, this commit was AI assisted.
- Cross-repo: clean up localPath in catch block when fetch/checkout
  fails after clone succeeds (directory was leaking in /tmp)
- Remove j/k/arrow keyboard navigation from PR comments panel —
  these shortcuts belong to the file tree only, both registering
  global handlers caused double-navigation
- Thread header null guard: check thread.line before building range
  label to prevent "L12–null" for outdated GitHub threads

For provenance purposes, this commit was AI assisted.
The previous rmSync(localPath) in the catch block was dead code —
const declarations inside try are not in scope in catch (separate
lexical environments per ECMAScript spec). The ReferenceError was
silently swallowed by the inner try/catch, so temp directories
still leaked on failed fetch/checkout.

Fix: hoist `let localPath` before the try block so it's accessible
in catch. Guard with `if (localPath)` since the error could occur
before assignment.

Also: validate baseSha is a hex SHA (40-64 chars) to prevent git
flag injection via crafted API responses. Validate baseBranch
rejects both '..' and '-' prefixes.

For provenance purposes, this commit was AI assisted.
Restructured for a technical audience: shorter paragraphs, cleaner
tables, removed em-dashes and filler prose, tightened the pipeline
diagram, streamlined security section into scannable single-line items.

For provenance purposes, this commit was AI assisted.
Replaced summarized/paraphrased transparency section with the actual
prompts, commands, schemas, and tool allowlists as they exist in the
code. One short security note at the top, then raw content.

For provenance purposes, this commit was AI assisted.
For provenance purposes, this commit was AI assisted.
… memo removal

Critical:
- Race stdoutDone against 2s timeout after proc.exited — prevents
  permanent job hang when Bun's ReadableStream doesn't close after
  process exit. The process is dead; 2s is a cleanup deadline.

Bug fix:
- Codex verdict: override to "Issues Found" when P0/P1 findings exist,
  regardless of the freeform overall_correctness string. Prevents green
  "Correct" badge when Codex says "mostly correct but has issues."
  P2/P3-only findings still trust Codex's verdict.

Observability:
- Log Claude parse failures with buffer size and last 200 bytes so we
  can diagnose empty-findings cases.

Performance:
- Remove React.memo from ReviewSidebar — was blocking legitimate
  re-renders (job status, findings) to prevent cosmetic log flickering.
  The tradeoff was wrong.

Docs:
- Remove "shell" from CLAUDE.md capabilities table (provider was removed).

For provenance purposes, this commit was AI assisted.
Bun's proc.stdout/stderr support for-await at runtime but TypeScript's
ReadableStream type doesn't declare [Symbol.asyncIterator]. Cast through
unknown to AsyncIterable<Uint8Array> — standard Bun workaround, same
pattern as the FileSink cast for stdin.

For provenance purposes, this commit was AI assisted.
@backnotprop backnotprop merged commit b375a80 into main Apr 6, 2026
5 checks passed
backnotprop added a commit that referenced this pull request Apr 7, 2026
Delete import-octi/ octicon SVGs accidentally committed in #491
(no references in source). Relocate root-level test-fixtures/
markdown files under tests/test-fixtures/ for consistency.

For provenance purposes, this commit was AI assisted.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants