feat(guardrails): autonomous pipeline — orchestration, auto-review, ship agent by terisuke · Pull Request #145 · Cor-Incorporated/opencode

terisuke · 2026-04-08T15:47:56Z

Summary

Phase 0: Fix 8 foundation bugs (PR feat(guardrail): add git freshness, merge gate, test falsifiability, doc reminder hooks #90 findings + Issues fix(guardrails): apply_patch bypasses tool.execute.before preflight checks #140/fix(guardrails): tighten bash mut regex to reduce false review_state resets #141/bug(guardrails): .env.example reads blocked despite opencode.json explicitly allowing them #142)
Phase 1: Workflow orchestration engine — /implement and /auto auto-chain through test→review→ship
Phase 2: Auto-review trigger — spawns code-reviewer on session.idle when edits ≥ 3
Phase 3: Ship agent with gh pr merge capability + /ship command rewrite
Phase 4: Issue management — tracks gh issue create in workflow state
Phase 5: Delegation worktree fix (Issue bug: agent worktrees created as empty git init — team/subagent delegation non-functional #144) + auto-commit on merge-back

Key Changes

File	Lines	Description
`guardrail.ts`	+290	Client type, autoReview pipeline, checklist gate, workflow state machine, system.transform
`team.ts`	+25	Worktree verification, directory injection, auto-commit, regex fix
`agents/ship.md`	+42	New agent with merge permission
`commands/ship.md`	rewrite	Executable merge workflow (was read-only gate)
`commands/auto.md`	+18	New autonomous pipeline command
`AGENTS.md`	+1	Ship agent entry

Test plan

bun test all pass (2062 pass, pre-existing LSP fail only)
Typecheck clean (bun turbo typecheck)
Local deploy + binary launch
Firing test: /implement sets workflow_phase
Firing test: session.idle triggers auto-review
Firing test: /ship executes gh pr merge
Firing test: /auto completes full pipeline

Closes #140, #141, #142, #144

🤖 Generated with Claude Code

…#141/#142 Bug 0-1: Add exit code guards to all git() helper callers to prevent using stdout from failed commands in security decisions. Bug 0-2: Fix team.ts mut() merge regex to use (\s|$) instead of \b, preventing false matches on "git merge-base". Bug 0-3: Create MUTATING_TOOLS constant (edit/write/apply_patch/multiedit) and use it consistently for review_state resets and advisories. Bug 0-4: Verified Python test detector regex already handles subdirectory paths correctly via (^|\/) pattern — no change needed. Bug 0-5: Convert chat.message git fetch from blocking await to fire-and-forget with deferred advisory surfaced on next message. Bug 0-6: Add secExempt pattern to skip .env.example/.sample/.template files from secret material read blocks. Bug 0-7: Add apply_patch to tool.execute.before preflight checks for security files, config files, version baseline, and context budget. Bug 0-8: Tighten mut[] regex patterns: \brm\b → \brm\s+, \bmv\b → \bmv\s+, \bcp\b → \bcp\s+ to reduce false positives from arguments. Closes #140, Closes #141, Closes #142 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Add ship agent with gh pr merge permission for /ship command - Rewrite /ship from read-only gate to executable merge workflow - Fix delegation worktree empty init (Issue #144): verify files after creation - Inject working directory path into child session prompts - Auto-commit applied worker changes in merge() to prevent uncommitted drift - Add ship agent to AGENTS.md subagents table Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…p command Add a dedicated ship agent with `gh pr merge` permission, replacing the read-only review agent for the /ship command. The ship agent uses deny-by-default permissions with explicit allowlists for git read commands and gh PR operations, enabling OpenCode to actually execute merges instead of saying "please merge manually." - New: packages/guardrails/profile/agents/ship.md (deny-by-default, gh pr merge allowed) - Rewrite: /ship command from read-only gate check to executable merge workflow - Update: AGENTS.md subagents table with ship agent entry - Update: test replay fixture and assertion to use ship agent Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… management Add autonomous pipeline support to guardrail.ts: - Client type for session API (create, prompt, status, messages, abort) - Workflow state tracking (phase, PR URL, review attempts, issues) - Auto-review trigger on session.idle (3+ edits, review pending) - pollIdle/readResult/parseFindings helpers for review pipeline - checklist() gate for merge readiness (tests, review, CI, findings) - Workflow phase transitions in tool.execute.after (PR/test/merge/CI/issue) - experimental.chat.system.transform hook for pipeline guidance - command.execute.before: /implement and /auto workflow initialization - Compacting context includes workflow phase, PR, review attempts - New /auto command for full autonomous pipeline execution Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…pipeline-wave1 # Conflicts: # packages/guardrails/profile/agents/ship.md

github-actions · 2026-04-08T15:49:22Z

New PR opened -- automated review will run on the next push.

To trigger a manual review, comment /review on this PR.

github-actions · 2026-04-08T15:49:23Z

This PR doesn't fully meet our contributing guidelines and PR template.

What needs to be fixed:

PR description is missing required template sections. Please use the PR template.

Please edit this PR description to address the above within 2 hours, or it will be automatically closed.

If you believe this was flagged incorrectly, please let a maintainer know.

github-actions · 2026-04-08T15:50:00Z

The following comment was made by an LLM, it may be inaccurate:

Copilot

Pull request overview

This PR extends the guardrails profile toward an autonomous workflow pipeline by adding auto-review orchestration, a dedicated /ship subagent that can execute gh pr merge, and tightening/repairing delegation worktree behavior.

Changes:

Add an auto-review pipeline triggered on session.idle (edits ≥ 3) plus workflow phase/state tracking and system prompt injection.
Introduce a new ship agent + rewrite /ship command to verify gates and run gh pr merge; add /auto command for end-to-end automation.
Improve delegation/worktree handling in team.ts (worktree verification, auto-commit, mutating regex adjustments) and update scenario tests/replays.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
packages/guardrails/profile/plugins/guardrail.ts	Adds auto-review + workflow orchestration/state, expands mutation handling, and adjusts secret detection.
packages/guardrails/profile/plugins/team.ts	Worktree creation verification, auto-commit after merge-back, and mutating command regex tweaks.
packages/guardrails/profile/agents/ship.md	New ship subagent with restricted permissions and merge capability.
packages/guardrails/profile/commands/ship.md	Rewrites `/ship` into an executable merge workflow (gates → merge).
packages/guardrails/profile/commands/auto.md	Adds `/auto` command instructions for autonomous pipeline execution.
packages/guardrails/profile/AGENTS.md	Documents the new `ship` agent.
packages/opencode/test/scenario/replay.ts	Updates ship replay metadata to use the new ship agent/prompt.
packages/opencode/test/scenario/guardrails.test.ts	Updates expectation that `ship` maps to the `ship` agent.

Comments suppressed due to low confidence (1)

packages/guardrails/profile/plugins/guardrail.ts:797

tool.execute.before applies deny/version/context-budget checks to edit/write/apply_patch, but not to multiedit. Since multiedit takes a filePath and performs sequential edits, this creates a policy bypass for secret/config protection and the version/context gates. Consider extending the same checks to multiedit (and if needed, iterating through edits to validate each operation) to keep behavior consistent with other mutating tools.

    "tool.execute.before": async (
      item: { tool: string; args?: unknown },
      out: { args: Record<string, unknown> },
    ) => {
      const file = pick(out.args ?? item.args)
      if (file && (item.tool === "read" || item.tool === "edit" || item.tool === "write" || item.tool === "apply_patch")) {
        const err = deny(file, item.tool === "read" ? "read" : "edit")
        if (err) {
          await mark({ last_block: item.tool, last_file: rel(input.worktree, file), last_reason: err })
          throw new Error(text(err))
        }
      }
      if (item.tool === "edit" || item.tool === "write" || item.tool === "apply_patch") {
        const err = await version(out.args ?? {})
        if (err) {
          await mark({ last_block: item.tool, last_file: file ? rel(input.worktree, file) : "", last_reason: err })
          throw new Error(text(err))
        }
      }
      if ((item.tool === "edit" || item.tool === "write" || item.tool === "apply_patch") && file && code(file)) {
        const count = await budget()
        if (count >= 4) {
          const budgetData = await stash(state)
          const readFiles = list(budgetData.read_files).slice(-5).join(", ")
          const err = `context budget exceeded after ${count} source reads (recent: ${readFiles || "unknown"}). Recovery options:\n(1) call \`team\` tool to delegate edit to isolated worker\n(2) use \`background\` tool for side work\n(3) narrow edit scope to a specific function/section rather than whole file\n(4) start a new session and continue from where you left off`
          await mark({ last_block: item.tool, last_file: rel(input.worktree, file), last_reason: err })
          throw new Error(text(err))
        }
      }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-08T15:53:22Z

packages/guardrails/profile/plugins/guardrail.ts

+const secExempt = /\.(example|sample|template)$/i
+
 function has(file: string, list: RegExp[]) {
+  if (list === sec && secExempt.test(file)) return false
  return list.some((item) => item.test(file))


secExempt currently exempts any path ending with .example, .sample, or .template from all sec protections. This weakens the secret-file guard broadly (e.g., a file matching *.key or *.pem but suffixed with .template would no longer be blocked). If the goal is only to allow .env.example (Issue #142), please narrow the exemption to explicit .env.(example|sample|template) (or a small allowlist) rather than a generic extension-based bypass.

Copilot · 2026-04-08T15:53:23Z

packages/guardrails/profile/plugins/guardrail.ts

+  const REVIEW_POLL_GAP = 750
+
+  async function pollIdle(sessionID: string) {
+    for (;;) {
+      const stat = await input.client.session.status({ query: { directory: input.directory } })
+      const item = stat.data?.[sessionID]
+      if (!item || item.type === "idle") return


pollIdle() loops indefinitely with no timeout/abort path. If the review session never reaches idle (API issue, stuck run, missing status entry), autoReview() never resolves/rejects and auto_review_in_progress can remain stuck true, preventing future auto-reviews and leaking a background task. Consider adding a max wait (and/or using input.client.session.abort) and ensuring auto_review_in_progress is cleared on timeout.

Suggested change

const REVIEW_POLL_GAP = 750

async function pollIdle(sessionID: string) {

for (;;) {

const stat = await input.client.session.status({ query: { directory: input.directory } })

const item = stat.data?.[sessionID]

if (!item || item.type === "idle") return

const REVIEW_POLL_GAP = 750

const REVIEW_POLL_TIMEOUT = 60_000

async function pollIdle(sessionID: string) {

const deadline = Date.now() + REVIEW_POLL_TIMEOUT

const sessionClient = input.client.session as typeof input.client.session & {

abort?: (args: { path: { id: string }; query: { directory: string } }) => Promise<unknown>

}

for (;;) {

const stat = await input.client.session.status({ query: { directory: input.directory } })

const item = stat.data?.[sessionID]

if (!item || item.type === "idle") return

if (Date.now() >= deadline) {

if (typeof sessionClient.abort === "function") {

await sessionClient.abort({ path: { id: sessionID }, query: { directory: input.directory } }).catch(() => undefined)

}

throw new Error(`Timed out waiting for review session ${sessionID} to become idle`)

}

Copilot · 2026-04-08T15:53:23Z

packages/guardrails/profile/plugins/guardrail.ts

+    await pollIdle(made.data.id)
+    const result = await readResult(made.data.id)
+    const findings = parseFindings(result.text)
+    await mark({
+      auto_review_in_progress: false,
+      auto_review_session: made.data.id,
+      review_state: "done",
+      reviewed: true,
+      review_at: new Date().toISOString(),
+      edits_since_review: 0,
+      review_critical_count: findings.critical,
+      review_high_count: findings.high,
+    })


autoReview() marks review_state: "done" even when the spawned session returns no assistant message (result.text === "") or reports an error (result.error). This can incorrectly satisfy merge gates without a valid review result. Consider treating empty/error results as a failed review: keep review_state unset, record the error, and notify the parent session.

Copilot · 2026-04-08T15:53:23Z

packages/guardrails/profile/plugins/guardrail.ts

@@ -37,6 +37,8 @@ const mut = [
  />/,


The mutating-bash detector includes a bare />/, which matches > inside quoted strings (e.g., echo "version > 2") and will trigger mutating behavior like review_state resets and protected-mutation blocks. To avoid false positives (Issue #141), tighten this to only match redirections outside quotes (or add a scrub/parse step similar to team.ts before applying the regexes).

Suggested change

/>/,

/(?:^|[ \t;&|()])\d*>>?\s*(?:"[^"]+"|'[^']+'|(?!&?\d+\b)[^\s"'&|;()<>]+)/i,

Copilot · 2026-04-08T15:53:24Z

packages/guardrails/profile/plugins/team.ts

 function scrub(cmd: string) {
  return cmd
    .replace(/(?:\d*>>?|\&>>?|\&>)\s*\/dev\/null\b/g, " ")
    .replace(/\d*>\s*&\s*\d+\b/g, " ")
    .replace(/\d*>\s*&-/g, " ")
 }

 function mut(cmd: string) {
  const data = scrub(cmd)
  return [
-    /\brm\b/i,
-    /\bmv\b/i,
-    /\bcp\b/i,
+    /\brm\s+/i,
+    /\bmv\s+/i,
+    /\bcp\s+/i,
    /\bchmod\b/i,
    /\bchown\b/i,
    /\btouch\b/i,
    /\btruncate\b/i,
    /\btee\b/i,
    /\bsed\s+-i\b/i,
    /\bperl\s+-pi\b/i,
    />/,
-    /\bgit\s+(apply|am|merge|rebase|cherry-pick|checkout\s+--|reset\s+--hard)\b/i,
+    /\bgit\s+(apply|am|rebase|cherry-pick|checkout\s+--|reset\s+--hard)\b/i,
+    /\bgit\s+merge(\s|$)/i,
  ].some((item) => item.test(data))
 }


mut() still uses a bare />/ on the scrubbed command. Since scrub() does not remove quoted strings, this will still treat > inside quotes as a mutation (e.g., echo "version > 2"), which can incorrectly reset review state / trigger mutation gates. Consider tightening > detection to actual shell redirections outside quotes (or implement a small unquoted-token scan) per Issue #141.

Copilot · 2026-04-08T15:53:24Z

packages/guardrails/profile/plugins/team.ts

+  const files = await git(next, ["ls-files", "--cached"]).catch(() => ({ code: 1, out: "", err: "" }))
+  if (files.code !== 0 || !files.out.trim()) {
+    // Worktree might be empty — force checkout HEAD contents
+    await git(next, ["checkout", "HEAD", "--", "."])


In yardadd(), if the worktree appears empty you run git checkout HEAD -- . but don't check the return code or re-validate that files are now present. If checkout fails (or the worktree is misconfigured), this will silently return a still-broken worktree and later steps will fail in harder-to-debug ways. Consider checking the checkout result and throwing (or at least surfacing a clear error) if the worktree remains empty.

Suggested change

const files = await git(next, ["ls-files", "--cached"]).catch(() => ({ code: 1, out: "", err: "" }))

if (files.code !== 0 || !files.out.trim()) {

// Worktree might be empty — force checkout HEAD contents

await git(next, ["checkout", "HEAD", "--", "."])

let files = await git(next, ["ls-files", "--cached"]).catch(() => ({ code: 1, out: "", err: "" }))

if (files.code !== 0 || !files.out.trim()) {

// Worktree might be empty — force checkout HEAD contents

const checkout = await git(next, ["checkout", "HEAD", "--", "."]).catch(() => ({ code: 1, out: "", err: "" }))

if (checkout.code !== 0) {

throw new Error(checkout.err || checkout.out || "Failed to restore files in git worktree")

}

files = await git(next, ["ls-files", "--cached"]).catch(() => ({ code: 1, out: "", err: "" }))

if (files.code !== 0 || !files.out.trim()) {

throw new Error("Git worktree is still empty after checking out HEAD")

}

Copilot · 2026-04-08T15:53:24Z

packages/guardrails/profile/plugins/team.ts

+    await git(dir, ["add", "-A"])
+    const commitMsg = `chore(team): apply worker changes from task ${id}`
+    const commit = await git(dir, ["commit", "-m", commitMsg, "--no-verify"])
+    if (commit.code !== 0 && !commit.err.includes("nothing to commit")) {


Auto-commit failure detection only checks commit.err.includes("nothing to commit"), but git commit commonly prints "nothing to commit" to stdout (with non-zero exit) depending on config/version. This can incorrectly report an auto-commit failure in verification.issues. Consider checking both stdout and stderr for the "nothing to commit" condition (or using git diff --cached --quiet to decide whether to commit).

Suggested change

if (commit.code !== 0 && !commit.err.includes("nothing to commit")) {

const commitOutput = `${commit.out || ""}\n${commit.err || ""}`

if (commit.code !== 0 && !commitOutput.includes("nothing to commit")) {

Copilot · 2026-04-08T15:53:25Z

packages/guardrails/profile/plugins/guardrail.ts

+      if (event.type === "session.idle") {
+        const data = await stash(state)
+        const edits = num(data.edit_count)
+        const pending = str(data.review_state) !== "done"
+        const inProgress = flag(data.auto_review_in_progress)
+        const sessionID = str(event.properties?.sessionID)
+
+        if (edits >= 3 && pending && !inProgress && sessionID) {
+          await mark({ auto_review_in_progress: true })
+          await seen("auto_review.triggered", { edit_count: edits, sessionID })
+          void autoReview(sessionID, data).catch(async (err) => {
+            await mark({ auto_review_in_progress: false })
+            await seen("auto_review.failed", { error: String(err) })
+          })
+        }


Auto-review behavior (trigger on session.idle after edit_count >= 3, spawning a review session, and updating review_state/counts) is newly introduced here, but there are no scenario/unit tests covering the trigger, the spawned-session polling, or the resulting state transitions (auto_review_in_progress, review_critical_count, etc.). Since packages/opencode/test/scenario/guardrails.test.ts already exercises other guardrail hooks, adding coverage for this path would help prevent regressions (especially around stuck in-progress state and merge gating).

- parseFindings(): skip negation lines ("No CRITICAL issues") to avoid false positive severity counts blocking clean reviews - autoReview(): do not mark review_state="done" when session errored or returned empty — prevents merge gate bypass on failed reviews - team.ts merge(): scope auto-commit to patch-applied files only instead of git add -A, preventing unrelated local edits from being swept in Codex review: codex review --base dev (3 findings, all addressed) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Scope secExempt to .env.example/.sample/.template only (not *.key.template) - Add 2-minute timeout to pollIdle() to prevent stuck auto-review - Tighten bare `>` regex to avoid matching inside quoted strings - Add error check + re-verify after yardadd() checkout fallback Review: PR #145 inline comments (6 findings, 4 new + 1 already fixed + 1 truncated) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

terisuke and others added 5 commits April 8, 2026 23:59

Merge branch 'feat/ship-agent-merge-capability' into feat/autonomous-…

540e5e9

…pipeline-wave1 # Conflicts: # packages/guardrails/profile/agents/ship.md

Copilot AI review requested due to automatic review settings April 8, 2026 15:47

Copilot started reviewing on behalf of terisuke April 8, 2026 15:48 View session

github-actions bot added the needs:compliance label Apr 8, 2026

Copilot AI reviewed Apr 8, 2026

View reviewed changes

terisuke and others added 2 commits April 9, 2026 00:53

terisuke merged commit 58e6b24 into dev Apr 8, 2026
5 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(guardrails): autonomous pipeline — orchestration, auto-review, ship agent#145

feat(guardrails): autonomous pipeline — orchestration, auto-review, ship agent#145
terisuke merged 7 commits intodevfrom
feat/autonomous-pipeline-wave1

terisuke commented Apr 8, 2026

Uh oh!

github-actions bot commented Apr 8, 2026

Uh oh!

github-actions bot commented Apr 8, 2026

Uh oh!

github-actions bot commented Apr 8, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	/>/,
	/(?:^\|[ \t;&\|()])\d>>?\s(?:"[^"]+"\|'[^']+'\|(?!&?\d+\b)[^\s"'&\|;()<>]+)/i,

	if (commit.code !== 0 && !commit.err.includes("nothing to commit")) {
	const commitOutput = `${commit.out \|\| ""}\n${commit.err \|\| ""}`
	if (commit.code !== 0 && !commitOutput.includes("nothing to commit")) {

Conversation

terisuke commented Apr 8, 2026

Summary

Key Changes

Test plan

Uh oh!

github-actions bot commented Apr 8, 2026

Uh oh!

github-actions bot commented Apr 8, 2026

Uh oh!

github-actions bot commented Apr 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants