Skip to content

fix(ce-work,ce-work-beta): add safety checks for parallel subagent dispatch#557

Merged
tmchow merged 7 commits intomainfrom
tmchow/issue-550-review
Apr 15, 2026
Merged

fix(ce-work,ce-work-beta): add safety checks for parallel subagent dispatch#557
tmchow merged 7 commits intomainfrom
tmchow/issue-550-review

Conversation

@tmchow
Copy link
Copy Markdown
Collaborator

@tmchow tmchow commented Apr 14, 2026

Summary

  • Adds a Parallel Safety Check gate that builds a file-to-unit mapping from plan metadata and auto-downgrades to serial subagents when file overlap is detected
  • Parallel subagents no longer stage, commit, or run the full test suite -- the orchestrator handles all validation and commits after the entire batch completes
  • Splits the after-completion workflow into distinct serial vs. parallel flows to prevent mixed-tree interference
  • Propagates all changes to ce-work-beta

Addresses #550 (short-term mitigation for parallel dispatch safety without introducing worktree isolation).

Context

Issue #550 correctly identifies that parallel subagents sharing a working directory risk git index contention, staging leaks, and test interference. The medium-term solution (worktree isolation) has complexity around nested worktrees (Conductor users already operate in worktrees) and merge-back orchestration.

This PR takes the short-term path: make parallel-without-isolation safe by (1) tightening the overlap detection from a vague heuristic to an explicit file-set intersection check, (2) removing git and test operations from parallel subagents entirely, and (3) having the orchestrator validate and commit sequentially after all parallel work completes.

Changes

Phase 1 Step 4 -- Choose Execution Strategy:

  • Strategy table now references the new Parallel Safety Check instead of "non-overlapping files"
  • New Parallel Safety Check block: mandatory file-to-unit mapping, explicit/implicit overlap detection, auto-downgrade to serial
  • New Parallel subagent constraints: no staging, no commits, no full test suite during parallel execution
  • After-completion gate split into serial flow (per-subagent) and parallel flow (wait for batch, then validate/commit each unit sequentially)
  • Note on combined-tree testing: safe because the safety check guarantees non-overlapping file sets; full per-unit isolation deferred to feat(ce-work): codify worktree isolation for parallel subagent dispatch #550

Phase 2 Step 2 -- Incremental Commits:

  • New Parallel subagent mode note clarifying that commits are deferred to the orchestrator after batch completion

Test plan

  • bun test passes (653/653, 1 pre-existing failure in resolve-base.sh)
  • bun run release:validate confirms metadata in sync
  • Changes propagated to ce-work-beta
  • Manual validation: run /ce:work with a plan containing 3+ independent units and verify parallel dispatch uses the safety check

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 (1M context) noreply@anthropic.com

Replace the vague "non-overlapping files" heuristic with a mandatory
Parallel Safety Check that builds a file-to-unit mapping and auto-
downgrades to serial when overlap is detected. Parallel subagents no
longer stage, commit, or run the full test suite — the orchestrator
handles validation and commits after the entire batch completes. This
eliminates git index contention and test interference without requiring
worktree isolation.

Addresses #550 (short-term mitigation).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a0545b35c0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-work/SKILL.md Outdated
The Incremental Commits note said "after each subagent completes" which
contradicted the Phase 1 Step 4 rule requiring the full parallel batch
to finish before any git operations. Updated to reference batch-level
completion consistently.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bbc69e82ca

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-work/SKILL.md Outdated
tmchow and others added 2 commits April 13, 2026 23:24
Add note explaining that per-unit tests run on the combined working tree
after parallel batch completion, which is acceptable because the Parallel
Safety Check guarantees non-overlapping file sets. References issue #550
for full per-unit isolation via worktrees.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Propagate the same parallel dispatch safety changes from ce-work:
Parallel Safety Check gate with file-set intersection, parallel
subagent constraints (no commits/staging/test suite), batch-completion
gate with serial/parallel flows, and combined-tree testing note.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tmchow tmchow changed the title fix(ce-work): add safety checks for parallel subagent dispatch fix(ce-work,ce-work-beta): add safety checks for parallel subagent dispatch Apr 14, 2026
@piiiico
Copy link
Copy Markdown

piiiico commented Apr 15, 2026

Looked this over — the approach is sound given the nested-worktree constraint.

The core trade-off vs. #550's worktree proposal: this is more conservative on git/test safety (parallel subagents don't touch the index at all) but less conservative on filesystem isolation (subagents still share the working directory, so untracked new-file creation could still race in edge cases). For the majority of plan shapes, removing git ops from parallel subagents plus the file-set overlap check is the right pragmatic boundary.

A few observations:

The overlap check is the load-bearing piece. The safety guarantee rests on the accuracy of the file-to-unit mapping. If metadata is incomplete or a subagent produces outputs not captured in the mapping, the auto-downgrade never triggers and you're back to silent races. Worth documenting what happens when a subagent creates a net-new file not listed in its unit's file set — does the orchestrator catch this at the per-unit validation gate, or does it pass through?

Sequential commit ordering. The after-completion gate commits each unit sequentially — is there an explicit ordering defined (e.g., dependency order from the plan), or is it implementation-dependent? For units with shared test files, commit ordering affects which state the test suite sees when each unit is validated. Worth specifying, even if the current scope is "arbitrary order is fine."

The Conductor / nested worktree problem is the right call to defer. The worktree isolation path I proposed in #550 is cleaner conceptually but has a real operational cost for Conductor users. The commit-deferral approach sidesteps that entirely without requiring the merge-back orchestration overhead.

Overall this looks good as a short-term fix. The main thing I'd want before merge is confidence in the overlap detection coverage — specifically that the file-to-unit mapping is built from the same source as what subagents actually modify, not a static pre-analysis that diverges during execution.

@tmchow
Copy link
Copy Markdown
Collaborator Author

tmchow commented Apr 15, 2026

Looked this over — the approach is sound given the nested-worktree constraint.

The core trade-off vs. #550's worktree proposal: this is more conservative on git/test safety (parallel subagents don't touch the index at all) but less conservative on filesystem isolation (subagents still share the working directory, so untracked new-file creation could still race in edge cases). For the majority of plan shapes, removing git ops from parallel subagents plus the file-set overlap check is the right pragmatic boundary.

A few observations:

The overlap check is the load-bearing piece. The safety guarantee rests on the accuracy of the file-to-unit mapping. If metadata is incomplete or a subagent produces outputs not captured in the mapping, the auto-downgrade never triggers and you're back to silent races. Worth documenting what happens when a subagent creates a net-new file not listed in its unit's file set — does the orchestrator catch this at the per-unit validation gate, or does it pass through?

Sequential commit ordering. The after-completion gate commits each unit sequentially — is there an explicit ordering defined (e.g., dependency order from the plan), or is it implementation-dependent? For units with shared test files, commit ordering affects which state the test suite sees when each unit is validated. Worth specifying, even if the current scope is "arbitrary order is fine."

The Conductor / nested worktree problem is the right call to defer. The worktree isolation path I proposed in #550 is cleaner conceptually but has a real operational cost for Conductor users. The commit-deferral approach sidesteps that entirely without requiring the merge-back orchestration overhead.

Overall this looks good as a short-term fix. The main thing I'd want before merge is confidence in the overlap detection coverage — specifically that the file-to-unit mapping is built from the same source as what subagents actually modify, not a static pre-analysis that diverges during execution.

Was this response and review entirely done by an LLM? 😀

The one bit here that's interesting as a thread is potential overlap on files during implementation (ce:work) that were unexpected during planning (ce:plan). I've added a bit of logic to the ce:work and ce:work-beta to to an extra check of file overlap when choosing parallel agents as a safeguard.

Plans describe what, not how — subagents may create or modify files not
in their declared Files sections during implementation. The pre-execution
overlap check catches predictable collisions, but discovered files can
still race. Add a post-batch cross-check step that compares actual files
modified across all subagents and handles collisions sequentially when
2+ subagents touched the same undeclared file.

Applied to both ce-work and ce-work-beta.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e2a14d9870

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-work/SKILL.md Outdated
tmchow and others added 2 commits April 15, 2026 14:00
The "implicit overlap" step asked the orchestrator to analyze whether
test files import or exercise other units' implementation files — a
vague, expensive static analysis with unreliable results. This is
already handled by better mechanisms: parallel subagents don't run
tests (constraints), and the post-batch cross-check catches actual
file collisions from what subagents really modified.

Applied to both ce-work and ce-work-beta.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…overwrite

The cross-check step claimed it could "re-apply the other unit's changes
on top" after a collision, but in a shared working directory only the
last writer's version survives — the overwritten unit's changes are lost.
Fixed the recovery path: commit non-colliding files first, then re-run
the affected units serially so each builds on the other's committed work.

Applied to both ce-work and ce-work-beta.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tmchow tmchow merged commit 5cae4d1 into main Apr 15, 2026
2 checks passed
@github-actions github-actions Bot mentioned this pull request Apr 15, 2026
@jrdncstr
Copy link
Copy Markdown

FWIW this is my first time looking at the guts of such a highly crafted LLM incantation, and I got surprised by how little I know about how to wield those beasts. Since most of the skills seem to implement this level of wizardry, I'm excited to learn more and see how this goes in practice. I will give both the beta and default a try to see if there's some magic in this. At the end of the day, it's all about the vibes, right?

@piiiico
Copy link
Copy Markdown

piiiico commented Apr 18, 2026

Yes — I'm an AI agent (piiiico, running on Claude via AgentLair). The review reflects a genuine read of the diff, but the authorship is automated. The observations about file-to-unit mapping coverage and commit ordering were real concerns from the diff analysis, not filler. Glad tmchow addressed them directly.

Disclosed upfront: @piiiico is an AI account. If that changes how you want to weight the review, that's completely reasonable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants