feat(proof, ce-brainstorm, ce-plan, ce-ideate): HITL review-loop mode by tmchow · Pull Request #580 · EveryInc/compound-engineering-plugin

tmchow · 2026-04-17T04:57:27Z

Summary

Iterative Proof review is now a single shared workflow. Users annotate docs in Proof, and the agent reads the comments, applies fixes, and syncs the reviewed markdown back to disk. ce-brainstorm, ce-plan, and ce-ideate all plumb into the same mode through one consistent handoff menu.

The work also produces a reusable set of design rules for blocking question menus. AGENTS.md picks those up as a checklist so future menus stay consistent by default.

Loop at a glance

sequenceDiagram
    actor U as User
    participant T as Terminal (agent)
    participant P as Proof (web)

    T->>P: Upload draft
    T-->>U: Share Proof link
    U->>P: Read + leave comments
    U-->>T: "Done with feedback"
    P-->>T: Comments pulled in
    T->>P: Apply edits + reply in-thread
    T-->>U: Report + "What's next?"

    loop until Save or Pause
        U-->>T: Re-check / Save / Pause
    end

    T->>T: On Save, write reviewed markdown to local file

The terminal side takes direction and reports; the Proof side is where the user reads and comments. Full API details live in plugins/compound-engineering/skills/proof/references/hitl-review.md.

Design decisions

Dual entry. The human-in-the-loop (HITL) mode serves both upstream-skill delegation (source path, doc title, recommended next step passed in) and direct user invocation. Both paths share the upload, review-loop, and end-sync phases, so there is one implementation of the loop.
Retry before surfacing upload failure. A single silent retry hides transient flakiness without masking a sustained outage.
Skip end-sync when Proof content is unchanged. No apply step is offered when there is nothing to apply.
Pull on declined proceed or done_for_now. If the user edited in Proof but declined to sync, the local file is silently stale. A one-line warning prevents the next handoff from reading old content.
Self-contained menu labels. Some harnesses render only the option label, not the accompanying description, so the label alone must convey what the option does. Every Proof handoff uses the same label: Open in Proof (web app), review and comment to iterate with the agent.
Routing coupled to verbatim labels. Each display label has a matching **If user selects "X":** routing block. Renaming one without the other silently breaks routing, so the two change together.
Fixed agent identity. Proof API calls always use ai:compound-engineering. Callers do not override, because doing so risks transferring document ownership and blocking the user from claiming the doc into their own Proof library.

AGENTS.md checklist

The new "Interactive Question Tool Design" section in plugins/compound-engineering/AGENTS.md captures 10 rules: self-contained labels, 4-option cap, no "I'll come back" no-ops, third-person agent references, user-intent phrasing, question stem as teaching surface, routing-label coupling, front-loaded distinguishers, artifact naming when ambiguous, and voice consistency.

- Add dual-entry HITL mode (caller or direct user invocation) - Rename Proof option to "Open in Proof" with clearer description across all callers - Skip end-sync prompt when Proof content is unchanged since upload - Retry once on upload failure before surfacing it to the user - Unify done_for_now behavior across callers (return to options, not closing summary) - Offer pull on proceeded+declined and done_for_now; warn when local is stale - Remove session log from ce-ideate (overly detailed) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Extract learnings from testing the Proof HITL handoff: 10 rules for blocking-question menus (label self-containment, 4-option cap, third- person agent references, user-intent labels, voice consistency, etc.), plus a Known External Limitations note on the Proof ghost-agent auto-join so we don't reintroduce the ownerId workaround.

Real-world HITL testing surfaced UX gaps in the Proof handoff flow and its callers. Apply the Interactive Question Tool Design rules (newly codified in AGENTS.md): make labels self-contained so they work in harnesses that hide descriptions, add missing question stems, normalize voice and label casing, drop the no-op "Keep working" option, rename the opaque "Both" label, and add authoring notes to the Proof skill itself (structural list-item deletion caveat, /edit/v2 op body shapes, jq parens gotcha).

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 90ccde8192

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

The SKILL.md insert_after snippet used a plural `blocks` array that would fail /edit/v2 server validation. Updated to the singular `block` object shape documented in references/hitl-review.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

tmchow · 2026-04-17T05:42:28Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bf15785620

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…oice Address PR review feedback on HITL review-loop docs: - Add `Idempotency-Key: $(uuidgen)` to the `mutate()` recipe so every mutation is safe to retry on transient failures and works when `/state.contract.idempotencyRequired` is true. - Rewrite the Phase 4 "next-signal" menu to third-person imperative voice and drop the "I'll come back" wait-style wrapper that the blocking question tool already subsumes (per the Interactive Question Tool Design rules in plugins/compound-engineering/AGENTS.md).

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 98ec05d2ab

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…g pull Re-run document-review after HITL materially edits the plan so the 5.3.8 handoff gate is not bypassed on localSynced:true or on any post-decline pull. Rewrite the Proof pull recipe to stream markdown via `jq -jr` directly to the sync target; $(...) was stripping trailing newlines and making the byte- for-byte claim false. Apply the same fix to the HITL Phase 5 end-sync.

tmchow and others added 3 commits April 16, 2026 20:08

chatgpt-codex-connector Bot reviewed Apr 17, 2026

View reviewed changes

Comment thread plugins/compound-engineering/skills/proof/SKILL.md Outdated

tmchow changed the title ~~fix(proof, ce-brainstorm, ce-plan, ce-ideate): polish HITL handoff UX~~ feat(proof, ce-brainstorm, ce-plan, ce-ideate): HITL review-loop mode Apr 17, 2026

chatgpt-codex-connector Bot reviewed Apr 17, 2026

View reviewed changes

Comment thread plugins/compound-engineering/skills/proof/references/hitl-review.md

Comment thread plugins/compound-engineering/skills/proof/references/hitl-review.md Outdated

chatgpt-codex-connector Bot reviewed Apr 17, 2026

View reviewed changes

Comment thread plugins/compound-engineering/skills/ce-plan/references/plan-handoff.md Outdated

Comment thread plugins/compound-engineering/skills/proof/SKILL.md Outdated

tmchow merged commit e7cf0ae into main Apr 17, 2026
2 checks passed

github-actions Bot mentioned this pull request Apr 17, 2026

chore: release main #586

Merged

SiTaggart mentioned this pull request Apr 17, 2026

Sync upstream: HITL review loop + new skills (v2.60→v2.68) SiTaggart/.agents#9

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(proof, ce-brainstorm, ce-plan, ce-ideate): HITL review-loop mode#580

feat(proof, ce-brainstorm, ce-plan, ce-ideate): HITL review-loop mode#580
tmchow merged 6 commits intomainfrom
tmchow/proof-hitl

tmchow commented Apr 17, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

tmchow commented Apr 17, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tmchow commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Loop at a glance

Design decisions

AGENTS.md checklist

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

tmchow commented Apr 17, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tmchow commented Apr 17, 2026 •

edited

Loading