feat(proof, ce-brainstorm, ce-plan, ce-ideate): HITL review-loop mode#580
feat(proof, ce-brainstorm, ce-plan, ce-ideate): HITL review-loop mode#580
Conversation
- Add dual-entry HITL mode (caller or direct user invocation) - Rename Proof option to "Open in Proof" with clearer description across all callers - Skip end-sync prompt when Proof content is unchanged since upload - Retry once on upload failure before surfacing it to the user - Unify done_for_now behavior across callers (return to options, not closing summary) - Offer pull on proceeded+declined and done_for_now; warn when local is stale - Remove session log from ce-ideate (overly detailed) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Extract learnings from testing the Proof HITL handoff: 10 rules for blocking-question menus (label self-containment, 4-option cap, third- person agent references, user-intent labels, voice consistency, etc.), plus a Known External Limitations note on the Proof ghost-agent auto-join so we don't reintroduce the ownerId workaround.
Real-world HITL testing surfaced UX gaps in the Proof handoff flow and its callers. Apply the Interactive Question Tool Design rules (newly codified in AGENTS.md): make labels self-contained so they work in harnesses that hide descriptions, add missing question stems, normalize voice and label casing, drop the no-op "Keep working" option, rename the opaque "Both" label, and add authoring notes to the Proof skill itself (structural list-item deletion caveat, /edit/v2 op body shapes, jq parens gotcha).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 90ccde8192
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
The SKILL.md insert_after snippet used a plural `blocks` array that would fail /edit/v2 server validation. Updated to the singular `block` object shape documented in references/hitl-review.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bf15785620
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…oice Address PR review feedback on HITL review-loop docs: - Add `Idempotency-Key: $(uuidgen)` to the `mutate()` recipe so every mutation is safe to retry on transient failures and works when `/state.contract.idempotencyRequired` is true. - Rewrite the Phase 4 "next-signal" menu to third-person imperative voice and drop the "I'll come back" wait-style wrapper that the blocking question tool already subsumes (per the Interactive Question Tool Design rules in plugins/compound-engineering/AGENTS.md).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 98ec05d2ab
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…g pull Re-run document-review after HITL materially edits the plan so the 5.3.8 handoff gate is not bypassed on localSynced:true or on any post-decline pull. Rewrite the Proof pull recipe to stream markdown via `jq -jr` directly to the sync target; $(...) was stripping trailing newlines and making the byte- for-byte claim false. Apply the same fix to the HITL Phase 5 end-sync.
Summary
Iterative Proof review is now a single shared workflow. Users annotate docs in Proof, and the agent reads the comments, applies fixes, and syncs the reviewed markdown back to disk.
ce-brainstorm,ce-plan, andce-ideateall plumb into the same mode through one consistent handoff menu.The work also produces a reusable set of design rules for blocking question menus.
AGENTS.mdpicks those up as a checklist so future menus stay consistent by default.Loop at a glance
sequenceDiagram actor U as User participant T as Terminal (agent) participant P as Proof (web) T->>P: Upload draft T-->>U: Share Proof link U->>P: Read + leave comments U-->>T: "Done with feedback" P-->>T: Comments pulled in T->>P: Apply edits + reply in-thread T-->>U: Report + "What's next?" loop until Save or Pause U-->>T: Re-check / Save / Pause end T->>T: On Save, write reviewed markdown to local fileThe terminal side takes direction and reports; the Proof side is where the user reads and comments. Full API details live in
plugins/compound-engineering/skills/proof/references/hitl-review.md.Design decisions
done_for_now. If the user edited in Proof but declined to sync, the local file is silently stale. A one-line warning prevents the next handoff from reading old content.Open in Proof (web app), review and comment to iterate with the agent.**If user selects "X":**routing block. Renaming one without the other silently breaks routing, so the two change together.ai:compound-engineering. Callers do not override, because doing so risks transferring document ownership and blocking the user from claiming the doc into their own Proof library.AGENTS.md checklist
The new "Interactive Question Tool Design" section in
plugins/compound-engineering/AGENTS.mdcaptures 10 rules: self-contained labels, 4-option cap, no "I'll come back" no-ops, third-person agent references, user-intent phrasing, question stem as teaching surface, routing-label coupling, front-loaded distinguishers, artifact naming when ambiguous, and voice consistency.