Skip to content

feat(proof, ce-brainstorm, ce-plan, ce-ideate): HITL review-loop mode#580

Merged
tmchow merged 6 commits intomainfrom
tmchow/proof-hitl
Apr 17, 2026
Merged

feat(proof, ce-brainstorm, ce-plan, ce-ideate): HITL review-loop mode#580
tmchow merged 6 commits intomainfrom
tmchow/proof-hitl

Conversation

@tmchow
Copy link
Copy Markdown
Collaborator

@tmchow tmchow commented Apr 17, 2026

Summary

Iterative Proof review is now a single shared workflow. Users annotate docs in Proof, and the agent reads the comments, applies fixes, and syncs the reviewed markdown back to disk. ce-brainstorm, ce-plan, and ce-ideate all plumb into the same mode through one consistent handoff menu.

The work also produces a reusable set of design rules for blocking question menus. AGENTS.md picks those up as a checklist so future menus stay consistent by default.

Loop at a glance

sequenceDiagram
    actor U as User
    participant T as Terminal (agent)
    participant P as Proof (web)

    T->>P: Upload draft
    T-->>U: Share Proof link
    U->>P: Read + leave comments
    U-->>T: "Done with feedback"
    P-->>T: Comments pulled in
    T->>P: Apply edits + reply in-thread
    T-->>U: Report + "What's next?"

    loop until Save or Pause
        U-->>T: Re-check / Save / Pause
    end

    T->>T: On Save, write reviewed markdown to local file
Loading

The terminal side takes direction and reports; the Proof side is where the user reads and comments. Full API details live in plugins/compound-engineering/skills/proof/references/hitl-review.md.

Design decisions

  • Dual entry. The human-in-the-loop (HITL) mode serves both upstream-skill delegation (source path, doc title, recommended next step passed in) and direct user invocation. Both paths share the upload, review-loop, and end-sync phases, so there is one implementation of the loop.
  • Retry before surfacing upload failure. A single silent retry hides transient flakiness without masking a sustained outage.
  • Skip end-sync when Proof content is unchanged. No apply step is offered when there is nothing to apply.
  • Pull on declined proceed or done_for_now. If the user edited in Proof but declined to sync, the local file is silently stale. A one-line warning prevents the next handoff from reading old content.
  • Self-contained menu labels. Some harnesses render only the option label, not the accompanying description, so the label alone must convey what the option does. Every Proof handoff uses the same label: Open in Proof (web app), review and comment to iterate with the agent.
  • Routing coupled to verbatim labels. Each display label has a matching **If user selects "X":** routing block. Renaming one without the other silently breaks routing, so the two change together.
  • Fixed agent identity. Proof API calls always use ai:compound-engineering. Callers do not override, because doing so risks transferring document ownership and blocking the user from claiming the doc into their own Proof library.

AGENTS.md checklist

The new "Interactive Question Tool Design" section in plugins/compound-engineering/AGENTS.md captures 10 rules: self-contained labels, 4-option cap, no "I'll come back" no-ops, third-person agent references, user-intent phrasing, question stem as teaching surface, routing-label coupling, front-loaded distinguishers, artifact naming when ambiguous, and voice consistency.


Compound Engineering
Claude Code

tmchow and others added 3 commits April 16, 2026 20:08
- Add dual-entry HITL mode (caller or direct user invocation)
- Rename Proof option to "Open in Proof" with clearer description across all callers
- Skip end-sync prompt when Proof content is unchanged since upload
- Retry once on upload failure before surfacing it to the user
- Unify done_for_now behavior across callers (return to options, not closing summary)
- Offer pull on proceeded+declined and done_for_now; warn when local is stale
- Remove session log from ce-ideate (overly detailed)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Extract learnings from testing the Proof HITL handoff: 10 rules for
blocking-question menus (label self-containment, 4-option cap, third-
person agent references, user-intent labels, voice consistency, etc.),
plus a Known External Limitations note on the Proof ghost-agent
auto-join so we don't reintroduce the ownerId workaround.
Real-world HITL testing surfaced UX gaps in the Proof handoff flow and
its callers. Apply the Interactive Question Tool Design rules (newly
codified in AGENTS.md): make labels self-contained so they work in
harnesses that hide descriptions, add missing question stems, normalize
voice and label casing, drop the no-op "Keep working" option, rename
the opaque "Both" label, and add authoring notes to the Proof skill
itself (structural list-item deletion caveat, /edit/v2 op body shapes,
jq parens gotcha).
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 90ccde8192

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/proof/SKILL.md Outdated
@tmchow tmchow changed the title fix(proof, ce-brainstorm, ce-plan, ce-ideate): polish HITL handoff UX feat(proof, ce-brainstorm, ce-plan, ce-ideate): HITL review-loop mode Apr 17, 2026
The SKILL.md insert_after snippet used a plural `blocks` array that
would fail /edit/v2 server validation. Updated to the singular `block`
object shape documented in references/hitl-review.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@tmchow
Copy link
Copy Markdown
Collaborator Author

tmchow commented Apr 17, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bf15785620

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/proof/references/hitl-review.md
Comment thread plugins/compound-engineering/skills/proof/references/hitl-review.md Outdated
…oice

Address PR review feedback on HITL review-loop docs:

- Add `Idempotency-Key: $(uuidgen)` to the `mutate()` recipe so every
  mutation is safe to retry on transient failures and works when
  `/state.contract.idempotencyRequired` is true.
- Rewrite the Phase 4 "next-signal" menu to third-person imperative
  voice and drop the "I'll come back" wait-style wrapper that the
  blocking question tool already subsumes (per the Interactive Question
  Tool Design rules in plugins/compound-engineering/AGENTS.md).
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 98ec05d2ab

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-plan/references/plan-handoff.md Outdated
Comment thread plugins/compound-engineering/skills/proof/SKILL.md Outdated
…g pull

Re-run document-review after HITL materially edits the plan so the 5.3.8
handoff gate is not bypassed on localSynced:true or on any post-decline pull.

Rewrite the Proof pull recipe to stream markdown via `jq -jr` directly to the
sync target; $(...) was stripping trailing newlines and making the byte-
for-byte claim false. Apply the same fix to the HITL Phase 5 end-sync.
@tmchow tmchow merged commit e7cf0ae into main Apr 17, 2026
2 checks passed
@github-actions github-actions Bot mentioned this pull request Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant