feat(agent): add optional plan-before-execute turn for complex codergen sessions#137
Conversation
Agent-Logs-Url: https://github.com/2389-research/tracker/sessions/13e98b22-007e-4fb3-9f06-d88c3546f4f5 Co-authored-by: clintecker <26985+clintecker@users.noreply.github.com>
Agent-Logs-Url: https://github.com/2389-research/tracker/sessions/13e98b22-007e-4fb3-9f06-d88c3546f4f5 Co-authored-by: clintecker <26985+clintecker@users.noreply.github.com>
Agent-Logs-Url: https://github.com/2389-research/tracker/sessions/13e98b22-007e-4fb3-9f06-d88c3546f4f5 Co-authored-by: clintecker <26985+clintecker@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds an opt-in “plan before execute” phase to the agent session loop to reduce thrash on multi-step codergen tasks by injecting a single planning-only LLM call before normal tool-using turns.
Changes:
- Introduces
SessionConfig.PlanBeforeExecute(defaultfalse) and wires a one-time planning call intoSession.Run. - Adds codergen config mapping from graph/node attrs (
plan_before_execute, plus node aliasplan) and tests for the mapping. - Updates tests and changelog to cover the new planning behavior and defaults.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
agent/config.go |
Adds PlanBeforeExecute to SessionConfig. |
agent/config_test.go |
Verifies PlanBeforeExecute default is false. |
agent/session.go |
Runs the optional planning turn before the main turn loop. |
agent/session_run.go |
Implements planning prompt constants, planning-call path without tools, and context injection. |
agent/session_test.go |
Adds unit tests ensuring planning call is injected/omitted correctly and tools are suppressed during planning. |
pipeline/handlers/codergen.go |
Maps pipeline attrs into SessionConfig.PlanBeforeExecute. |
pipeline/handlers/codergen_test.go |
Adds tests for node attr + alias mapping. |
CHANGELOG.md |
Documents the new opt-in plan-before-execute phase. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| planBeforeExecutePrompt = "Before you start executing this task, outline your plan: what files to examine, what changes to make, and in what order. Do not call tools yet." | ||
| executeAfterPlanPrompt = "Now execute the task using the plan above. Start making progress with tool calls." |
There was a problem hiding this comment.
The planning prompts don’t account for sessions using structured output (ResponseFormat/strict json_schema). Since doPlanningCall reuses buildResponseFormat(), the planning response may be forced into a JSON schema that was designed for the final execution output, causing provider-side schema failures or invalid planning output. Consider either (a) omitting ResponseFormat for the planning call, or (b) adapting planBeforeExecutePrompt to explicitly instruct the model how to express the plan within the configured response format/schema (e.g., a dedicated plan field).
| planBeforeExecutePrompt = "Before you start executing this task, outline your plan: what files to examine, what changes to make, and in what order. Do not call tools yet." | |
| executeAfterPlanPrompt = "Now execute the task using the plan above. Start making progress with tool calls." | |
| planBeforeExecutePrompt = "Before you start executing this task, outline your plan: what files to examine, what changes to make, and in what order. Do not call tools yet. If a structured response format or JSON schema is active for this session, express the plan using that required format instead of free-form text: put the plan in the most appropriate existing text/string field(s), keep it concise, and do not invent new fields that are not allowed by the schema." | |
| executeAfterPlanPrompt = "Now execute the task using the plan above. Start making progress with tool calls. If the prior plan was expressed through a structured response format or JSON schema, continue based on that plan rather than repeating it." |
| result.Usage = result.Usage.Add(resp.Usage) | ||
| if resp.Usage.EstimatedCost == 0 { | ||
| result.Usage.EstimatedCost += llm.EstimateCost(s.config.Model, resp.Usage) | ||
| } | ||
| s.messages = append(s.messages, resp.Message) |
There was a problem hiding this comment.
maybeRunPlanningTurn appends the planning response directly into the conversation without handling the case where the model returns tool calls (despite the prompt). If resp.ToolCalls() is non-empty / finish reason indicates tool calls, those calls will never be executed (planning runs outside the normal tool-execution loop), leaving the conversation in an inconsistent state for the first execution turn. Add an explicit guard here (error out, or strip tool-call parts and re-prompt for a plain plan) before appending resp.Message.
| // and node-level overrides. Node-level "plan" is accepted as a shorthand alias. | ||
| func (h *CodergenHandler) applyPlanningConfig(config *agent.SessionConfig, node *pipeline.Node) { | ||
| if v, ok := h.graphAttrs["plan_before_execute"]; ok { | ||
| config.PlanBeforeExecute = v == "true" | ||
| } | ||
| if v, ok := node.Attrs["plan_before_execute"]; ok { | ||
| config.PlanBeforeExecute = v == "true" | ||
| } | ||
| if v, ok := node.Attrs["plan"]; ok { |
There was a problem hiding this comment.
applyPlanningConfig lets the shorthand alias plan override an explicit plan_before_execute value on the same node because it’s applied last. That makes configuration ambiguous (e.g., plan_before_execute="false" + plan="true" ends up enabled). Prefer treating plan as an alias only when plan_before_execute is not present, or define and enforce a clear precedence order (typically the explicit key wins).
| // and node-level overrides. Node-level "plan" is accepted as a shorthand alias. | |
| func (h *CodergenHandler) applyPlanningConfig(config *agent.SessionConfig, node *pipeline.Node) { | |
| if v, ok := h.graphAttrs["plan_before_execute"]; ok { | |
| config.PlanBeforeExecute = v == "true" | |
| } | |
| if v, ok := node.Attrs["plan_before_execute"]; ok { | |
| config.PlanBeforeExecute = v == "true" | |
| } | |
| if v, ok := node.Attrs["plan"]; ok { | |
| // and node-level overrides. Node-level "plan" is accepted as a shorthand alias | |
| // only when the explicit "plan_before_execute" attribute is not present. | |
| func (h *CodergenHandler) applyPlanningConfig(config *agent.SessionConfig, node *pipeline.Node) { | |
| if v, ok := h.graphAttrs["plan_before_execute"]; ok { | |
| config.PlanBeforeExecute = v == "true" | |
| } | |
| if v, ok := node.Attrs["plan_before_execute"]; ok { | |
| config.PlanBeforeExecute = v == "true" | |
| } else if v, ok := node.Attrs["plan"]; ok { |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7d83ae8daa
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| return err | ||
| } | ||
|
|
||
| result.Usage = result.Usage.Add(resp.Usage) |
There was a problem hiding this comment.
Exclude planning tokens from empty-response accounting
Adding resp.Usage from the planning-only call into SessionResult.Usage causes downstream empty-response handling to misclassify failed execution turns when planning is enabled. buildEmptyResponseOutcome in pipeline/handlers/codergen.go treats Usage.OutputTokens == 0 as the signal to retry swallowed provider responses; after this change, a normal planning reply (non-zero output tokens) can mask a subsequent execution turn that returned zero tokens/no tool calls, so the node is marked as normal/turn-limit failure instead of retryable. This is most visible with low max_turns (for example 1), where the execution retry prompt cannot run and the pipeline now loses its fallback retry path.
Useful? React with 👍 / 👎.
|
@copilot address all the feedback |
Agent-Logs-Url: https://github.com/2389-research/tracker/sessions/3b682829-1719-4abb-bca3-487cb08d8ac4 Co-authored-by: clintecker <26985+clintecker@users.noreply.github.com>
Addressed all requested feedback in commit eafeae3:
Validation is green ( |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
go build ./...,go test ./... -short)go build ./...,go test ./... -short) and parallel validation