Problem
In hosted workflow mode, {{variable}} substitutions in step prompts inject the full text of prior step outputs into the orchestrating LLM's context. For a 10-step workflow with large reports, this accumulates tens of thousands of tokens in the main conversation context — directly causing the context compaction that orphans background agents (see related issue).
Example token accumulation (ftm-analysis workflow)
| Step output injected |
Approx. tokens |
ranging_report |
~4K |
rssi_report |
~2K |
lowi_report |
~5K |
firmware_report |
~4K |
spike_report |
~5K |
consistency_report |
~3K |
trilat_report |
~2K |
final_report (synthesize) |
~9K |
| Total injected into orchestrating context |
~34K tokens |
By Wave 5 (synthesize), the orchestrating context is carrying ~34K tokens of step output on top of the conversation history — making compaction near-certain on longer sessions.
Root Cause
The {{variable}} substitution in step prompts is designed for the step-executing agent's context (it needs the prior outputs to do its work). But the rendered prompt is returned to the orchestrating host via workflow_next, meaning the full substituted content also lands in the orchestrating context even though the host never needs to read it.
Requested Fix
Option A (preferred): Return step prompts to the host with {{variable}} references unresolved (or replaced by a lightweight summary/key). The step-executing subagent calls a workflow_get_output(session, key) tool to fetch the full content on demand — keeping large text out of the orchestrating context entirely.
Option B: Document that all step prompts should be executed inside a Agent() subagent call so the large substituted content stays isolated in the subagent's context and is never written back into the orchestrating conversation.
Option C: In the workflow_next response, provide both a full_prompt (for the subagent) and a summary_prompt (for the orchestrating host to log/track progress) — the host passes full_prompt directly to the subagent without ever parsing or storing it.
Impact
Without this fix, Issues 1 and 2 are coupled: large outputs cause compaction, compaction orphans agents, orphaned agents require manual recovery. Fixing either issue independently reduces the blast radius; fixing both eliminates it.
Related
Closes / related to #3
Problem
In hosted workflow mode,
{{variable}}substitutions in step prompts inject the full text of prior step outputs into the orchestrating LLM's context. For a 10-step workflow with large reports, this accumulates tens of thousands of tokens in the main conversation context — directly causing the context compaction that orphans background agents (see related issue).Example token accumulation (ftm-analysis workflow)
ranging_reportrssi_reportlowi_reportfirmware_reportspike_reportconsistency_reporttrilat_reportfinal_report(synthesize)By Wave 5 (synthesize), the orchestrating context is carrying ~34K tokens of step output on top of the conversation history — making compaction near-certain on longer sessions.
Root Cause
The
{{variable}}substitution in step prompts is designed for the step-executing agent's context (it needs the prior outputs to do its work). But the rendered prompt is returned to the orchestrating host viaworkflow_next, meaning the full substituted content also lands in the orchestrating context even though the host never needs to read it.Requested Fix
Option A (preferred): Return step prompts to the host with
{{variable}}references unresolved (or replaced by a lightweight summary/key). The step-executing subagent calls aworkflow_get_output(session, key)tool to fetch the full content on demand — keeping large text out of the orchestrating context entirely.Option B: Document that all step prompts should be executed inside a
Agent()subagent call so the large substituted content stays isolated in the subagent's context and is never written back into the orchestrating conversation.Option C: In the
workflow_nextresponse, provide both afull_prompt(for the subagent) and asummary_prompt(for the orchestrating host to log/track progress) — the host passesfull_promptdirectly to the subagent without ever parsing or storing it.Impact
Without this fix, Issues 1 and 2 are coupled: large outputs cause compaction, compaction orphans agents, orphaned agents require manual recovery. Fixing either issue independently reduces the blast radius; fixing both eliminates it.
Related
Closes / related to #3