Skip to content

[workflow] Large step outputs injected into host context cause token bloat and trigger compaction #4

@tvhc84

Description

@tvhc84

Problem

In hosted workflow mode, {{variable}} substitutions in step prompts inject the full text of prior step outputs into the orchestrating LLM's context. For a 10-step workflow with large reports, this accumulates tens of thousands of tokens in the main conversation context — directly causing the context compaction that orphans background agents (see related issue).

Example token accumulation (ftm-analysis workflow)

Step output injected Approx. tokens
ranging_report ~4K
rssi_report ~2K
lowi_report ~5K
firmware_report ~4K
spike_report ~5K
consistency_report ~3K
trilat_report ~2K
final_report (synthesize) ~9K
Total injected into orchestrating context ~34K tokens

By Wave 5 (synthesize), the orchestrating context is carrying ~34K tokens of step output on top of the conversation history — making compaction near-certain on longer sessions.

Root Cause

The {{variable}} substitution in step prompts is designed for the step-executing agent's context (it needs the prior outputs to do its work). But the rendered prompt is returned to the orchestrating host via workflow_next, meaning the full substituted content also lands in the orchestrating context even though the host never needs to read it.

Requested Fix

Option A (preferred): Return step prompts to the host with {{variable}} references unresolved (or replaced by a lightweight summary/key). The step-executing subagent calls a workflow_get_output(session, key) tool to fetch the full content on demand — keeping large text out of the orchestrating context entirely.

Option B: Document that all step prompts should be executed inside a Agent() subagent call so the large substituted content stays isolated in the subagent's context and is never written back into the orchestrating conversation.

Option C: In the workflow_next response, provide both a full_prompt (for the subagent) and a summary_prompt (for the orchestrating host to log/track progress) — the host passes full_prompt directly to the subagent without ever parsing or storing it.

Impact

Without this fix, Issues 1 and 2 are coupled: large outputs cause compaction, compaction orphans agents, orphaned agents require manual recovery. Fixing either issue independently reduces the blast radius; fixing both eliminates it.

Related

Closes / related to #3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions