Skip to content

feat: add clean context and token budget for subagents#2337

Closed
ossaidqadri wants to merge 1 commit intoQwenLM:mainfrom
ossaidqadri:feature/subagent-clean-context
Closed

feat: add clean context and token budget for subagents#2337
ossaidqadri wants to merge 1 commit intoQwenLM:mainfrom
ossaidqadri:feature/subagent-clean-context

Conversation

@ossaidqadri
Copy link
Copy Markdown
Contributor

This pull request introduces advanced runtime configuration options for subagents, enabling more granular control over context management and output formatting. The changes add new fields to the subagent configuration, update the core logic to support these options, and enhance documentation and tests to reflect the new capabilities.

Key changes:

Subagent Runtime Configuration Enhancements

  • Added new runtime configuration options to RunConfig: useCleanContext, maxContextTokens, and useStructuredOutput. These allow subagents to start with a clean context, enforce a token budget, and produce structured output summaries, respectively. Also introduced the SubagentStructuredSummary interface for structured results.
  • Updated the subagent creation flow (SubagentManager, SubAgentScope) to accept and apply runtime configuration overrides, and to pass these options to context initialization and prompt construction. [1] [2] [3] [4] [5] [6]

Task Tool and API Support

  • Extended the TaskParams interface and TaskTool logic to accept and forward runtime configuration overrides, enabling dynamic customization of subagent behavior via the Task tool API. [1] [2] [3]
  • Added tests to verify that runtime configuration overrides are correctly passed through the Task tool to subagent instantiation. [1] [2]

Documentation Improvements

  • Expanded the subagent documentation to describe the new runtime configuration options, including usage examples and explanations of their benefits and output formats.

Test and Mock Adjustments

  • Updated and extended mocks and tests for environment context and subagent initialization to accommodate new parameters for context management. [1] [2] [3] [4] [5]

These changes provide users with greater flexibility and cost control when delegating tasks to subagents, and ensure consistent, structured output for downstream processing.

Closes #2332

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds runtime configuration controls for subagents—allowing optional “clean context” delegation, context token budgeting, and (prompt-level) structured output formatting—and wires these overrides through the Task tool and subagent creation flow.

Changes:

  • Extended RunConfig with useCleanContext, maxContextTokens, and useStructuredOutput, plus a SubagentStructuredSummary type.
  • Updated subagent scope creation to accept runtime RunConfig overrides and apply them when initializing chat history and constructing prompts.
  • Added context truncation utilities/tests and expanded subagent documentation with examples of the new options.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
packages/core/src/utils/environmentContext.ts Adds token-budget truncation and extends getInitialChatHistory to support clean context + max token budgeting + session history inclusion.
packages/core/src/utils/environmentContext.test.ts Adds/updates tests for clean-context behavior and token budget truncation.
packages/core/src/tools/task.ts Extends TaskParams with optional runConfig overrides and forwards them into subagent scope creation.
packages/core/src/tools/task.test.ts Verifies runConfig overrides are passed through the Task tool invocation path.
packages/core/src/subagents/types.ts Extends RunConfig and introduces SubagentStructuredSummary.
packages/core/src/subagents/subagent.ts Applies useCleanContext/maxContextTokens in history initialization and appends structured-output instructions to the system prompt.
packages/core/src/subagents/subagent.test.ts Updates mocks to match the updated getInitialChatHistory signature.
packages/core/src/subagents/subagent-manager.ts Adds runConfigOverrides parameter and merges overrides into the runtime config.
docs/users/features/sub-agents.md Documents the new runtime configuration options with examples.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 40 to 49
export interface TaskParams {
description: string;
prompt: string;
subagent_type: string;
/**
* Optional runtime configuration overrides for the subagent.
* Allows customizing context behavior like useCleanContext, maxContextTokens, etc.
*/
runConfig?: Partial<RunConfig>;
}
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TaskParams now includes runConfig, but the Task tool's JSON schema (defined in the constructor) still has additionalProperties: false and does not declare a runConfig property. That means tool calls containing runConfig will be rejected at schema-validation time, so this override won't be usable via the actual Task tool API unless the schema is updated accordingly.

Copilot uses AI. Check for mistakes.
Comment on lines +272 to +285
* This prevents context bloat during long sessions.
*/
useCleanContext?: boolean;
/**
* Maximum number of tokens allowed for context injection.
* When exceeded, context is truncated to fit within this budget.
* If not specified, no token budget is enforced.
*/
maxContextTokens?: number;
/**
* When true, instructs the subagent to format its output using a structured
* summary schema (findings, files changed, conclusion). This ensures only
* distilled summaries are injected back into the main context.
*/
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RunConfig uses snake_case keys for existing fields (max_time_minutes, max_turns), but the newly added fields use camelCase (useCleanContext, maxContextTokens, useStructuredOutput). This inconsistency makes YAML/frontmatter configuration error-prone and complicates validation/SDK schemas. Consider switching these to snake_case (or supporting both names with a clear precedence) to match existing RunConfig conventions.

Suggested change
* This prevents context bloat during long sessions.
*/
useCleanContext?: boolean;
/**
* Maximum number of tokens allowed for context injection.
* When exceeded, context is truncated to fit within this budget.
* If not specified, no token budget is enforced.
*/
maxContextTokens?: number;
/**
* When true, instructs the subagent to format its output using a structured
* summary schema (findings, files changed, conclusion). This ensures only
* distilled summaries are injected back into the main context.
*/
* This prevents context bloat during long sessions.
*
* Prefer using the snake_case `use_clean_context` key in YAML/frontmatter.
* The camelCase `useCleanContext` field is kept for backwards compatibility.
*/
use_clean_context?: boolean;
useCleanContext?: boolean;
/**
* Maximum number of tokens allowed for context injection.
* When exceeded, context is truncated to fit within this budget.
* If not specified, no token budget is enforced.
*
* Prefer using the snake_case `max_context_tokens` key in YAML/frontmatter.
* The camelCase `maxContextTokens` field is kept for backwards compatibility.
*/
max_context_tokens?: number;
maxContextTokens?: number;
/**
* When true, instructs the subagent to format its output using a structured
* summary schema (findings, files changed, conclusion). This ensures only
* distilled summaries are injected back into the main context.
*
* Prefer using the snake_case `use_structured_output` key in YAML/frontmatter.
* The camelCase `useStructuredOutput` field is kept for backwards compatibility.
*/
use_structured_output?: boolean;

Copilot uses AI. Check for mistakes.
Comment on lines +163 to +189
```yaml
---
name: focused-researcher
description: Researches topics without carrying main session context
runConfig:
useCleanContext: true
---
```

**Benefits:**

- Reduces token usage for focused tasks
- Prevents context pollution from unrelated conversations
- Improves performance for long-running sessions

#### `maxContextTokens`

Sets a maximum token budget for the subagent's context. When exceeded, older messages are truncated to fit within the budget.

```yaml
---
name: budget-conscious-agent
description: Works within strict token limits
runConfig:
maxContextTokens: 4000
---
```
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The YAML examples document runConfig keys in camelCase (e.g. useCleanContext, maxContextTokens, useStructuredOutput) but the rest of RunConfig (and prior docs) use snake_case (e.g. max_turns, max_time_minutes). If the implementation expects one naming style, this doc will lead to misconfiguration. Align the examples with the actual supported key names (or explicitly document that both styles are accepted).

Copilot uses AI. Check for mistakes.
Comment on lines +179 to 190
* @param {Config} config - The runtime configuration and services.
* @param {boolean} useCleanContext - When true, omits accumulated session history.
* @param {number} maxContextTokens - Optional maximum token budget for context.
* @param extraHistory - Optional additional history to append.
* @returns A promise that resolves to an array of `Content` objects for chat history.
*/
export async function getInitialChatHistory(
config: Config,
useCleanContext: boolean = false,
maxContextTokens?: number,
extraHistory?: Content[],
): Promise<Content[]> {
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getInitialChatHistory now treats the 2nd argument as useCleanContext, but there are existing call sites that pass extraHistory as the 2nd arg (e.g. GeminiClient.startChat() in packages/core/src/core/client.ts). This will be a TypeScript type error and also changes behavior at runtime. Consider either keeping the old parameter order (or switching to an options object) and updating all call sites accordingly.

Suggested change
* @param {Config} config - The runtime configuration and services.
* @param {boolean} useCleanContext - When true, omits accumulated session history.
* @param {number} maxContextTokens - Optional maximum token budget for context.
* @param extraHistory - Optional additional history to append.
* @returns A promise that resolves to an array of `Content` objects for chat history.
*/
export async function getInitialChatHistory(
config: Config,
useCleanContext: boolean = false,
maxContextTokens?: number,
extraHistory?: Content[],
): Promise<Content[]> {
*
* Backward compatibility:
* - Old calling convention:
* getInitialChatHistory(config, extraHistory?, maxContextTokens?, useCleanContext?)
* - New calling convention:
* getInitialChatHistory(config, useCleanContext?, maxContextTokens?, extraHistory?)
*
* @param {Config} config - The runtime configuration and services.
* @param {boolean | Content[]} arg2 - Either useCleanContext (boolean) or extraHistory (Content[]).
* @param {number | Content[]} arg3 - Either maxContextTokens (number) or extraHistory when using old convention.
* @param {Content[]} arg4 - extraHistory when using the new convention.
* @returns A promise that resolves to an array of `Content` objects for chat history.
*/
export async function getInitialChatHistory(
config: Config,
arg2?: boolean | Content[],
arg3?: number | Content[],
arg4?: Content[],
): Promise<Content[]> {
let useCleanContext = false;
let maxContextTokens: number | undefined;
let extraHistory: Content[] | undefined;
// Detect whether the second argument is extraHistory (old convention) or useCleanContext (new convention).
if (Array.isArray(arg2)) {
// Old convention: (config, extraHistory?, maxContextTokens?, useCleanContext?)
extraHistory = arg2;
if (typeof arg3 === 'number') {
maxContextTokens = arg3;
}
if (typeof arg3 === 'boolean') {
useCleanContext = arg3;
} else if (typeof arg4 === 'boolean') {
useCleanContext = arg4;
}
} else {
// New convention: (config, useCleanContext?, maxContextTokens?, extraHistory?)
useCleanContext = arg2 ?? false;
if (typeof arg3 === 'number') {
maxContextTokens = arg3;
}
if (Array.isArray(arg3)) {
extraHistory = arg3;
} else if (Array.isArray(arg4)) {
extraHistory = arg4;
}
}

Copilot uses AI. Check for mistakes.
Comment on lines +215 to +216
const sessionHistory = config.getGeminiClient()?.getHistory() ?? [];

Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the useCleanContext === false path, sessionHistory is retrieved via config.getGeminiClient()?.getHistory(). Config.getGeminiClient() is not optional and getHistory() throws when the client hasn't been initialized yet (e.g. during the first startChat() call). This can cause chat initialization to fail; consider guarding with isInitialized() (or equivalent) and defaulting to an empty history when uninitialized.

Suggested change
const sessionHistory = config.getGeminiClient()?.getHistory() ?? [];
let sessionHistory: Content[] = [];
try {
sessionHistory = config.getGeminiClient().getHistory() ?? [];
} catch {
// If the Gemini client is not initialized yet or history retrieval fails,
// fall back to an empty session history.
sessionHistory = [];
}

Copilot uses AI. Check for mistakes.
Comment on lines +215 to +228
const sessionHistory = config.getGeminiClient()?.getHistory() ?? [];

history = [
{
role: 'user',
parts: [{ text: envContextString }],
},
{
role: 'model',
parts: [{ text: 'Got it. Thanks for the context!' }],
},
...sessionHistory,
...(extraHistory ?? []),
];
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When including sessionHistory, the code prepends a fresh environment context + canned model ack and then appends the full sessionHistory. Since sessionHistory already begins with the previous environment context + ack, this will duplicate those messages (and may re-inject stale folder structure). Consider dropping/replacing the initial env+ack from sessionHistory before concatenation so the resulting history contains exactly one env context block.

Copilot uses AI. Check for mistakes.
Comment on lines +69 to +78
if (availableChars > 0 && availableChars < firstItemText.length) {
firstItem.parts = [
{
text:
firstItemText.slice(0, availableChars) +
'... [truncated due to token budget]',
},
];
}
return firstItem ? [firstItem] : [];
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

truncateContentToTokenBudget mutates Content objects in-place (e.g., assigning firstItem.parts = ...). If the input contents includes references to the live chat history (like GeminiClient.getHistory()), this can permanently alter the main conversation history. Consider treating inputs as immutable: clone any Content you plan to modify and return a new array with new objects/parts.

Suggested change
if (availableChars > 0 && availableChars < firstItemText.length) {
firstItem.parts = [
{
text:
firstItemText.slice(0, availableChars) +
'... [truncated due to token budget]',
},
];
}
return firstItem ? [firstItem] : [];
let truncatedText = firstItemText;
if (availableChars > 0 && availableChars < firstItemText.length) {
truncatedText =
firstItemText.slice(0, availableChars) +
'... [truncated due to token budget]';
}
// Clone the first item to avoid mutating the original content object.
const truncatedFirstItem: Content = {
...firstItem,
parts: [
{
...(firstItem.parts?.[0] ?? {}),
text: truncatedText,
} as Part,
],
};
return firstItem ? [truncatedFirstItem] : [];

Copilot uses AI. Check for mistakes.
Comment on lines +95 to +107
const lastItem = truncatedItems[0];
const lastText = lastItem.parts?.map((p) => p.text || '').join('') || '';

const availableTokensForLast = maxTokens - firstItemTokens;
const availableCharsForLast =
availableTokensForLast * CHARS_PER_TOKEN_ESTIMATE;

if (availableCharsForLast > 0 && availableCharsForLast < lastText.length) {
const truncatedText =
lastText.slice(0, availableCharsForLast) +
'... [truncated due to token budget]';
lastItem.parts = [{ text: truncatedText }];
}
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

truncateContentToTokenBudget also mutates lastItem.parts when truncating the last remaining item. If lastItem comes from sessionHistory or extraHistory, this will modify the original objects outside this function. Consider cloning lastItem (and its parts) before truncating so callers don't observe unexpected side-effects.

Copilot uses AI. Check for mistakes.
- Add useCleanContext flag to skip session history inheritance
- Add maxContextTokens for context token budget enforcement
- Add useStructuredOutput for structured summary format
- Implement truncateContentToTokenBudget utility
- Update Task tool to support runtime config overrides
- Add comprehensive tests for new features
- Document advanced subagent configuration options

Fixes QwenLM#2332
@tanzhenxin
Copy link
Copy Markdown
Collaborator

@ossaidqadri Thanks for your contribution, great work done here!
But we are undergoing a major refactor to all subagent based system in PR #1912 , so we would not take any PR targeting subagent at this moment. The mentioned PR expects to be merged into main next week.

@tanzhenxin tanzhenxin added the status/on-hold Temporarily paused label Mar 18, 2026
@tanzhenxin
Copy link
Copy Markdown
Collaborator

Thanks for the contribution, @ossaidqadri! We appreciate the effort you put into this PR around subagent runtime configuration (clean context, token budgets, etc.).

Unfortunately, the subagent interface has undergone significant changes since this PR was opened, and there are now substantial conflicts with the current codebase that would make merging impractical.

We do plan to address the goals proposed here — we'll be opening new PRs that build on the current subagent architecture. Your work here has been helpful in informing the direction.

Closing this for now, but thanks again for the contribution!

@tanzhenxin tanzhenxin closed this Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status/on-hold Temporarily paused

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Add Delegation Layer with isolated clean context windows for subagents

3 participants