feat: add clean context and token budget for subagents#2337
feat: add clean context and token budget for subagents#2337ossaidqadri wants to merge 1 commit intoQwenLM:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds runtime configuration controls for subagents—allowing optional “clean context” delegation, context token budgeting, and (prompt-level) structured output formatting—and wires these overrides through the Task tool and subagent creation flow.
Changes:
- Extended
RunConfigwithuseCleanContext,maxContextTokens, anduseStructuredOutput, plus aSubagentStructuredSummarytype. - Updated subagent scope creation to accept runtime
RunConfigoverrides and apply them when initializing chat history and constructing prompts. - Added context truncation utilities/tests and expanded subagent documentation with examples of the new options.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/core/src/utils/environmentContext.ts | Adds token-budget truncation and extends getInitialChatHistory to support clean context + max token budgeting + session history inclusion. |
| packages/core/src/utils/environmentContext.test.ts | Adds/updates tests for clean-context behavior and token budget truncation. |
| packages/core/src/tools/task.ts | Extends TaskParams with optional runConfig overrides and forwards them into subagent scope creation. |
| packages/core/src/tools/task.test.ts | Verifies runConfig overrides are passed through the Task tool invocation path. |
| packages/core/src/subagents/types.ts | Extends RunConfig and introduces SubagentStructuredSummary. |
| packages/core/src/subagents/subagent.ts | Applies useCleanContext/maxContextTokens in history initialization and appends structured-output instructions to the system prompt. |
| packages/core/src/subagents/subagent.test.ts | Updates mocks to match the updated getInitialChatHistory signature. |
| packages/core/src/subagents/subagent-manager.ts | Adds runConfigOverrides parameter and merges overrides into the runtime config. |
| docs/users/features/sub-agents.md | Documents the new runtime configuration options with examples. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| export interface TaskParams { | ||
| description: string; | ||
| prompt: string; | ||
| subagent_type: string; | ||
| /** | ||
| * Optional runtime configuration overrides for the subagent. | ||
| * Allows customizing context behavior like useCleanContext, maxContextTokens, etc. | ||
| */ | ||
| runConfig?: Partial<RunConfig>; | ||
| } |
There was a problem hiding this comment.
TaskParams now includes runConfig, but the Task tool's JSON schema (defined in the constructor) still has additionalProperties: false and does not declare a runConfig property. That means tool calls containing runConfig will be rejected at schema-validation time, so this override won't be usable via the actual Task tool API unless the schema is updated accordingly.
| * This prevents context bloat during long sessions. | ||
| */ | ||
| useCleanContext?: boolean; | ||
| /** | ||
| * Maximum number of tokens allowed for context injection. | ||
| * When exceeded, context is truncated to fit within this budget. | ||
| * If not specified, no token budget is enforced. | ||
| */ | ||
| maxContextTokens?: number; | ||
| /** | ||
| * When true, instructs the subagent to format its output using a structured | ||
| * summary schema (findings, files changed, conclusion). This ensures only | ||
| * distilled summaries are injected back into the main context. | ||
| */ |
There was a problem hiding this comment.
RunConfig uses snake_case keys for existing fields (max_time_minutes, max_turns), but the newly added fields use camelCase (useCleanContext, maxContextTokens, useStructuredOutput). This inconsistency makes YAML/frontmatter configuration error-prone and complicates validation/SDK schemas. Consider switching these to snake_case (or supporting both names with a clear precedence) to match existing RunConfig conventions.
| * This prevents context bloat during long sessions. | |
| */ | |
| useCleanContext?: boolean; | |
| /** | |
| * Maximum number of tokens allowed for context injection. | |
| * When exceeded, context is truncated to fit within this budget. | |
| * If not specified, no token budget is enforced. | |
| */ | |
| maxContextTokens?: number; | |
| /** | |
| * When true, instructs the subagent to format its output using a structured | |
| * summary schema (findings, files changed, conclusion). This ensures only | |
| * distilled summaries are injected back into the main context. | |
| */ | |
| * This prevents context bloat during long sessions. | |
| * | |
| * Prefer using the snake_case `use_clean_context` key in YAML/frontmatter. | |
| * The camelCase `useCleanContext` field is kept for backwards compatibility. | |
| */ | |
| use_clean_context?: boolean; | |
| useCleanContext?: boolean; | |
| /** | |
| * Maximum number of tokens allowed for context injection. | |
| * When exceeded, context is truncated to fit within this budget. | |
| * If not specified, no token budget is enforced. | |
| * | |
| * Prefer using the snake_case `max_context_tokens` key in YAML/frontmatter. | |
| * The camelCase `maxContextTokens` field is kept for backwards compatibility. | |
| */ | |
| max_context_tokens?: number; | |
| maxContextTokens?: number; | |
| /** | |
| * When true, instructs the subagent to format its output using a structured | |
| * summary schema (findings, files changed, conclusion). This ensures only | |
| * distilled summaries are injected back into the main context. | |
| * | |
| * Prefer using the snake_case `use_structured_output` key in YAML/frontmatter. | |
| * The camelCase `useStructuredOutput` field is kept for backwards compatibility. | |
| */ | |
| use_structured_output?: boolean; |
| ```yaml | ||
| --- | ||
| name: focused-researcher | ||
| description: Researches topics without carrying main session context | ||
| runConfig: | ||
| useCleanContext: true | ||
| --- | ||
| ``` | ||
|
|
||
| **Benefits:** | ||
|
|
||
| - Reduces token usage for focused tasks | ||
| - Prevents context pollution from unrelated conversations | ||
| - Improves performance for long-running sessions | ||
|
|
||
| #### `maxContextTokens` | ||
|
|
||
| Sets a maximum token budget for the subagent's context. When exceeded, older messages are truncated to fit within the budget. | ||
|
|
||
| ```yaml | ||
| --- | ||
| name: budget-conscious-agent | ||
| description: Works within strict token limits | ||
| runConfig: | ||
| maxContextTokens: 4000 | ||
| --- | ||
| ``` |
There was a problem hiding this comment.
The YAML examples document runConfig keys in camelCase (e.g. useCleanContext, maxContextTokens, useStructuredOutput) but the rest of RunConfig (and prior docs) use snake_case (e.g. max_turns, max_time_minutes). If the implementation expects one naming style, this doc will lead to misconfiguration. Align the examples with the actual supported key names (or explicitly document that both styles are accepted).
| * @param {Config} config - The runtime configuration and services. | ||
| * @param {boolean} useCleanContext - When true, omits accumulated session history. | ||
| * @param {number} maxContextTokens - Optional maximum token budget for context. | ||
| * @param extraHistory - Optional additional history to append. | ||
| * @returns A promise that resolves to an array of `Content` objects for chat history. | ||
| */ | ||
| export async function getInitialChatHistory( | ||
| config: Config, | ||
| useCleanContext: boolean = false, | ||
| maxContextTokens?: number, | ||
| extraHistory?: Content[], | ||
| ): Promise<Content[]> { |
There was a problem hiding this comment.
getInitialChatHistory now treats the 2nd argument as useCleanContext, but there are existing call sites that pass extraHistory as the 2nd arg (e.g. GeminiClient.startChat() in packages/core/src/core/client.ts). This will be a TypeScript type error and also changes behavior at runtime. Consider either keeping the old parameter order (or switching to an options object) and updating all call sites accordingly.
| * @param {Config} config - The runtime configuration and services. | |
| * @param {boolean} useCleanContext - When true, omits accumulated session history. | |
| * @param {number} maxContextTokens - Optional maximum token budget for context. | |
| * @param extraHistory - Optional additional history to append. | |
| * @returns A promise that resolves to an array of `Content` objects for chat history. | |
| */ | |
| export async function getInitialChatHistory( | |
| config: Config, | |
| useCleanContext: boolean = false, | |
| maxContextTokens?: number, | |
| extraHistory?: Content[], | |
| ): Promise<Content[]> { | |
| * | |
| * Backward compatibility: | |
| * - Old calling convention: | |
| * getInitialChatHistory(config, extraHistory?, maxContextTokens?, useCleanContext?) | |
| * - New calling convention: | |
| * getInitialChatHistory(config, useCleanContext?, maxContextTokens?, extraHistory?) | |
| * | |
| * @param {Config} config - The runtime configuration and services. | |
| * @param {boolean | Content[]} arg2 - Either useCleanContext (boolean) or extraHistory (Content[]). | |
| * @param {number | Content[]} arg3 - Either maxContextTokens (number) or extraHistory when using old convention. | |
| * @param {Content[]} arg4 - extraHistory when using the new convention. | |
| * @returns A promise that resolves to an array of `Content` objects for chat history. | |
| */ | |
| export async function getInitialChatHistory( | |
| config: Config, | |
| arg2?: boolean | Content[], | |
| arg3?: number | Content[], | |
| arg4?: Content[], | |
| ): Promise<Content[]> { | |
| let useCleanContext = false; | |
| let maxContextTokens: number | undefined; | |
| let extraHistory: Content[] | undefined; | |
| // Detect whether the second argument is extraHistory (old convention) or useCleanContext (new convention). | |
| if (Array.isArray(arg2)) { | |
| // Old convention: (config, extraHistory?, maxContextTokens?, useCleanContext?) | |
| extraHistory = arg2; | |
| if (typeof arg3 === 'number') { | |
| maxContextTokens = arg3; | |
| } | |
| if (typeof arg3 === 'boolean') { | |
| useCleanContext = arg3; | |
| } else if (typeof arg4 === 'boolean') { | |
| useCleanContext = arg4; | |
| } | |
| } else { | |
| // New convention: (config, useCleanContext?, maxContextTokens?, extraHistory?) | |
| useCleanContext = arg2 ?? false; | |
| if (typeof arg3 === 'number') { | |
| maxContextTokens = arg3; | |
| } | |
| if (Array.isArray(arg3)) { | |
| extraHistory = arg3; | |
| } else if (Array.isArray(arg4)) { | |
| extraHistory = arg4; | |
| } | |
| } |
| const sessionHistory = config.getGeminiClient()?.getHistory() ?? []; | ||
|
|
There was a problem hiding this comment.
In the useCleanContext === false path, sessionHistory is retrieved via config.getGeminiClient()?.getHistory(). Config.getGeminiClient() is not optional and getHistory() throws when the client hasn't been initialized yet (e.g. during the first startChat() call). This can cause chat initialization to fail; consider guarding with isInitialized() (or equivalent) and defaulting to an empty history when uninitialized.
| const sessionHistory = config.getGeminiClient()?.getHistory() ?? []; | |
| let sessionHistory: Content[] = []; | |
| try { | |
| sessionHistory = config.getGeminiClient().getHistory() ?? []; | |
| } catch { | |
| // If the Gemini client is not initialized yet or history retrieval fails, | |
| // fall back to an empty session history. | |
| sessionHistory = []; | |
| } |
| const sessionHistory = config.getGeminiClient()?.getHistory() ?? []; | ||
|
|
||
| history = [ | ||
| { | ||
| role: 'user', | ||
| parts: [{ text: envContextString }], | ||
| }, | ||
| { | ||
| role: 'model', | ||
| parts: [{ text: 'Got it. Thanks for the context!' }], | ||
| }, | ||
| ...sessionHistory, | ||
| ...(extraHistory ?? []), | ||
| ]; |
There was a problem hiding this comment.
When including sessionHistory, the code prepends a fresh environment context + canned model ack and then appends the full sessionHistory. Since sessionHistory already begins with the previous environment context + ack, this will duplicate those messages (and may re-inject stale folder structure). Consider dropping/replacing the initial env+ack from sessionHistory before concatenation so the resulting history contains exactly one env context block.
| if (availableChars > 0 && availableChars < firstItemText.length) { | ||
| firstItem.parts = [ | ||
| { | ||
| text: | ||
| firstItemText.slice(0, availableChars) + | ||
| '... [truncated due to token budget]', | ||
| }, | ||
| ]; | ||
| } | ||
| return firstItem ? [firstItem] : []; |
There was a problem hiding this comment.
truncateContentToTokenBudget mutates Content objects in-place (e.g., assigning firstItem.parts = ...). If the input contents includes references to the live chat history (like GeminiClient.getHistory()), this can permanently alter the main conversation history. Consider treating inputs as immutable: clone any Content you plan to modify and return a new array with new objects/parts.
| if (availableChars > 0 && availableChars < firstItemText.length) { | |
| firstItem.parts = [ | |
| { | |
| text: | |
| firstItemText.slice(0, availableChars) + | |
| '... [truncated due to token budget]', | |
| }, | |
| ]; | |
| } | |
| return firstItem ? [firstItem] : []; | |
| let truncatedText = firstItemText; | |
| if (availableChars > 0 && availableChars < firstItemText.length) { | |
| truncatedText = | |
| firstItemText.slice(0, availableChars) + | |
| '... [truncated due to token budget]'; | |
| } | |
| // Clone the first item to avoid mutating the original content object. | |
| const truncatedFirstItem: Content = { | |
| ...firstItem, | |
| parts: [ | |
| { | |
| ...(firstItem.parts?.[0] ?? {}), | |
| text: truncatedText, | |
| } as Part, | |
| ], | |
| }; | |
| return firstItem ? [truncatedFirstItem] : []; |
| const lastItem = truncatedItems[0]; | ||
| const lastText = lastItem.parts?.map((p) => p.text || '').join('') || ''; | ||
|
|
||
| const availableTokensForLast = maxTokens - firstItemTokens; | ||
| const availableCharsForLast = | ||
| availableTokensForLast * CHARS_PER_TOKEN_ESTIMATE; | ||
|
|
||
| if (availableCharsForLast > 0 && availableCharsForLast < lastText.length) { | ||
| const truncatedText = | ||
| lastText.slice(0, availableCharsForLast) + | ||
| '... [truncated due to token budget]'; | ||
| lastItem.parts = [{ text: truncatedText }]; | ||
| } |
There was a problem hiding this comment.
truncateContentToTokenBudget also mutates lastItem.parts when truncating the last remaining item. If lastItem comes from sessionHistory or extraHistory, this will modify the original objects outside this function. Consider cloning lastItem (and its parts) before truncating so callers don't observe unexpected side-effects.
- Add useCleanContext flag to skip session history inheritance - Add maxContextTokens for context token budget enforcement - Add useStructuredOutput for structured summary format - Implement truncateContentToTokenBudget utility - Update Task tool to support runtime config overrides - Add comprehensive tests for new features - Document advanced subagent configuration options Fixes QwenLM#2332
62bd6a1 to
d374125
Compare
|
@ossaidqadri Thanks for your contribution, great work done here! |
|
Thanks for the contribution, @ossaidqadri! We appreciate the effort you put into this PR around subagent runtime configuration (clean context, token budgets, etc.). Unfortunately, the subagent interface has undergone significant changes since this PR was opened, and there are now substantial conflicts with the current codebase that would make merging impractical. We do plan to address the goals proposed here — we'll be opening new PRs that build on the current subagent architecture. Your work here has been helpful in informing the direction. Closing this for now, but thanks again for the contribution! |
This pull request introduces advanced runtime configuration options for subagents, enabling more granular control over context management and output formatting. The changes add new fields to the subagent configuration, update the core logic to support these options, and enhance documentation and tests to reflect the new capabilities.
Key changes:
Subagent Runtime Configuration Enhancements
RunConfig:useCleanContext,maxContextTokens, anduseStructuredOutput. These allow subagents to start with a clean context, enforce a token budget, and produce structured output summaries, respectively. Also introduced theSubagentStructuredSummaryinterface for structured results.SubagentManager,SubAgentScope) to accept and apply runtime configuration overrides, and to pass these options to context initialization and prompt construction. [1] [2] [3] [4] [5] [6]Task Tool and API Support
TaskParamsinterface andTaskToollogic to accept and forward runtime configuration overrides, enabling dynamic customization of subagent behavior via the Task tool API. [1] [2] [3]Documentation Improvements
Test and Mock Adjustments
These changes provide users with greater flexibility and cost control when delegating tasks to subagents, and ensure consistent, structured output for downstream processing.
Closes #2332