Skip to content

feat: automatically compress long-task context based on call count and context length #219

@huangrichao2020

Description

@huangrichao2020

Scenario

I hit this pitfall directly.

Long agent tasks often involve many rounds of tool calls: browsing web pages, reading files, running commands, editing code, and validating again. Context grows quickly, especially when tool results and thinking/tool_use structures are long. Without automatic compression, the final symptoms are often model 400 errors, empty responses, slowness, or the model starting to forget the current goal.

Current Pain Points

  • Simple truncation loses important task chains.
  • Trimming only by message count is not enough, because a single tool result may be very large.
  • Waiting until the provider reports a context-limit error is too late.
  • There is no compression trigger based on call count, context length, or tool-result length.

Suggested Direction

Add a layered compression strategy:

  • Triggers: LLM call count, estimated token/character count, tool-result length, and message count.
  • Compression targets: old thinking, tool_use, tool_result, web-page output, and long command output.
  • Preserved content: recent turns, the original user goal, current plan, unfinished TODOs, key file paths, and diff summaries.
  • The compressed output should ideally be structured: Goal / Completed / Current State / TODO / Key Evidence / Risks.
  • If compression fails, fall back to safe truncation and record it in logs.

A lightweight direction is to first compress old <thinking>/<tool_use>/<tool_result> blocks at the tag level, then force deeper compression when the context crosses a threshold. Longer term, this could evolve into a real context engine.

Acceptance Criteria

  • A synthetic 50+ tool-call long task should not directly fail with 400 because of context growth.
  • After compression, the user goal, current plan, and recent tool results are still preserved.
  • Logs show when compression was triggered and the before/after size.
  • Compression thresholds are configurable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions