Skip to content

feat(core): enhanced anchored iterative context compression with self-verification#15710

Merged
NTaylorMullen merged 6 commits intogoogle-gemini:mainfrom
rmedranollamas:feature/factory-compression
Jan 20, 2026
Merged

feat(core): enhanced anchored iterative context compression with self-verification#15710
NTaylorMullen merged 6 commits intogoogle-gemini:mainfrom
rmedranollamas:feature/factory-compression

Conversation

@rmedranollamas
Copy link
Copy Markdown
Contributor

@rmedranollamas rmedranollamas commented Dec 30, 2025

Summary

This PR implements an enhanced context compression mechanism inspired by Factory.ai's research on evaluating context compression for AI agents. It shifts from simple summarization to a structured, anchored, and self-verifying distillation process.

Details

  • Structured XML Snapshot: Updated the <state_snapshot> schema in packages/core/src/core/prompts.ts to include:
    • <artifact_trail>: Tracks the evolution of critical files and symbols.
    • <active_constraints>: Explicitly preserves user-defined or discovered constraints.
    • <task_state>: Refined from 'plan' to emphasize continuity and the immediate next focus.
  • Anchored Iterative Logic: ChatCompressionService now detects existing snapshots in history and instructs the model to integrate and update them rather than starting from scratch, preventing 'information drift'.
  • Self-Verification Turn ('Probe'): Implemented a two-turn process. After the initial summary, a second turn asks the model to critique its own summary against the original history and provide a refined version if any technical details or constraints were omitted.
  • Improved Test Coverage: Updated existing tests and added new cases to verify the two-turn flow and anchored integration.

Related Issues

Closes: #16500

How to Validate

Run the core unit tests:

npx vitest packages/core/src/services/chatCompressionService.test.ts

Pre-Merge Checklist

  • Updated relevant documentation and README (if needed)
  • Added/updated tests (if needed)
  • Noted breaking changes (if any)
  • Validated on required platforms/methods:
    • Linux
      • npm run

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @rmedranollamas, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly upgrades the core context compression capabilities for AI agents, introducing a more robust and intelligent system for managing conversational state. It integrates a self-correction mechanism and anchored updates to existing snapshots, ensuring critical information is retained and refined over time. Additionally, it provides new configuration options for automatic management of MCP servers and experimental model routing, enhancing the system's operational resilience and flexibility.

Highlights

  • Enhanced Context Compression: Implemented a new context compression mechanism for AI agents, moving beyond simple summarization to a structured, anchored, and self-verifying distillation process, inspired by Factory.ai's research.
  • Structured XML Snapshot Schema Update: The <state_snapshot> schema in packages/core/src/core/prompts.ts has been updated to include new elements like <artifact_trail> (for tracking file/symbol evolution), <active_constraints> (for preserving explicit rules), and a refined <task_state>.
  • Anchored Iterative Logic: The ChatCompressionService now intelligently detects existing state snapshots in the conversation history and instructs the model to integrate and update them, preventing 'information drift' rather than generating new summaries from scratch.
  • Self-Verification Turn ('Probe'): A two-turn compression process has been introduced where, after an initial summary, the model is prompted to critique its own output against the original history and provide a refined version if any critical details were omitted.
  • MCP Server Auto-Restart Configuration: New configuration options (mcp.autoRestart.enabled, mcp.autoRestart.healthCheckIntervalMs, mcp.autoRestart.unhealthyTimeoutMs) have been added to manage the automatic restarting of MCP servers based on health checks.
  • Experimental Model Router Settings: New experimental settings (experimental.modelRouter.enabled, experimental.modelRouter.simpleTaskModel, experimental.modelRouter.complexTaskModel) are introduced to enable and configure model routing based on task complexity.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant enhancements to context compression, including a structured XML snapshot, anchored iteration, and self-verification. It also adds auto-restarting for MCP servers and an experimental model router. My review focuses on potential race conditions and resource management. I've identified a critical issue in the MCP client manager related to unsafe iteration and a high-severity issue in the chat compression service concerning uncancellable background tasks. Addressing these will improve the robustness and efficiency of the new features.

Comment thread packages/core/src/tools/mcp-client-manager.ts Outdated
Comment thread packages/core/src/services/chatCompressionService.ts Outdated
@rmedranollamas rmedranollamas force-pushed the feature/factory-compression branch from b14ada6 to 5cf14a8 Compare December 30, 2025 13:37
@rmedranollamas rmedranollamas marked this pull request as ready for review December 30, 2025 13:46
@rmedranollamas rmedranollamas requested a review from a team as a code owner December 30, 2025 13:46
@gemini-cli gemini-cli Bot added the status/need-issue Pull requests that need to have an associated issue. label Jan 7, 2026
@mrcabbage972
Copy link
Copy Markdown
Contributor

mrcabbage972 commented Jan 9, 2026

Very interesting PR! Can you please open a corresponding issue for tracking?

Also, it seems that the PR combines a few different issues: memory compression, mcp auto-restart, model router configuration. It would be easier to review if they are split to individual PR's.

@mrcabbage972 mrcabbage972 added the area/agent Issues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Quality label Jan 9, 2026
@rmedranollamas rmedranollamas force-pushed the feature/factory-compression branch from 611b187 to b3f1342 Compare January 13, 2026 10:05
@rmedranollamas
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an enhanced context compression mechanism with a two-turn self-verification process and anchored iterative logic, which is a significant improvement. However, a high-severity prompt injection vulnerability has been identified in the new chat compression logic, where direct use of chat history in prompts could allow an attacker to manipulate the agent's memory and behavior. Additionally, there are two high-severity issues: one related to handling potentially empty summaries that could corrupt chat history, and another concerning the lack of cancellation support for the compression process, potentially leading to resource leaks and race conditions. Addressing these points will make the new feature more robust, reliable, and secure.

Comment thread packages/core/src/services/chatCompressionService.ts Outdated
@rmedranollamas
Copy link
Copy Markdown
Contributor Author

filled #16500 for this work, and cleaned the PR up. rebased and passed tests.

@bdmorgan
Copy link
Copy Markdown
Collaborator

Hi @rmedranollamas, thank you so much for your contribution to Gemini CLI! We really appreciate the time and effort you've put into this.

We're making some updates to our contribution process to improve how we track and review changes. Please take a moment to review our recent discussion post: Improving Our Contribution Process & Introducing New Guidelines.

Key Update: Starting January 26, 2026, the Gemini CLI project will require all pull requests to be associated with an existing issue. Any pull requests not linked to an issue by that date will be automatically closed.

Thank you for your understanding and for being a part of our community!

@rmedranollamas
Copy link
Copy Markdown
Contributor Author

attached to #16500

Comment thread packages/core/src/core/prompts.ts
@NTaylorMullen
Copy link
Copy Markdown
Collaborator

Looks like you may have some test failures here too

@rmedranollamas rmedranollamas force-pushed the feature/factory-compression branch from ffbc0ef to 380afed Compare January 19, 2026 10:50
@rmedranollamas
Copy link
Copy Markdown
Contributor Author

/gemini review

rmedranollamas added a commit to rmedranollamas/gemini-cli that referenced this pull request Jan 19, 2026
- Add security warnings against prompt injection in compression prompt.
- Clarify artifact_trail purpose in compression prompt.
- Fix unsafe iteration in McpClientManager.restart.
- Ensure all chatCompressionService tests use correct tokenLimit mocks.
@rmedranollamas
Copy link
Copy Markdown
Contributor Author

I've addressed the review comments:

  • Security: Added explicit warnings against prompt injection in the compression prompt.
  • Robustness: Added checks for empty/whitespace summaries to prevent context corruption.
  • Cancellation: Ensured abortSignal is correctly propagated through all stages of compression.
  • MCP Client Manager: Fixed an unsafe iteration in restart() by using Array.from(this.clients.keys()).
  • Clarity: Updated the artifact_trail description in the prompt to clarify it tracks evolution and reasoning, as suggested by @NTaylorMullen.
  • Tests: Fixed some test cases that were incorrectly assuming token limits.

All tests passed locally.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a sophisticated enhancement to the chat history compression mechanism, incorporating an anchored, iterative approach with a self-verification step. The changes are well-implemented, with a more structured XML snapshot format and corresponding updates to the compression service and tests. The two-turn verification process is a clever way to improve summary quality. I've identified one high-severity issue in how the final summary is determined, which could lead to discarding a valid summary if the verification step returns only whitespace. Overall, this is a great improvement.

Comment thread packages/core/src/services/chatCompressionService.ts
@rmedranollamas
Copy link
Copy Markdown
Contributor Author

Fixed the subtle bug in finalSummary logic where a whitespace-only verification response would cause compression to fail instead of falling back to the initial valid summary. Added a regression test for this case.

@gemini-cli gemini-cli Bot removed the status/need-issue Pull requests that need to have an associated issue. label Jan 19, 2026
NTaylorMullen pushed a commit to rmedranollamas/gemini-cli that referenced this pull request Jan 19, 2026
- Add security warnings against prompt injection in compression prompt.
- Clarify artifact_trail purpose in compression prompt.
- Fix unsafe iteration in McpClientManager.restart.
- Ensure all chatCompressionService tests use correct tokenLimit mocks.
@NTaylorMullen NTaylorMullen force-pushed the feature/factory-compression branch from 50fbdce to f3b47eb Compare January 19, 2026 22:10
@rmedranollamas
Copy link
Copy Markdown
Contributor Author

Fixed integration test failures by:

  1. Adding telemetry logging for the EMPTY_SUMMARY failure case.
  2. Updating integration test mocks to provide responses for both the initial summarization and the new verification turn.

auto-merge was automatically disabled January 20, 2026 07:59

Head branch was pushed to by a user without write access

… with self-verification turn based on Factory.ai research
- Add security warnings against prompt injection in compression prompt.
- Clarify artifact_trail purpose in compression prompt.
- Fix unsafe iteration in McpClientManager.restart.
- Ensure all chatCompressionService tests use correct tokenLimit mocks.
…ication is empty

- Address gemini-code-assist review comment regarding subtle fallback bug.
- Add regression test for empty verification response.
@rmedranollamas
Copy link
Copy Markdown
Contributor Author

Rebased onto main and verified. Ready for another look! @NTaylorMullen

@rmedranollamas rmedranollamas force-pushed the feature/factory-compression branch from db5d831 to a5ce377 Compare January 20, 2026 08:15
@NTaylorMullen NTaylorMullen added this pull request to the merge queue Jan 20, 2026
Merged via the queue into google-gemini:main with commit 1182168 Jan 20, 2026
25 checks passed
@rmedranollamas rmedranollamas deleted the feature/factory-compression branch January 20, 2026 08:55
Thomas-Shephard pushed a commit to Thomas-Shephard/gemini-cli that referenced this pull request Jan 21, 2026
thacio added a commit to thacio/auditaria that referenced this pull request Jan 24, 2026
kuishou68 pushed a commit to iOfficeAI/aioncli that referenced this pull request Feb 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/agent Issues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Quality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

enhanced anchored iterative context compression with self-verification

4 participants