Feature hasn't been suggested before.
Describe the enhancement you want to request
Type
Performance Optimization / Token Efficiency
Description
Currently, OpenCode injects the entire input_schema of all connected MCP tools into the LLM context at the start of every session. While this follows a literal implementation of the MCP protocol, it creates a massive "Token Tax," especially when dealing with complex MCP servers.
I performed a comparative test between OpenCode and Cursor using an MCP server with a large input_schema:
| Behavior |
OpenCode |
Cursor |
| Initial Context |
Full schema loaded immediately |
Minimal metadata (name/description) only |
| On Tool Call |
N/A |
Fetches input_schema just-in-time |
The Problem
As discussed in previous issues like #2841, the current implementation causes the system prompt to grow linearly with the number and complexity of MCP tools. For power users with many MCP servers, this makes the tool almost unusable due to:
- High Token Costs: Paying for the same large schema in every single turn
- Context Exhaustion: Large schemas leave less room for actual code and reasoning
- Model Confusion: Overloading the prompt with irrelevant schemas can lead to hallucinations
Past Evidence of Context Bloat (Screenshots Attached)
I have attached a comparison showing the extreme token consumption caused by connecting a single MCP server (lark-mcp-docx):
| Configuration |
Token Count |
Overhead |
| Without MCP |
~21k tokens |
- |
| With MCP Enabled |
~168k tokens |
+147k tokens |
This demonstrates that the entire schema is being injected into the system prompt, consuming 86% of the context window before a single user message is even processed. This makes the tool unusable for large-scale MCP integrations.
Steps to Reproduce
- Connect an MCP server with a very large/complex input_schema
- Start a session and ask a simple question like "How is the weather?"
- Inspect the tokens/context. OpenCode will show high token usage because it loaded the entire tool definition, whereas Cursor remains lean
Proposed Solution
Adopt a "Lazy Loading" or "Two-Step Discovery" mechanism similar to Cursor or the latest Claude Code updates:
- Initial Context: Send only the name and a short description of the MCP tools
- Just-in-Time Injection: When the LLM's intent matches a tool's description, the client should then inject the detailed
input_schema for that specific tool before the final inference step
- Toggle: Provide a setting to "Lazy Load" specific MCP servers
Additional Context
Addressing this would bring OpenCode's MCP efficiency on par with Cursor and Windsurf, making it much more viable for large-scale enterprise MCP integrations.
References

Feature hasn't been suggested before.
Describe the enhancement you want to request
Type
Performance Optimization / Token Efficiency
Description
Currently, OpenCode injects the entire
input_schemaof all connected MCP tools into the LLM context at the start of every session. While this follows a literal implementation of the MCP protocol, it creates a massive "Token Tax," especially when dealing with complex MCP servers.I performed a comparative test between OpenCode and Cursor using an MCP server with a large input_schema:
input_schemajust-in-timeThe Problem
As discussed in previous issues like #2841, the current implementation causes the system prompt to grow linearly with the number and complexity of MCP tools. For power users with many MCP servers, this makes the tool almost unusable due to:
Past Evidence of Context Bloat (Screenshots Attached)
I have attached a comparison showing the extreme token consumption caused by connecting a single MCP server (lark-mcp-docx):
This demonstrates that the entire schema is being injected into the system prompt, consuming 86% of the context window before a single user message is even processed. This makes the tool unusable for large-scale MCP integrations.
Steps to Reproduce
Proposed Solution
Adopt a "Lazy Loading" or "Two-Step Discovery" mechanism similar to Cursor or the latest Claude Code updates:
input_schemafor that specific tool before the final inference stepAdditional Context
Addressing this would bring OpenCode's MCP efficiency on par with Cursor and Windsurf, making it much more viable for large-scale enterprise MCP integrations.
References