Skip to content

feat: Context Mode v2 - Tiered Output Routing#9

Open
effortprogrammer wants to merge 2 commits intomainfrom
feature/context-mode-v2
Open

feat: Context Mode v2 - Tiered Output Routing#9
effortprogrammer wants to merge 2 commits intomainfrom
feature/context-mode-v2

Conversation

@effortprogrammer
Copy link
Copy Markdown
Owner

Summary

Reduces tool output context bloat while preserving information access through tiered routing:

Tier Threshold Action
L0 < 1 KB Pass through unchanged
L1 1-10 KB Algorithmic summary + file link
L2 > 10 KB Agent delegation (falls back to L1)

Key Changes

  • context_router.py: Core routing logic (ContextStore, OutputRouter, StreamProcessor)
  • Integration hook in opencode_gateway_server.py streaming loop
  • Configurable via CONTEXT_MODE_* environment variables
  • Original outputs preserved in /tmp/ctx/ for drill-down access

How It Works

  1. Tool outputs are routed based on size
  2. L1/L2 outputs are saved to /tmp/ctx/{tool}_{hash}.txt
  3. A summary replaces the full output in context
  4. Model can still access full output via cat /tmp/ctx/...

Configuration

# Disable
export CONTEXT_MODE_DISABLED=1

# Adjust thresholds
export CONTEXT_MODE_L0_THRESHOLD=2048
export CONTEXT_MODE_L1_THRESHOLD=20480

# Force specific tools to tiers
export CONTEXT_MODE_TOOL_OVERRIDES="glob:L0,playwright:L2"

Synergy with Tool Selection

This complements mcpflow-router's existing tool selection:

mcpflow-router Context Mode v2
Reduces tool definitions Reduces tool outputs
Input optimization Output optimization

Together: bidirectional context optimization

Reduces tool output context bloat while preserving information access:

- L0 (< 1KB): Pass through unchanged
- L1 (1-10KB): Algorithmic summary + file link
- L2 (> 10KB): Agent delegation (falls back to L1)

Key changes:
- New context_router.py module with ContextStore, OutputRouter, StreamProcessor
- Integration hook in opencode_gateway_server.py streaming loop
- Configurable via CONTEXT_MODE_* environment variables
- Original outputs preserved in /tmp/ctx/ for drill-down access

This complements tool selection (input optimization) with output optimization.
- Add SSE event buffering to handle chunk boundaries correctly
- Add threading locks for singleton initialization (double-check pattern)
- Add threading lock for ContextStore operations
- Use atomic write (temp file + rename) for index.json
- Clarify L2 behavior in docs (currently falls back to L1)
- Add comprehensive tests for:
  - Thread safety
  - Chunked event buffering
  - Unicode/Korean content handling
  - Agent function callback
  - Flush of incomplete events
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant