Problem
When using OpenCode across multiple LLM providers (Claude, Gemini, Kimi, OpenRouter, etc.), there's no persistent memory between sessions or across models. Each conversation starts fresh, losing valuable context about:
- User preferences and coding style
- Project-specific decisions and architecture
- Previously solved problems and their solutions
- Domain knowledge accumulated over time
Proposed Solution
Add a model-agnostic memory layer to OpenCode that works transparently regardless of which LLM provider is being used.
Key Features
- Two memory scopes:
- Global memory (
~/.opencode/memory/) - Personal preferences, cross-project knowledge
- Project memory (
.opencode/memory/) - Codebase-specific context, architecture decisions
- Provider independence:
- Use local embedding model (e.g.,
Qwen3-Embedding-0.6B) - no API dependency
- Local vector storage (LanceDB, SQLite, or similar)
-
- Optional: configurable "memory LLM" separate from main model (could be a cheap/fast model)
- Transparent operation:
- Automatically extract relevant facts from conversations
- Inject relevant context before each LLM call
- Works the same whether using Claude, Gemini, Kimi, or any other provider
Example Architecture
User Query
│
▼
┌──────────────────────┐
│ Memory Retrieval │ ◄── Local embeddings + vector search
│ (relevant context) │
└──────────┬───────────┘
│
▼
┌──────────────────────┐
│ Context Injection │ ◄── Prepend relevant memories to prompt
└──────────┬───────────┘
│
▼
┌──────────────────────┐
│ LLM Call (any) │ ◄── Claude, Gemini, Kimi, etc.
└──────────┬───────────┘
│
▼
┌──────────────────────┐
│ Memory Extraction │ ◄── Extract facts from response (async)
│ (background task) │
└──────────────────────┘
Use Cases
- Preference persistence: "I prefer functional style" remembered across all projects and models
- Project context: Architecture decisions stay available even when switching models mid-project
- Cross-session continuity: Resume work without re-explaining context
- Knowledge accumulation: Build up domain expertise over time
Configuration (example)
# opencode.toml
[memory]
enabled = true
global_enabled = true
project_enabled = true
# Optional: dedicated model for memory operations (cheap/fast)
extraction_model = "openai/gpt-4.1-mini"
# Local embedding model (no API needed)
embedding_model = "Qwen/Qwen3-Embedding-0.6B"
Prior Art / Inspiration
- SimpleMem (https://github.com/aiming-lab/SimpleMem) - Efficient lifelong memory for LLM agents using semantic compression
- Mem0 (https://github.com/mem0ai/mem0) - Memory layer for AI applications
Additional Context
This would be especially valuable for users who:
- Switch between providers based on task (coding vs reasoning vs speed)
- Work on multiple projects with different contexts
- Want continuity without provider lock-in
Problem
When using OpenCode across multiple LLM providers (Claude, Gemini, Kimi, OpenRouter, etc.), there's no persistent memory between sessions or across models. Each conversation starts fresh, losing valuable context about:
Proposed Solution
Add a model-agnostic memory layer to OpenCode that works transparently regardless of which LLM provider is being used.
Key Features
~/.opencode/memory/) - Personal preferences, cross-project knowledge.opencode/memory/) - Codebase-specific context, architecture decisionsQwen3-Embedding-0.6B) - no API dependencyExample Architecture
User Query
│
▼
┌──────────────────────┐
│ Memory Retrieval │ ◄── Local embeddings + vector search
│ (relevant context) │
└──────────┬───────────┘
│
▼
┌──────────────────────┐
│ Context Injection │ ◄── Prepend relevant memories to prompt
└──────────┬───────────┘
│
▼
┌──────────────────────┐
│ LLM Call (any) │ ◄── Claude, Gemini, Kimi, etc.
└──────────┬───────────┘
│
▼
┌──────────────────────┐
│ Memory Extraction │ ◄── Extract facts from response (async)
│ (background task) │
└──────────────────────┘
Use Cases
Configuration (example)