Memory you can trust.
A memory system for AI applications that preserves ground truth, tracks confidence, and prevents hallucinations.
AI memory systems have an accuracy crisis:
"All systems achieve answer accuracies below 70%, with hallucination rate and omission rate remaining high."
Why? Existing systems (Mem0, basic RAG, LangChain memory) lose source data after LLM extraction. When extraction errors occur, there's no recovery path. The original truth is gone.
| Traditional Memory | Engram |
|---|---|
| Embed immediately, discard source | Store verbatim first, derive later |
| Single similarity score | Composite confidence (extraction + corroboration + recency + verification) |
| No error recovery | Re-derive from source when errors occur |
| Flat retrieval (top-K) | Multi-hop reasoning via bidirectional links |
| No contradiction handling | Negation tracking filters outdated facts |
| No auditability | Full provenance — trace any memory to source |
Episodes (raw interactions) are immutable. All derived memories maintain source_episode_ids for traceability. If extraction errors occur, re-derive from source — the truth is never lost.
# Every derived memory traces back to ground truth
verified = await engram.verify(memory_id, user_id="user_123")
print(verified.source_episodes) # Original verbatim content
print(verified.explanation) # How confidence was calculatedConfidence is not just cosine similarity. It's a composite score you can explain:
Confidence: 0.73
├── Extraction method: 0.9 (regex pattern match)
├── Corroboration: 0.6 (3 supporting sources)
├── Recency: 0.8 (confirmed 2 months ago)
└── Verification: 1.0 (format validated)
Filter by confidence for high-stakes queries: min_confidence=0.8
Fast writes, smart background processing:
Encode (immediate, <100ms):
User input → Episode (verbatim) + StructuredMemory (regex extraction)
Consolidate (background):
N Episodes → LLM synthesis → SemanticMemory (with links to similar memories)
Benefits: Low latency, batched LLM costs, error recovery possible.
Memories link to related memories bidirectionally (A-MEM research shows 2x improvement on multi-hop benchmarks):
# Follow links for deeper context
results = await engram.recall(
query="What database?",
user_id="user_123",
follow_links=True, # Traverse related memories
max_hops=2,
)Tracks what isn't true to prevent returning outdated information:
Episode 1: "I use MongoDB" → preference: MongoDB
Episode 2: "Switched to Redis" → preference: Redis, negation: "no longer uses MongoDB"
Query: "What database?"
Result: Redis (MongoDB filtered by negation)
Memories strengthen through retrieval (Roediger & Karpicke, 2006: tested group forgot only 13% vs 52% for study-only):
# Memories used more often become more stable
memory.consolidation_strength # 0.0-1.0, increases with use
memory.consolidation_passes # How many times refinedgit clone https://github.com/ashita-ai/engram.git
cd engram
uv sync --extra dev
# Start Qdrant (vector database)
docker run -p 6333:6333 qdrant/qdrantfrom engram.service import EngramService
async with EngramService.create() as engram:
# Store interaction (immediate, preserves ground truth)
result = await engram.encode(
content="My email is john@example.com and I prefer PostgreSQL",
role="user",
user_id="user_123",
)
print(f"Episode: {result.episode.id}")
print(f"Emails: {result.structured.emails}") # ["john@example.com"]
# Retrieve with confidence filtering
memories = await engram.recall(
query="What's the user's email?",
user_id="user_123",
min_confidence=0.7,
)
# Verify any memory back to source
verified = await engram.verify(memories[0].memory_id, user_id="user_123")
print(verified.explanation)
# Run consolidation (N episodes → semantic memory, scoped to project)
await engram.consolidate(user_id="user_123", org_id="my-project")# Start the server
uv run uvicorn engram.api.app:app --port 8000
# Encode a memory
curl -X POST http://localhost:8000/api/v1/encode \
-H "Content-Type: application/json" \
-d '{"content": "My email is john@example.com", "role": "user", "user_id": "user_123"}'
# Recall memories
curl -X POST http://localhost:8000/api/v1/recall \
-H "Content-Type: application/json" \
-d '{"query": "email", "user_id": "user_123", "min_confidence": 0.7}'┌─────────────────────────────────────────────────────────────────────┐
│ ENCODE (Fast Path, <100ms) │
│ Input → Episode (verbatim, immutable) │
│ → StructuredMemory (regex: emails, phones, URLs) │
└─────────────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────────┐
│ CONSOLIDATE (Background, Deferred) │
│ N Episodes → LLM synthesis → SemanticMemory │
│ → Link to similar memories (bidirectional) │
│ → Strengthen linked memories (Testing Effect) │
└─────────────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────────┐
│ RECALL (Multi-Signal Reranking) │
│ Query → Vector search (all types) │
│ → Negation filtering (remove contradicted) │
│ → Confidence reranking (5 signals) │
│ → Multi-hop traversal (follow links) │
└─────────────────────────────────────────────────────────────────────┘
| Type | Confidence | Purpose |
|---|---|---|
| Episode | 1.0 (verbatim) | Ground truth, immutable raw interactions |
| Structured | 0.9 (regex) / 0.8 (LLM) | Per-episode extraction (emails, phones, negations) |
| Semantic | Variable (0.6 base) | Cross-episode synthesis, LLM-consolidated |
| Procedural | Variable (0.6 base) | Behavioral patterns, long-term preferences |
| Feature | Engram | Mem0 | Zep/Graphiti | LangChain |
|---|---|---|---|---|
| Ground truth preservation | ✅ Immutable episodes | ❌ Lost after extraction | ✅ Episode subgraph | ❌ |
| Confidence tracking | ✅ Composite + auditable | ❌ | ❌ | ❌ |
| Error recovery | ✅ Re-derive from source | ❌ Permanent errors | ❌ | |
| Multi-hop reasoning | ✅ Bidirectional links | ✅ Graph | ✅ Graph | |
| Negation handling | ✅ Explicit filtering | ❌ | ❌ | ❌ |
| Consolidation strength | ✅ Testing Effect | ❌ | ❌ | ❌ |
When to use Engram: High-stakes applications requiring accuracy, auditability, and ground truth preservation (healthcare, legal, finance, enterprise assistants).
When to use alternatives: Rapid prototyping (LangChain), managed service needs (Mem0), temporal graph queries (Zep).
| Method | Endpoint | Description |
|---|---|---|
POST |
/encode |
Store memory, extract structured data |
POST |
/encode/batch |
Bulk import (up to 100 items) |
POST |
/recall |
Semantic search with confidence filtering |
GET |
/memories/{id}/verify |
Trace memory to source with explanation |
GET |
/memories/{id}/provenance |
Full derivation chain |
| Method | Endpoint | Description |
|---|---|---|
POST |
/workflows/consolidate |
N episodes → semantic memory |
POST |
/workflows/structure |
LLM enrichment for episode |
POST |
/workflows/promote |
Semantic → procedural synthesis |
POST |
/workflows/decay |
Apply confidence decay |
See docs/api.md for complete reference.
confidence = (
extraction_method * 0.50 + # VERBATIM=1.0, EXTRACTED=0.9, INFERRED=0.6
corroboration * 0.25 + # log scale: 1 source=0.5, 10 sources=1.0
recency * 0.15 + # exponential decay, 365-day half-life
verification * 0.10 # 1.0 if format validated
) - contradiction_penalty # -10% per contradiction, floor 0.1Every score includes .explain() for auditability.
Engram is inspired by cognitive science research:
| Paper | Finding | Engram Implementation |
|---|---|---|
| Roediger & Karpicke (2006) | Retrieval slows forgetting (13% vs 52% after 1 week) | consolidation_strength, Testing Effect |
| A-MEM (2025) | 2x multi-hop improvement via linking | related_ids, bidirectional links |
| HaluMem (2025) | <70% accuracy without source preservation | Immutable episodes, verify() |
| Cognitive Workspace (2025) | 58.6% memory reuse vs 0% for naive RAG | Hierarchical memory tiers |
Note: Engram uses these concepts as engineering abstractions, not strict cognitive implementations.
| Variable | Default | Description |
|---|---|---|
ENGRAM_QDRANT_URL |
http://localhost:6333 |
Qdrant connection |
ENGRAM_EMBEDDING_PROVIDER |
fastembed |
Embedding backend |
ENGRAM_AUTH_ENABLED |
auto | Bearer token auth |
ENGRAM_CONSOLIDATION_THRESHOLD |
10 |
Episodes before auto-consolidation |
See docs/development.md for full configuration.
uv run pytest tests/ -v --no-cov # Run tests (990+ tests)
uv run ruff check src/engram/ # Lint
uv run mypy src/engram/ # Type check
uv run pre-commit run --all-files # All checksEngram uses OpenAI for embeddings and LLM operations. Set either of these environment variables — engram accepts both:
OPENAI_API_KEY(standard OpenAI convention)ENGRAM_OPENAI_API_KEY(engram-prefixed)
If both are set, ENGRAM_OPENAI_API_KEY takes precedence.
Docker gotcha: Docker's -e VAR_NAME syntax (without =value) inherits the variable from the parent process, not from your shell profile. When Claude Code launches the MCP server, the parent process is Claude Code's Node runtime — which may not have your shell's env vars. To be safe, pass both names so whichever is available gets picked up:
"-e", "OPENAI_API_KEY",
"-e", "ENGRAM_OPENAI_API_KEY"DBOS provides durable workflow execution with automatic recovery.
git clone https://github.com/ashita-ai/engram.git
cd engram
docker compose -f docker-compose.full.yml up -d # Starts Qdrant + PostgreSQL
docker build -t engram-mcp .Add to your project's MCP config in ~/.claude.json (under projects.<path>.mcpServers):
{
"engram": {
"type": "stdio",
"command": "docker",
"args": [
"run", "-i", "--rm",
"-e", "ENGRAM_QDRANT_URL=http://host.docker.internal:6333",
"-e", "ENGRAM_USER=your-username",
"-e", "ENGRAM_EMBEDDING_PROVIDER=openai",
"-e", "OPENAI_API_KEY",
"-e", "ENGRAM_OPENAI_API_KEY",
"-e", "ENGRAM_DURABLE_BACKEND=dbos",
"-e", "ENGRAM_DATABASE_URL=postgresql://engram:engram@host.docker.internal:5432/engram_dbos",
"engram-mcp"
]
}
}Both -e OPENAI_API_KEY and -e ENGRAM_OPENAI_API_KEY are passed so whichever is set in Claude Code's environment gets forwarded to the container. You don't need to hardcode the key value.
For quick testing without workflow durability:
docker compose up -d # Starts Qdrant only
docker build -t engram-mcp .{
"engram": {
"type": "stdio",
"command": "docker",
"args": [
"run", "-i", "--rm",
"-e", "ENGRAM_QDRANT_URL=http://host.docker.internal:6333",
"-e", "ENGRAM_EMBEDDING_PROVIDER=openai",
"-e", "OPENAI_API_KEY",
"-e", "ENGRAM_OPENAI_API_KEY",
"engram-mcp"
]
}
}docker run -d -p 6333:6333 qdrant/qdrant # Vector DB
docker run -d -p 5432:5432 -e POSTGRES_USER=engram -e POSTGRES_PASSWORD=engram -e POSTGRES_DB=engram_dbos postgres:15-alpine # For DBOS
uv sync --extra mcp{
"engram": {
"command": "uv",
"args": ["run", "--directory", "/path/to/engram", "python", "-m", "engram.mcp"],
"env": {
"OPENAI_API_KEY": "sk-proj-YOUR_KEY_HERE",
"ENGRAM_DURABLE_BACKEND": "dbos",
"ENGRAM_DATABASE_URL": "postgresql://engram:engram@localhost:5432/engram_dbos"
}
}
}For local (non-Docker) setup, the env block sets environment variables directly in the subprocess, so either variable name works.
10 MCP tools: engram_encode, engram_recall, engram_verify, engram_stats, engram_delete, engram_get, engram_consolidate, engram_promote, engram_search, engram_recall_at. See docs/mcp.md.
- Architecture — Memory types, confidence scoring, consolidation
- API Reference — Complete endpoint documentation
- Development — Setup, configuration, contributing
- Research — Scientific foundations and limitations
- Competitive Analysis — Comparison with alternatives
Beta. Core functionality complete. 990+ tests. Production-ready for evaluation.
MIT
