Skip to content

nillebco/engram

 
 

Repository files navigation

Engram Logo

Engram

Memory you can trust.

A memory system for AI applications that preserves ground truth, tracks confidence, and prevents hallucinations.

The Problem

AI memory systems have an accuracy crisis:

"All systems achieve answer accuracies below 70%, with hallucination rate and omission rate remaining high."

HaluMem: Hallucinations in LLM Memory Systems

Why? Existing systems (Mem0, basic RAG, LangChain memory) lose source data after LLM extraction. When extraction errors occur, there's no recovery path. The original truth is gone.

How Engram Is Different

Traditional Memory Engram
Embed immediately, discard source Store verbatim first, derive later
Single similarity score Composite confidence (extraction + corroboration + recency + verification)
No error recovery Re-derive from source when errors occur
Flat retrieval (top-K) Multi-hop reasoning via bidirectional links
No contradiction handling Negation tracking filters outdated facts
No auditability Full provenance — trace any memory to source

Key Differentiators

1. Ground Truth Preservation

Episodes (raw interactions) are immutable. All derived memories maintain source_episode_ids for traceability. If extraction errors occur, re-derive from source — the truth is never lost.

# Every derived memory traces back to ground truth
verified = await engram.verify(memory_id, user_id="user_123")
print(verified.source_episodes)  # Original verbatim content
print(verified.explanation)      # How confidence was calculated

2. Auditable Confidence Scoring

Confidence is not just cosine similarity. It's a composite score you can explain:

Confidence: 0.73
├── Extraction method: 0.9 (regex pattern match)
├── Corroboration: 0.6 (3 supporting sources)
├── Recency: 0.8 (confirmed 2 months ago)
└── Verification: 1.0 (format validated)

Filter by confidence for high-stakes queries: min_confidence=0.8

3. Deferred Consolidation

Fast writes, smart background processing:

Encode (immediate, <100ms):
  User input → Episode (verbatim) + StructuredMemory (regex extraction)

Consolidate (background):
  N Episodes → LLM synthesis → SemanticMemory (with links to similar memories)

Benefits: Low latency, batched LLM costs, error recovery possible.

4. A-MEM Style Multi-Hop Reasoning

Memories link to related memories bidirectionally (A-MEM research shows 2x improvement on multi-hop benchmarks):

# Follow links for deeper context
results = await engram.recall(
    query="What database?",
    user_id="user_123",
    follow_links=True,  # Traverse related memories
    max_hops=2,
)

5. Negation-Aware Retrieval

Tracks what isn't true to prevent returning outdated information:

Episode 1: "I use MongoDB"     → preference: MongoDB
Episode 2: "Switched to Redis" → preference: Redis, negation: "no longer uses MongoDB"

Query: "What database?"
Result: Redis (MongoDB filtered by negation)

6. Consolidation Strength (Testing Effect)

Memories strengthen through retrieval (Roediger & Karpicke, 2006: tested group forgot only 13% vs 52% for study-only):

# Memories used more often become more stable
memory.consolidation_strength  # 0.0-1.0, increases with use
memory.consolidation_passes    # How many times refined

Quick Start

Installation

git clone https://github.com/ashita-ai/engram.git
cd engram
uv sync --extra dev

# Start Qdrant (vector database)
docker run -p 6333:6333 qdrant/qdrant

Python SDK

from engram.service import EngramService

async with EngramService.create() as engram:
    # Store interaction (immediate, preserves ground truth)
    result = await engram.encode(
        content="My email is john@example.com and I prefer PostgreSQL",
        role="user",
        user_id="user_123",
    )
    print(f"Episode: {result.episode.id}")
    print(f"Emails: {result.structured.emails}")  # ["john@example.com"]

    # Retrieve with confidence filtering
    memories = await engram.recall(
        query="What's the user's email?",
        user_id="user_123",
        min_confidence=0.7,
    )

    # Verify any memory back to source
    verified = await engram.verify(memories[0].memory_id, user_id="user_123")
    print(verified.explanation)

    # Run consolidation (N episodes → semantic memory, scoped to project)
    await engram.consolidate(user_id="user_123", org_id="my-project")

REST API

# Start the server
uv run uvicorn engram.api.app:app --port 8000

# Encode a memory
curl -X POST http://localhost:8000/api/v1/encode \
  -H "Content-Type: application/json" \
  -d '{"content": "My email is john@example.com", "role": "user", "user_id": "user_123"}'

# Recall memories
curl -X POST http://localhost:8000/api/v1/recall \
  -H "Content-Type: application/json" \
  -d '{"query": "email", "user_id": "user_123", "min_confidence": 0.7}'

Memory Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                     ENCODE (Fast Path, <100ms)                       │
│  Input → Episode (verbatim, immutable)                              │
│       → StructuredMemory (regex: emails, phones, URLs)              │
└─────────────────────────────────────────────────────────────────────┘
                                ↓
┌─────────────────────────────────────────────────────────────────────┐
│                  CONSOLIDATE (Background, Deferred)                  │
│  N Episodes → LLM synthesis → SemanticMemory                        │
│            → Link to similar memories (bidirectional)               │
│            → Strengthen linked memories (Testing Effect)            │
└─────────────────────────────────────────────────────────────────────┘
                                ↓
┌─────────────────────────────────────────────────────────────────────┐
│                   RECALL (Multi-Signal Reranking)                    │
│  Query → Vector search (all types)                                  │
│       → Negation filtering (remove contradicted)                    │
│       → Confidence reranking (5 signals)                            │
│       → Multi-hop traversal (follow links)                          │
└─────────────────────────────────────────────────────────────────────┘

Memory Types

Type Confidence Purpose
Episode 1.0 (verbatim) Ground truth, immutable raw interactions
Structured 0.9 (regex) / 0.8 (LLM) Per-episode extraction (emails, phones, negations)
Semantic Variable (0.6 base) Cross-episode synthesis, LLM-consolidated
Procedural Variable (0.6 base) Behavioral patterns, long-term preferences

Comparison with Alternatives

Feature Engram Mem0 Zep/Graphiti LangChain
Ground truth preservation ✅ Immutable episodes ❌ Lost after extraction ✅ Episode subgraph
Confidence tracking ✅ Composite + auditable
Error recovery ✅ Re-derive from source ❌ Permanent errors ⚠️ Partial
Multi-hop reasoning ✅ Bidirectional links ✅ Graph ✅ Graph ⚠️
Negation handling ✅ Explicit filtering
Consolidation strength ✅ Testing Effect

When to use Engram: High-stakes applications requiring accuracy, auditability, and ground truth preservation (healthcare, legal, finance, enterprise assistants).

When to use alternatives: Rapid prototyping (LangChain), managed service needs (Mem0), temporal graph queries (Zep).

API Reference

Core Endpoints

Method Endpoint Description
POST /encode Store memory, extract structured data
POST /encode/batch Bulk import (up to 100 items)
POST /recall Semantic search with confidence filtering
GET /memories/{id}/verify Trace memory to source with explanation
GET /memories/{id}/provenance Full derivation chain

Workflow Endpoints

Method Endpoint Description
POST /workflows/consolidate N episodes → semantic memory
POST /workflows/structure LLM enrichment for episode
POST /workflows/promote Semantic → procedural synthesis
POST /workflows/decay Apply confidence decay

See docs/api.md for complete reference.

Confidence Formula

confidence = (
    extraction_method * 0.50 +  # VERBATIM=1.0, EXTRACTED=0.9, INFERRED=0.6
    corroboration * 0.25 +      # log scale: 1 source=0.5, 10 sources=1.0
    recency * 0.15 +            # exponential decay, 365-day half-life
    verification * 0.10         # 1.0 if format validated
) - contradiction_penalty       # -10% per contradiction, floor 0.1

Every score includes .explain() for auditability.

Research Foundations

Engram is inspired by cognitive science research:

Paper Finding Engram Implementation
Roediger & Karpicke (2006) Retrieval slows forgetting (13% vs 52% after 1 week) consolidation_strength, Testing Effect
A-MEM (2025) 2x multi-hop improvement via linking related_ids, bidirectional links
HaluMem (2025) <70% accuracy without source preservation Immutable episodes, verify()
Cognitive Workspace (2025) 58.6% memory reuse vs 0% for naive RAG Hierarchical memory tiers

Note: Engram uses these concepts as engineering abstractions, not strict cognitive implementations.

Configuration

Variable Default Description
ENGRAM_QDRANT_URL http://localhost:6333 Qdrant connection
ENGRAM_EMBEDDING_PROVIDER fastembed Embedding backend
ENGRAM_AUTH_ENABLED auto Bearer token auth
ENGRAM_CONSOLIDATION_THRESHOLD 10 Episodes before auto-consolidation

See docs/development.md for full configuration.

Development

uv run pytest tests/ -v --no-cov  # Run tests (990+ tests)
uv run ruff check src/engram/     # Lint
uv run mypy src/engram/           # Type check
uv run pre-commit run --all-files # All checks

Claude Code Integration

Required: OpenAI API Key

Engram uses OpenAI for embeddings and LLM operations. Set either of these environment variables — engram accepts both:

  • OPENAI_API_KEY (standard OpenAI convention)
  • ENGRAM_OPENAI_API_KEY (engram-prefixed)

If both are set, ENGRAM_OPENAI_API_KEY takes precedence.

Docker gotcha: Docker's -e VAR_NAME syntax (without =value) inherits the variable from the parent process, not from your shell profile. When Claude Code launches the MCP server, the parent process is Claude Code's Node runtime — which may not have your shell's env vars. To be safe, pass both names so whichever is available gets picked up:

"-e", "OPENAI_API_KEY",
"-e", "ENGRAM_OPENAI_API_KEY"

Docker with DBOS (Recommended)

DBOS provides durable workflow execution with automatic recovery.

git clone https://github.com/ashita-ai/engram.git
cd engram
docker compose -f docker-compose.full.yml up -d  # Starts Qdrant + PostgreSQL
docker build -t engram-mcp .

Add to your project's MCP config in ~/.claude.json (under projects.<path>.mcpServers):

{
  "engram": {
    "type": "stdio",
    "command": "docker",
    "args": [
      "run", "-i", "--rm",
      "-e", "ENGRAM_QDRANT_URL=http://host.docker.internal:6333",
      "-e", "ENGRAM_USER=your-username",
      "-e", "ENGRAM_EMBEDDING_PROVIDER=openai",
      "-e", "OPENAI_API_KEY",
      "-e", "ENGRAM_OPENAI_API_KEY",
      "-e", "ENGRAM_DURABLE_BACKEND=dbos",
      "-e", "ENGRAM_DATABASE_URL=postgresql://engram:engram@host.docker.internal:5432/engram_dbos",
      "engram-mcp"
    ]
  }
}

Both -e OPENAI_API_KEY and -e ENGRAM_OPENAI_API_KEY are passed so whichever is set in Claude Code's environment gets forwarded to the container. You don't need to hardcode the key value.

Docker Minimal (No Durability)

For quick testing without workflow durability:

docker compose up -d  # Starts Qdrant only
docker build -t engram-mcp .
{
  "engram": {
    "type": "stdio",
    "command": "docker",
    "args": [
      "run", "-i", "--rm",
      "-e", "ENGRAM_QDRANT_URL=http://host.docker.internal:6333",
      "-e", "ENGRAM_EMBEDDING_PROVIDER=openai",
      "-e", "OPENAI_API_KEY",
      "-e", "ENGRAM_OPENAI_API_KEY",
      "engram-mcp"
    ]
  }
}

Local (Alternative)

docker run -d -p 6333:6333 qdrant/qdrant  # Vector DB
docker run -d -p 5432:5432 -e POSTGRES_USER=engram -e POSTGRES_PASSWORD=engram -e POSTGRES_DB=engram_dbos postgres:15-alpine  # For DBOS
uv sync --extra mcp
{
  "engram": {
    "command": "uv",
    "args": ["run", "--directory", "/path/to/engram", "python", "-m", "engram.mcp"],
    "env": {
      "OPENAI_API_KEY": "sk-proj-YOUR_KEY_HERE",
      "ENGRAM_DURABLE_BACKEND": "dbos",
      "ENGRAM_DATABASE_URL": "postgresql://engram:engram@localhost:5432/engram_dbos"
    }
  }
}

For local (non-Docker) setup, the env block sets environment variables directly in the subprocess, so either variable name works.

Tools

10 MCP tools: engram_encode, engram_recall, engram_verify, engram_stats, engram_delete, engram_get, engram_consolidate, engram_promote, engram_search, engram_recall_at. See docs/mcp.md.

Documentation

Status

Beta. Core functionality complete. 990+ tests. Production-ready for evaluation.

License

MIT

About

A memory system for AI applications that preserves ground truth, tracks confidence, and prevents hallucinations.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 99.9%
  • Dockerfile 0.1%