Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
212 changes: 195 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@
<p align="center">
<a href="#quick-start">Quick Start</a> &bull;
<a href="#how-it-works">How It Works</a> &bull;
<a href="#self-evaluation">Self-Evaluation</a> &bull;
<a href="#mcp-server">MCP Server</a> &bull;
<a href="#skills">Skills</a> &bull;
<a href="#configuration">Configuration</a> &bull;
<a href="#api">API</a> &bull;
<a href="#plugin-install">Plugin Install</a>
Expand All @@ -31,6 +34,19 @@ Session 2: "Now add rate limiting"
Claude Code starts with full project awareness
```

## What's New in v0.3.0

- **Self-evaluation framework** — Zod I/O validation, quality scoring (0-100), self-correcting LLM retries, per-function metrics
- **Health monitoring** — Real-time CPU, memory, event loop lag tracking with degraded/critical alerts
- **Circuit breaker** — Automatic failover when LLM providers go down (closed → open → half-open recovery)
- **BM25 search** — Replaced basic TF scoring with BM25 (k1=1.2, b=0.75) for better search relevance
- **Deduplication** — SHA-256 content hashing with 5-minute TTL window prevents duplicate observations
- **Memory eviction** — Age-based + importance-based + per-project cap eviction with dry-run support
- **MCP server** — 5 tools for any MCP-compatible client (memory_recall, memory_save, memory_file_history, memory_patterns, memory_sessions)
- **4 skills** — /recall, /remember, /session-history, /forget
- **12 hooks** — All Claude Code hook types covered (up from 5)
- **OTEL telemetry** — Counters and histograms for observability

## Quick Start

### 1. Install the Plugin
Expand All @@ -40,7 +56,7 @@ Session 2: "Now add rate limiting"
/plugin install agentmemory
```

All 5 hooks are registered automatically. No manual config needed.
All 12 hooks, 4 skills, and MCP server are registered automatically.

### 2. Start the Worker

Expand All @@ -59,10 +75,28 @@ npm install && npm run build && npm start

```bash
curl http://localhost:3111/agentmemory/health
# {"status":"ok","service":"agentmemory","version":"0.1.0"}
```

That's it. Start a Claude Code session and agentmemory begins capturing. Start another session and it injects context from all previous sessions.
Returns real health data:

```json
{
"status": "healthy",
"service": "agentmemory",
"version": "0.3.0",
"health": {
"memory": { "heapUsed": 42000000, "heapTotal": 67000000 },
"cpu": { "percent": 2.1 },
"eventLoopLagMs": 1.2,
"status": "healthy",
"alerts": []
},
"functionMetrics": [...],
"circuitBreaker": { "state": "closed", "failures": 0 }
}
```

HTTP 503 when status is `critical`.

### Manual Hook Setup (alternative)

Expand All @@ -73,7 +107,14 @@ If you prefer not to use the plugin, add hooks directly to `~/.claude/settings.j
"hooks": {
"SessionStart": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/session-start.mjs" }],
"UserPromptSubmit": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/prompt-submit.mjs" }],
"PreToolUse": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/pre-tool-use.mjs" }],
"PostToolUse": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/post-tool-use.mjs" }],
"PostToolUseFailure": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/post-tool-failure.mjs" }],
"PreCompact": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/pre-compact.mjs" }],
"SubagentStart": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/subagent-start.mjs" }],
"SubagentStop": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/subagent-stop.mjs" }],
"Notification": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/notification.mjs" }],
"TaskCompleted": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/task-completed.mjs" }],
"Stop": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/stop.mjs" }],
"SessionEnd": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/session-end.mjs" }]
}
Expand All @@ -88,10 +129,13 @@ Every tool use flows through this pipeline:

```
PostToolUse hook fires
-> Dedup check SHA-256 hash of tool_name + tool_input (5min window)
-> mem::privacy Strip secrets, API keys, <private> tags
-> mem::observe Store raw observation, push to real-time stream
-> mem::compress LLM extracts: type, facts, narrative, concepts, files
(async -- never blocks your session)
Validates output with Zod, scores quality (0-100)
Self-corrects on validation failure (1 retry)
Records latency + quality in per-function metrics
```

### Context Injection
Expand All @@ -101,6 +145,7 @@ On `SessionStart`, agentmemory builds context from your history:
```
SessionStart hook fires
-> mem::context Load recent sessions for this project
BM25-ranked search across observations
Prefer summaries, fall back to high-importance observations
Apply token budget (default: 2000 tokens)
Return formatted context block
Expand All @@ -113,7 +158,14 @@ SessionStart hook fires
|------|----------|
| `SessionStart` | Project path, session ID, working directory |
| `UserPromptSubmit` | User prompts (privacy-filtered) |
| `PostToolUse` | Tool name, input, output (Read, Write, Edit, Bash, etc.) |
| `PreToolUse` | File access patterns (Read, Write, Edit, Glob, Grep) |
| `PostToolUse` | Tool name, input, output |
| `PostToolUseFailure` | Failed tool invocations with error context |
| `PreCompact` | Re-injects memory context before context compaction |
| `SubagentStart` | Sub-agent spawning events |
| `SubagentStop` | Sub-agent completion events |
| `Notification` | System notifications |
| `TaskCompleted` | Task completion events |
| `Stop` | Triggers end-of-session summary |
| `SessionEnd` | Marks session complete |

Expand All @@ -123,11 +175,101 @@ All data passes through `mem::privacy` before storage:
- `<private>...</private>` tags are stripped
- API keys (`sk-*`, `ghp_*`, `xoxb-*`, `AKIA*`), JWTs, and secrets are redacted

## Self-Evaluation

agentmemory v0.3.0 monitors its own health and validates its own I/O.

### Quality Scoring

Every LLM-generated compression and summary is scored 0-100:

| Check | Points |
|-------|--------|
| Has structured facts | +20 |
| Narrative length > 10 chars | +20 |
| Concepts extracted | +20 |
| Title quality (not truncated) | +20 |
| Importance in valid range (1-10) | +20 |

Scores are tracked per-function and exposed via the `/health` endpoint.

### Self-Correction

When LLM output fails Zod validation, agentmemory retries once with a stricter prompt suffix explaining the exact validation errors. This recovers from malformed JSON, missing fields, and out-of-range values.

### Circuit Breaker

LLM providers fail. The circuit breaker prevents cascading failures:

```
Closed (normal)
-> 3 failures in 60s -> Open (all calls rejected)
-> 30s cooldown -> Half-Open (one test call allowed)
-> Success -> Closed (normal)
-> Failure -> Open (restart cooldown)
```

When the circuit is open, observations are stored raw without compression. No data is lost.

### Health Monitor

Collects every 30 seconds:
- **Memory**: heap used/total, RSS, external
- **CPU**: user/system time, real percentage via delta sampling
- **Event loop lag**: detect blocking operations
- **Connection state**: engine connectivity

Thresholds:
| Metric | Warning | Critical |
|--------|---------|----------|
| Event loop lag | > 100ms | > 500ms |
| CPU | > 80% | > 90% |
| Heap usage | > 80% | > 95% |
| Connection | reconnecting | disconnected |

## MCP Server

agentmemory exposes 5 tools via MCP for any compatible client:

| Tool | Description |
|------|-------------|
| `memory_recall` | Search past observations by keyword |
| `memory_save` | Save an insight, decision, or pattern to long-term memory |
| `memory_file_history` | Get past observations about specific files |
| `memory_patterns` | Detect recurring patterns across sessions |
| `memory_sessions` | List recent sessions with status |

### MCP Endpoints

```
GET /agentmemory/mcp/tools — List available tools
POST /agentmemory/mcp/call — Execute a tool
```

### Example

```bash
curl -X POST http://localhost:3111/agentmemory/mcp/call \
-H "Content-Type: application/json" \
-d '{"name": "memory_recall", "arguments": {"query": "authentication", "limit": 5}}'
```

## Skills

Four slash commands for interacting with memory:

| Skill | Usage |
|-------|-------|
| `/recall` | Search memory for past context (`/recall auth middleware`) |
| `/remember` | Save something to long-term memory (`/remember always use jose for JWT`) |
| `/session-history` | Show recent session summaries |
| `/forget` | Delete specific observations or entire sessions |

## Configuration

### LLM Providers

agentmemory needs an LLM for two things: compressing observations and generating session summaries.
agentmemory needs an LLM for compressing observations and generating session summaries.

| Provider | Config | Cost |
|----------|--------|------|
Expand Down Expand Up @@ -172,17 +314,26 @@ All endpoints on port `3111`. Protected endpoints require `Authorization: Bearer

| Method | Path | Description |
|--------|------|-------------|
| `GET` | `/agentmemory/health` | Health check (always public) |
| `GET` | `/agentmemory/health` | Health check with metrics (always public) |
| `POST` | `/agentmemory/session/start` | Start session + get context |
| `POST` | `/agentmemory/session/end` | Mark session complete |
| `POST` | `/agentmemory/observe` | Capture observation |
| `POST` | `/agentmemory/context` | Generate context |
| `POST` | `/agentmemory/search` | Search observations |
| `POST` | `/agentmemory/search` | Search observations (BM25) |
| `POST` | `/agentmemory/summarize` | Generate session summary |
| `POST` | `/agentmemory/remember` | Save to long-term memory |
| `POST` | `/agentmemory/forget` | Delete observations/sessions |
| `POST` | `/agentmemory/consolidate` | Merge duplicate observations |
| `POST` | `/agentmemory/patterns` | Detect recurring patterns |
| `POST` | `/agentmemory/generate-rules` | Generate CLAUDE.md rules from patterns |
| `POST` | `/agentmemory/file-context` | Get file-specific history |
| `POST` | `/agentmemory/evict` | Evict stale memories (supports `?dryRun=true`) |
| `POST` | `/agentmemory/migrate` | Import from SQLite |
| `GET` | `/agentmemory/sessions` | List all sessions |
| `GET` | `/agentmemory/observations?sessionId=X` | Session observations |
| `GET` | `/agentmemory/viewer` | Real-time web viewer |
| `POST` | `/agentmemory/migrate` | Import from SQLite |
| `GET` | `/agentmemory/mcp/tools` | List MCP tools |
| `POST` | `/agentmemory/mcp/call` | Execute MCP tool |

### Examples

Expand All @@ -204,15 +355,18 @@ curl -X POST http://localhost:3111/agentmemory/observe \
"data": {"tool": "Edit", "file": "src/auth.ts", "content": "Added JWT validation"}
}'

# Search across all observations
# Search across all observations (BM25-ranked)
curl -X POST http://localhost:3111/agentmemory/search \
-H "Content-Type: application/json" \
-d '{"query": "authentication middleware", "limit": 10}'

# Get context for a new session
curl -X POST http://localhost:3111/agentmemory/context \
# Save something to long-term memory
curl -X POST http://localhost:3111/agentmemory/remember \
-H "Content-Type: application/json" \
-d '{"sessionId": "ses-2", "project": "/my/project"}'
-d '{"content": "Always use jose for JWT in Edge environments", "type": "preference"}'

# Preview memory eviction (dry run)
curl -X POST "http://localhost:3111/agentmemory/evict?dryRun=true"

# List sessions
curl http://localhost:3111/agentmemory/sessions
Expand All @@ -233,7 +387,7 @@ Connects to iii-engine's WebSocket stream. Dark theme, timeline layout, session
/plugin install agentmemory
```

Restart Claude Code. All 5 hooks are registered automatically.
Restart Claude Code. All 12 hooks, 4 skills, and MCP tools are registered automatically.

### Plugin Commands

Expand Down Expand Up @@ -268,8 +422,28 @@ agentmemory is built on iii-engine's three primitives:
| SQLite / Postgres | iii KV State |
| SSE / Socket.io | iii Streams (WebSocket) |
| pm2 / systemd | iii-engine worker management |

**~30 source files. ~1,800 LOC. 37KB bundled.**
| Prometheus / Grafana | iii OTEL + built-in health monitor |
| Redis (circuit breaker) | In-process circuit breaker |

**52 source files. ~4,800 LOC. 82 tests. 195KB bundled.**

### Functions

| Function | Purpose |
|----------|---------|
| `mem::observe` | Store raw observation with dedup check |
| `mem::compress` | LLM compression with validation + quality scoring |
| `mem::search` | BM25-ranked full-text search |
| `mem::context` | Build session context within token budget |
| `mem::summarize` | Generate validated session summaries |
| `mem::remember` | Save to long-term memory |
| `mem::forget` | Delete observations, sessions, or memories |
| `mem::file-index` | File-specific observation lookup |
| `mem::consolidate` | Merge duplicate observations |
| `mem::patterns` | Detect recurring patterns |
| `mem::generate-rules` | Generate CLAUDE.md rules from patterns |
| `mem::migrate` | Import from SQLite |
| `mem::evict` | Age + importance + cap-based memory eviction |

### Data Model

Expand All @@ -278,13 +452,17 @@ agentmemory is built on iii-engine's three primitives:
| `mem:sessions` | `{session_id}` | Session metadata, project, timestamps |
| `mem:obs:{session_id}` | `{obs_id}` | Compressed observations |
| `mem:summaries` | `{session_id}` | End-of-session summaries |
| `mem:memories` | `{memory_id}` | Long-term memories (remember/forget) |
| `mem:metrics` | `{function_id}` | Per-function metrics (latency, quality, success rate) |
| `mem:health` | `latest` | Latest health snapshot |
| `mem:config` | `{key}` | Runtime configuration overrides |

## Development

```bash
npm run dev # Hot reload
npm run build # Production build
npm test # Unit tests (45 tests, ~250ms)
npm test # Unit tests (82 tests, ~300ms)
npm run test:integration # API tests (requires running services)
```

Expand Down