diff --git a/README.md b/README.md
index 84709fc..0bdd638 100644
--- a/README.md
+++ b/README.md
@@ -6,6 +6,9 @@
Quick Start •
How It Works •
+ Self-Evaluation •
+ MCP Server •
+ Skills •
Configuration •
API •
Plugin Install
@@ -31,6 +34,19 @@ Session 2: "Now add rate limiting"
Claude Code starts with full project awareness
```
+## What's New in v0.3.0
+
+- **Self-evaluation framework** — Zod I/O validation, quality scoring (0-100), self-correcting LLM retries, per-function metrics
+- **Health monitoring** — Real-time CPU, memory, event loop lag tracking with degraded/critical alerts
+- **Circuit breaker** — Automatic failover when LLM providers go down (closed → open → half-open recovery)
+- **BM25 search** — Replaced basic TF scoring with BM25 (k1=1.2, b=0.75) for better search relevance
+- **Deduplication** — SHA-256 content hashing with 5-minute TTL window prevents duplicate observations
+- **Memory eviction** — Age-based + importance-based + per-project cap eviction with dry-run support
+- **MCP server** — 5 tools for any MCP-compatible client (memory_recall, memory_save, memory_file_history, memory_patterns, memory_sessions)
+- **4 skills** — /recall, /remember, /session-history, /forget
+- **12 hooks** — All Claude Code hook types covered (up from 5)
+- **OTEL telemetry** — Counters and histograms for observability
+
## Quick Start
### 1. Install the Plugin
@@ -40,7 +56,7 @@ Session 2: "Now add rate limiting"
/plugin install agentmemory
```
-All 5 hooks are registered automatically. No manual config needed.
+All 12 hooks, 4 skills, and MCP server are registered automatically.
### 2. Start the Worker
@@ -59,10 +75,28 @@ npm install && npm run build && npm start
```bash
curl http://localhost:3111/agentmemory/health
-# {"status":"ok","service":"agentmemory","version":"0.1.0"}
```
-That's it. Start a Claude Code session and agentmemory begins capturing. Start another session and it injects context from all previous sessions.
+Returns real health data:
+
+```json
+{
+ "status": "healthy",
+ "service": "agentmemory",
+ "version": "0.3.0",
+ "health": {
+ "memory": { "heapUsed": 42000000, "heapTotal": 67000000 },
+ "cpu": { "percent": 2.1 },
+ "eventLoopLagMs": 1.2,
+ "status": "healthy",
+ "alerts": []
+ },
+ "functionMetrics": [...],
+ "circuitBreaker": { "state": "closed", "failures": 0 }
+}
+```
+
+HTTP 503 when status is `critical`.
### Manual Hook Setup (alternative)
@@ -73,7 +107,14 @@ If you prefer not to use the plugin, add hooks directly to `~/.claude/settings.j
"hooks": {
"SessionStart": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/session-start.mjs" }],
"UserPromptSubmit": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/prompt-submit.mjs" }],
+ "PreToolUse": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/pre-tool-use.mjs" }],
"PostToolUse": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/post-tool-use.mjs" }],
+ "PostToolUseFailure": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/post-tool-failure.mjs" }],
+ "PreCompact": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/pre-compact.mjs" }],
+ "SubagentStart": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/subagent-start.mjs" }],
+ "SubagentStop": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/subagent-stop.mjs" }],
+ "Notification": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/notification.mjs" }],
+ "TaskCompleted": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/task-completed.mjs" }],
"Stop": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/stop.mjs" }],
"SessionEnd": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/session-end.mjs" }]
}
@@ -88,10 +129,13 @@ Every tool use flows through this pipeline:
```
PostToolUse hook fires
+ -> Dedup check SHA-256 hash of tool_name + tool_input (5min window)
-> mem::privacy Strip secrets, API keys, tags
-> mem::observe Store raw observation, push to real-time stream
-> mem::compress LLM extracts: type, facts, narrative, concepts, files
- (async -- never blocks your session)
+ Validates output with Zod, scores quality (0-100)
+ Self-corrects on validation failure (1 retry)
+ Records latency + quality in per-function metrics
```
### Context Injection
@@ -101,6 +145,7 @@ On `SessionStart`, agentmemory builds context from your history:
```
SessionStart hook fires
-> mem::context Load recent sessions for this project
+ BM25-ranked search across observations
Prefer summaries, fall back to high-importance observations
Apply token budget (default: 2000 tokens)
Return formatted context block
@@ -113,7 +158,14 @@ SessionStart hook fires
|------|----------|
| `SessionStart` | Project path, session ID, working directory |
| `UserPromptSubmit` | User prompts (privacy-filtered) |
-| `PostToolUse` | Tool name, input, output (Read, Write, Edit, Bash, etc.) |
+| `PreToolUse` | File access patterns (Read, Write, Edit, Glob, Grep) |
+| `PostToolUse` | Tool name, input, output |
+| `PostToolUseFailure` | Failed tool invocations with error context |
+| `PreCompact` | Re-injects memory context before context compaction |
+| `SubagentStart` | Sub-agent spawning events |
+| `SubagentStop` | Sub-agent completion events |
+| `Notification` | System notifications |
+| `TaskCompleted` | Task completion events |
| `Stop` | Triggers end-of-session summary |
| `SessionEnd` | Marks session complete |
@@ -123,11 +175,101 @@ All data passes through `mem::privacy` before storage:
- `...` tags are stripped
- API keys (`sk-*`, `ghp_*`, `xoxb-*`, `AKIA*`), JWTs, and secrets are redacted
+## Self-Evaluation
+
+agentmemory v0.3.0 monitors its own health and validates its own I/O.
+
+### Quality Scoring
+
+Every LLM-generated compression and summary is scored 0-100:
+
+| Check | Points |
+|-------|--------|
+| Has structured facts | +20 |
+| Narrative length > 10 chars | +20 |
+| Concepts extracted | +20 |
+| Title quality (not truncated) | +20 |
+| Importance in valid range (1-10) | +20 |
+
+Scores are tracked per-function and exposed via the `/health` endpoint.
+
+### Self-Correction
+
+When LLM output fails Zod validation, agentmemory retries once with a stricter prompt suffix explaining the exact validation errors. This recovers from malformed JSON, missing fields, and out-of-range values.
+
+### Circuit Breaker
+
+LLM providers fail. The circuit breaker prevents cascading failures:
+
+```
+Closed (normal)
+ -> 3 failures in 60s -> Open (all calls rejected)
+ -> 30s cooldown -> Half-Open (one test call allowed)
+ -> Success -> Closed (normal)
+ -> Failure -> Open (restart cooldown)
+```
+
+When the circuit is open, observations are stored raw without compression. No data is lost.
+
+### Health Monitor
+
+Collects every 30 seconds:
+- **Memory**: heap used/total, RSS, external
+- **CPU**: user/system time, real percentage via delta sampling
+- **Event loop lag**: detect blocking operations
+- **Connection state**: engine connectivity
+
+Thresholds:
+| Metric | Warning | Critical |
+|--------|---------|----------|
+| Event loop lag | > 100ms | > 500ms |
+| CPU | > 80% | > 90% |
+| Heap usage | > 80% | > 95% |
+| Connection | reconnecting | disconnected |
+
+## MCP Server
+
+agentmemory exposes 5 tools via MCP for any compatible client:
+
+| Tool | Description |
+|------|-------------|
+| `memory_recall` | Search past observations by keyword |
+| `memory_save` | Save an insight, decision, or pattern to long-term memory |
+| `memory_file_history` | Get past observations about specific files |
+| `memory_patterns` | Detect recurring patterns across sessions |
+| `memory_sessions` | List recent sessions with status |
+
+### MCP Endpoints
+
+```
+GET /agentmemory/mcp/tools — List available tools
+POST /agentmemory/mcp/call — Execute a tool
+```
+
+### Example
+
+```bash
+curl -X POST http://localhost:3111/agentmemory/mcp/call \
+ -H "Content-Type: application/json" \
+ -d '{"name": "memory_recall", "arguments": {"query": "authentication", "limit": 5}}'
+```
+
+## Skills
+
+Four slash commands for interacting with memory:
+
+| Skill | Usage |
+|-------|-------|
+| `/recall` | Search memory for past context (`/recall auth middleware`) |
+| `/remember` | Save something to long-term memory (`/remember always use jose for JWT`) |
+| `/session-history` | Show recent session summaries |
+| `/forget` | Delete specific observations or entire sessions |
+
## Configuration
### LLM Providers
-agentmemory needs an LLM for two things: compressing observations and generating session summaries.
+agentmemory needs an LLM for compressing observations and generating session summaries.
| Provider | Config | Cost |
|----------|--------|------|
@@ -172,17 +314,26 @@ All endpoints on port `3111`. Protected endpoints require `Authorization: Bearer
| Method | Path | Description |
|--------|------|-------------|
-| `GET` | `/agentmemory/health` | Health check (always public) |
+| `GET` | `/agentmemory/health` | Health check with metrics (always public) |
| `POST` | `/agentmemory/session/start` | Start session + get context |
| `POST` | `/agentmemory/session/end` | Mark session complete |
| `POST` | `/agentmemory/observe` | Capture observation |
| `POST` | `/agentmemory/context` | Generate context |
-| `POST` | `/agentmemory/search` | Search observations |
+| `POST` | `/agentmemory/search` | Search observations (BM25) |
| `POST` | `/agentmemory/summarize` | Generate session summary |
+| `POST` | `/agentmemory/remember` | Save to long-term memory |
+| `POST` | `/agentmemory/forget` | Delete observations/sessions |
+| `POST` | `/agentmemory/consolidate` | Merge duplicate observations |
+| `POST` | `/agentmemory/patterns` | Detect recurring patterns |
+| `POST` | `/agentmemory/generate-rules` | Generate CLAUDE.md rules from patterns |
+| `POST` | `/agentmemory/file-context` | Get file-specific history |
+| `POST` | `/agentmemory/evict` | Evict stale memories (supports `?dryRun=true`) |
+| `POST` | `/agentmemory/migrate` | Import from SQLite |
| `GET` | `/agentmemory/sessions` | List all sessions |
| `GET` | `/agentmemory/observations?sessionId=X` | Session observations |
| `GET` | `/agentmemory/viewer` | Real-time web viewer |
-| `POST` | `/agentmemory/migrate` | Import from SQLite |
+| `GET` | `/agentmemory/mcp/tools` | List MCP tools |
+| `POST` | `/agentmemory/mcp/call` | Execute MCP tool |
### Examples
@@ -204,15 +355,18 @@ curl -X POST http://localhost:3111/agentmemory/observe \
"data": {"tool": "Edit", "file": "src/auth.ts", "content": "Added JWT validation"}
}'
-# Search across all observations
+# Search across all observations (BM25-ranked)
curl -X POST http://localhost:3111/agentmemory/search \
-H "Content-Type: application/json" \
-d '{"query": "authentication middleware", "limit": 10}'
-# Get context for a new session
-curl -X POST http://localhost:3111/agentmemory/context \
+# Save something to long-term memory
+curl -X POST http://localhost:3111/agentmemory/remember \
-H "Content-Type: application/json" \
- -d '{"sessionId": "ses-2", "project": "/my/project"}'
+ -d '{"content": "Always use jose for JWT in Edge environments", "type": "preference"}'
+
+# Preview memory eviction (dry run)
+curl -X POST "http://localhost:3111/agentmemory/evict?dryRun=true"
# List sessions
curl http://localhost:3111/agentmemory/sessions
@@ -233,7 +387,7 @@ Connects to iii-engine's WebSocket stream. Dark theme, timeline layout, session
/plugin install agentmemory
```
-Restart Claude Code. All 5 hooks are registered automatically.
+Restart Claude Code. All 12 hooks, 4 skills, and MCP tools are registered automatically.
### Plugin Commands
@@ -268,8 +422,28 @@ agentmemory is built on iii-engine's three primitives:
| SQLite / Postgres | iii KV State |
| SSE / Socket.io | iii Streams (WebSocket) |
| pm2 / systemd | iii-engine worker management |
-
-**~30 source files. ~1,800 LOC. 37KB bundled.**
+| Prometheus / Grafana | iii OTEL + built-in health monitor |
+| Redis (circuit breaker) | In-process circuit breaker |
+
+**52 source files. ~4,800 LOC. 82 tests. 195KB bundled.**
+
+### Functions
+
+| Function | Purpose |
+|----------|---------|
+| `mem::observe` | Store raw observation with dedup check |
+| `mem::compress` | LLM compression with validation + quality scoring |
+| `mem::search` | BM25-ranked full-text search |
+| `mem::context` | Build session context within token budget |
+| `mem::summarize` | Generate validated session summaries |
+| `mem::remember` | Save to long-term memory |
+| `mem::forget` | Delete observations, sessions, or memories |
+| `mem::file-index` | File-specific observation lookup |
+| `mem::consolidate` | Merge duplicate observations |
+| `mem::patterns` | Detect recurring patterns |
+| `mem::generate-rules` | Generate CLAUDE.md rules from patterns |
+| `mem::migrate` | Import from SQLite |
+| `mem::evict` | Age + importance + cap-based memory eviction |
### Data Model
@@ -278,13 +452,17 @@ agentmemory is built on iii-engine's three primitives:
| `mem:sessions` | `{session_id}` | Session metadata, project, timestamps |
| `mem:obs:{session_id}` | `{obs_id}` | Compressed observations |
| `mem:summaries` | `{session_id}` | End-of-session summaries |
+| `mem:memories` | `{memory_id}` | Long-term memories (remember/forget) |
+| `mem:metrics` | `{function_id}` | Per-function metrics (latency, quality, success rate) |
+| `mem:health` | `latest` | Latest health snapshot |
+| `mem:config` | `{key}` | Runtime configuration overrides |
## Development
```bash
npm run dev # Hot reload
npm run build # Production build
-npm test # Unit tests (45 tests, ~250ms)
+npm test # Unit tests (82 tests, ~300ms)
npm run test:integration # API tests (requires running services)
```