From 17fef54478d2dc881a62477066f37e6a04838bc2 Mon Sep 17 00:00:00 2001
From: Rohit Ghumare <ghumare64@gmail.com>
Date: Thu, 26 Feb 2026 21:04:10 +0530
Subject: [PATCH] docs: update README for v0.3.0

Reflects all v0.3.0 features: self-evaluation framework, health monitoring,
circuit breaker, BM25 search, dedup, eviction, MCP server (5 tools),
4 skills, 12 hooks, 20 API endpoints, and updated architecture stats.
---
 README.md | 212 +++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 195 insertions(+), 17 deletions(-)
diff --git a/README.md b/README.md
index 84709fc..0bdd638 100644
--- a/README.md
+++ b/README.md
@@ -6,6 +6,9 @@
 <p align="center">
   <a href="#quick-start">Quick Start</a> &bull;
   <a href="#how-it-works">How It Works</a> &bull;
+  <a href="#self-evaluation">Self-Evaluation</a> &bull;
+  <a href="#mcp-server">MCP Server</a> &bull;
+  <a href="#skills">Skills</a> &bull;
   <a href="#configuration">Configuration</a> &bull;
   <a href="#api">API</a> &bull;
   <a href="#plugin-install">Plugin Install</a>
@@ -31,6 +34,19 @@ Session 2: "Now add rate limiting"
   Claude Code starts with full project awareness
 ```
 
+## What's New in v0.3.0
+
+- **Self-evaluation framework** — Zod I/O validation, quality scoring (0-100), self-correcting LLM retries, per-function metrics
+- **Health monitoring** — Real-time CPU, memory, event loop lag tracking with degraded/critical alerts
+- **Circuit breaker** — Automatic failover when LLM providers go down (closed → open → half-open recovery)
+- **BM25 search** — Replaced basic TF scoring with BM25 (k1=1.2, b=0.75) for better search relevance
+- **Deduplication** — SHA-256 content hashing with 5-minute TTL window prevents duplicate observations
+- **Memory eviction** — Age-based + importance-based + per-project cap eviction with dry-run support
+- **MCP server** — 5 tools for any MCP-compatible client (memory_recall, memory_save, memory_file_history, memory_patterns, memory_sessions)
+- **4 skills** — /recall, /remember, /session-history, /forget
+- **12 hooks** — All Claude Code hook types covered (up from 5)
+- **OTEL telemetry** — Counters and histograms for observability
+
 ## Quick Start
 
 ### 1. Install the Plugin
@@ -40,7 +56,7 @@ Session 2: "Now add rate limiting"
 /plugin install agentmemory
 ```
 
-All 5 hooks are registered automatically. No manual config needed.
+All 12 hooks, 4 skills, and MCP server are registered automatically.
 
 ### 2. Start the Worker
 
@@ -59,10 +75,28 @@ npm install && npm run build && npm start
 
 ```bash
 curl http://localhost:3111/agentmemory/health
-# {"status":"ok","service":"agentmemory","version":"0.1.0"}
 ```
 
-That's it. Start a Claude Code session and agentmemory begins capturing. Start another session and it injects context from all previous sessions.
+Returns real health data:
+
+```json
+{
+  "status": "healthy",
+  "service": "agentmemory",
+  "version": "0.3.0",
+  "health": {
+    "memory": { "heapUsed": 42000000, "heapTotal": 67000000 },
+    "cpu": { "percent": 2.1 },
+    "eventLoopLagMs": 1.2,
+    "status": "healthy",
+    "alerts": []
+  },
+  "functionMetrics": [...],
+  "circuitBreaker": { "state": "closed", "failures": 0 }
+}
+```
+
+HTTP 503 when status is `critical`.
 
 ### Manual Hook Setup (alternative)
 
@@ -73,7 +107,14 @@ If you prefer not to use the plugin, add hooks directly to `~/.claude/settings.j
   "hooks": {
     "SessionStart": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/session-start.mjs" }],
     "UserPromptSubmit": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/prompt-submit.mjs" }],
+    "PreToolUse": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/pre-tool-use.mjs" }],
     "PostToolUse": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/post-tool-use.mjs" }],
+    "PostToolUseFailure": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/post-tool-failure.mjs" }],
+    "PreCompact": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/pre-compact.mjs" }],
+    "SubagentStart": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/subagent-start.mjs" }],
+    "SubagentStop": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/subagent-stop.mjs" }],
+    "Notification": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/notification.mjs" }],
+    "TaskCompleted": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/task-completed.mjs" }],
     "Stop": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/stop.mjs" }],
     "SessionEnd": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/session-end.mjs" }]
   }
@@ -88,10 +129,13 @@ Every tool use flows through this pipeline:
 
 ```
 PostToolUse hook fires
+  -> Dedup check      SHA-256 hash of tool_name + tool_input (5min window)
   -> mem::privacy     Strip secrets, API keys, <private> tags
   -> mem::observe     Store raw observation, push to real-time stream
   -> mem::compress    LLM extracts: type, facts, narrative, concepts, files
-                      (async -- never blocks your session)
+                      Validates output with Zod, scores quality (0-100)
+                      Self-corrects on validation failure (1 retry)
+                      Records latency + quality in per-function metrics
 ```
 
 ### Context Injection
@@ -101,6 +145,7 @@ On `SessionStart`, agentmemory builds context from your history:
 ```
 SessionStart hook fires
   -> mem::context     Load recent sessions for this project
+                      BM25-ranked search across observations
                       Prefer summaries, fall back to high-importance observations
                       Apply token budget (default: 2000 tokens)
                       Return formatted context block
@@ -113,7 +158,14 @@ SessionStart hook fires
 |------|----------|
 | `SessionStart` | Project path, session ID, working directory |
 | `UserPromptSubmit` | User prompts (privacy-filtered) |
-| `PostToolUse` | Tool name, input, output (Read, Write, Edit, Bash, etc.) |
+| `PreToolUse` | File access patterns (Read, Write, Edit, Glob, Grep) |
+| `PostToolUse` | Tool name, input, output |
+| `PostToolUseFailure` | Failed tool invocations with error context |
+| `PreCompact` | Re-injects memory context before context compaction |
+| `SubagentStart` | Sub-agent spawning events |
+| `SubagentStop` | Sub-agent completion events |
+| `Notification` | System notifications |
+| `TaskCompleted` | Task completion events |
 | `Stop` | Triggers end-of-session summary |
 | `SessionEnd` | Marks session complete |
 
@@ -123,11 +175,101 @@ All data passes through `mem::privacy` before storage:
 - `<private>...</private>` tags are stripped
 - API keys (`sk-*`, `ghp_*`, `xoxb-*`, `AKIA*`), JWTs, and secrets are redacted
 
+## Self-Evaluation
+
+agentmemory v0.3.0 monitors its own health and validates its own I/O.
+
+### Quality Scoring
+
+Every LLM-generated compression and summary is scored 0-100:
+
+| Check | Points |
+|-------|--------|
+| Has structured facts | +20 |
+| Narrative length > 10 chars | +20 |
+| Concepts extracted | +20 |
+| Title quality (not truncated) | +20 |
+| Importance in valid range (1-10) | +20 |
+
+Scores are tracked per-function and exposed via the `/health` endpoint.
+
+### Self-Correction
+
+When LLM output fails Zod validation, agentmemory retries once with a stricter prompt suffix explaining the exact validation errors. This recovers from malformed JSON, missing fields, and out-of-range values.
+
+### Circuit Breaker
+
+LLM providers fail. The circuit breaker prevents cascading failures:
+
+```
+Closed (normal)
+  -> 3 failures in 60s -> Open (all calls rejected)
+  -> 30s cooldown      -> Half-Open (one test call allowed)
+  -> Success           -> Closed (normal)
+  -> Failure           -> Open (restart cooldown)
+```
+
+When the circuit is open, observations are stored raw without compression. No data is lost.
+
+### Health Monitor
+
+Collects every 30 seconds:
+- **Memory**: heap used/total, RSS, external
+- **CPU**: user/system time, real percentage via delta sampling
+- **Event loop lag**: detect blocking operations
+- **Connection state**: engine connectivity
+
+Thresholds:
+| Metric | Warning | Critical |
+|--------|---------|----------|
+| Event loop lag | > 100ms | > 500ms |
+| CPU | > 80% | > 90% |
+| Heap usage | > 80% | > 95% |
+| Connection | reconnecting | disconnected |
+
+## MCP Server
+
+agentmemory exposes 5 tools via MCP for any compatible client:
+
+| Tool | Description |
+|------|-------------|
+| `memory_recall` | Search past observations by keyword |
+| `memory_save` | Save an insight, decision, or pattern to long-term memory |
+| `memory_file_history` | Get past observations about specific files |
+| `memory_patterns` | Detect recurring patterns across sessions |
+| `memory_sessions` | List recent sessions with status |
+
+### MCP Endpoints
+
+```
+GET  /agentmemory/mcp/tools   — List available tools
+POST /agentmemory/mcp/call    — Execute a tool
+```
+
+### Example
+
+```bash
+curl -X POST http://localhost:3111/agentmemory/mcp/call \
+  -H "Content-Type: application/json" \
+  -d '{"name": "memory_recall", "arguments": {"query": "authentication", "limit": 5}}'
+```
+
+## Skills
+
+Four slash commands for interacting with memory:
+
+| Skill | Usage |
+|-------|-------|
+| `/recall` | Search memory for past context (`/recall auth middleware`) |
+| `/remember` | Save something to long-term memory (`/remember always use jose for JWT`) |
+| `/session-history` | Show recent session summaries |
+| `/forget` | Delete specific observations or entire sessions |
+
 ## Configuration
 
 ### LLM Providers
 
-agentmemory needs an LLM for two things: compressing observations and generating session summaries.
+agentmemory needs an LLM for compressing observations and generating session summaries.
 
 | Provider | Config | Cost |
 |----------|--------|------|
@@ -172,17 +314,26 @@ All endpoints on port `3111`. Protected endpoints require `Authorization: Bearer
 
 | Method | Path | Description |
 |--------|------|-------------|
-| `GET` | `/agentmemory/health` | Health check (always public) |
+| `GET` | `/agentmemory/health` | Health check with metrics (always public) |
 | `POST` | `/agentmemory/session/start` | Start session + get context |
 | `POST` | `/agentmemory/session/end` | Mark session complete |
 | `POST` | `/agentmemory/observe` | Capture observation |
 | `POST` | `/agentmemory/context` | Generate context |
-| `POST` | `/agentmemory/search` | Search observations |
+| `POST` | `/agentmemory/search` | Search observations (BM25) |
 | `POST` | `/agentmemory/summarize` | Generate session summary |
+| `POST` | `/agentmemory/remember` | Save to long-term memory |
+| `POST` | `/agentmemory/forget` | Delete observations/sessions |
+| `POST` | `/agentmemory/consolidate` | Merge duplicate observations |
+| `POST` | `/agentmemory/patterns` | Detect recurring patterns |
+| `POST` | `/agentmemory/generate-rules` | Generate CLAUDE.md rules from patterns |
+| `POST` | `/agentmemory/file-context` | Get file-specific history |
+| `POST` | `/agentmemory/evict` | Evict stale memories (supports `?dryRun=true`) |
+| `POST` | `/agentmemory/migrate` | Import from SQLite |
 | `GET` | `/agentmemory/sessions` | List all sessions |
 | `GET` | `/agentmemory/observations?sessionId=X` | Session observations |
 | `GET` | `/agentmemory/viewer` | Real-time web viewer |
-| `POST` | `/agentmemory/migrate` | Import from SQLite |
+| `GET` | `/agentmemory/mcp/tools` | List MCP tools |
+| `POST` | `/agentmemory/mcp/call` | Execute MCP tool |
 
 ### Examples
 
@@ -204,15 +355,18 @@ curl -X POST http://localhost:3111/agentmemory/observe \
     "data": {"tool": "Edit", "file": "src/auth.ts", "content": "Added JWT validation"}
   }'
 
-# Search across all observations
+# Search across all observations (BM25-ranked)
 curl -X POST http://localhost:3111/agentmemory/search \
   -H "Content-Type: application/json" \
   -d '{"query": "authentication middleware", "limit": 10}'
 
-# Get context for a new session
-curl -X POST http://localhost:3111/agentmemory/context \
+# Save something to long-term memory
+curl -X POST http://localhost:3111/agentmemory/remember \
   -H "Content-Type: application/json" \
-  -d '{"sessionId": "ses-2", "project": "/my/project"}'
+  -d '{"content": "Always use jose for JWT in Edge environments", "type": "preference"}'
+
+# Preview memory eviction (dry run)
+curl -X POST "http://localhost:3111/agentmemory/evict?dryRun=true"
 
 # List sessions
 curl http://localhost:3111/agentmemory/sessions
@@ -233,7 +387,7 @@ Connects to iii-engine's WebSocket stream. Dark theme, timeline layout, session
 /plugin install agentmemory
 ```
 
-Restart Claude Code. All 5 hooks are registered automatically.
+Restart Claude Code. All 12 hooks, 4 skills, and MCP tools are registered automatically.
 
 ### Plugin Commands
 
@@ -268,8 +422,28 @@ agentmemory is built on iii-engine's three primitives:
 | SQLite / Postgres | iii KV State |
 | SSE / Socket.io | iii Streams (WebSocket) |
 | pm2 / systemd | iii-engine worker management |
-
-**~30 source files. ~1,800 LOC. 37KB bundled.**
+| Prometheus / Grafana | iii OTEL + built-in health monitor |
+| Redis (circuit breaker) | In-process circuit breaker |
+
+**52 source files. ~4,800 LOC. 82 tests. 195KB bundled.**
+
+### Functions
+
+| Function | Purpose |
+|----------|---------|
+| `mem::observe` | Store raw observation with dedup check |
+| `mem::compress` | LLM compression with validation + quality scoring |
+| `mem::search` | BM25-ranked full-text search |
+| `mem::context` | Build session context within token budget |
+| `mem::summarize` | Generate validated session summaries |
+| `mem::remember` | Save to long-term memory |
+| `mem::forget` | Delete observations, sessions, or memories |
+| `mem::file-index` | File-specific observation lookup |
+| `mem::consolidate` | Merge duplicate observations |
+| `mem::patterns` | Detect recurring patterns |
+| `mem::generate-rules` | Generate CLAUDE.md rules from patterns |
+| `mem::migrate` | Import from SQLite |
+| `mem::evict` | Age + importance + cap-based memory eviction |
 
 ### Data Model
 
@@ -278,13 +452,17 @@ agentmemory is built on iii-engine's three primitives:
 | `mem:sessions` | `{session_id}` | Session metadata, project, timestamps |
 | `mem:obs:{session_id}` | `{obs_id}` | Compressed observations |
 | `mem:summaries` | `{session_id}` | End-of-session summaries |
+| `mem:memories` | `{memory_id}` | Long-term memories (remember/forget) |
+| `mem:metrics` | `{function_id}` | Per-function metrics (latency, quality, success rate) |
+| `mem:health` | `latest` | Latest health snapshot |
+| `mem:config` | `{key}` | Runtime configuration overrides |
 
 ## Development
 
 ```bash
 npm run dev               # Hot reload
 npm run build             # Production build
-npm test                  # Unit tests (45 tests, ~250ms)
+npm test                  # Unit tests (82 tests, ~300ms)
 npm run test:integration  # API tests (requires running services)
 ```