From 17fef54478d2dc881a62477066f37e6a04838bc2 Mon Sep 17 00:00:00 2001 From: Rohit Ghumare Date: Thu, 26 Feb 2026 21:04:10 +0530 Subject: [PATCH] docs: update README for v0.3.0 Reflects all v0.3.0 features: self-evaluation framework, health monitoring, circuit breaker, BM25 search, dedup, eviction, MCP server (5 tools), 4 skills, 12 hooks, 20 API endpoints, and updated architecture stats. --- README.md | 212 +++++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 195 insertions(+), 17 deletions(-) diff --git a/README.md b/README.md index 84709fc..0bdd638 100644 --- a/README.md +++ b/README.md @@ -6,6 +6,9 @@

Quick StartHow It Works • + Self-Evaluation • + MCP Server • + SkillsConfigurationAPIPlugin Install @@ -31,6 +34,19 @@ Session 2: "Now add rate limiting" Claude Code starts with full project awareness ``` +## What's New in v0.3.0 + +- **Self-evaluation framework** — Zod I/O validation, quality scoring (0-100), self-correcting LLM retries, per-function metrics +- **Health monitoring** — Real-time CPU, memory, event loop lag tracking with degraded/critical alerts +- **Circuit breaker** — Automatic failover when LLM providers go down (closed → open → half-open recovery) +- **BM25 search** — Replaced basic TF scoring with BM25 (k1=1.2, b=0.75) for better search relevance +- **Deduplication** — SHA-256 content hashing with 5-minute TTL window prevents duplicate observations +- **Memory eviction** — Age-based + importance-based + per-project cap eviction with dry-run support +- **MCP server** — 5 tools for any MCP-compatible client (memory_recall, memory_save, memory_file_history, memory_patterns, memory_sessions) +- **4 skills** — /recall, /remember, /session-history, /forget +- **12 hooks** — All Claude Code hook types covered (up from 5) +- **OTEL telemetry** — Counters and histograms for observability + ## Quick Start ### 1. Install the Plugin @@ -40,7 +56,7 @@ Session 2: "Now add rate limiting" /plugin install agentmemory ``` -All 5 hooks are registered automatically. No manual config needed. +All 12 hooks, 4 skills, and MCP server are registered automatically. ### 2. Start the Worker @@ -59,10 +75,28 @@ npm install && npm run build && npm start ```bash curl http://localhost:3111/agentmemory/health -# {"status":"ok","service":"agentmemory","version":"0.1.0"} ``` -That's it. Start a Claude Code session and agentmemory begins capturing. Start another session and it injects context from all previous sessions. +Returns real health data: + +```json +{ + "status": "healthy", + "service": "agentmemory", + "version": "0.3.0", + "health": { + "memory": { "heapUsed": 42000000, "heapTotal": 67000000 }, + "cpu": { "percent": 2.1 }, + "eventLoopLagMs": 1.2, + "status": "healthy", + "alerts": [] + }, + "functionMetrics": [...], + "circuitBreaker": { "state": "closed", "failures": 0 } +} +``` + +HTTP 503 when status is `critical`. ### Manual Hook Setup (alternative) @@ -73,7 +107,14 @@ If you prefer not to use the plugin, add hooks directly to `~/.claude/settings.j "hooks": { "SessionStart": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/session-start.mjs" }], "UserPromptSubmit": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/prompt-submit.mjs" }], + "PreToolUse": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/pre-tool-use.mjs" }], "PostToolUse": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/post-tool-use.mjs" }], + "PostToolUseFailure": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/post-tool-failure.mjs" }], + "PreCompact": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/pre-compact.mjs" }], + "SubagentStart": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/subagent-start.mjs" }], + "SubagentStop": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/subagent-stop.mjs" }], + "Notification": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/notification.mjs" }], + "TaskCompleted": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/task-completed.mjs" }], "Stop": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/stop.mjs" }], "SessionEnd": [{ "type": "command", "command": "node ~/agentmemory/dist/hooks/session-end.mjs" }] } @@ -88,10 +129,13 @@ Every tool use flows through this pipeline: ``` PostToolUse hook fires + -> Dedup check SHA-256 hash of tool_name + tool_input (5min window) -> mem::privacy Strip secrets, API keys, tags -> mem::observe Store raw observation, push to real-time stream -> mem::compress LLM extracts: type, facts, narrative, concepts, files - (async -- never blocks your session) + Validates output with Zod, scores quality (0-100) + Self-corrects on validation failure (1 retry) + Records latency + quality in per-function metrics ``` ### Context Injection @@ -101,6 +145,7 @@ On `SessionStart`, agentmemory builds context from your history: ``` SessionStart hook fires -> mem::context Load recent sessions for this project + BM25-ranked search across observations Prefer summaries, fall back to high-importance observations Apply token budget (default: 2000 tokens) Return formatted context block @@ -113,7 +158,14 @@ SessionStart hook fires |------|----------| | `SessionStart` | Project path, session ID, working directory | | `UserPromptSubmit` | User prompts (privacy-filtered) | -| `PostToolUse` | Tool name, input, output (Read, Write, Edit, Bash, etc.) | +| `PreToolUse` | File access patterns (Read, Write, Edit, Glob, Grep) | +| `PostToolUse` | Tool name, input, output | +| `PostToolUseFailure` | Failed tool invocations with error context | +| `PreCompact` | Re-injects memory context before context compaction | +| `SubagentStart` | Sub-agent spawning events | +| `SubagentStop` | Sub-agent completion events | +| `Notification` | System notifications | +| `TaskCompleted` | Task completion events | | `Stop` | Triggers end-of-session summary | | `SessionEnd` | Marks session complete | @@ -123,11 +175,101 @@ All data passes through `mem::privacy` before storage: - `...` tags are stripped - API keys (`sk-*`, `ghp_*`, `xoxb-*`, `AKIA*`), JWTs, and secrets are redacted +## Self-Evaluation + +agentmemory v0.3.0 monitors its own health and validates its own I/O. + +### Quality Scoring + +Every LLM-generated compression and summary is scored 0-100: + +| Check | Points | +|-------|--------| +| Has structured facts | +20 | +| Narrative length > 10 chars | +20 | +| Concepts extracted | +20 | +| Title quality (not truncated) | +20 | +| Importance in valid range (1-10) | +20 | + +Scores are tracked per-function and exposed via the `/health` endpoint. + +### Self-Correction + +When LLM output fails Zod validation, agentmemory retries once with a stricter prompt suffix explaining the exact validation errors. This recovers from malformed JSON, missing fields, and out-of-range values. + +### Circuit Breaker + +LLM providers fail. The circuit breaker prevents cascading failures: + +``` +Closed (normal) + -> 3 failures in 60s -> Open (all calls rejected) + -> 30s cooldown -> Half-Open (one test call allowed) + -> Success -> Closed (normal) + -> Failure -> Open (restart cooldown) +``` + +When the circuit is open, observations are stored raw without compression. No data is lost. + +### Health Monitor + +Collects every 30 seconds: +- **Memory**: heap used/total, RSS, external +- **CPU**: user/system time, real percentage via delta sampling +- **Event loop lag**: detect blocking operations +- **Connection state**: engine connectivity + +Thresholds: +| Metric | Warning | Critical | +|--------|---------|----------| +| Event loop lag | > 100ms | > 500ms | +| CPU | > 80% | > 90% | +| Heap usage | > 80% | > 95% | +| Connection | reconnecting | disconnected | + +## MCP Server + +agentmemory exposes 5 tools via MCP for any compatible client: + +| Tool | Description | +|------|-------------| +| `memory_recall` | Search past observations by keyword | +| `memory_save` | Save an insight, decision, or pattern to long-term memory | +| `memory_file_history` | Get past observations about specific files | +| `memory_patterns` | Detect recurring patterns across sessions | +| `memory_sessions` | List recent sessions with status | + +### MCP Endpoints + +``` +GET /agentmemory/mcp/tools — List available tools +POST /agentmemory/mcp/call — Execute a tool +``` + +### Example + +```bash +curl -X POST http://localhost:3111/agentmemory/mcp/call \ + -H "Content-Type: application/json" \ + -d '{"name": "memory_recall", "arguments": {"query": "authentication", "limit": 5}}' +``` + +## Skills + +Four slash commands for interacting with memory: + +| Skill | Usage | +|-------|-------| +| `/recall` | Search memory for past context (`/recall auth middleware`) | +| `/remember` | Save something to long-term memory (`/remember always use jose for JWT`) | +| `/session-history` | Show recent session summaries | +| `/forget` | Delete specific observations or entire sessions | + ## Configuration ### LLM Providers -agentmemory needs an LLM for two things: compressing observations and generating session summaries. +agentmemory needs an LLM for compressing observations and generating session summaries. | Provider | Config | Cost | |----------|--------|------| @@ -172,17 +314,26 @@ All endpoints on port `3111`. Protected endpoints require `Authorization: Bearer | Method | Path | Description | |--------|------|-------------| -| `GET` | `/agentmemory/health` | Health check (always public) | +| `GET` | `/agentmemory/health` | Health check with metrics (always public) | | `POST` | `/agentmemory/session/start` | Start session + get context | | `POST` | `/agentmemory/session/end` | Mark session complete | | `POST` | `/agentmemory/observe` | Capture observation | | `POST` | `/agentmemory/context` | Generate context | -| `POST` | `/agentmemory/search` | Search observations | +| `POST` | `/agentmemory/search` | Search observations (BM25) | | `POST` | `/agentmemory/summarize` | Generate session summary | +| `POST` | `/agentmemory/remember` | Save to long-term memory | +| `POST` | `/agentmemory/forget` | Delete observations/sessions | +| `POST` | `/agentmemory/consolidate` | Merge duplicate observations | +| `POST` | `/agentmemory/patterns` | Detect recurring patterns | +| `POST` | `/agentmemory/generate-rules` | Generate CLAUDE.md rules from patterns | +| `POST` | `/agentmemory/file-context` | Get file-specific history | +| `POST` | `/agentmemory/evict` | Evict stale memories (supports `?dryRun=true`) | +| `POST` | `/agentmemory/migrate` | Import from SQLite | | `GET` | `/agentmemory/sessions` | List all sessions | | `GET` | `/agentmemory/observations?sessionId=X` | Session observations | | `GET` | `/agentmemory/viewer` | Real-time web viewer | -| `POST` | `/agentmemory/migrate` | Import from SQLite | +| `GET` | `/agentmemory/mcp/tools` | List MCP tools | +| `POST` | `/agentmemory/mcp/call` | Execute MCP tool | ### Examples @@ -204,15 +355,18 @@ curl -X POST http://localhost:3111/agentmemory/observe \ "data": {"tool": "Edit", "file": "src/auth.ts", "content": "Added JWT validation"} }' -# Search across all observations +# Search across all observations (BM25-ranked) curl -X POST http://localhost:3111/agentmemory/search \ -H "Content-Type: application/json" \ -d '{"query": "authentication middleware", "limit": 10}' -# Get context for a new session -curl -X POST http://localhost:3111/agentmemory/context \ +# Save something to long-term memory +curl -X POST http://localhost:3111/agentmemory/remember \ -H "Content-Type: application/json" \ - -d '{"sessionId": "ses-2", "project": "/my/project"}' + -d '{"content": "Always use jose for JWT in Edge environments", "type": "preference"}' + +# Preview memory eviction (dry run) +curl -X POST "http://localhost:3111/agentmemory/evict?dryRun=true" # List sessions curl http://localhost:3111/agentmemory/sessions @@ -233,7 +387,7 @@ Connects to iii-engine's WebSocket stream. Dark theme, timeline layout, session /plugin install agentmemory ``` -Restart Claude Code. All 5 hooks are registered automatically. +Restart Claude Code. All 12 hooks, 4 skills, and MCP tools are registered automatically. ### Plugin Commands @@ -268,8 +422,28 @@ agentmemory is built on iii-engine's three primitives: | SQLite / Postgres | iii KV State | | SSE / Socket.io | iii Streams (WebSocket) | | pm2 / systemd | iii-engine worker management | - -**~30 source files. ~1,800 LOC. 37KB bundled.** +| Prometheus / Grafana | iii OTEL + built-in health monitor | +| Redis (circuit breaker) | In-process circuit breaker | + +**52 source files. ~4,800 LOC. 82 tests. 195KB bundled.** + +### Functions + +| Function | Purpose | +|----------|---------| +| `mem::observe` | Store raw observation with dedup check | +| `mem::compress` | LLM compression with validation + quality scoring | +| `mem::search` | BM25-ranked full-text search | +| `mem::context` | Build session context within token budget | +| `mem::summarize` | Generate validated session summaries | +| `mem::remember` | Save to long-term memory | +| `mem::forget` | Delete observations, sessions, or memories | +| `mem::file-index` | File-specific observation lookup | +| `mem::consolidate` | Merge duplicate observations | +| `mem::patterns` | Detect recurring patterns | +| `mem::generate-rules` | Generate CLAUDE.md rules from patterns | +| `mem::migrate` | Import from SQLite | +| `mem::evict` | Age + importance + cap-based memory eviction | ### Data Model @@ -278,13 +452,17 @@ agentmemory is built on iii-engine's three primitives: | `mem:sessions` | `{session_id}` | Session metadata, project, timestamps | | `mem:obs:{session_id}` | `{obs_id}` | Compressed observations | | `mem:summaries` | `{session_id}` | End-of-session summaries | +| `mem:memories` | `{memory_id}` | Long-term memories (remember/forget) | +| `mem:metrics` | `{function_id}` | Per-function metrics (latency, quality, success rate) | +| `mem:health` | `latest` | Latest health snapshot | +| `mem:config` | `{key}` | Runtime configuration overrides | ## Development ```bash npm run dev # Hot reload npm run build # Production build -npm test # Unit tests (45 tests, ~250ms) +npm test # Unit tests (82 tests, ~300ms) npm run test:integration # API tests (requires running services) ```