diff --git a/README.md b/README.md index 45f4de5..e635814 100644 --- a/README.md +++ b/README.md @@ -84,7 +84,7 @@ npx @agentmemory/agentmemory -> Embedding model: `all-MiniLM-L6-v2` (local, free, no API key). Full reports: [`benchmark/LONGMEMEVAL.md`](benchmark/LONGMEMEVAL.md), [`benchmark/QUALITY.md`](benchmark/QUALITY.md), [`benchmark/SCALE.md`](benchmark/SCALE.md) +> Embedding model: `all-MiniLM-L6-v2` (local, free, no API key). Full reports: [`benchmark/LONGMEMEVAL.md`](benchmark/LONGMEMEVAL.md), [`benchmark/QUALITY.md`](benchmark/QUALITY.md), [`benchmark/SCALE.md`](benchmark/SCALE.md). Competitor comparison: [`benchmark/COMPARISON.md`](benchmark/COMPARISON.md) — agentmemory vs mem0, Letta, Khoj, claude-mem, Hippo. --- @@ -210,6 +210,20 @@ agentmemory works with any agent that supports hooks, MCP, or REST API. All agen ## Quick Start +### Try it in 30 seconds + +```bash +# Terminal 1: start the server +npx @agentmemory/agentmemory + +# Terminal 2: seed sample data and see recall in action +npx @agentmemory/agentmemory demo +``` + +`demo` seeds 3 realistic sessions (JWT auth, N+1 query fix, rate limiting) and runs semantic searches against them. You'll see it find "N+1 query fix" when you search "database performance optimization" — keyword matching can't do that. + +Open `http://localhost:3113` to watch the memory build live. + ### Claude Code (one block, paste it) ``` @@ -225,7 +239,7 @@ Then add the MCP config for your agent: | Agent | Setup | |---|---| | **Cursor** | Add to `~/.cursor/mcp.json`: `{"mcpServers": {"agentmemory": {"command": "npx", "args": ["agentmemory-mcp"]}}}` | -| **OpenClaw** | Add to MCP config: `{"mcpServers": {"agentmemory": {"command": "npx", "args": ["agentmemory-mcp"]}}}` | +| **OpenClaw** | Add to MCP config: `{"mcpServers": {"agentmemory": {"command": "npx", "args": ["agentmemory-mcp"]}}}` or use the [gateway plugin](integrations/openclaw/) | | **Gemini CLI** | `gemini mcp add agentmemory -- npx agentmemory-mcp` | | **Codex CLI** | Add to `.codex/config.yaml`: `mcp_servers: {agentmemory: {command: npx, args: ["agentmemory-mcp"]}}` | | **OpenCode** | Add to `.opencode/config.json`: `{"mcpServers": {"agentmemory": {"command": "npx", "args": ["agentmemory-mcp"]}}}` | diff --git a/benchmark/COMPARISON.md b/benchmark/COMPARISON.md new file mode 100644 index 0000000..83f7a39 --- /dev/null +++ b/benchmark/COMPARISON.md @@ -0,0 +1,151 @@ +# AI Agent Memory: Benchmark Comparison + +How agentmemory compares against other persistent memory solutions for AI coding agents. + +All numbers here come from published benchmarks or public repositories. We link to primary sources wherever possible so you can reproduce. + +--- + +## Retrieval Accuracy (LongMemEval) + +[LongMemEval](https://arxiv.org/abs/2410.10813) (ICLR 2025) measures long-term memory retrieval across ~48 sessions per question on the S variant (500 questions, ~115K tokens each). + +| System | Benchmark | R@5 | Notes | +|---|---|---|---| +| **agentmemory** (BM25 + Vector) | LongMemEval-S | **95.2%** | `all-MiniLM-L6-v2` embeddings, no API key | +| agentmemory (BM25-only) | LongMemEval-S | 86.2% | Fallback when no embedding provider available | +| MemPalace | LongMemEval-S | ~96.6% | Vector-only, bigger embedding model | +| Letta / MemGPT | LoCoMo | 83.2% | Different benchmark (LoCoMo, not LongMemEval) | +| Mem0 | LoCoMo | 68.5% | Different benchmark (LoCoMo, not LongMemEval) | + +**⚠️ Apples vs oranges caveat:** agentmemory and MemPalace are measured on LongMemEval-S. Letta and Mem0 publish on [LoCoMo](https://snap-stanford.github.io/LoCoMo/), a different benchmark. We're showing both so you can see the ballpark. We'd love to run all four on the same dataset — if any maintainer wants to collaborate, open an issue. + +Full agentmemory methodology: [`LONGMEMEVAL.md`](LONGMEMEVAL.md) + +--- + +## Feature Matrix + +| Feature | agentmemory | mem0 | Letta/MemGPT | Khoj | claude-mem | Hippo | +|---|---|---|---|---|---|---| +| **GitHub stars** | Growing | 53K+ | 22K+ | 34K+ | 46K+ | Trending | +| **Type** | Memory engine + MCP server | Memory layer API | Full agent runtime | Personal AI | MCP server | Memory system | +| **Auto-capture via hooks** | ✅ 12 lifecycle hooks | ❌ Manual `add()` | ❌ Agent self-edits | ❌ Manual | ✅ Limited | ❌ Manual | +| **Search strategy** | BM25 + Vector + Graph | Vector + Graph | Vector (archival) | Semantic | FTS5 | Decay-weighted | +| **Multi-agent coordination** | ✅ Leases + signals + mesh | ❌ | Runtime-internal only | ❌ | ❌ | Multi-agent shared | +| **Framework lock-in** | None | None | High | Standalone | Claude Code | None | +| **External deps** | None | Qdrant/pgvector | Postgres + vector | Multiple | None (SQLite) | None | +| **Self-hostable** | ✅ default | Optional | Optional | ✅ | ✅ | ✅ | +| **Knowledge graph** | ✅ Entity extraction + BFS | ✅ Mem0g variant | ❌ | Doc links | ❌ | ❌ | +| **Memory decay** | ✅ Ebbinghaus + tiered | ❌ | ❌ | ❌ | ❌ | ✅ Half-lives | +| **4-tier consolidation** | ✅ Working → episodic → semantic → procedural | ❌ | OS-inspired tiers | ❌ | ❌ | Episodic + semantic | +| **Version / supersession** | ✅ Jaccard-based | Passive | ❌ | ❌ | ❌ | ❌ | +| **Real-time viewer** | ✅ Port 3113 | Cloud dashboard | Cloud dashboard | Web UI | ❌ | ❌ | +| **Privacy filtering** | ✅ Strips secrets pre-store | ❌ | ❌ | ❌ | ❌ | ❌ | +| **Obsidian export** | ✅ Built-in | ❌ | ❌ | Native format | ❌ | ❌ | +| **Cross-agent** | ✅ MCP + REST | API calls | Within runtime | Standalone | Claude-only | Multi-agent shared | +| **Audit trail** | ✅ All mutations logged | ❌ | Limited | ❌ | ❌ | ❌ | +| **Language SDKs** | Any (REST + MCP) | Python + TS | Python only | API | Any (MCP) | Node | + +--- + +## Token Efficiency + +The main reason to use persistent memory at all: token cost. Here's what one year of heavy agent use looks like across approaches. + +| Approach | Tokens / year | Cost / year | Notes | +|---|---|---|---| +| Paste full history into context | 19.5M+ | Impossible | Exceeds context window after ~200 observations | +| LLM-summarized memory (extraction-based) | ~650K | ~$500 | Lossy — summarization drops detail | +| **agentmemory (API embeddings)** | **~170K** | **~$10** | Token-budgeted, only relevant memories injected | +| **agentmemory (local embeddings)** | **~170K** | **$0** | `all-MiniLM-L6-v2` runs in-process | +| claude-mem | Reports ~10x savings | — | SQLite + FTS5 + 3-layer filter | +| Mem0 | Varies by integration | — | Extraction-based, no token budget | + +**agentmemory ships with a built-in token savings calculator.** Run `npx @agentmemory/agentmemory status` after a few sessions and you'll see exactly how many tokens you've saved vs. pasting the full history. + +--- + +## What Each Tool Is Best At + +This isn't a "agentmemory wins everything" page. Different tools solve different problems. + +**Choose agentmemory if you want:** +- Automatic capture with zero manual `add()` calls +- MCP server that works across Claude Code, Cursor, Codex, Gemini CLI, etc. +- Hybrid BM25 + vector + graph search +- Real-time viewer to see what your agent is learning +- Self-hostable with zero external databases +- Privacy filtering on API keys and secrets +- Multi-agent coordination (leases, signals, routines) + +**Choose Mem0 if you want:** +- Framework-agnostic API to bolt onto an existing agent +- Managed cloud option with a dashboard +- Python + TypeScript SDKs for direct integration +- Entity/relationship extraction as the primary abstraction + +**Choose Letta/MemGPT if you want:** +- A full agent runtime, not just memory +- OS-inspired memory tiers (core/archival/recall) +- Agents that self-edit their memory via function calls +- Long-running conversational agents (weeks/months) + +**Choose Khoj if you want:** +- A personal AI second brain, not agent infrastructure +- Document-first search over your files and the web +- Obsidian/Notion/Emacs integrations +- Scheduled automations and research tasks + +**Choose claude-mem if you want:** +- Claude Code-specific tooling with SQLite + FTS5 +- Minimal install footprint +- Token compression via LLM + +**Choose Hippo if you want:** +- Biologically-inspired memory model (decay, consolidation, sleep) +- Multi-agent shared memory as a primary feature +- "Forget by default, earn persistence through use" philosophy + +--- + +## Running Your Own Benchmarks + +We encourage you to measure this yourself rather than trust any README. Here's how: + +```bash +# Clone the repo +git clone https://github.com/rohitg00/agentmemory.git +cd agentmemory && npm install + +# Run LongMemEval-S +npm run bench:longmemeval + +# Run quality benchmark (240 observations, 20 queries) +npm run bench:quality + +# Run scale benchmark +npm run bench:scale + +# Run real embeddings benchmark +npm run bench:real-embeddings +``` + +Results land in `benchmark/results/`. All scripts, datasets, and results are committed for reproducibility. + +--- + +## Corrections Welcome + +If you maintain one of these tools and we got a number wrong, please open an issue or PR. We'd rather have accurate numbers than convenient ones. + +If you want to add your tool to this comparison, open a PR with: +1. A link to your benchmark methodology +2. The metric and dataset you're measuring on +3. A commit hash / version so we can reproduce + +**Sources:** +- Mem0 LoCoMo benchmark: [mem0.ai blog](https://mem0.ai) +- Letta LoCoMo benchmark: [letta.com/blog/benchmarking-ai-agent-memory](https://letta.com/blog/benchmarking-ai-agent-memory) +- LongMemEval paper: [arxiv.org/abs/2410.10813](https://arxiv.org/abs/2410.10813) +- LoCoMo paper: [snap-stanford.github.io/LoCoMo](https://snap-stanford.github.io/LoCoMo/) diff --git a/integrations/openclaw/README.md b/integrations/openclaw/README.md new file mode 100644 index 0000000..2bd4d90 --- /dev/null +++ b/integrations/openclaw/README.md @@ -0,0 +1,122 @@ +# agentmemory for OpenClaw + +Persistent cross-session memory for [OpenClaw](https://github.com/openclaw/openclaw) via agentmemory. Gives every OpenClaw agent a searchable long-term memory with 95.2% retrieval accuracy on [LongMemEval-S](https://arxiv.org/abs/2410.10813). + +## Why you want this + +OpenClaw agents restart fresh every session. You waste tokens re-explaining architecture, re-discovering bugs, re-teaching preferences. agentmemory captures every tool use automatically and injects relevant context when the next session starts. + +- **92% fewer tokens** per session vs full-context pasting +- **12 auto-capture hooks** — zero manual `memory.add()` calls +- **MCP-native** — same server works for Claude Code, Cursor, Gemini CLI, Hermes, and OpenClaw at the same time +- **Self-hosted** — no external database, no cloud, no API key needed for embeddings + +## Quick setup + +### Option 1: MCP server (zero code) + +Start the agentmemory server in a separate terminal: + +```bash +npx @agentmemory/agentmemory +``` + +Then add to your OpenClaw MCP config: + +```json +{ + "mcpServers": { + "agentmemory": { + "command": "npx", + "args": ["agentmemory-mcp"] + } + } +} +``` + +OpenClaw now has access to all 43 MCP tools including `memory_recall`, `memory_save`, `memory_smart_search`, `memory_timeline`, `memory_profile`, and more. + +### Option 2: Gateway plugin (deeper integration) + +If you're running an OpenClaw gateway, drop this folder into your gateway's plugins directory: + +```bash +cp -r integrations/openclaw ~/.openclaw/plugins/memory/agentmemory +``` + +Start the agentmemory server: + +```bash +npx @agentmemory/agentmemory +``` + +The plugin auto-detects the running server and hooks into the OpenClaw agent loop: + +- `onSessionStart` starts a new session on the agentmemory server and injects any returned context +- `onPreLlmCall` injects token-budgeted memories before each LLM call (BM25 + vector + graph fusion) +- `onPostToolUse` records every tool use, error, and decision after execution +- `onSessionEnd` marks the session complete so raw observations can be compressed into structured memory + +Configure via `~/.openclaw/plugins/memory/agentmemory/config.yaml`: + +```yaml +enabled: true +base_url: http://localhost:3111 +token_budget: 2000 +min_confidence: 0.5 +``` + +## What your agent gets + +### Automatic context injection + +When a session starts, agentmemory injects ~1,900 tokens of the most relevant past context: + +```text +Project profile: + - Auth uses JWT middleware in src/middleware/auth.ts (jose, not jsonwebtoken) + - Tests in test/auth.test.ts cover token validation + - Database uses Prisma with include{} to avoid N+1 queries + - Rate limiting: 100 req/min default, Redis for prod + +Recent decisions: + - Chose jose over jsonwebtoken for Edge compatibility (2026-03-15) + - N+1 fix dropped query time 450ms → 28ms (2026-03-20) +``` + +### Semantic search across sessions + +Ask "what was that fix for slow user queries?" and the agent finds the Prisma include{} decision from three weeks ago. BM25 + vector + knowledge graph fusion. + +### Privacy filtering + +Every captured observation is scanned for API keys, secrets, bearer tokens, and `` tags. These are stripped before storage. Modern token formats supported: `sk-`, `sk-proj-`, `ghp_/ghs_/ghu_`, AWS keys, and more. + +### Multi-agent coordination + +If you're running multiple OpenClaw agents on the same codebase: + +- **Leases** give one agent exclusive claim on an action so they don't stomp each other +- **Signals** let agents send threaded messages to each other with read receipts +- **Mesh sync** shares memory between agentmemory instances (requires `AGENTMEMORY_SECRET`) + +## Troubleshooting + +**"Connection refused on port 3111"** — The agentmemory server isn't running. Start it with `npx @agentmemory/agentmemory` in a separate terminal. + +**"No memories returned"** — Check `http://localhost:3113` (the real-time viewer). If there are no observations, the hooks aren't firing. Make sure your OpenClaw plugin is loaded and enabled. + +**"Search returns irrelevant results"** — Install local embeddings: `npm install @xenova/transformers`. This enables vector search for +8pp recall over BM25-only. + +**"I want to see what agentmemory is learning"** — Open `http://localhost:3113` in a browser. Live observation stream, session explorer, memory graph, and health dashboard. + +## See also + +- [agentmemory main README](../../README.md) +- [Benchmark results](../../benchmark/LONGMEMEVAL.md) — 95.2% R@5 on LongMemEval-S +- [Competitor comparison](../../benchmark/COMPARISON.md) — vs mem0, Letta, Khoj, claude-mem, Hippo +- [Hermes integration](../hermes/README.md) — same server also works with Hermes Agent + +## License + +Apache-2.0 (same as agentmemory) diff --git a/integrations/openclaw/plugin.mjs b/integrations/openclaw/plugin.mjs new file mode 100644 index 0000000..7850f3f --- /dev/null +++ b/integrations/openclaw/plugin.mjs @@ -0,0 +1,97 @@ +/** + * agentmemory plugin for OpenClaw gateway + * + * Hooks into the OpenClaw agent loop: + * - onSessionStart: starts a session on the memory server and injects any returned context + * - onPreLlmCall: injects token-budgeted memories before each LLM call + * - onPostToolUse: records every tool use, error, and decision after execution + * - onSessionEnd: marks the session complete for downstream compression + * + * Requires the agentmemory server running on localhost:3111. + * Start it with: npx @agentmemory/agentmemory + */ + +const DEFAULT_BASE_URL = "http://localhost:3111"; +const DEFAULT_TIMEOUT_MS = 5000; + +export class AgentmemoryPlugin { + constructor(config = {}) { + this.enabled = config.enabled !== false; + this.baseUrl = config.base_url ?? DEFAULT_BASE_URL; + this.tokenBudget = config.token_budget ?? 2000; + this.minConfidence = config.min_confidence ?? 0.5; + this.fallbackOnError = config.fallback_on_error !== false; + this.timeoutMs = config.timeout_ms ?? DEFAULT_TIMEOUT_MS; + this.secret = process.env.AGENTMEMORY_SECRET; + } + + get name() { + return "agentmemory"; + } + + async postJson(path, payload) { + const headers = { "Content-Type": "application/json" }; + if (this.secret) headers["Authorization"] = `Bearer ${this.secret}`; + + try { + const res = await fetch(`${this.baseUrl}${path}`, { + method: "POST", + headers, + body: JSON.stringify(payload), + signal: AbortSignal.timeout(this.timeoutMs), + }); + if (!res.ok) { + if (this.fallbackOnError) return null; + const body = await res.text().catch(() => ""); + throw new Error( + `agentmemory POST ${path} failed: ${res.status} ${res.statusText}${body ? ` — ${body.slice(0, 200)}` : ""}`, + ); + } + return await res.json(); + } catch (err) { + if (!this.fallbackOnError) throw err; + return null; + } + } + + async onSessionStart(ctx) { + if (!this.enabled) return; + const result = await this.postJson("/agentmemory/session/start", { + sessionId: ctx.sessionId, + project: ctx.project || ctx.cwd, + cwd: ctx.cwd, + }); + if (result?.context) ctx.injectContext(result.context); + } + + async onPreLlmCall(ctx) { + if (!this.enabled) return; + const result = await this.postJson("/agentmemory/context", { + sessionId: ctx.sessionId, + project: ctx.project || ctx.cwd, + budget: this.tokenBudget, + }); + if (result?.context) ctx.injectContext(result.context); + } + + async onPostToolUse(ctx) { + if (!this.enabled) return; + await this.postJson("/agentmemory/observe", { + hookType: "post_tool_use", + sessionId: ctx.sessionId, + timestamp: new Date().toISOString(), + data: { + tool_name: ctx.toolName, + tool_input: ctx.toolInput, + tool_output: ctx.toolOutput, + }, + }); + } + + async onSessionEnd(ctx) { + if (!this.enabled) return; + await this.postJson("/agentmemory/session/end", { sessionId: ctx.sessionId }); + } +} + +export default AgentmemoryPlugin; diff --git a/integrations/openclaw/plugin.yaml b/integrations/openclaw/plugin.yaml new file mode 100644 index 0000000..f991323 --- /dev/null +++ b/integrations/openclaw/plugin.yaml @@ -0,0 +1,27 @@ +name: agentmemory +version: 0.8.1 +description: "Persistent cross-session memory for OpenClaw via agentmemory. 95.2% retrieval accuracy on LongMemEval-S." +author: "Rohit Ghumare" +homepage: "https://github.com/rohitg00/agentmemory" +license: Apache-2.0 + +category: memory +tags: + - memory + - persistence + - mcp + - context + +hooks: + - on_session_start + - on_pre_llm_call + - on_post_tool_use + - on_session_end + +config: + enabled: true + base_url: http://localhost:3111 + token_budget: 2000 + min_confidence: 0.5 + fallback_on_error: true + timeout_ms: 5000 diff --git a/src/cli.ts b/src/cli.ts index 000621c..807f966 100644 --- a/src/cli.ts +++ b/src/cli.ts @@ -5,6 +5,7 @@ import { existsSync } from "node:fs"; import { join, dirname } from "node:path"; import { fileURLToPath } from "node:url"; import * as p from "@clack/prompts"; +import { generateId } from "./state/schema.js"; const __dirname = dirname(fileURLToPath(import.meta.url)); const args = process.argv.slice(2); @@ -18,6 +19,7 @@ Usage: agentmemory [command] [options] Commands: (default) Start agentmemory worker status Show connection status, memory count, and health + demo Seed sample sessions and show recall in action Options: --help, -h Show this help @@ -28,6 +30,7 @@ Options: Quick start: npx @agentmemory/agentmemory # start with local iii-engine or Docker npx @agentmemory/agentmemory status # check health + npx @agentmemory/agentmemory demo # try it in 30 seconds (needs server running) npx agentmemory-mcp # standalone MCP server (no engine) `); process.exit(0); @@ -267,14 +270,249 @@ async function runStatus() { } } -if (args[0] === "status") { - runStatus().catch((err) => { - p.log.error(err instanceof Error ? err.message : String(err)); - process.exit(1); +type DemoObservation = { + toolName: string; + toolInput: Record; + toolOutput: string; +}; + +type DemoSession = { + id: string; + title: string; + observations: DemoObservation[]; +}; + +type SearchResult = { query: string; hits: number; topTitle: string }; + +function buildDemoSessions(): DemoSession[] { + return [ + { + id: generateId("demo"), + title: "Session 1: JWT auth setup", + observations: [ + { + toolName: "Write", + toolInput: { file_path: "src/middleware/auth.ts" }, + toolOutput: + "Created JWT middleware using jose library. Tokens expire after 30 days. Chose jose over jsonwebtoken for Edge compatibility.", + }, + { + toolName: "Write", + toolInput: { file_path: "test/auth.test.ts" }, + toolOutput: + "Added token validation tests covering expired, malformed, and valid cases.", + }, + { + toolName: "Bash", + toolInput: { command: "npm test" }, + toolOutput: "All 12 auth tests passing.", + }, + ], + }, + { + id: generateId("demo"), + title: "Session 2: Database migration debugging", + observations: [ + { + toolName: "Read", + toolInput: { file_path: "prisma/schema.prisma" }, + toolOutput: + "Found N+1 query issue in user relations. Need to add include on posts query.", + }, + { + toolName: "Edit", + toolInput: { file_path: "src/api/users.ts" }, + toolOutput: + "Fixed N+1 by adding Prisma include. Query time dropped from 450ms to 28ms.", + }, + ], + }, + { + id: generateId("demo"), + title: "Session 3: Rate limiting", + observations: [ + { + toolName: "Write", + toolInput: { file_path: "src/middleware/ratelimit.ts" }, + toolOutput: + "Added rate limiting middleware with 100 req/min default. Uses in-memory store for dev, Redis for prod.", + }, + ], + }, + ]; +} + +async function postJson( + url: string, + body: unknown, + timeoutMs = 5000, +): Promise { + try { + const res = await fetch(url, { + method: "POST", + headers: { "Content-Type": "application/json" }, + body: JSON.stringify(body), + signal: AbortSignal.timeout(timeoutMs), + }); + if (!res.ok) return null; + return (await res.json().catch(() => null)) as T | null; + } catch { + return null; + } +} + +async function postJsonStrict( + url: string, + body: unknown, + timeoutMs = 5000, +): Promise { + const res = await fetch(url, { + method: "POST", + headers: { "Content-Type": "application/json" }, + body: JSON.stringify(body), + signal: AbortSignal.timeout(timeoutMs), }); -} else { - main().catch((err) => { - p.log.error(err instanceof Error ? err.message : String(err)); - process.exit(1); + if (!res.ok) { + const errBody = await res.text().catch(() => ""); + const suffix = errBody ? ` — ${errBody.slice(0, 200)}` : ""; + throw new Error(`POST ${url} failed: ${res.status} ${res.statusText}${suffix}`); + } + return (await res.json().catch(() => null)) as T | null; +} + +async function seedDemoSession( + base: string, + project: string, + session: DemoSession, +): Promise { + await postJsonStrict(`${base}/agentmemory/session/start`, { + sessionId: session.id, + project, + cwd: project, }); + + let stored = 0; + for (const obs of session.observations) { + const url = `${base}/agentmemory/observe`; + const payload = { + hookType: "post_tool_use", + sessionId: session.id, + timestamp: new Date().toISOString(), + data: { + tool_name: obs.toolName, + tool_input: obs.toolInput, + tool_output: obs.toolOutput, + }, + }; + + try { + const res = await fetch(url, { + method: "POST", + headers: { "Content-Type": "application/json" }, + body: JSON.stringify(payload), + signal: AbortSignal.timeout(5000), + }); + if (res.ok) { + stored++; + } else { + const body = await res.text().catch(() => ""); + p.log.warn( + `observe failed for ${obs.toolName}: ${res.status} ${res.statusText}${body ? ` — ${body.slice(0, 160)}` : ""}`, + ); + } + } catch (err) { + p.log.warn( + `observe request failed for ${obs.toolName}: ${err instanceof Error ? err.message : String(err)}`, + ); + } + } + + await postJsonStrict(`${base}/agentmemory/session/end`, { sessionId: session.id }); + return stored; +} + +async function runDemoSearch(base: string, query: string): Promise { + const data = await postJson<{ results?: Array<{ title?: string }> }>( + `${base}/agentmemory/smart-search`, + { query, limit: 5 }, + 10000, + ); + const items = data?.results ?? []; + return { + query, + hits: items.length, + topTitle: items[0]?.title ?? "(no results)", + }; } + +async function runDemo() { + const port = getRestPort(); + const base = `http://localhost:${port}`; + p.intro("agentmemory demo"); + + if (!(await isEngineRunning())) { + p.log.error(`Not running — no response on port ${port}`); + p.log.info("Start the server first: npx @agentmemory/agentmemory"); + process.exit(1); + } + + const demoProject = "/tmp/agentmemory-demo"; + const sessions = buildDemoSessions(); + + const sSeed = p.spinner(); + sSeed.start("Seeding 3 demo sessions with realistic observations..."); + + let totalObs = 0; + for (const session of sessions) { + totalObs += await seedDemoSession(base, demoProject, session); + } + + sSeed.stop(`Seeded ${totalObs} observations across ${sessions.length} sessions`); + + const queries = [ + "jwt auth middleware", + "database performance optimization", + "rate limiting", + ]; + + const sQuery = p.spinner(); + sQuery.start(`Running ${queries.length} smart-search queries...`); + + const results: SearchResult[] = []; + for (const query of queries) { + results.push(await runDemoSearch(base, query)); + } + + sQuery.stop("Search complete"); + + const lines = [ + `Project: ${demoProject}`, + `Sessions: ${sessions.length} seeded (${totalObs} observations)`, + "", + "Search results:", + ...results.flatMap((r) => [ + ` "${r.query}"`, + ` → ${r.hits} hit(s), top: ${r.topTitle.slice(0, 60)}`, + ]), + "", + `Notice: searching "database performance optimization"`, + `found the N+1 query fix — keyword matching can't do that.`, + "", + `Viewer: http://localhost:${port + 2}`, + `Clean up with: curl -X DELETE "${base}/agentmemory/sessions?project=${demoProject}"`, + ]; + + p.note(lines.join("\n"), "demo complete"); + p.log.success("agentmemory is working. Point your agent at it and get back to coding."); +} + +const commands: Record Promise> = { + status: runStatus, + demo: runDemo, +}; + +const handler = commands[args[0] ?? ""] ?? main; +handler().catch((err) => { + p.log.error(err instanceof Error ? err.message : String(err)); + process.exit(1); +}); diff --git a/src/viewer/index.html b/src/viewer/index.html index 99868da..9039d8d 100644 --- a/src/viewer/index.html +++ b/src/viewer/index.html @@ -1026,7 +1026,12 @@

agentmemory

var estInjected = d.sessions.length * tokenBudget; var savings = estFull > 0 ? Math.round((1 - estInjected / Math.max(estFull, 1)) * 100) : 0; if (savings < 0) savings = 0; - html += '
Token Savings
' + savings + '%
~' + estInjected.toLocaleString() + ' vs ~' + estFull.toLocaleString() + ' full (budget: ' + tokenBudget + ')
'; + var tokensSaved = Math.max(0, estFull - estInjected); + // Rate: $0.30 per 1K tokens (mid-tier model baseline) + var costDollars = tokensSaved / 1000 * 0.3; + var costCents = Math.round(costDollars * 100); + var costStr = costCents >= 100 ? '$' + (costCents / 100).toFixed(2) : costCents + 'ct'; + html += '
Token Savings
' + savings + '%
~' + tokensSaved.toLocaleString() + ' tokens · ' + costStr + ' saved
'; html += ''; if (snap.memory || snap.cpu) {