fix: SessionStart context gate (#143) + retention-evict semantic leak (#124) by rohitg00 · Pull Request #145 · rohitg00/agentmemory

rohitg00 · 2026-04-15T09:13:57Z

Closes #143 and closes #124. Two independent bugs bundled into 0.8.10.

Honest framing up front: the #143 diagnosis in the original PR description was wrong — PreToolUse stdout is debug-log only per the Claude Code hook docs, so it never reached the model and never burned user tokens. I corrected the record on the #143 thread and rewrote the CHANGELOG entry. This PR still ships the gates, but only the SessionStart gate is a real token saving (~1-2K per session). The PreToolUse gate is a resource cleanup (skips ~100-300ms REST round-trip per tool call, zero token impact) kept for forward-compat and hot-path hygiene.

The underlying "Claude Pro allocation burned in 4 messages" symptom from #143 is dominated by Anthropic's own acknowledged Claude Pro quota tightness, not by agentmemory. This release doesn't and can't fix that.

Fix 1: #143 — SessionStart context injection gate

src/hooks/session-start.ts was writing ~1-2K chars of project context to stdout at every session start. Per the Claude Code hook docs, SessionStart stdout is injected into the model's context (one of only two exceptions alongside UserPromptSubmit). So every new session the user started in Claude Code was getting an extra ~1-2K input tokens prepended to its first turn, silently.

Fix: new env var AGENTMEMORY_INJECT_CONTEXT, default false. When disabled:

session-start.ts still POSTs /agentmemory/session/start so observation tracking knows about the session, but skips the process.stdout.write(result.context) line
pre-tool-use.ts exits immediately — no stdin read, no /agentmemory/enrich fetch, no stdout write. Pure resource cleanup (20x-per-message Node+HTTP hot path is now a ~15ms no-op Node startup)

Startup banner prints Context injection: OFF (default, #143) on normal startup and a loud WARNING when AGENTMEMORY_INJECT_CONTEXT=true is set — same pattern as AGENTMEMORY_AUTO_COMPRESS from 0.8.8.

Tests: 5 new subprocess tests in test/context-injection.test.ts spawn the compiled pre-tool-use.mjs and session-start.mjs hooks with real JSON stdin and assert:

Unset env var → stdout is empty
Explicit AGENTMEMORY_INJECT_CONTEXT=false → stdout is empty
Disabled PreToolUse path exits under 1 second (no stdin consumption, no fetch)
Opt-in path with unreachable backend exits cleanly (doesn't break Claude Code startup)

Fix 2: #124 — `mem::retention-evict` leaks semantic memories

Pre-existing bug I filed myself during the #119 adversarial review. mem::retention-evict was unconditionally calling kv.delete(KV.memories, id) for every below-threshold candidate, but retention scores are computed for both episodic (KV.memories) and semantic (KV.semantic) memories. When a candidate came from KV.semantic, the delete silently became a no-op and the semantic row stayed alive forever with a sub-threshold score. Semantic memories could not be evicted by the retention path at all.

Fix:

Add a source: "episodic" | "semantic" discriminator to RetentionScore
Tag it at score creation in both loops of mem::retention-score
Branch the delete in mem::retention-evict on candidate.source, routing to KV.memories or KV.semantic accordingly
Pre-0.8.10 rows with no source field are treated as episodic for backwards-compat so upgraded stores continue to evict their old rows without re-scoring first
Response now includes evictedEpisodic and evictedSemantic counts for observability

Tests: 3 new cases in test/retention.test.ts:

Scoring tags rows with the correct source for both loops
Evicting a mixed set of below-threshold episodic + semantic candidates removes both from their respective scopes (the core regression)
Legacy-shape score rows with missing source still evict to mem:memories (backwards-compat)

Release

Bumps to 0.8.10 (main package + @agentmemory/mcp shim).

Test plan

npm run build clean
npm test — 727 passing (was 719 + 5 new from bursting token allocation (re-open) #143 + 3 new from mem::retention-evict only deletes KV.memories rows, leaking semantic memories #124)
bursting token allocation (re-open) #143 subprocess tests confirm stdout is empty in all off/default paths
mem::retention-evict only deletes KV.memories rows, leaking semantic memories #124 mixed-scope eviction test confirms both episodic and semantic rows are removed
mem::retention-evict only deletes KV.memories rows, leaking semantic memories #124 legacy-shape test confirms backwards-compat for pre-0.8.10 rows
After release: @adrianricardo re-tests bursting token allocation (re-open) #143 and confirms SessionStart token injection stops (won't restore Pro allocation — that's Anthropic-side)
After release: anyone with a store containing evictable semantic memories re-runs mem::retention-evict and sees evictedSemantic > 0

Summary by CodeRabbit

New Features
- Context injection is OFF by default; enable with AGENTMEMORY_INJECT_CONTEXT=true. Startup now shows injection status and a warning when enabled. Package/version updated to 0.8.10.
Bug Fixes
- Eviction now reliably distinguishes and deletes semantic vs episodic memories, returns separate evicted counts, and emits an audit record for non-zero sweeps.
- Retention scoring now records memory type to improve eviction accuracy.
Tests
- Added regression tests for context-injection behavior, fast shutdown, and retention/eviction routing.
Documentation
- README and changelog updated with flag details and migration note.

@adrianricardo

0.8.8 fixed the agentmemory-side Claude API burn (where the engine called Claude via the user's ANTHROPIC_API_KEY for per-observation compression). That addressed #138 for users with API keys, but it missed the second and much larger token-burn path: the PreToolUse hook writing context to stdout. Claude Code reads PreToolUse stdout and prepends it to the model's next turn. src/hooks/pre-tool-use.ts was POSTing /agentmemory/enrich on every Edit/Write/Read/Glob/Grep tool call and piping up to 4000 chars of response context into stdout. At ~20 tool calls per user message this silently injected ~20K tokens per message into Claude Code's input window — all charged against the user's Claude Pro allocation because Claude Code was the one sending them to Anthropic. 4 messages drained the cap, which matches @adrianricardo's report. session-start.ts had the same pattern (injected once per session, smaller blast radius). Fix: gate both hooks on AGENTMEMORY_INJECT_CONTEXT, default false. - pre-tool-use.ts: when disabled, exit immediately — no stdin read, no fetch, no stdout write. The hot path (~20x per message) becomes a no-op Node startup. - session-start.ts: when disabled, still POST /agentmemory/session/start so the session gets registered for observation tracking, but never write context to stdout. The session registration is cheap and doesn't touch Claude Code's input window. - src/config.ts: new isContextInjectionEnabled() helper. - src/index.ts: startup banner prints 'Context injection: OFF (default, #143)' on normal startup and a loud WARNING when opt-in is enabled. - test/context-injection.test.ts: 5 subprocess tests that spawn the compiled pre-tool-use.mjs and session-start.mjs hooks, feed real JSON payloads via stdin, and assert stdout is empty in all the off/default paths. Also asserts the disabled path exits under 1s and the opt-in path with an unreachable backend still exits cleanly. - README .env section: new AGENTMEMORY_INJECT_CONTEXT entry. - CHANGELOG [0.8.10] with prominent 'Behavior change' banner. Observations are still captured via PostToolUse regardless of the flag — the memory store and MCP search tools are completely unaffected by this change. The fix only severs the path where agentmemory silently shoves memory context into the user's Claude Code conversation. Bumps to 0.8.10 (main + @agentmemory/mcp shim). Test count: 724 passing (was 719 + 5 new).

coderabbitai · 2026-04-15T09:15:33Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Release 0.8.10: context injection for PreToolUse and SessionStart is opt‑in via AGENTMEMORY_INJECT_CONTEXT (default off). Retention now records memory source and evicts from the correct KV namespace with split eviction counts and audit writes. Docs, tests, and version metadata updated.

Changes

Cohort / File(s)	Summary
Version & metadata `package.json`, `packages/mcp/package.json`, `plugin/.claude-plugin/plugin.json`, `src/version.ts`, `src/types.ts`	Bumped release to `0.8.10`; updated `VERSION` and `ExportData.version` unions, package/plugin version fields, and added `retention_score` audit op and `RetentionScore.source` type.
Context injection config & docs `src/config.ts`, `README.md`, `CHANGELOG.md`	Added `isContextInjectionEnabled()` and documented new env var `AGENTMEMORY_INJECT_CONTEXT` (default `false`) with opt‑in behavior and startup/migration messaging.
Plugin hook scripts `plugin/scripts/pre-tool-use.mjs`, `plugin/scripts/session-start.mjs`	Hook scripts read `AGENTMEMORY_INJECT_CONTEXT` and short‑circuit when disabled to avoid stdin parsing, `/agentmemory/enrich` calls, and stdout context emission.
Hook implementations (src) `src/hooks/pre-tool-use.ts`, `src/hooks/session-start.ts`	Gated enrichment and stdout context emission on `AGENTMEMORY_INJECT_CONTEXT === "true"`; session registration/observation capture still runs but project context is not written to stdout when disabled.
Startup & export-import `src/index.ts`, `src/functions/export-import.ts`	Startup logs show context injection status and warning when enabled; `mem::import` supportedVersions updated to include `0.8.10`.
Retention scoring & eviction `src/functions/retention.ts`, `src/types.ts`	`RetentionScore` gets optional `source?: "episodic"
Tests `test/context-injection.test.ts`, `test/retention.test.ts`, `test/export-import.test.ts`	Added subprocess-based tests for hook stdout behavior and gating; expanded retention tests for `source` tagging, correct eviction routing, legacy-row handling, and audit emission; updated export-import test to expect `0.8.10`.

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

fix: retention scoring (#119) + npx agentmemory-mcp 404 (#120), release 0.8.3 #123 — directly related: modifies retention scoring/eviction logic and types; overlaps with the new source field and eviction routing.
chore: release 0.8.5 #133 — related: updates version/export-import compatibility checks and version literals.
feat: agentmemory v0.4.0 — MCP resources, prompts, enrichment, confidence #7 — related: prior changes to pre-tool-use/session-start enrichment/injection behavior; overlaps with hook/stdout behavior adjustments.

Poem

🐰 I hopped through logs at break of dawn, so spry,
Quiet now by default — I’ll only speak if you try.
Flip the switch, I’ll whisper context in each turn,
Or stay muted while safe memories calmly churn.
A carrot for tests, and a thump of delight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 30.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes both main changes: SessionStart context gating fix for `#143` and retention-evict semantic memory leak fix for `#124`.
Linked Issues check	✅ Passed	All coding requirements from `#143` and `#124` are met: AGENTMEMORY_INJECT_CONTEXT gating, PreToolUse no-op by default, SessionStart stdout gating, source discriminator on RetentionScore, branched eviction by source, backwards compatibility, split eviction counts, and comprehensive tests added.
Out of Scope Changes check	✅ Passed	All changes are in scope: version bumps (0.8.9→0.8.10), documentation updates (CHANGELOG/README), context injection gating implementation, retention semantic leak fix, test coverage additions, and audit logging for retention operations.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/stop-claude-code-token-burn-143

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

PreToolUse stdout is NOT injected into the model context — per the Claude Code hook docs, only UserPromptSubmit and SessionStart stdout are injected. My initial #143 PR description and CHANGELOG claimed PreToolUse was the smoking gun behind 'Pro allocation burned in 4 messages', which is wrong. What's actually true: - SessionStart stdout injection IS real (~1-2K tokens per session) - PreToolUse stdout goes to debug log only — no tokens - Claude Pro's Claude Code quotas are tight by design (Anthropic has publicly acknowledged this); 4 messages to burn is plausible with or without agentmemory installed The gate on pre-tool-use.ts is still worth keeping as a resource cleanup (skips a 20x-per-message Node+HTTP hot path) and as forward-compat protection in case Claude Code ever changes PreToolUse hook contract. But the CHANGELOG entry has to stop claiming it saves tokens when it doesn't.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (4)

src/config.ts (1)

207-215: Shorten the long hook-injection comment block.

isContextInjectionEnabled() is clear on its own; the multi-line narrative at Lines 207-215 is high-maintenance and likely to drift. Keep a compact intent-focused note and rely on changelog/docs for details.

🧹 Suggested cleanup

-// Hook-level context injection into Claude Code's conversation is OFF by
-// default as of 0.8.10 (see `#143`). When disabled, pre-tool-use and
-// session-start hooks still POST observations for background capture, but
-// never write context to stdout — so Claude Code doesn't inject an extra
-// ~4000-char blob into every tool turn. 0.8.8 stopped the agentmemory-side
-// Claude calls (via ANTHROPIC_API_KEY); this stops the Claude Code-side
-// token burn where every tool call silently grew the model input window.
-// Users who want the in-conversation context injection explicitly opt in
-// with AGENTMEMORY_INJECT_CONTEXT=true and get a loud startup warning.
+// Hook-level context injection is opt-in since 0.8.10.
+// Set AGENTMEMORY_INJECT_CONTEXT=true to re-enable stdout context injection.
 export function isContextInjectionEnabled(): boolean {
   return getMergedEnv()["AGENTMEMORY_INJECT_CONTEXT"] === "true";
 }

As per coding guidelines: "Use clear, self-documenting variable and function names instead of code comments explaining WHAT".

Also applies to: 216-218

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/config.ts` around lines 207 - 215, Replace the long multi-line
explanation about hook-level context injection with a short, intent-focused
comment near isContextInjectionEnabled(): e.g., one-line summary stating that
context injection is disabled by default and can be enabled with
AGENTMEMORY_INJECT_CONTEXT, and remove the verbose narrative (also shorten the
adjacent block at the 216-218 area) so the code relies on the self-explanatory
function name and external docs/changelog for details.

CHANGELOG.md (1)

22-23: Avoid hardcoded source line numbers in changelog entries.

src/index.ts:179 is brittle and can drift; referencing just the file/function is more durable.

📝 Suggested change

-- **`isContextInjectionEnabled()`** helper in `src/config.ts` — single source of truth for the flag. The hooks read the env var directly (they're spawned as standalone `.mjs` files by Claude Code and don't bootstrap through `src/index.ts`), so the helper is there for the startup banner in `src/index.ts:179` and future code paths that want the same gate.
+- **`isContextInjectionEnabled()`** helper in `src/config.ts` — single source of truth for the flag. The hooks read the env var directly (they're spawned as standalone `.mjs` files by Claude Code and don't bootstrap through `src/index.ts`), so the helper is there for the startup banner in `src/index.ts` and future code paths that want the same gate.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@CHANGELOG.md` around lines 22 - 23, Update the CHANGELOG entry to remove the
hardcoded source line number "src/index.ts:179" and instead reference the file
and symbol only (e.g., "src/index.ts" and the isContextInjectionEnabled()
helper); edit the sentence that currently reads with the line number so it says
something like "the startup banner in src/index.ts and future code paths" or
explicitly names the function isContextInjectionEnabled() without a line number,
ensuring the changelog remains durable as source shifts.

test/context-injection.test.ts (2)

1-4: Add the standard iii-sdk mock to align with test conventions.

This suite doesn’t currently include vi.mock("iii-sdk", ...), which diverges from the repo’s required test pattern.
🧪 Suggested alignment patch
-import { describe, it, expect } from "vitest";
+import { describe, it, expect, vi } from "vitest";
 import { spawn } from "node:child_process";
 import { join } from "node:path";
+
+vi.mock("iii-sdk", () => ({}));
As per coding guidelines, `test/**/*.test.ts`: "Test files must use vitest with .test.ts extension, mock iii-sdk with vi.mock(), and follow patterns from test/crystallize.test.ts".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/context-injection.test.ts` around lines 1 - 4, The test file
context-injection.test.ts is missing the standard iii-sdk mock; add a top-level
vitest mock call vi.mock("iii-sdk", ...) before imports or test definitions to
match the repo pattern used in test/crystallize.test.ts; ensure the mock
provides the same exported hooks used in this suite (the iii-sdk API your tests
call) so tests import describe/it/expect as normal and the mock replaces iii-sdk
behavior consistently across tests.
86-90: Timing budget comment and assertion are inconsistent.

Line 86 describes a 250ms budget, but Line 90 enforces < 1000ms. Please align text and threshold to avoid confusion.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/context-injection.test.ts` around lines 86 - 90, Comment states a 250ms
budget but the assertion checks <1000ms; make them consistent by either lowering
the assertion or updating the comment. Locate the test using
runHook("pre-tool-use.mjs", "", {}) and change
expect(result.tookMs).toBeLessThan(1000) to
expect(result.tookMs).toBeLessThan(250) if you want the 250ms target, or update
the preceding comment to mention a 1000ms budget if that is the intended
threshold.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@README.md`:
- Around line 735-737: Clarify that SessionStart and PreToolUse have different
frequencies: update the wording around the "PreToolUse and SessionStart hooks"
so it states that PreToolUse injects up to ~4000 chars of memory into every tool
turn, while SessionStart injects memory only once at the start of a session (not
on every tool turn); mention the token impact accordingly and keep references to
the hook names SessionStart and PreToolUse so readers can find their
implementations.

In `@src/functions/export-import.ts`:
- Line 178: The hard-coded supportedVersions Set in export-import.ts is drifting
from the canonical ExportData.version/type and allows unsafe casts; replace this
inline list by importing or deriving the allowed export-version values from the
single source of truth (the ExportData.version enum/type or a shared constant)
and use that to build the Set (reference the supportedVersions constant and
ExportData type/enum to locate code). Ensure the central definition includes
"0.7.7" (or add it there) so callers like validate/parse logic use the same
canonical versions and remove any duplicate lists to restore type safety.

---

Nitpick comments:
In `@CHANGELOG.md`:
- Around line 22-23: Update the CHANGELOG entry to remove the hardcoded source
line number "src/index.ts:179" and instead reference the file and symbol only
(e.g., "src/index.ts" and the isContextInjectionEnabled() helper); edit the
sentence that currently reads with the line number so it says something like
"the startup banner in src/index.ts and future code paths" or explicitly names
the function isContextInjectionEnabled() without a line number, ensuring the
changelog remains durable as source shifts.

In `@src/config.ts`:
- Around line 207-215: Replace the long multi-line explanation about hook-level
context injection with a short, intent-focused comment near
isContextInjectionEnabled(): e.g., one-line summary stating that context
injection is disabled by default and can be enabled with
AGENTMEMORY_INJECT_CONTEXT, and remove the verbose narrative (also shorten the
adjacent block at the 216-218 area) so the code relies on the self-explanatory
function name and external docs/changelog for details.

In `@test/context-injection.test.ts`:
- Around line 1-4: The test file context-injection.test.ts is missing the
standard iii-sdk mock; add a top-level vitest mock call vi.mock("iii-sdk", ...)
before imports or test definitions to match the repo pattern used in
test/crystallize.test.ts; ensure the mock provides the same exported hooks used
in this suite (the iii-sdk API your tests call) so tests import
describe/it/expect as normal and the mock replaces iii-sdk behavior consistently
across tests.
- Around line 86-90: Comment states a 250ms budget but the assertion checks
<1000ms; make them consistent by either lowering the assertion or updating the
comment. Locate the test using runHook("pre-tool-use.mjs", "", {}) and change
expect(result.tookMs).toBeLessThan(1000) to
expect(result.tookMs).toBeLessThan(250) if you want the 250ms target, or update
the preceding comment to mention a 1000ms budget if that is the intended
threshold.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 48060a7d-0576-4146-acb7-4cc9e6aecde7

📥 Commits

Reviewing files that changed from the base of the PR and between 9cef253 and fbb32b0.

⛔ Files ignored due to path filters (1)

package-lock.json is excluded by !**/package-lock.json

📒 Files selected for processing (16)

CHANGELOG.md
README.md
package.json
packages/mcp/package.json
plugin/.claude-plugin/plugin.json
plugin/scripts/pre-tool-use.mjs
plugin/scripts/session-start.mjs
src/config.ts
src/functions/export-import.ts
src/hooks/pre-tool-use.ts
src/hooks/session-start.ts
src/index.ts
src/types.ts
src/version.ts
test/context-injection.test.ts
test/export-import.test.ts

coderabbitai · 2026-04-15T09:21:55Z

      const importData = data.exportData;

-      const supportedVersions = new Set(["0.3.0", "0.4.0", "0.5.0", "0.6.0", "0.6.1", "0.7.0", "0.7.2", "0.7.3", "0.7.4", "0.7.5", "0.7.6", "0.7.7", "0.7.9", "0.8.0", "0.8.1", "0.8.2", "0.8.3", "0.8.4", "0.8.5", "0.8.6", "0.8.7", "0.8.8", "0.8.9"]);
+      const supportedVersions = new Set(["0.3.0", "0.4.0", "0.5.0", "0.6.0", "0.6.1", "0.7.0", "0.7.2", "0.7.3", "0.7.4", "0.7.5", "0.7.6", "0.7.7", "0.7.9", "0.8.0", "0.8.1", "0.8.2", "0.8.3", "0.8.4", "0.8.5", "0.8.6", "0.8.7", "0.8.8", "0.8.9", "0.8.10"]);


⚠️ Potential issue | 🟡 Minor

Centralize supported export versions to prevent drift.

Line 178 maintains a hard-coded list that already diverges from type definitions ("0.7.7" is accepted here but not represented in ExportData.version). This can force casts and weakens type safety.

♻️ Proposed refactor

- const supportedVersions = new Set(["0.3.0", "0.4.0", "0.5.0", "0.6.0", "0.6.1", "0.7.0", "0.7.2", "0.7.3", "0.7.4", "0.7.5", "0.7.6", "0.7.7", "0.7.9", "0.8.0", "0.8.1", "0.8.2", "0.8.3", "0.8.4", "0.8.5", "0.8.6", "0.8.7", "0.8.8", "0.8.9", "0.8.10"]); + const supportedVersions = new Set(SUPPORTED_EXPORT_VERSIONS);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/functions/export-import.ts` at line 178, The hard-coded supportedVersions Set in export-import.ts is drifting from the canonical ExportData.version/type and allows unsafe casts; replace this inline list by importing or deriving the allowed export-version values from the single source of truth (the ExportData.version enum/type or a shared constant) and use that to build the Set (reference the supportedVersions constant and ExportData type/enum to locate code). Ensure the central definition includes "0.7.7" (or add it there) so callers like validate/parse logic use the same canonical versions and remove any duplicate lists to restore type safety.

The eviction loop was unconditionally calling kv.delete(KV.memories, id) for every below-threshold candidate, but retention scores are computed for both episodic (KV.memories) and semantic (KV.semantic) memories. When a candidate came from KV.semantic, the delete silently became a no-op (key wasn't in mem:memories to begin with) and the semantic row stayed alive forever with a sub-threshold score. Semantic memories could not be evicted by this path at all. Fix: - Add a source: "episodic" | "semantic" discriminator to RetentionScore - Tag it at score creation in both loops of mem::retention-score - Branch the delete in mem::retention-evict on candidate.source, routing to KV.memories or KV.semantic accordingly - Pre-0.8.10 retention rows with no source field are treated as episodic for backwards-compat so upgraded stores continue to evict their old rows without re-scoring first - Response now includes evictedEpisodic and evictedSemantic counts so callers can see what was removed from each scope Adds 3 regression tests to test/retention.test.ts: - Scoring tags rows with the correct source - Evicting a mixed set of below-threshold episodic + semantic candidates removes both from their respective scopes - Legacy-shape score rows with no source field still evict to mem:memories (backwards-compat) Full suite: 727 passing (was 724 + 3 new).

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

test/retention.test.ts (1)
246-263: Assert the persisted mem:retention rows here too.

This only verifies the returned scores array. The eviction path reads back from stored retention rows, so a regression in kv.set/serialization would still pass. Please also fetch mem:retention after scoring and assert that both stored rows carry the expected source.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/retention.test.ts` around lines 246 - 263, After calling
registerRetentionFunctions and triggering "mem::retention-score" in the test,
also read back the persisted retention rows (the "mem:retention" entry) from the
mock KV and assert that the stored rows for memoryId "mem_ep" and "sem_sem" have
source set to "episodic" and "semantic" respectively; update the test that uses
mockKV, mockSdk, and registerRetentionFunctions to fetch the stored
"mem:retention" payload (via the mock KV read method used in tests, e.g., kv.get
or equivalent) and add expect assertions mirroring the in-memory checks to
verify serialization/persistence correctness.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/functions/retention.ts`:
- Around line 262-271: The eviction loop is deleting persisted state (kv.delete
and deleteAccessLog) but not emitting audits; add a call to recordAudit(...) for
each structural deletion so evictions are auditable. Inside the for (const
candidate of candidates) block, after deleting KV.semantic or KV.memories,
KV.retentionScores and after deleteAccessLog(kv, candidate.memoryId), invoke
recordAudit with the memoryId, an action like "evict" or "delete", the source
(candidate.source), and any relevant metadata (reason/score) so the audit
records the state change; ensure this is done for both semantic and episodic
branches and increment counters as before.
- Around line 255-271: The loop assumes missing candidate.source means episodic
and deletes KV.memories, which can leave semantic records intact for legacy
rows; instead, before deleting, probe both namespaces (KV.semantic and
KV.memories) for candidate.memoryId to determine which actually contains the
record, delete only that namespace entry, increment evictedSemantic or
evictedEpisodic accordingly, then delete the retention score
(KV.retentionScores) and call deleteAccessLog(kv, candidate.memoryId) as before;
update the code around the candidates loop and uses of KV.semantic / KV.memories
/ candidate.memoryId to perform the existence check and correct counting.

---

Nitpick comments:
In `@test/retention.test.ts`:
- Around line 246-263: After calling registerRetentionFunctions and triggering
"mem::retention-score" in the test, also read back the persisted retention rows
(the "mem:retention" entry) from the mock KV and assert that the stored rows for
memoryId "mem_ep" and "sem_sem" have source set to "episodic" and "semantic"
respectively; update the test that uses mockKV, mockSdk, and
registerRetentionFunctions to fetch the stored "mem:retention" payload (via the
mock KV read method used in tests, e.g., kv.get or equivalent) and add expect
assertions mirroring the in-memory checks to verify serialization/persistence
correctness.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 39b871ef-643e-4923-a275-873928b90c93

📥 Commits

Reviewing files that changed from the base of the PR and between fbb32b0 and fa89a42.

📒 Files selected for processing (4)

CHANGELOG.md
src/functions/retention.ts
src/types.ts
test/retention.test.ts

✅ Files skipped from review due to trivial changes (1)

CHANGELOG.md

🚧 Files skipped from review as they are similar to previous changes (1)

src/types.ts

CodeRabbit caught a real backwards-compat hole in the #124 fix: pre-0.8.10 stores already contain semantic retention rows with no source field (because the old mem::retention-score scored KV.semantic before the discriminator existed). My first fix defaulted missing source to episodic, which meant those legacy semantic rows still got delete-routed to KV.memories — the exact no-op that stranded them in the first place. Fix: when candidate.source is undefined, probe KV.memories first for the memoryId; if it's there, route to episodic, otherwise route to semantic. Count the resolved source in the response. Adds one new test case: a pre-0.8.10 semantic memory with a legacy-shape retention row (no source field) gets evicted from mem:semantic, not silently no-op'd. Existing 'defaults to episodic' test is kept and retargeted to the genuinely-episodic legacy case. Also fixes a README nit: the AGENTMEMORY_INJECT_CONTEXT comment previously implied SessionStart fires on every tool turn. It's once per session. Now broken out into two bullets explaining what each hook does differently, with the note that only SessionStart actually reaches the model (PreToolUse stdout is debug-log only per Claude Code docs). Full suite: 728 passing (was 727 + 1 new).

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

src/functions/retention.ts (1)
286-288: ⚠️ Potential issue | 🟠 Major

Emit an audit record for each successful eviction.

Lines 286-288 delete the memory row, retention row, and access log, but the branch still never calls recordAudit(). That makes mem::retention-evict invisible to audit consumers even though this is a structural mutation path, not read-side bookkeeping. Please record the eviction after the deletes succeed, including at least candidate.memoryId, resolvedSource, and the eviction score/threshold metadata.

As per coding guidelines, "Use recordAudit() for all state-changing operations in TypeScript functions".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/retention.ts` around lines 286 - 288, After successfully
removing the memory and access records (after the existing calls to
kv.delete(scope, candidate.memoryId), kv.delete(KV.retentionScores,
candidate.memoryId), and deleteAccessLog(kv, candidate.memoryId)), call
recordAudit() to emit an audit record for this structural mutation; include
candidate.memoryId, the resolvedSource value used to resolve the memory, and the
eviction metadata (the computed eviction score and the threshold or reason) in
the audit payload so the mem::retention-evict event is visible to audit
consumers. Ensure the recordAudit invocation only runs after the deletes
complete without throwing so it reflects a successful eviction.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/retention.test.ts`:
- Around line 258-262: The test currently only asserts the transient SDK
response from sdk.trigger("mem::retention-score", {}) but must also verify the
persisted rows so KV.retentionScores (the stored "mem:retention" entries) retain
the source discriminator; after obtaining result and locating ep and sem, fetch
the persisted retention entries from the same storage used by KV.retentionScores
(e.g., read the "mem:retention" keys or call the helper that reads persisted
retention rows) and assert that the stored entries for memoryId "mem_ep" and
"sem_sem" have source "episodic" and "semantic" respectively, mirroring the
expectations on result.scores.

---

Duplicate comments:
In `@src/functions/retention.ts`:
- Around line 286-288: After successfully removing the memory and access records
(after the existing calls to kv.delete(scope, candidate.memoryId),
kv.delete(KV.retentionScores, candidate.memoryId), and deleteAccessLog(kv,
candidate.memoryId)), call recordAudit() to emit an audit record for this
structural mutation; include candidate.memoryId, the resolvedSource value used
to resolve the memory, and the eviction metadata (the computed eviction score
and the threshold or reason) in the audit payload so the mem::retention-evict
event is visible to audit consumers. Ensure the recordAudit invocation only runs
after the deletes complete without throwing so it reflects a successful
eviction.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5723daf7-cae5-4830-8435-a75e73f4fad4

📥 Commits

Reviewing files that changed from the base of the PR and between fa89a42 and ea45144.

📒 Files selected for processing (3)

README.md
src/functions/retention.ts
test/retention.test.ts

✅ Files skipped from review due to trivial changes (1)

README.md

…nd 3) CodeRabbit round 3 findings, both real: 1. retention-evict performs structural deletes (memories / semantic / retention scores / access logs) but was not calling recordAudit(). Repo learnings say state-changing functions must be auditable except for read-path bookkeeping. Now emits one batched audit row per non-zero eviction sweep: operation: 'delete' functionId: 'mem::retention-evict' targetIds: every evicted memoryId details: { threshold, evicted, evictedEpisodic, evictedSemantic, reason: 'retention score below threshold' } Zero-eviction sweeps intentionally do NOT write an audit row (no state change, no need to flood the audit log during health checks). 2. The #124 scoring test only checked result.scores (transient response) but not the persisted mem:retention rows. Eviction reads back from stored rows, so a regression in kv.set/serialization would have still passed the old assertion. Now also does kv.get('mem:retention', id) and asserts { source: ... }. Two new tests: - Retention evict with a mixed set of 2 episodic + 1 semantic candidates writes exactly one audit row with all 3 ids in targetIds and the correct evictedEpisodic/Semantic breakdown in details. - Retention evict with zero candidates writes zero audit rows. Full suite: 730 passing (was 728 + 2 new).

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/functions/retention.ts (1)
135-147: ⚠️ Potential issue | 🟠 Major

Audit the re-score/backfill writes.

Lines 147 and 191 now persist schema-relevant source tags into KV.retentionScores, but mem::retention-score still exits without any recordAudit() call. Please emit one batched audit after the score rows are written so this backfill is visible to audit consumers.

As per coding guidelines "Use recordAudit() for all state-changing operations in TypeScript functions".

Also applies to: 179-191
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/retention.ts` around lines 135 - 147, The backfill currently
writes RetentionScore entries (using KV.retentionScores and pushing into scores)
and persists new `source` tags but does not emit any audit records for the
`mem::retention-score` state change; add a batched call to recordAudit()
immediately after the block that writes the score rows (after the loop that
calls kv.set for retention entries and after the other write block that persists
scores) so all created/updated RetentionScore rows are recorded as a single
audit event; use recordAudit() with a payload that identifies the affected
memory IDs (mem.id), the operation (e.g., "mem::retention-score.backfill"), and
the score entries created (scores array) so audit consumers can see the
backfill.

🧹 Nitpick comments (1)

src/functions/retention.ts (1)
121-147: Batch the retention-score writes instead of awaiting them one by one.

Each kv.set(...) here is independent, so the inline await turns a full rescore into one KV round-trip per memory. Collect the writes and flush them with Promise.all (or bounded batches) after building the entries.

As per coding guidelines "Perform parallel operations where possible using Promise.all for independent kv writes/reads in TypeScript".

Also applies to: 150-191
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/retention.ts` around lines 121 - 147, The per-memory inline
await on kv.set inside the loop causes sequential KV writes; instead, collect
each RetentionScore entry (created using computeSalience, computeDecay,
computeReinforcementBoost) and its pending write (kv.set(KV.retentionScores,
mem.id, entry)) into an array of Promises after pushing to scores, then flush
them with Promise.all (or split into bounded batches) once the loop completes;
apply the same batching change to the similar block around lines 150-191 so all
independent kv.set calls run in parallel rather than one-by-one.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/functions/retention.ts`:
- Around line 135-147: The backfill currently writes RetentionScore entries
(using KV.retentionScores and pushing into scores) and persists new `source`
tags but does not emit any audit records for the `mem::retention-score` state
change; add a batched call to recordAudit() immediately after the block that
writes the score rows (after the loop that calls kv.set for retention entries
and after the other write block that persists scores) so all created/updated
RetentionScore rows are recorded as a single audit event; use recordAudit() with
a payload that identifies the affected memory IDs (mem.id), the operation (e.g.,
"mem::retention-score.backfill"), and the score entries created (scores array)
so audit consumers can see the backfill.

---

Nitpick comments:
In `@src/functions/retention.ts`:
- Around line 121-147: The per-memory inline await on kv.set inside the loop
causes sequential KV writes; instead, collect each RetentionScore entry (created
using computeSalience, computeDecay, computeReinforcementBoost) and its pending
write (kv.set(KV.retentionScores, mem.id, entry)) into an array of Promises
after pushing to scores, then flush them with Promise.all (or split into bounded
batches) once the loop completes; apply the same batching change to the similar
block around lines 150-191 so all independent kv.set calls run in parallel
rather than one-by-one.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 27bbc7a8-13d4-4989-a69e-2dd7bfa22c80

📥 Commits

Reviewing files that changed from the base of the PR and between ea45144 and 04581cf.

📒 Files selected for processing (3)

CHANGELOG.md
src/functions/retention.ts
test/retention.test.ts

🚧 Files skipped from review as they are similar to previous changes (2)

test/retention.test.ts
CHANGELOG.md

CodeRabbit round 3 outside-diff findings, both addressed: 1. mem::retention-score was persisting schema-relevant writes to KV.retentionScores (1000+ rows in a mature store) but never called recordAudit(). Per the repo audit-coverage policy, state-changing functions need an audit row. Added a single batched audit event per rescore: operation: 'retention_score' (new audit op — added to the AuditEntry union in types.ts) functionId: 'mem::retention-score' targetIds: [] (intentionally empty — a mature store can have 1000+ ids per sweep; flooding the audit log with every memoryId on every cron tick is worse than recording just the summary counts) details: { total, episodic, semantic, tiers, config } Zero-memory stores intentionally skip the audit call. 2. The per-memory kv.set inside the score loop was O(n) sequential round-trips. Refactored to collect pendingWrites: [id, entry][] while iterating, then flush with Promise.all at the end. On a mature store with 1000+ memories this is ~10x faster (depends on backend pipelining). Test updates: - Added 'mem::retention-score emits audit row per rescore' covering the new audit call, targetIds=[], and details.episodic/semantic. - Existing '#124 audit evict' and 'zero-evict skip audit' tests now filter the audit log by functionId === 'mem::retention-evict' because retention-score also writes one row per sweep now. Full suite: 731 passing (was 730 + 1 new, existing tests retargeted).

coderabbitai

♻️ Duplicate comments (1)

src/functions/retention.ts (1)

314-329: ⚠️ Potential issue | 🟠 Major

Legacy-source fallback can overcount evictions and should probe both namespaces.

At Line 314, only KV.memories is checked; Line 319 then assumes semantic. If the ID exists in neither namespace, Line 326-Line 329 still increments eviction counters after a no-op delete. This can over-report evicted / evictedSemantic.

Suggested fix

-          } else {
-            const episodic = await kv.get(KV.memories, candidate.memoryId);
-            if (episodic !== null) {
-              scope = KV.memories;
-              resolvedSource = "episodic";
-            } else {
-              scope = KV.semantic;
-              resolvedSource = "semantic";
-            }
-          }
+          } else {
+            const [episodic, semantic] = await Promise.all([
+              kv.get(KV.memories, candidate.memoryId),
+              kv.get(KV.semantic, candidate.memoryId),
+            ]);
+            if (episodic !== null && semantic === null) {
+              scope = KV.memories;
+              resolvedSource = "episodic";
+            } else if (semantic !== null && episodic === null) {
+              scope = KV.semantic;
+              resolvedSource = "semantic";
+            } else if (semantic !== null) {
+              // ambiguous legacy collision: prefer semantic to avoid semantic leak
+              scope = KV.semantic;
+              resolvedSource = "semantic";
+            } else {
+              // stale retention row: clean metadata only, do not count as evicted memory
+              await Promise.all([
+                kv.delete(KV.retentionScores, candidate.memoryId),
+                deleteAccessLog(kv, candidate.memoryId),
+              ]);
+              continue;
+            }
+          }
           await kv.delete(scope, candidate.memoryId);
-          await kv.delete(KV.retentionScores, candidate.memoryId);
-          await deleteAccessLog(kv, candidate.memoryId);
+          await Promise.all([
+            kv.delete(KV.retentionScores, candidate.memoryId),
+            deleteAccessLog(kv, candidate.memoryId),
+          ]);

As per coding guidelines Perform parallel operations where possible using Promise.all for independent kv writes/reads in TypeScript.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/functions/retention.ts` around lines 314 - 329, Probe both namespaces for
candidate.memoryId (via kv.get for KV.memories and KV.semantic in parallel)
instead of only checking KV.memories and assuming semantic; set scope and
resolvedSource based on which namespace actually contains the id (or skip
deletion entirely if neither exists), then only perform kv.delete(scope,
candidate.memoryId), kv.delete(KV.retentionScores, candidate.memoryId) and
deleteAccessLog(kv, candidate.memoryId) when a namespace was found and increment
evicted/evictedSemantic/evictedEpisodic accordingly; perform the independent
reads/deletes using Promise.all to parallelize operations (referencing
candidate.memoryId, kv.get/kv.delete, KV.memories, KV.semantic,
KV.retentionScores, deleteAccessLog, evicted, evictedSemantic, evictedEpisodic,
resolvedSource).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/functions/retention.ts`:
- Around line 314-329: Probe both namespaces for candidate.memoryId (via kv.get
for KV.memories and KV.semantic in parallel) instead of only checking
KV.memories and assuming semantic; set scope and resolvedSource based on which
namespace actually contains the id (or skip deletion entirely if neither
exists), then only perform kv.delete(scope, candidate.memoryId),
kv.delete(KV.retentionScores, candidate.memoryId) and deleteAccessLog(kv,
candidate.memoryId) when a namespace was found and increment
evicted/evictedSemantic/evictedEpisodic accordingly; perform the independent
reads/deletes using Promise.all to parallelize operations (referencing
candidate.memoryId, kv.get/kv.delete, KV.memories, KV.semantic,
KV.retentionScores, deleteAccessLog, evicted, evictedSemantic, evictedEpisodic,
resolvedSource).

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 93203ffd-1462-4bc2-8238-c3abf7f5418f

📥 Commits

Reviewing files that changed from the base of the PR and between 04581cf and 46b0a16.

📒 Files selected for processing (3)

src/functions/retention.ts
src/types.ts
test/retention.test.ts

✅ Files skipped from review due to trivial changes (1)

test/retention.test.ts

🚧 Files skipped from review as they are similar to previous changes (1)

src/types.ts

rohitg00 marked this pull request as draft April 15, 2026 09:19

rohitg00 changed the title ~~fix: stop burning Claude Pro tokens on every tool call (#143, real root cause of #138)~~ fix: gate SessionStart context injection + pre-tool-use hot path (#143) Apr 15, 2026

coderabbitai Bot reviewed Apr 15, 2026

View reviewed changes

rohitg00 changed the title ~~fix: gate SessionStart context injection + pre-tool-use hot path (#143)~~ fix: SessionStart context gate (#143) + retention-evict semantic leak (#124) Apr 15, 2026

rohitg00 marked this pull request as ready for review April 15, 2026 10:33

coderabbitai Bot reviewed Apr 15, 2026

View reviewed changes

Comment thread src/functions/retention.ts Outdated

Comment thread src/functions/retention.ts

coderabbitai Bot reviewed Apr 15, 2026

View reviewed changes

Comment thread test/retention.test.ts

coderabbitai Bot reviewed Apr 15, 2026

View reviewed changes

rohitg00 merged commit 3bad543 into main Apr 15, 2026
3 checks passed

rohitg00 deleted the fix/stop-claude-code-token-burn-143 branch April 15, 2026 11:06

rohitg00 mentioned this pull request Apr 15, 2026

mem::retention-evict only deletes KV.memories rows, leaking semantic memories #124

Closed

Conversation

rohitg00 commented Apr 15, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fix 1: #143 — SessionStart context injection gate

Fix 2: #124 — mem::retention-evict leaks semantic memories

Release

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rohitg00 commented Apr 15, 2026 •

edited by coderabbitai Bot

Loading

Fix 2: #124 — `mem::retention-evict` leaks semantic memories

coderabbitai Bot commented Apr 15, 2026 •

edited

Loading