Skip to content

feat: implement multimodal image memory#111

Merged
rohitg00 merged 7 commits intorohitg00:mainfrom
Tanmay-008:feat/multimodal-image-memory
Apr 21, 2026
Merged

feat: implement multimodal image memory#111
rohitg00 merged 7 commits intorohitg00:mainfrom
Tanmay-008:feat/multimodal-image-memory

Conversation

@Tanmay-008
Copy link
Copy Markdown
Contributor

@Tanmay-008 Tanmay-008 commented Apr 11, 2026

This PR establishes the foundation for Multimodal Memory.below we describe the technical changes

Technical Changes:

  1. Data Model Updates (src/types.ts):
    Added modality, imageRef, and imageDescription to CompressedObservation and RawObservation.

2 .Safe Extraction (src/hooks/post-tool-use.ts):
Updated the hook to safely extract base64 images before text truncation occurs, preventing payload corruption.

  1. Content-Addressed Storage (src/utils/image-store.ts):
    Built a storage utility to extract base64 images from JSON payloads and save them to ~/.agentmemory/images/.
    Uses SHA-256 hashing to guarantee deduplication.

  2. Vision Integration Prep (src/functions/compress.ts):
    Added src/prompts/vision.ts and set up the pipeline to pass images to the AI vision model for text description.

  3. Garbage Collection Hooks:
    Wired up auto-forget.ts, evict.ts, and retention.ts to automatically purge physical image files from the disk when their corresponding observation is removed from KV.

6.write the test file for test this all features.

Summary by CodeRabbit

  • New Features

    • Multimodal support: extracted image data from observations, image-aware compression/description, and provider image-description integration.
    • Disk-usage and image-quota management with automatic cleanup and image reference counting.
  • Improvements

    • Images persisted to a managed store with safe lifecycle handling during eviction/forget workflows.
    • Observation types and memory schemas extended to represent image/mixed modality and image metadata.
  • Tests

    • End-to-end and unit tests for multimodal observe→compress pipeline, storage, and ref-counting.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 11, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Detects/extracts base64/image payloads from tool outputs, persists images to a managed disk store, annotates observations/memories with modality/image refs/descriptions, calls provider vision hooks for image descriptions, and decrements/deletes stored images during eviction/retention/auto-forget. Includes tests and disk-quota/delta managers. (50 words)

Changes

Cohort / File(s) Summary
Image storage & refs
src/utils/image-store.ts, src/functions/image-refs.ts, src/functions/disk-size-manager.ts, src/functions/image-quota-cleanup.ts
Add filesystem-backed image store (IMAGES_DIR, saveImageToDisk, deleteImage, touchImage, isManagedImagePath, getMaxBytes) and KV-backed image reference counters with increment/decrement logic and disk-size tracking/cleanup.
Observe & ingestion
src/functions/observe.ts, src/hooks/post-tool-use.ts, plugin/scripts/post-tool-use.mjs
Detect/extract base64/image payloads via extractImage/extractImageData/isBase64Image, save images to disk for supported formats, set raw.imageData and modality, and rollback saved images on persistence errors. Hooks/post-hook updated to send truncated cleanOutput and conditional image_data.
Compression & vision
src/functions/compress.ts, src/prompts/vision.ts, src/providers/anthropic.ts
When imageData present and provider supports describeImage, read/normalize image bytes (support managed paths), call provider describeImage with VISION_DESCRIPTION_PROMPT, include imageDescription, imageRef, and modality in compressed observations; add prompt constant and provider method.
Lifecycle cleanup (evict/retention/auto-forget)
src/functions/evict.ts, src/functions/retention.ts, src/functions/auto-forget.ts
Lazy-import and call decrementImageRef / delete image files when deleting observations or memories (dry-run gated); decrement image refs for expired/evicted items before KV deletion.
Types & registration
src/types.ts, src/index.ts, src/state/schema.ts
Extend RawObservation/CompressedObservation/Memory with modality, imageData, imageRef, imageDescription; add ObservationType "image"; add MemoryProvider.describeImage?; register disk-size manager and image-quota cleanup at startup; add KV.imageRefs and KV.state.
API & SDK minor
src/triggers/api.ts, src/providers/agent-sdk.ts
routine-create handler now validates name and steps and constructs explicit payload; SDK loop assigns result from (msg as any).result with nullish default.
Tests
test/multimodal.test.ts
New Vitest tests covering observe→save image to disk, compress with mocked describeImage, disk-size manager, image ref counting, and image file utilities.
Scripts & small hooks
plugin/scripts/post-tool-use.mjs, src/hooks/post-tool-use.ts
Added isBase64Image() and extractImageData() helpers; split/replace image content in tool output and send extracted data separately when present.

Sequence Diagram(s)

sequenceDiagram
    participant Tool as Tool Execution
    participant PostHook as post-tool-use Hook
    participant ObserveFn as Observe Function
    participant ImgStore as Image Store
    participant KV as KV Storage
    participant CompressFn as Compress Function
    participant Vision as Vision Provider

    Tool->>PostHook: tool_output (may contain base64/data URL)
    PostHook->>PostHook: extractImageData() -> (cleanOutput, imageData)
    PostHook->>ObserveFn: POST observe (tool_output=cleanOutput, image_data?)
    ObserveFn->>ImgStore: saveImageToDisk(imageData) [if base64-like]
    ImgStore-->>ObserveFn: storedImagePath
    ObserveFn->>KV: store RawObservation (modality, imageData=path)
    CompressFn->>KV: fetch RawObservation
    CompressFn->>ImgStore: read image file -> base64 bytes (if imageData is path)
    CompressFn->>Vision: describeImage(base64, mimeType, prompt)
    Vision-->>CompressFn: imageDescription
    CompressFn->>KV: store CompressedObservation (imageRef, imageDescription, modality)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I nibble code and chase a byte,
I stash each image safe by night,
Models tell me what they see,
I count and clean what used to be,
A hopping helper, keeping memory light.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 3.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'feat: implement multimodal image memory' clearly and accurately summarizes the primary objective of the changeset, which introduces foundational multimodal (image) memory support across multiple modules.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/functions/compress.ts (1)

25-40: ⚠️ Potential issue | 🟡 Minor

Keep VALID_TYPES aligned with ObservationType.

Line 39 adds "image", but ObservationType in src/types.ts still excludes it. The cast in parseCompressionXml() hides that contract mismatch, so downstream exhaustive logic will treat a now-valid runtime value as impossible.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/compress.ts` around lines 25 - 40, VALID_TYPES now contains
"image" but the union type ObservationType doesn't include it and
parseCompressionXml() currently masks this by casting; update the type system to
match runtime values: either add "image" to the ObservationType union in the
type definition or remove "image" from VALID_TYPES so they stay consistent, and
remove the unsafe cast in parseCompressionXml() so TypeScript enforces
exhaustiveness (handle unknown/invalid types explicitly instead of casting).
🧹 Nitpick comments (4)
src/functions/auto-forget.ts (2)

156-161: Hoist the dynamic import outside the nested loops.

The deleteImage import is performed inside the nested observation loop. For projects with many sessions and observations, this causes significant overhead.

♻️ Proposed refactor
+      const { deleteImage } = await import("../utils/image-store.js");
+
       for (let i = 0; i < sessions.length; i++) {
         for (const obs of obsPerSession[i]) {
           if (!obs.timestamp) continue;
           const age = now - new Date(obs.timestamp).getTime();
           if (age > 180 * MS_PER_DAY && (obs.importance ?? 5) <= 2) {
             result.lowValueObs.push(obs.id);
             if (!dryRun) {
-              const { deleteImage } = await import("../utils/image-store.js");
               if ((obs as any).imageData) deleteImage((obs as any).imageData);
               if ((obs as any).imageRef) deleteImage((obs as any).imageRef);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/auto-forget.ts` around lines 156 - 161, The dynamic import of
deleteImage inside the nested observation loop is causing repeated module loads;
move the import out of the loops by awaiting import("../utils/image-store.js")
once before iterating sessions/observations and destructuring deleteImage from
the result, then call deleteImage((obs as any).imageData) and deleteImage((obs
as any).imageRef) as before; ensure the import is awaited and keep existing
error handling around kv.delete and image deletion calls in the same function
(auto-forget) to preserve behavior.

49-52: Consider hoisting the import for TTL-expired memories as well.

The dynamic import inside the loop is fine for a small number of TTL-expired memories, but for consistency with the recommended refactor for observations, consider hoisting it to the top of the function.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/auto-forget.ts` around lines 49 - 52, Hoist the dynamic
image-store import out of the per-memory loop in autoForget: before iterating
TTL-expired memories, import deleteImage once (either via a single await
import("../utils/image-store.js") assigned to const { deleteImage } or by
switching to a static top-level import) and then call deleteImage(mem.imageRef)
inside the loop when imageRef exists; this removes the repeated dynamic import
and keeps behavior for (mem as any).imageRef identical.
src/functions/evict.ts (1)

97-99: Hoist the deleteImage import to reduce redundant dynamic imports.

The deleteImage function is dynamically imported in four separate locations within the eviction loop. Hoisting the import to the start of the function body improves performance and reduces code duplication.

♻️ Proposed refactor
     async (data: { dryRun?: boolean }): Promise<EvictionStats> => {
       const ctx = getContext();
       const dryRun = data?.dryRun ?? false;
+      const { deleteImage } = await import("../utils/image-store.js");

       const configOverride = await kv
         .get<Partial<EvictionConfig>>(KV.config, "eviction")

Then remove the individual imports at lines 97, 125, 146, and 164, keeping only the deleteImage(...) calls.

Also applies to: 125-127, 145-148, 163-166

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/evict.ts` around lines 97 - 99, Hoist the dynamic import of
deleteImage to the start of the eviction function so it’s only loaded once: at
the top of the function body (since the function is async) add const {
deleteImage } = await import("../utils/image-store.js"); then delete the four
redundant dynamic imports that currently appear next to the deleteImage calls
inside the eviction loop (the imports currently at the sites where
deleteImage((o as any).imageData) and deleteImage((o as any).imageRef) are
invoked), leaving only the deleteImage(...) calls intact.
src/functions/retention.ts (1)

219-223: Move the dynamic import outside the loop for efficiency.

The deleteImage import is performed inside the eviction loop. When evicting many memories, this causes repeated dynamic import overhead.

♻️ Proposed refactor
+      const { deleteImage } = await import("../utils/image-store.js");
+
       let evicted = 0;
       for (const candidate of candidates) {
         try {
           const mem = await kv.get<Memory>(KV.memories, candidate.memoryId);
           if (mem && (mem as any).imageRef) {
-            const { deleteImage } = await import("../utils/image-store.js");
             deleteImage((mem as any).imageRef);
           }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/retention.ts` around lines 219 - 223, The dynamic import of
deleteImage inside the eviction loop causes repeated overhead; move the import
out of the loop by performing const { deleteImage } = await
import("../utils/image-store.js") before iterating over candidates (the loop
where kv.get<Memory>(KV.memories, candidate.memoryId) and (mem as any).imageRef
are checked) and then call deleteImage(imageRef) inside the loop; ensure the
import is awaited once and deleteImage is referenced directly so the loop only
performs image deletion, not repeated imports.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/functions/compress.ts`:
- Around line 88-99: The code is currently calling readFileSync on
data.raw.imageData allowing arbitrary local paths; update the logic in the
hasImage branch (where data.raw.imageData is inspected and
provider.describeImage is called) to only read files that originate from our
managed image store: resolve the path (using path.resolve) and verify it is
inside the configured image-store directory (compare with
path.resolve(IMAGE_STORE_DIR) and ensure the resolved path startsWith that dir)
before calling readFileSync; if the check fails, do not read the file and
instead either treat imageData as invalid/untrusted or require a typed
image-store reference (persist an image ID/ref rather than raw paths) and return
an error so provider.describeImage is not passed arbitrary filesystem paths.
Ensure the checks are applied to the branch that currently transforms file paths
into base64 and reference the symbols data.raw.imageData, hasImage,
provider.describeImage, and readFileSync.

In `@src/functions/observe.ts`:
- Around line 112-118: The code currently calls saveImageToDisk() as soon as
extractImage(sanitizedRaw) yields hiddenImage, which may create orphan files if
the observation later fails session-cap checks or KV persistence; move the
saveImageToDisk(hiddenImage) call (and assignment to raw.imageData) out of the
early path and into the same locked/persisted section that performs the
session-cap check and KV write (the block that admits/persists the observation),
or alternatively ensure any file created is deleted on every early
return/failure; update references around extractImage, hiddenImage,
saveImageToDisk, and raw.imageData so the image is only written when the
observation is successfully admitted/persisted and not before.

In `@src/triggers/api.ts`:
- Around line 1214-1218: The handler currently forwards the raw casted body to
sdk.trigger("mem::routine-create") after only checking body?.name; instead,
perform full boundary validation and whitelisting: reject if name or steps are
missing, ensure name is a string, ensure steps is a non-empty array and each
step is an object with only the allowed step fields (whitelist keys you expect),
validate types for those fields, and return 400 with a clear error when
validation fails; then build a sanitizedPayload object containing only the
whitelisted top-level fields (e.g., name, description, steps) and pass
sanitizedPayload to sdk.trigger("mem::routine-create", sanitizedPayload) instead
of the raw body (do not rely on a broad cast like body as Record<string,
unknown>).

In `@src/utils/image-store.ts`:
- Around line 28-29: Replace the ad-hoc SHA-256 filename derivation using
createHash with the canonical fingerprintId-based identity: call
fingerprintId(cleanBase64) and use its result to build the content-addressable
filename instead of computing a new hash; update the filePath construction
(currently using createHash(...).digest("hex")) to use
fingerprintId(cleanBase64) combined with ext and IMAGES_DIR so deduplication
uses the single source of truth.
- Around line 3-4: Import existsSync and unlinkSync from "node:fs" at the top
alongside writeFileSync/mkdirSync and remove any runtime require() calls; then
harden deleteImage(imagePath) by resolving and normalizing the target (use
path.resolve or equivalent) and comparing it against the resolved IMAGES_DIR
(ensure the resolved target begins with the resolved IMAGES_DIR + path.sep)
before calling unlinkSync, returning/logging a safe error if the path is outside
IMAGES_DIR so only files inside IMAGES_DIR can be deleted.

In `@test/multimodal.test.ts`:
- Around line 81-127: The test never calls the registered mem::compress handler
so it bypasses the real compression path; call the captured compressCallback
with the proper payload (e.g. { observationId: raw.id, sessionId: raw.sessionId,
raw }) after registering via registerCompressFunction, make
mockProvider.compress return the XML string shape that parseCompressionXml
expects (not the fabricated JSON), then read the stored compressed observation
from the KV key list("mem:obs:test-session") and assert its fields
(imageRef/imageDescription/etc.) to validate the real MemoryProvider.compress ->
registerCompressFunction path executed.

---

Outside diff comments:
In `@src/functions/compress.ts`:
- Around line 25-40: VALID_TYPES now contains "image" but the union type
ObservationType doesn't include it and parseCompressionXml() currently masks
this by casting; update the type system to match runtime values: either add
"image" to the ObservationType union in the type definition or remove "image"
from VALID_TYPES so they stay consistent, and remove the unsafe cast in
parseCompressionXml() so TypeScript enforces exhaustiveness (handle
unknown/invalid types explicitly instead of casting).

---

Nitpick comments:
In `@src/functions/auto-forget.ts`:
- Around line 156-161: The dynamic import of deleteImage inside the nested
observation loop is causing repeated module loads; move the import out of the
loops by awaiting import("../utils/image-store.js") once before iterating
sessions/observations and destructuring deleteImage from the result, then call
deleteImage((obs as any).imageData) and deleteImage((obs as any).imageRef) as
before; ensure the import is awaited and keep existing error handling around
kv.delete and image deletion calls in the same function (auto-forget) to
preserve behavior.
- Around line 49-52: Hoist the dynamic image-store import out of the per-memory
loop in autoForget: before iterating TTL-expired memories, import deleteImage
once (either via a single await import("../utils/image-store.js") assigned to
const { deleteImage } or by switching to a static top-level import) and then
call deleteImage(mem.imageRef) inside the loop when imageRef exists; this
removes the repeated dynamic import and keeps behavior for (mem as any).imageRef
identical.

In `@src/functions/evict.ts`:
- Around line 97-99: Hoist the dynamic import of deleteImage to the start of the
eviction function so it’s only loaded once: at the top of the function body
(since the function is async) add const { deleteImage } = await
import("../utils/image-store.js"); then delete the four redundant dynamic
imports that currently appear next to the deleteImage calls inside the eviction
loop (the imports currently at the sites where deleteImage((o as any).imageData)
and deleteImage((o as any).imageRef) are invoked), leaving only the
deleteImage(...) calls intact.

In `@src/functions/retention.ts`:
- Around line 219-223: The dynamic import of deleteImage inside the eviction
loop causes repeated overhead; move the import out of the loop by performing
const { deleteImage } = await import("../utils/image-store.js") before iterating
over candidates (the loop where kv.get<Memory>(KV.memories, candidate.memoryId)
and (mem as any).imageRef are checked) and then call deleteImage(imageRef)
inside the loop; ensure the import is awaited once and deleteImage is referenced
directly so the loop only performs image deletion, not repeated imports.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a9dd1ee8-6d11-4c04-9d86-054d4d49452d

📥 Commits

Reviewing files that changed from the base of the PR and between 7af05ec and e692cf0.

📒 Files selected for processing (14)
  • plugin/scripts/post-tool-use.mjs
  • src/functions/auto-forget.ts
  • src/functions/compress.ts
  • src/functions/evict.ts
  • src/functions/observe.ts
  • src/functions/retention.ts
  • src/hooks/post-tool-use.ts
  • src/prompts/vision.ts
  • src/providers/agent-sdk.ts
  • src/providers/anthropic.ts
  • src/triggers/api.ts
  • src/types.ts
  • src/utils/image-store.ts
  • test/multimodal.test.ts

Comment thread src/functions/compress.ts
Comment thread src/functions/observe.ts Outdated
Comment thread src/triggers/api.ts Outdated
Comment thread src/utils/image-store.ts Outdated
Comment thread src/utils/image-store.ts Outdated
Comment on lines +28 to +29
const hash = createHash("sha256").update(cleanBase64).digest("hex");
const filePath = join(IMAGES_DIR, `${hash}.${ext}`);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Use fingerprintId() instead of rolling a second content hash.

This is a content-addressable storage path, so using a local SHA-256 here creates another dedup identity scheme to keep in sync. Please derive the filename from fingerprintId(cleanBase64) instead.

As per coding guidelines, "Use fingerprintId() for content-addressable deduplication and generateId() for unique IDs in TypeScript code".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/image-store.ts` around lines 28 - 29, Replace the ad-hoc SHA-256
filename derivation using createHash with the canonical fingerprintId-based
identity: call fingerprintId(cleanBase64) and use its result to build the
content-addressable filename instead of computing a new hash; update the
filePath construction (currently using createHash(...).digest("hex")) to use
fingerprintId(cleanBase64) combined with ext and IMAGES_DIR so deduplication
uses the single source of truth.

Comment thread test/multimodal.test.ts Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
src/functions/observe.ts (2)

112-120: Redundant extractImage call—reuse hiddenImage instead.

extractImage(sanitizedRaw) is invoked at line 112 to set hiddenImage, then called again at lines 118-120 to set pendingImageData with the same input. Reuse the already-computed value to avoid redundant traversal.

♻️ Proposed fix
         const hiddenImage = extractImage(sanitizedRaw);
         if (hiddenImage) {
           raw.modality = (raw.toolInput || raw.toolOutput || raw.userPrompt) ? "mixed" : "image";
         }
       }
 
-      const pendingImageData = (raw.modality === "image" || raw.modality === "mixed")
-        ? extractImage(typeof sanitizedRaw === "object" ? sanitizedRaw : undefined)
-        : undefined;
+      const pendingImageData = (raw.modality === "image" || raw.modality === "mixed")
+        ? hiddenImage
+        : undefined;

Note: hiddenImage is already in scope and holds the same extracted value.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/observe.ts` around lines 112 - 120, The code redundantly calls
extractImage(sanitizedRaw) twice; reuse the already-computed hiddenImage instead
of re-traversing sanitizedRaw. Update the pendingImageData assignment so when
raw.modality is "image" or "mixed" it returns hiddenImage (which was computed
from sanitizedRaw) and otherwise undefined; keep the existing modality check
(raw.modality === "image" || raw.modality === "mixed") and reference
hiddenImage, pendingImageData, extractImage, and sanitizedRaw when making the
change.

133-138: Consider wrapping the persist step in try/catch to avoid orphan files on KV failure.

If kv.set at line 138 throws after saveImageToDisk succeeds at line 135, the image file becomes orphaned since no observation references it. The eviction/retention hooks won't see it.

🛡️ Optional defensive pattern
         if (pendingImageData && (pendingImageData.startsWith("data:image/") || pendingImageData.startsWith("iVBORw0KGgo") || pendingImageData.startsWith("/9j/"))) {
           const { saveImageToDisk } = await import("../utils/image-store.js");
           raw.imageData = saveImageToDisk(pendingImageData);
         }
 
-        await kv.set(KV.observations(payload.sessionId), obsId, raw);
+        try {
+          await kv.set(KV.observations(payload.sessionId), obsId, raw);
+        } catch (err) {
+          if (raw.imageData) {
+            const { unlinkSync } = await import("node:fs");
+            try { unlinkSync(raw.imageData); } catch { /* best effort */ }
+          }
+          throw err;
+        }

This is a low-probability edge case, so marking as optional.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/observe.ts` around lines 133 - 138, The persist step can fail
after the image is written, leaving orphan files; change the flow around
pendingImageData/saveImageToDisk and kv.set so you await
saveImageToDisk(pendingImageData) and wrap the subsequent
kv.set(KV.observations(payload.sessionId), obsId, raw) in a try/catch; on catch
call a cleanup helper (e.g., deleteImageFromDisk or removeImageFromDisk from
../utils/image-store.js) using the path returned in raw.imageData, rethrow or
log the original error, and ensure no orphan file remains if kv.set fails.
src/functions/compress.ts (1)

107-113: Consider distinguishing security refusals from transient failures in logging.

The catch block logs all failures uniformly as warnings. When line 96 throws for an unmanaged path (a potential security probe), it's logged identically to a transient API timeout. Differentiating these could help with debugging and security monitoring.

🔍 Optional: differentiate error types
         } catch (err) {
           const msg = err instanceof Error ? err.message : String(err);
-          ctx.logger.warn("Vision model call failed, falling back to text-only compression", {
+          const level = msg.includes("outside managed store") ? "error" : "warn";
+          ctx.logger[level](
+            msg.includes("outside managed store")
+              ? "Refused to read image outside managed store"
+              : "Vision model call failed, falling back to text-only compression",
+            {
             obsId: data.observationId,
             error: msg,
-          });
+            }
+          );
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/compress.ts` around lines 107 - 113, The catch currently lumps
all failures together; update the catch in src/functions/compress.ts to inspect
err (e.g., err.name === 'SecurityError' or err.message.includes('unmanaged
path') or other provider-specific sentinel) and log security refusals with a
distinct level/marker versus transient failures: for security refusals call
ctx.logger.warn or ctx.logger.error with a security flag (e.g., { obsId:
data.observationId, security: true, error: msg }) and for transient/API errors
keep the existing warning but include error details; ensure you reference the
same ctx.logger.warn/ctx.logger.error call sites and preserve err -> msg
extraction logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/functions/compress.ts`:
- Around line 107-113: The catch currently lumps all failures together; update
the catch in src/functions/compress.ts to inspect err (e.g., err.name ===
'SecurityError' or err.message.includes('unmanaged path') or other
provider-specific sentinel) and log security refusals with a distinct
level/marker versus transient failures: for security refusals call
ctx.logger.warn or ctx.logger.error with a security flag (e.g., { obsId:
data.observationId, security: true, error: msg }) and for transient/API errors
keep the existing warning but include error details; ensure you reference the
same ctx.logger.warn/ctx.logger.error call sites and preserve err -> msg
extraction logic.

In `@src/functions/observe.ts`:
- Around line 112-120: The code redundantly calls extractImage(sanitizedRaw)
twice; reuse the already-computed hiddenImage instead of re-traversing
sanitizedRaw. Update the pendingImageData assignment so when raw.modality is
"image" or "mixed" it returns hiddenImage (which was computed from sanitizedRaw)
and otherwise undefined; keep the existing modality check (raw.modality ===
"image" || raw.modality === "mixed") and reference hiddenImage,
pendingImageData, extractImage, and sanitizedRaw when making the change.
- Around line 133-138: The persist step can fail after the image is written,
leaving orphan files; change the flow around pendingImageData/saveImageToDisk
and kv.set so you await saveImageToDisk(pendingImageData) and wrap the
subsequent kv.set(KV.observations(payload.sessionId), obsId, raw) in a
try/catch; on catch call a cleanup helper (e.g., deleteImageFromDisk or
removeImageFromDisk from ../utils/image-store.js) using the path returned in
raw.imageData, rethrow or log the original error, and ensure no orphan file
remains if kv.set fails.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c29c35a8-425a-4407-be8d-31626671ddc9

📥 Commits

Reviewing files that changed from the base of the PR and between e692cf0 and 3486288.

📒 Files selected for processing (5)
  • src/functions/compress.ts
  • src/functions/observe.ts
  • src/triggers/api.ts
  • src/utils/image-store.ts
  • test/multimodal.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/utils/image-store.ts
  • test/multimodal.test.ts

Copy link
Copy Markdown
Owner

@rohitg00 rohitg00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Tanmay-008 for this PR! Solid foundation — clean architecture with hook → observe → compress → describe pipeline, content-addressed storage, path traversal protection, and GC hooks across all 3 eviction paths. Tests cover the full flow.

Here are issues I'd like to see addressed before merge:

🐛 Bugs

1. ObservationType missing "image"

VALID_TYPES in compress.ts adds "image" but the ObservationType union in src/types.ts doesn't include it. This means TypeScript exhaustive checks won't catch it — a runtime value the type system says is impossible.

// src/types.ts
export type ObservationType =
  | "file_edit"
  | "command"
  ...
+ | "image"
  | "other";

2. Orphan image files on KV write failure

If saveImageToDisk succeeds but kv.set throws right after, the image is orphaned on disk with no KV reference. Wrap in try/catch:

  if (pendingImageData && ...) {
    const { saveImageToDisk } = await import("../utils/image-store.js");
    raw.imageData = saveImageToDisk(pendingImageData);
  }

- await kv.set(KV.observations(payload.sessionId), obsId, raw);
+ try {
+   await kv.set(KV.observations(payload.sessionId), obsId, raw);
+ } catch (err) {
+   if (raw.imageData) {
+     const { deleteImage } = await import("../utils/image-store.js");
+     deleteImage(raw.imageData);
+   }
+   throw err;
+ }

⚡ Performance

3. Dynamic import() inside nested loops

auto-forget.ts and evict.ts both call await import("../utils/image-store.js") inside nested loops — potentially hundreds of dynamic imports per eviction run. In evict.ts this is duplicated in 4 separate places within the same function. Hoist once at the top:

+ const { deleteImage } = await import("../utils/image-store.js");
  for (let i = 0; i < sessions.length; i++) {
    for (const obs of ...) {
-     const { deleteImage } = await import("../utils/image-store.js");
      if ((obs as any).imageData) deleteImage((obs as any).imageData);

🧹 Code Quality

4. Redundant extractImage call in observe.ts

extractImage(sanitizedRaw) is called twice — once for hiddenImage and again for pendingImageData. Reuse the computed value:

  const hiddenImage = extractImage(sanitizedRaw);
  if (hiddenImage) {
    raw.modality = (raw.toolInput || raw.toolOutput || raw.userPrompt) ? \"mixed\" : \"image\";
  }

- const pendingImageData = (raw.modality === \"image\" || raw.modality === \"mixed\")
-   ? extractImage(typeof sanitizedRaw === \"object\" ? sanitizedRaw : undefined)
-   : undefined;
+ const pendingImageData = (raw.modality === \"image\" || raw.modality === \"mixed\")
+   ? hiddenImage
+   : undefined;

5. (mem as any).imageRef casts everywhere

The types in types.ts were updated to include imageRef/imageData on CompressedObservation and RawObservation, but the eviction code still uses as any casts instead of typed access. Should use proper typed properties since the fields now exist on the interfaces.

6. imageThumbBase64 declared but never used

CompressedObservation adds imageThumbBase64?: string but nothing writes to it. Either remove it or add a TODO comment for thumbnail generation in a follow-up.

📦 Out-of-Scope Changes

7. src/triggers/api.ts — routine-create validation

The added steps validation and stricter body parsing in routine-create isn't related to multimodal memory. Please split into a separate commit or PR.

8. src/providers/agent-sdk.ts — payload extraction refactor

The (msg as any).result refactor is cosmetic and unrelated to images. Same — separate commit/PR.

🔮 Missing (OK as follow-up)

9. Other providers don't have describeImage

Only AnthropicProvider implements describeImage(). The fallback in compress.ts handles this gracefully (logs warning, falls back to text-only), but Gemini and OpenRouter both support vision. Worth a follow-up PR — could add to the description as a TODO.

10. No MCP tool for image-aware search

Users can't query "show me screenshots from last session" through MCP. Not a blocker for this PR but a natural follow-up to expose the new capability.


Summary

Category Count
Bugs 2 (type union, orphan files)
Perf 1 (dynamic imports in loops)
Code quality 3 (redundant call, as any, unused field)
Out-of-scope 2 (api.ts, agent-sdk.ts)
Follow-ups 2 (other providers, MCP tool)

Fix items 1-6 and split out 7-8 and this is ready to merge. Items 9-10 can be follow-ups. Great work understanding the whole system — the GC integration across all 3 eviction paths shows real attention to detail. 🙌

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/triggers/api.ts (1)

1211-1237: ⚠️ Potential issue | 🟠 Major

Tighten runtime validation before calling mem::routine-create.

Line 1223 still only checks truthiness, so values like name: 123, name: " ", steps: [], steps: {}, or step objects without a valid title can still cross the HTTP boundary. The downstream routine contract and tests already treat empty or malformed steps as invalid (src/functions/routines.ts:7-20, test/routines.test.ts:58-99), so this handler should reject and sanitize them here instead of forwarding loosely typed data.

Proposed fix
-      if (!req.body?.name || !req.body?.steps) {
+      const body = (req.body ?? {}) as Record<string, unknown>;
+      const name = typeof body.name === "string" ? body.name.trim() : "";
+      const steps =
+        Array.isArray(body.steps) &&
+        body.steps.length > 0 &&
+        body.steps.every((step) => {
+          if (!step || typeof step !== "object") return false;
+          const rawStep = step as Record<string, unknown>;
+          return (
+            typeof rawStep.title === "string" && rawStep.title.trim() !== ""
+          );
+        })
+          ? body.steps.map((step) => {
+              const rawStep = step as Record<string, unknown>;
+              return {
+                title: (rawStep.title as string).trim(),
+                description:
+                  typeof rawStep.description === "string"
+                    ? rawStep.description
+                    : "",
+                actionTemplate:
+                  rawStep.actionTemplate &&
+                  typeof rawStep.actionTemplate === "object"
+                    ? (rawStep.actionTemplate as Record<string, unknown>)
+                    : {},
+                dependsOn: Array.isArray(rawStep.dependsOn)
+                  ? rawStep.dependsOn.filter(
+                      (value): value is number => typeof value === "number",
+                    )
+                  : [],
+              };
+            })
+          : null;
+
+      if (!name || !steps) {
         return {
           status_code: 400,
           body: { error: "name and steps are required" },
         };
       }
       const payload = {
-        name: req.body.name,
-        description: req.body.description,
-        steps: req.body.steps,
-        tags: req.body.tags,
-        frozen: req.body.frozen,
-        sourceProceduralIds: req.body.sourceProceduralIds,
+        name,
+        description:
+          typeof body.description === "string" ? body.description : undefined,
+        steps,
+        tags: Array.isArray(body.tags)
+          ? body.tags.filter((tag): tag is string => typeof tag === "string")
+          : undefined,
+        frozen: typeof body.frozen === "boolean" ? body.frozen : undefined,
+        sourceProceduralIds: Array.isArray(body.sourceProceduralIds)
+          ? body.sourceProceduralIds.filter(
+              (id): id is string => typeof id === "string",
+            )
+          : undefined,
       };

Based on learnings: src/triggers/api.ts REST endpoint handlers must validate inputs, whitelist request body fields, and never pass raw request body directly to sdk.trigger(), and input validation must occur at system boundaries.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/triggers/api.ts` around lines 1211 - 1237, The handler currently only
checks truthiness and forwards raw req.body to
sdk.trigger("mem::routine-create"); instead, tighten validation: ensure name is
a non-empty trimmed string, description (if present) is a string, tags (if
present) is an array of strings, frozen (if present) is boolean, and
sourceProceduralIds (if present) is an array of strings. Validate steps is a
non-empty array and each step is an object with a non-empty trimmed title (and
any other required step fields per src/functions/routines.ts), reject with a 400
and clear message on failure, and build a sanitized payload object (whitelist
fields) to pass to sdk.trigger rather than forwarding req.body directly; keep
checkAuth(req, secret) usage and return early on auth error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/providers/agent-sdk.ts`:
- Line 29: The assignment to result from (msg as any).result can be null while
the function promises a Promise<string>; update the assignment to guard against
null by coercing to an empty string (or another non-null default) so result is
always a string; locate the assignment to result in src/providers/agent-sdk.ts
(the line using (msg as any).result) and replace it with a null-safe expression
(e.g., result = (msg as any).result ?? '') ensuring the function still returns
Promise<string>.

---

Duplicate comments:
In `@src/triggers/api.ts`:
- Around line 1211-1237: The handler currently only checks truthiness and
forwards raw req.body to sdk.trigger("mem::routine-create"); instead, tighten
validation: ensure name is a non-empty trimmed string, description (if present)
is a string, tags (if present) is an array of strings, frozen (if present) is
boolean, and sourceProceduralIds (if present) is an array of strings. Validate
steps is a non-empty array and each step is an object with a non-empty trimmed
title (and any other required step fields per src/functions/routines.ts), reject
with a 400 and clear message on failure, and build a sanitized payload object
(whitelist fields) to pass to sdk.trigger rather than forwarding req.body
directly; keep checkAuth(req, secret) usage and return early on auth error.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 93c1f873-59b5-4a9f-bb73-fe0af7ceb143

📥 Commits

Reviewing files that changed from the base of the PR and between 3486288 and 59f7611.

📒 Files selected for processing (7)
  • src/functions/auto-forget.ts
  • src/functions/evict.ts
  • src/functions/observe.ts
  • src/functions/retention.ts
  • src/providers/agent-sdk.ts
  • src/triggers/api.ts
  • src/types.ts
🚧 Files skipped from review as they are similar to previous changes (5)
  • src/functions/retention.ts
  • src/functions/evict.ts
  • src/functions/auto-forget.ts
  • src/functions/observe.ts
  • src/types.ts

Comment thread src/providers/agent-sdk.ts Outdated
@Qodo-Free-For-OSS
Copy link
Copy Markdown

Hi, mem::forget deletes KV records but doesn’t delete corresponding on-disk images, so a user-initiated forget can leave sensitive screenshots behind in ~/.agentmemory/images.

Severity: action required | Category: security

How to fix: Delete/GC images on forget

Agent prompt to fix - you can give this to your LLM of choice:

Issue description

User-triggered forget removes KV data but can leave persisted image files on disk.

Issue Context

  • Images are written to ~/.agentmemory/images.
  • mem::forget deletes sessions/observations/memories without cleaning up corresponding image files.
  • Because images are deduped, deletion must also avoid removing shared images still referenced elsewhere.

Fix Focus Areas

  • src/functions/observe.ts[138-151]
  • src/functions/remember.ts[115-160]
  • src/utils/image-store.ts[13-53]

Suggested fix

  • When forgetting:
    • Load the targeted observation(s)/memory first, collect their image refs.
    • Use the same safe deletion strategy as eviction (refcount or mark-and-sweep) to remove files only when unreferenced.
  • Add tests covering:
    • Forgetting an observation deletes its image when unique.
    • Forgetting one of two observations sharing an image does not delete the shared file.

We noticed a couple of other issues in this PR as well — happy to share if helpful.


Found by Qodo code review

@rohitg00
Copy link
Copy Markdown
Owner

Can you resolve conflicts and share demo of what's working?

@Tanmay-008
Copy link
Copy Markdown
Contributor Author

ok @rohitg00

@rohitg00
Copy link
Copy Markdown
Owner

Hey @Tanmay-008, thanks for this — #64 is the tracking issue for multimodal and your foundation looks like the right shape (content-addressed SHA-256 store, safe base64 extraction before truncation, vision prompt wired through the compress path).

Branch currently shows as conflicting with main. A few recent merges touched the same files:

Could you rebase / merge main and push? Once it's CLEAN I'll line up review.

While you're at it, two things I'll want covered before merge:

  1. Disk quota on ~/.agentmemory/images/. A single session can capture hundreds of screenshots. Needs a configurable cap (env var, e.g. AGENTMEMORY_IMAGE_STORE_MAX_BYTES, default 500 MB) and an LRU-by-lastReferenced eviction path, otherwise this can silently fill a user's disk.
  2. Deletion parity. When a memory is evicted (retention / auto-forget / governance-delete), its imageRef should get refcount-decremented; if no remaining observation references the image, the blob is removed. Content-addressed dedup is great for writes but needs the mirror on the delete side or we leak forever.

Happy to pair on either after the rebase. Appreciate the work on this.

@Qodo-Free-For-OSS
Copy link
Copy Markdown

Hi, Eviction/retention/auto-forget delete image files by path without considering that images are content-addressed and deduplicated, so deleting one observation can remove the shared file still referenced by other observations/memories.

Severity: action required | Category: reliability

How to fix: GC images by reachability

Agent prompt to fix - you can give this to your LLM of choice:

Issue description

Image files are content-addressed (deduped by SHA-256). Current deletion unlinks the file whenever any referencing record is removed, which breaks other records that still reference the same image path.

Issue Context

  • saveImageToDisk() derives the filename from sha256(cleanBase64).
  • deleteImage() deletes by path with no reference counting.
  • Multiple records can share the same image path.

Fix Focus Areas

  • src/utils/image-store.ts[13-53]
  • src/functions/evict.ts[93-103]
  • src/functions/auto-forget.ts[49-55]
  • src/functions/retention.ts[197-226]

Suggested fix

Implement one of:

  1. Reference counting in KV:
    • On save: increment mem:imageRefCount:<hash>.
    • On delete of a record: decrement; unlink only when reaches 0.
  2. Mark-and-sweep GC:
    • Periodically scan KV for all imageRef/imageData paths in observations/memories.
    • Delete only unreferenced files in IMAGES_DIR.

Also ensure all deletion sites use the same mechanism (no direct unlink unless safe).

We noticed a couple of other issues in this PR as well — happy to share if helpful.


Found by Qodo code review

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 19, 2026

@Tanmay-008 is attempting to deploy a commit to the rohitg00's projects Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🧹 Nitpick comments (1)
test/multimodal.test.ts (1)

56-160: Cross-describe state coupling via savedImagePath is fragile.

Step 2 depends on Step 1's module-scoped savedImagePath and on the KV contents persisted in Step 1. If Vitest runs these in isolation, in parallel, or if Step 1 fails, Step 2 will cascade-fail with a confusing error (or silently read undefined). Consider consolidating both steps into a single describe with beforeAll seeding the image + raw observation, or make Step 2 self-contained by saving its own fixture.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/multimodal.test.ts` around lines 56 - 160, Tests are coupling state
across its two it blocks via the module-scoped savedImagePath and KV contents
(used by observeCallback/registerObserveFunction and registerCompressFunction),
which makes Step 2 fragile if Step 1 doesn’t run first; fix by making the
compress test self-contained: either move both steps into a single test or add a
beforeAll that uses registerObserveFunction/observeCallback (or directly writes
a RawObservation into kv) to seed savedImagePath and the "mem:obs:test-session"
entry before registerCompressFunction is called, ensuring compressCallback sees
a valid raw observation (references: savedImagePath, observeCallback,
registerObserveFunction, registerCompressFunction, compressCallback, kv).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/functions/auto-forget.ts`:
- Around line 156-157: The code may call decrementImageRef twice when
obs.imageData === obs.imageRef; update the call sites (in auto-forget.ts where
obs.imageData / obs.imageRef are decremented and both locations in evict.ts) to
de-duplicate the image reference before decrementing (e.g., build a small Set or
compare strings) so each unique image path is passed to decrementImageRef only
once; ensure you handle undefined/null and keep using the existing
decrementImageRef(kv, sdk, ...) function name so behavior is unchanged for
distinct values.

In `@src/functions/disk-size-manager.ts`:
- Around line 15-38: The RMW on KV.state[DISK_SIZE_KEY] must be guarded with the
project's keyed lock to prevent lost concurrent updates: wrap the read, delta
application, clamp-to-zero, and the kv.set call in withKeyedLock(DISK_SIZE_KEY,
async () => { ... }); keep the initial validation (typeof data?.deltaBytes ...)
outside or at the start, then move currentTotal = await kv.get<number>(KV.state,
DISK_SIZE_KEY), newTotal calculation, await kv.set(KV.state, DISK_SIZE_KEY,
newTotal) and the subsequent quota check
(sdk.triggerVoid("mem::image-quota-cleanup", {}) and ctx.logger.info(...))
inside the withKeyedLock block so the whole read-modify-write+cleanup sequence
is atomic.

In `@src/functions/image-quota-cleanup.ts`:
- Around line 25-29: The lock acquisition is racy: replace the current
read-then-write (kv.get(KV.state, LOCK_KEY) + kv.set(KV.state, LOCK_KEY, now))
with an atomic compare-and-set or the keyed-mutex helper so only one worker wins
the lock (use a unique token/timestamp as the value); when releasing the lock in
the finally block, read the current KV value and only delete/unset LOCK_KEY if
it still equals the exact token/timestamp this invocation wrote (so you never
remove a lock acquired by another worker), ensuring references to LOCK_KEY,
LOCK_TTL_MS, kv.get, kv.set, and the finally release logic are updated
accordingly.
- Around line 62-86: The cleanup loop in image-quota-cleanup.ts has a TOCTOU
between getImageRefCount and deleteImage; fix by acquiring the same per-path
lock used by image-refs.ts (e.g., the lock function you add/use like
acquireImageLock/releaseImageLock or imageRefLock(path)) before reading and
deleting: inside the loop, for each f.filePath call the per-path lock, then call
getImageRefCount(kv, f.filePath) while holding the lock, if refCount is still 0
call deleteImage(f.filePath) and emit the mem::disk-size-delta, then release the
lock in a finally block; ensure you re-check refCount under the lock so
concurrent incrementImageRef/observe calls serialize and cannot race with
deleteImage.

In `@src/functions/image-refs.ts`:
- Around line 11-28: incrementImageRef and decrementImageRef perform non-atomic
read-modify-write on KV.imageRefs causing race conditions and double-delete;
wrap the body of both functions in the per-path mutex by using the existing
withKeyedLock helper (keyed on filePath) to ensure serialized access, then
inside the lock call getImageRefCount, compute and perform kv.set/kv.delete and
call touchImage/deleteImage and sdk.triggerVoid as currently implemented;
reference functions: incrementImageRef, decrementImageRef, withKeyedLock,
getImageRefCount, KV.imageRefs, touchImage, deleteImage, and sdk.triggerVoid
when applying the change.

In `@src/functions/observe.ts`:
- Around line 10-31: extractImage currently returns any string-valued image_*
field without verifying it matches the same prefixes you persist later, and it
recurses unbounded; update extractImage to only accept strings that start with
the allowed image prefixes ("data:image/", "iVBORw0KGgo", "/9j/") (same check
used when saving) and add a recursion depth cap (e.g. pass an optional depth
param defaulting to 0 and bail after a small max like 5) so it stops traversing
deeply nested/ malicious objects; keep the function signature extractImage(d:
unknown) but implement an internal/overloaded depth parameter and ensure callers
(including the raw.modality gating logic that reads imageData later) rely on
this stricter validation so modality is only set when a valid-prefixed image is
found.
- Around line 138-160: Move the image save + ref-increment logic
(saveImageToDisk + incrementImageRef + sdk.triggerVoid) inside the try block so
any exception after saving is handled by the catch; in the catch handler, do not
call deleteImage directly — call decrementImageRef(kv, raw.imageData) (which
will handle deletion and disk-size compensation when refcount hits zero) and
only trigger negative disk-size delta if decrementImageRef indicates it deleted
bytes; ensure raw.imageData is set consistently (filePath) prior to
incrementImageRef so the catch can reference it, and replace the deleteImage
usage with decrementImageRef in the error path to avoid leaking KV imageRefs
entries when kv.set fails.

In `@src/utils/image-store.ts`:
- Around line 24-56: saveImageToDisk currently always writes and returns the
full file size even when a content-addressed file already exists, causing
duplicate saves to report incorrect disk deltas; fix by computing the target
filePath (using contentHash and IMAGES_DIR as done) and checking for its
existence (e.g., with existsSync or fs.stat/access) before calling writeFile—if
the file exists, skip writeFile and return bytesWritten: 0 and the existing
filePath, otherwise write the buffer and return the newly written size from
stat; update saveImageToDisk to perform that existence check and short-circuit
to avoid reporting full size for deduplicated writes.
- Around line 15-18: isManagedImagePath incorrectly compares resolve(filePath)
to a hardcoded IMAGES_DIR + "/" which fails on Windows and doesn't normalize
IMAGES_DIR; update isManagedImagePath to resolve and normalize IMAGES_DIR as
well (e.g. compute resolvedDir = resolve(IMAGES_DIR) or
path.normalize(resolve(IMAGES_DIR))) and then test membership using a
platform-safe approach (for example use path.relative(resolvedDir, resolvedPath)
to ensure file is inside resolvedDir, or compare against resolvedDir + path.sep)
so deleteImage and touchImage correctly detect managed images across platforms
and symlink/relative cases.

In `@test/multimodal.test.ts`:
- Around line 302-313: The test passes a possibly undefined variable to
deleteImage causing TypeScript strict-check failures; change the call to use the
non-null assertion so it becomes deleteImage(testFilePath!) (or add a runtime
guard) so the variable matches the earlier uses of testFilePath!; ensure
references to deleteImage and testFilePath are updated consistently in the test
block.
- Around line 259-276: Add assertions to cover deletion-parity: after using
incrementImageRef to bring "/fake/path/test.png" to 1 then call
decrementImageRef(localKv, localSdk, path) and assert localSdk.triggerVoid was
called once with that path and getImageRefCount returns 0; then call
decrementImageRef on the same path again and assert triggerVoid is not called a
second time and getImageRefCount remains 0 (no-op on already-zero/unknown ref);
finally, set up a shared image by incrementing twice (refcount 2), call
decrementImageRef once and assert getImageRefCount is 1 and localSdk.triggerVoid
was not called (shared image not deleted until refcount reaches 0).
- Around line 193-218: Wrap the test body that mutates
process.env.AGENTMEMORY_IMAGE_STORE_MAX_BYTES in a try/finally so restoration
always runs, and when restoring use delete
process.env.AGENTMEMORY_IMAGE_STORE_MAX_BYTES if originalEnv is undefined
(otherwise set it back to originalEnv); update the test around originalEnv, the
assignments to process.env.AGENTMEMORY_IMAGE_STORE_MAX_BYTES, and the final
restore in the block that calls registerDiskSizeManager and invokes
managerCallback so the env is correctly cleaned up after the assertions.

---

Nitpick comments:
In `@test/multimodal.test.ts`:
- Around line 56-160: Tests are coupling state across its two it blocks via the
module-scoped savedImagePath and KV contents (used by
observeCallback/registerObserveFunction and registerCompressFunction), which
makes Step 2 fragile if Step 1 doesn’t run first; fix by making the compress
test self-contained: either move both steps into a single test or add a
beforeAll that uses registerObserveFunction/observeCallback (or directly writes
a RawObservation into kv) to seed savedImagePath and the "mem:obs:test-session"
entry before registerCompressFunction is called, ensuring compressCallback sees
a valid raw observation (references: savedImagePath, observeCallback,
registerObserveFunction, registerCompressFunction, compressCallback, kv).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8173888a-1aa1-46e9-a453-e7e277fdfb6e

📥 Commits

Reviewing files that changed from the base of the PR and between 44b8f21 and 1d4fbce.

📒 Files selected for processing (11)
  • src/functions/auto-forget.ts
  • src/functions/disk-size-manager.ts
  • src/functions/evict.ts
  • src/functions/image-quota-cleanup.ts
  • src/functions/image-refs.ts
  • src/functions/observe.ts
  • src/functions/retention.ts
  • src/index.ts
  • src/state/schema.ts
  • src/utils/image-store.ts
  • test/multimodal.test.ts
✅ Files skipped from review due to trivial changes (1)
  • src/state/schema.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/functions/evict.ts

Comment thread src/functions/auto-forget.ts Outdated
Comment thread src/functions/disk-size-manager.ts
Comment thread src/functions/image-quota-cleanup.ts Outdated
Comment thread src/functions/image-quota-cleanup.ts Outdated
Comment thread src/functions/image-refs.ts
Comment thread src/utils/image-store.ts
Comment thread src/utils/image-store.ts
Comment thread test/multimodal.test.ts
Comment thread test/multimodal.test.ts
Comment thread test/multimodal.test.ts
@Tanmay-008
Copy link
Copy Markdown
Contributor Author

Hey @rohitg00 , add API key and please test this implementation

Resolves conflicts in:
- src/state/schema.ts (merged imageRefs + accessLog)
- src/functions/auto-forget.ts (kept audit, reordered ref decrement)
- src/functions/compress.ts (collapsed iii-sdk imports, s/ctx.logger/logger/)
- src/functions/evict.ts (kept audit + deleteAccessLog, deferred ref
  decrement until after delete succeeds)
- src/functions/retention.ts (kept source-aware delete routing, fetch mem
  from resolved scope, defer ref decrement)
- src/triggers/api.ts (kept main's registerFunction/trigger shape)
- src/types.ts (kept superset of AuditEntry.operation union)

qodo-ai action-required findings:
- mem::forget now decrements image refs for every deleted memory and
  observation (including the session-wipe branch), which stops sensitive
  screenshots leaking to ~/.agentmemory/images/ after user-initiated
  forget.
- Eviction / auto-forget now decrement refs only after the KV delete
  actually succeeds. The old order (decrement-then-delete) could
  desync ref counts when the delete threw.

Follow-on fixes:
- buildSyntheticCompression now carries modality and imageData through
  from raw to compressed, so the default zero-LLM path doesn't drop
  image metadata the viewer and governance paths rely on.
- Test harness updated for the current registerFunction (name, cb)
  signature and iii-sdk partial-mock pattern — 11/11 multimodal
  tests pass, full suite 788/788.

Closes rohitg00#64.
@rohitg00
Copy link
Copy Markdown
Owner

Conflicts resolved and pushed back to Tanmay-008:feat/multimodal-image-memory at 2edc02f. Full diff now builds clean and all 788 tests pass (11/11 new multimodal cases).

Conflict resolution notes

  • src/state/schema.ts — merged your imageRefs: "mem:image-refs" alongside main's accessLog: "mem:access". Dropped the ambiguous state: "mem:state" since nothing references it.
  • src/functions/evict.ts, retention.ts, auto-forget.ts — main added audit rows and try/catch around deletes. Kept those paths; reordered your decrementImageRef calls to run only after the KV delete succeeds (see qodo-ai finding below).
  • src/functions/compress.ts — collapsed the split iii-sdk imports into one line and replaced ctx.logger references with the module-level logger import that the rest of the file uses (ctx was never defined in scope, which was crashing the vision path the first time the try block entered).
  • src/triggers/api.ts — kept main's sdk.registerFunction("api::routine-create", async (req) => ...) shape and sdk.trigger({ function_id, payload }) call form, since every other trigger in the file uses it.
  • src/types.ts — kept main's superset of AuditEntry.operation.

qodo-ai findings addressed

  1. mem::forget leaves images on disk. src/functions/remember.ts now fetches the memory/observation before deleting and calls decrementImageRef for imageData and imageRef (guarding against the same path counted twice). Applies to the memoryId branch, the explicit observationIds branch, and the session-wipe branch. Sensitive screenshots no longer survive a user-initiated forget.
  2. Ref count desyncs when deletes fail. Reordered eviction/auto-forget: kv.delete runs first inside try/catch; only on success does decrementImageRef run. Retention does the same by resolving the scope, reading the memory from the resolved scope, then deleting.

Extra carry-through

  • buildSyntheticCompression now carries modality and imageData through from raw to compressed. Since auto-compress defaults OFF (Busting token allocation #138), this was dropping image metadata on the default path — viewer and governance expect those fields.
  • test/multimodal.test.ts updated for the current registerFunction(id, cb) signature and vi.mock("iii-sdk", async (importOriginal) => ...) partial-mock pattern, so TriggerAction stays exported.

Mergeable now reports MERGEABLE / UNSTABLE (Vercel preview needs your team auth, unrelated). Ready for review.

@rohitg00 rohitg00 merged commit bb60b7a into rohitg00:main Apr 21, 2026
3 of 4 checks passed
rohitg00 added a commit that referenced this pull request Apr 21, 2026
Builds on PR #111's image store. Adds a dedicated visual embedding
path so agents can search screenshots by natural language or by
similar image, without routing through VLM-describe → BM25-text.

## Additions

- `EmbeddingProvider.embedImage?(src)` slot (non-breaking, optional)
- `ClipEmbeddingProvider` — CLIP ViT-B/32 via `@xenova/transformers`
  (already a dep). 512d, normalized. Supports base64, data URL, and
  file paths. Lazy model load; same pattern as the existing local
  text provider.
- `createImageEmbeddingProvider()` gated on
  `AGENTMEMORY_IMAGE_EMBEDDINGS=true` — opt-in, matches the
  auto-compress/context-injection opt-in pattern from #138 and #143.
- `KV.imageEmbeddings` scope keyed by imageRef path.
- `mem::vision-embed` + `mem::vision-search` functions. Cosine
  similarity, brute-force over the stored set (fine at image-memory
  scales; ANN can come later).
- `POST /agentmemory/vision-embed` + `POST /agentmemory/vision-search`
  HTTP triggers with auth + 400/503 error shapes.
- MCP tool `memory_vision_search` exposed via the running-server
  handler. Accepts queryText, queryImageRef, queryImageBase64, topK,
  sessionId.
- Observe pipeline: when an image is stored and the flag is on,
  fires `mem::vision-embed` via triggerVoid so embedding happens off
  the critical path.
- `decrementImageRef` sweeps the embedding too when refcount hits
  zero, so forget/evict/retention stop orphaning vectors.

## Why

Mem0/Zep/Letta are text-only in 2026. AgenticVision ships CLIP but
needs manual capture. agentmemory already auto-captures via hooks —
bolting CLIP onto that pipeline makes it the only memory system
where the agent sees what it sees and can cross-modal retrieve
without manual invocation.

Full plan: local spec at ~/specs-backup/agentmemory-multimodal-plan.md.

## Tests

- test/vision-search.test.ts: 6 new cases covering embed→store, text
  query ranking, image-to-image query, sessionId filter, provider-
  absent error, missing-query rejection.
- Full suite: 794/794 pass (+6 new).
pull Bot pushed a commit to bhardwajRahul/agentmemory that referenced this pull request Apr 22, 2026
… onnx, restore KV.state

The v0.9.1 dist was dead on arrival. Three independent bugs layered:

1) `getContext` was re-imported during PR rohitg00#111 conflict resolution in
   three files (compress.ts, disk-size-manager.ts, image-quota-cleanup.ts).
   iii-sdk v0.11 dropped that export — the shim in src/logger.ts
   documents the fix. Tests mocked iii-sdk so `npm test` passed, but
   `node dist/cli.mjs` crashed on first import. Removed the imports,
   switched to the module-level logger.

2) The two PR rohitg00#111 files also used the
   `registerFunction({ id, description }, handler)` shape. iii-sdk
   0.11.2 only accepts `registerFunction(id, handler, options?)`. The
   dist imports succeeded but the runtime registration threw "not a
   function" on the test mock. Collapsed to the flat form.

3) `KV.state` was dropped from src/state/schema.ts during the same
   conflict resolution, which broke the disk-size-manager's persistence
   key. Restored `state: "mem:state"` — used only by this manager.

4) tsdown was inlining onnxruntime-node and @xenova/transformers into
   dist/. That rewrites the relative require() inside
   `onnxruntime-node/dist/binding.js` so it can't find
   `../bin/napi-v3/darwin/arm64/onnxruntime_binding.node` from dist/.
   Even without AGENTMEMORY_IMAGE_EMBEDDINGS the module graph still
   evaluated the bundled binding.js on startup and blew up. Added both
   packages (plus onnxruntime-web and the two Anthropic SDKs) to
   `external:` in tsdown.config.ts. Bundle shrank from 6.1 MB to
   1.9 MB. The CLIP / local embedding providers lazy-load them from
   node_modules where relative paths work.

5) Bumped iii-sdk 0.11.0 → 0.11.2 to match the API currently shipped
   (Logger / durable:subscriber / durable:publisher / TriggerAction.void).

6) test/multimodal.test.ts used the old `{ id, description }` mock
   shape — rewrote the four registerFunction mocks to match the real
   `(id, cb)` signature. 812/812 pass.

End-to-end smoke test with AGENTMEMORY_SLOTS=true AGENTMEMORY_REFLECT=true:
- livez → 200 ok
- GET /slots → all 8 defaults seeded into correct scopes (persona /
  user_preferences / tool_guidelines in global; rest in project)
- POST /slot (invalid sizeLimit / unknown scope) → 400 with specific
  error
- POST /slot/append overflow → 413 with currentSize + sizeLimit
- POST /slot/reflect on empty session → no-op
- GET /audit → every slot write + reflect emits an audit row
- POST /vision-search without flag → 503 disabled
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants