Skip to content

fix: address 38 system audit findings#67

Closed
rohitg00 wants to merge 2 commits intomainfrom
fix/system-audit
Closed

fix: address 38 system audit findings#67
rohitg00 wants to merge 2 commits intomainfrom
fix/system-audit

Conversation

@rohitg00
Copy link
Copy Markdown
Owner

@rohitg00 rohitg00 commented Mar 4, 2026

Summary

Fixes 38 findings from the system audit (1 critical, 12 high, 16 medium, 9 low) across security, race conditions, performance, correctness, and dead code categories.

New Files (2)

  • src/version.ts — Single VERSION constant replacing 11 hardcoded strings across 8 files
  • src/auth.ts — Shared timingSafeCompare() using crypto.timingSafeEqual + VIEWER_CSP constant

Security Fixes

  • Timing-safe auth comparison in all 3 auth implementations (api.ts, viewer/server.ts, mcp/server.ts)
  • Snapshot commitHash validation with /^[0-9a-f]{7,40}$/i regex to prevent command injection
  • Content-Type enforcement on POST/DELETE routes in viewer server
  • CSP header on viewer HTML response
  • Added npm_, glpat-, dop_v1_ token patterns to privacy scanner

Input Validation

  • Governance bulk delete requires at least one filter for non-dryRun operations
  • Team itemType validation against Set(["memory", "pattern"]) whitelist
  • maxObservationsPerSession enforcement in observe function

Race Condition Fixes

  • In-memory mutex around remember read-find-supersede-write cycle
  • In-memory mutex around relations read-modify-write for relatedIds

Bug Fixes

  • Schema version validation updated to accept both 0.3.0 and 0.4.0
  • GraphNode.properties typed as Record<string, unknown> (was Record<string, string>)
  • Standalone MCP tools/list filtered to only 5 implemented tools
  • Export version uses VERSION constant

Config & Cleanup

  • Circuit breaker thresholds configurable via constructor opts (with current values as defaults)
  • Health monitor interval.unref() to prevent blocking process shutdown
  • Removed unused getCounters()/getHistograms() telemetry exports
  • Centralized version string via VERSION import in 7 files

Modified Files (19)

src/triggers/api.ts, src/viewer/server.ts, src/mcp/server.ts, src/mcp/standalone.ts, src/index.ts, src/functions/observe.ts, src/functions/compress.ts (no change needed), src/functions/remember.ts, src/functions/relations.ts, src/functions/snapshot.ts, src/functions/governance.ts, src/functions/team.ts, src/functions/privacy.ts, src/functions/export-import.ts, src/eval/schemas.ts, src/types.ts, src/providers/circuit-breaker.ts, src/health/monitor.ts, src/telemetry/setup.ts

Test plan

  • npm run build — succeeds with no errors
  • npm test — 216 tests pass (29 test files)
  • Updated snapshot test to use valid 7-char commit hash
  • Manual: verify timing-safe auth rejects invalid tokens
  • Manual: verify bulk delete with no filters returns error

Summary by CodeRabbit

  • New Features

    • Session observation limits and a keyed locking mechanism to serialize related operations
    • Configurable circuit breaker options; dynamic service/version reporting
  • Security

    • Timing-safe authorization checks and a stricter viewer Content Security Policy
    • Extended secret detection for npm, GitHub, and Doppler tokens
  • Bug Fixes

    • Added validations (governance filters, commit hash format, item types) and relaxed export version acceptance
  • Tests

    • Updated snapshot test expectations and mock outputs

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 4, 2026

📝 Walkthrough

Walkthrough

Adds timing-safe auth utilities and CSP, centralizes VERSION usage, introduces a keyed mutex utility and applies it to memory operations (remember, relate, observe), expands secret-detection regexes, relaxes export schema for versions, adds several input validations and limits, makes circuit breaker configurable, and widens GraphNode property types.

Changes

Cohort / File(s) Summary
Auth & CSP
src/auth.ts, src/mcp/server.ts, src/triggers/api.ts
New timing-safe compare and VIEWER_CSP; replace direct header equality with timingSafeCompare and stricter auth handling.
Keyed Locking / Concurrency
src/state/keyed-mutex.ts, src/functions/remember.ts, src/functions/relations.ts, src/functions/observe.ts
Adds withKeyedLock implementation and applies per-key locks to mem::remember, mem::relate, and observation handling to serialize critical sections.
Versioning & Snapshots
src/version.ts, src/index.ts, src/functions/export-import.ts, src/functions/snapshot.ts, src/mcp/standalone.ts, src/triggers/api.ts
Centralizes VERSION constant, replaces hardcoded "0.4.0" with VERSION, validates commitHash format, and aligns export payload versions.
Validation & Limits
src/functions/observe.ts, src/functions/governance.ts, src/functions/team.ts
Adds maxObservationsPerSession limit, requires filters for governance bulk (non-dryRun), and validates itemType against allowed set.
Privacy / Secrets
src/functions/privacy.ts
Expands secret regexes to detect npm_..., glpat-..., and dop_v1_... token formats.
Circuit Breaker & Telemetry
src/providers/circuit-breaker.ts, src/telemetry/setup.ts
Makes CircuitBreaker thresholds/timeouts configurable via options; replaces hardcoded telemetry serviceVersion with VERSION and removes two metric accessors.
Health & Tests
src/health/monitor.ts, test/snapshot.test.ts
Calls interval.unref() on health monitor; updates test mocks/expectations for commitHash changes and compacted mocked JSON.
Types & Graph
src/types.ts, src/functions/graph.ts
Widen GraphNode.properties to Record<string, unknown> and guard graph property filtering to only operate on string values.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant ObserveFn as ObserveFunction
  participant LockMgr as withKeyedLock
  participant KV
  participant Stream
  participant MemCompress as mem::compress

  Client->>ObserveFn: submit observation (sessionId, data)
  ObserveFn->>LockMgr: withKeyedLock("obs:<sessionId>", fn)
  LockMgr->>KV: read session observation count
  alt limit reached
    KV-->>ObserveFn: count >= maxObservationsPerSession
    ObserveFn-->>Client: return { success:false, error }
  else allowed
    ObserveFn->>KV: kv.set(raw observation)
    ObserveFn->>Stream: stream.set(new observation)
    ObserveFn->>KV: increment session observationCount
    ObserveFn->>MemCompress: trigger mem::compress
    MemCompress-->>ObserveFn: done
    ObserveFn-->>Client: return { observationId }
  end
  Note over LockMgr,KV: operations serialized per session key
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

  • PR #6: Shares authentication and CSP-related edits (timing-safe comparisons and VIEWER_CSP) affecting the same files.
  • PR #7: Related to mem::relate changes; this PR adds mutex locking around relation creation while the other PR touches relation logic.
  • PR #4: Overlaps on export/import and VERSION handling changes that this PR centralizes and relaxes.

Poem

🐰 A tiny hop, a careful bind,

Locks in place to ease the mind,
Tokens hidden, versions one,
Timing-safe auth—nice and done.
Hooray for tidy, threaded fun! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: address 38 system audit findings' accurately describes the main objective of the PR, which is to remediate 38 audit findings across multiple areas of the codebase.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/system-audit

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (3)
src/functions/team.ts (1)

29-40: Redundant conditional after validation.

After the VALID_ITEM_TYPES check (lines 30-35), the conditional at line 38 is always true since only "memory" or "pattern" can reach that point. Consider simplifying.

♻️ Suggested simplification
       if (!VALID_ITEM_TYPES.has(data.itemType)) {
         return {
           success: false,
           error: "itemType must be 'memory' or 'pattern'",
         };
       }
 
       let content: unknown = null;
-      if (data.itemType === "memory" || data.itemType === "pattern") {
-        content = await kv.get<Memory>(KV.memories, data.itemId);
-      }
+      content = await kv.get<Memory>(KV.memories, data.itemId);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/team.ts` around lines 29 - 40, The conditional checking if
(data.itemType === "memory" || data.itemType === "pattern") is redundant because
VALID_ITEM_TYPES already guarantees that only those values reach this point;
simplify by removing that if and directly assign content = await
kv.get<Memory>(KV.memories, data.itemId). Update any surrounding logic that
depended on the branch accordingly (keep VALID_ITEM_TYPES, data.itemType,
content, and the kv.get<Memory>(KV.memories, data.itemId) call as the referenced
symbols).
iii-config.yaml (1)

39-44: Consider an environment-specific telemetry profile for deployed runs.

Lines 41-44 are good local defaults, but exporter: memory + full logs can become costly for long-running deployments. A production config variant (e.g., OTLP exporter + sampled logs) would be safer operationally.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iii-config.yaml` around lines 39 - 44, The current OtelModule block
(modules::observability::OtelModule) uses local defaults (exporter: memory,
logs_enabled: true) that are expensive for long-running deployments; add an
environment-specific telemetry profile and wiring so production uses a scalable
exporter and reduced log sampling—e.g., support an environment toggle or env
vars checked during config load that switch exporter to OTLP (or configurable
endpoint) and set logs_enabled to false or enable sampling instead of full logs;
ensure service_name remains set (service_name: agentmemory) and make exporter
and logs_enabled overrideable via ENV (or a separate production config profile)
so deployed runs don’t use the in-memory exporter or full logging by default.
src/functions/export-import.ts (1)

62-62: Avoid masking version compatibility with a type assertion.

Line 62 uses VERSION as ExportData["version"], which suppresses type checking instead of enforcing it. Prefer declaring VERSION with the export version type and using it directly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/export-import.ts` at line 62, The code is masking type issues
by casting VERSION at use-site; instead declare/define the VERSION constant with
the precise ExportData["version"] type and remove the cast in the export object.
Find the VERSION constant (and its initializer) and change its declaration to:
const VERSION: ExportData["version"] = <appropriate literal/value> (or
validate/convert its value to this type), then update the export where you see
version: VERSION as ExportData["version"] to simply version: VERSION so the
compiler enforces compatibility.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/auth.ts`:
- Around line 3-6: The timingSafeCompare function currently compares a.length
(string code units) but must check UTF-8 byte lengths before calling
timingSafeEqual to avoid ERR_CRYPTO_TIMING_SAFE_EQUAL_LENGTH; update
timingSafeCompare to compute byte lengths (e.g., via Buffer.byteLength(a,
'utf8') or by creating buffers with Buffer.from(a, 'utf8') and comparing their
.length) and return false if the byte lengths differ, then call timingSafeEqual
with those UTF-8 buffers (use the same Buffer.from(..., 'utf8') instances) to
perform the timing-safe comparison.

In `@src/functions/observe.ts`:
- Around line 57-68: The session observation limit check is racy: multiple
concurrent calls to the observe handler can read the same Session via
kv.get(KV.sessions, payload.sessionId) and all pass before incrementing
observationCount, exceeding maxObservationsPerSession; change the logic in the
observe handler (the block using maxObservationsPerSession, KV.sessions,
payload.sessionId, and session.observationCount) to perform the
check-and-increment atomically by using a session-scoped critical section or an
atomic KV operation (e.g., acquire a lock keyed by payload.sessionId or use KV
compare-and-set/transaction) so that the read, validation against
maxObservationsPerSession, increment of observationCount, and write back happen
as one atomic step; apply the same fix to the second occurrence that manipulates
the observationCount (lines ~118-123) so both places enforce the limit under
concurrency.

In `@src/functions/relations.ts`:
- Around line 7-17: The current createMutex in relations.ts only serializes
within that file so mem::relate and mem::remember can race; replace the per-file
mutex with a shared keyed lock (e.g., a simple in-memory Map<string,
PromiseChain> or exported keyedMutex helper) that returns a lock for a specific
memory id and use it in both relations.ts (replace createMutex usage) and
remember.ts (replace its mutex) so updates to the same KV.memories row are
serialized by memory id; ensure the API exposes a function like
acquireLockFor(memoryId) or lock(memoryId, fn) and update calls in the
mem::relate and mem::remember handlers to use that keyed lock.

In `@src/providers/circuit-breaker.ts`:
- Around line 18-22: The constructor for the CircuitBreakerOptions must validate
and normalize inputs before assigning: check opts?.failureThreshold,
opts?.failureWindowMs and opts?.recoveryTimeoutMs are finite numbers and >0 (and
coerce failureThreshold to an integer >=1), otherwise fall back to the existing
defaults (3, 60000, 30000). Implement these guards in the constructor (use
Number.isFinite and comparisons, and Math.max/Math.floor for the threshold) so
NaN, Infinity, zero or negative values cannot disable the breaker.

In `@src/types.ts`:
- Line 288: The change of properties to Record<string, unknown> makes the call
to .toLowerCase() on node properties unsafe; in the graph search where you
inspect n.properties (the Object.values(n.properties).some(...) check), add a
type guard so you only call string methods on strings (e.g., check typeof v ===
'string' before calling v.toLowerCase().includes(lower)) to avoid runtime errors
for numbers/booleans/objects/arrays.

In `@src/viewer/server.ts`:
- Around line 418-434: The handler currently calls readFileSync for each request
(using variables candidates and VIEWER_CSP); instead, load and cache the viewer
HTML once at module initialization: iterate the same candidates array at
top-level (before the request handler), try reading each path with readFileSync
until one succeeds, store the resulting string in a module-scoped constant like
cachedViewerHtml (or null/undefined if none found), and then in the request
handler return that cachedViewerHtml with the existing headers (including
VIEWER_CSP) rather than calling readFileSync per request; also ensure you handle
the case where no candidate was found by sending a 404 or appropriate error
response.

---

Nitpick comments:
In `@iii-config.yaml`:
- Around line 39-44: The current OtelModule block
(modules::observability::OtelModule) uses local defaults (exporter: memory,
logs_enabled: true) that are expensive for long-running deployments; add an
environment-specific telemetry profile and wiring so production uses a scalable
exporter and reduced log sampling—e.g., support an environment toggle or env
vars checked during config load that switch exporter to OTLP (or configurable
endpoint) and set logs_enabled to false or enable sampling instead of full logs;
ensure service_name remains set (service_name: agentmemory) and make exporter
and logs_enabled overrideable via ENV (or a separate production config profile)
so deployed runs don’t use the in-memory exporter or full logging by default.

In `@src/functions/export-import.ts`:
- Line 62: The code is masking type issues by casting VERSION at use-site;
instead declare/define the VERSION constant with the precise
ExportData["version"] type and remove the cast in the export object. Find the
VERSION constant (and its initializer) and change its declaration to: const
VERSION: ExportData["version"] = <appropriate literal/value> (or
validate/convert its value to this type), then update the export where you see
version: VERSION as ExportData["version"] to simply version: VERSION so the
compiler enforces compatibility.

In `@src/functions/team.ts`:
- Around line 29-40: The conditional checking if (data.itemType === "memory" ||
data.itemType === "pattern") is redundant because VALID_ITEM_TYPES already
guarantees that only those values reach this point; simplify by removing that if
and directly assign content = await kv.get<Memory>(KV.memories, data.itemId).
Update any surrounding logic that depended on the branch accordingly (keep
VALID_ITEM_TYPES, data.itemType, content, and the kv.get<Memory>(KV.memories,
data.itemId) call as the referenced symbols).

ℹ️ Review info
Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ea57b3b8-7020-43d6-ae74-e77f9c20c617

📥 Commits

Reviewing files that changed from the base of the PR and between e2d8e70 and 52deda9.

📒 Files selected for processing (25)
  • iii-config.yaml
  • src/auth.ts
  • src/eval/schemas.ts
  • src/functions/compress.ts
  • src/functions/export-import.ts
  • src/functions/governance.ts
  • src/functions/observe.ts
  • src/functions/privacy.ts
  • src/functions/relations.ts
  • src/functions/remember.ts
  • src/functions/snapshot.ts
  • src/functions/team.ts
  • src/health/monitor.ts
  • src/index.ts
  • src/mcp/server.ts
  • src/mcp/standalone.ts
  • src/providers/circuit-breaker.ts
  • src/state/schema.ts
  • src/telemetry/setup.ts
  • src/triggers/api.ts
  • src/types.ts
  • src/version.ts
  • src/viewer/index.html
  • src/viewer/server.ts
  • test/snapshot.test.ts

Comment thread src/auth.ts
Comment thread src/functions/observe.ts Outdated
Comment thread src/functions/relations.ts Outdated
Comment on lines +7 to +17
function createMutex() {
let pending = Promise.resolve();
return <T>(fn: () => Promise<T>): Promise<T> => {
const result = pending.then(fn, fn);
pending = result.then(
() => {},
() => {},
);
return result;
};
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use a shared memory-write lock across handlers, not a per-file mutex.

This mutex only serializes mem::relate. src/functions/remember.ts has a separate lock, so concurrent mem::remember + mem::relate updates to the same KV.memories row can still overwrite each other (lost relatedIds/isLatest updates). Please move to a shared lock (ideally keyed by memory id) used by both handlers.

Also applies to: 49-50, 64-97

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/relations.ts` around lines 7 - 17, The current createMutex in
relations.ts only serializes within that file so mem::relate and mem::remember
can race; replace the per-file mutex with a shared keyed lock (e.g., a simple
in-memory Map<string, PromiseChain> or exported keyedMutex helper) that returns
a lock for a specific memory id and use it in both relations.ts (replace
createMutex usage) and remember.ts (replace its mutex) so updates to the same
KV.memories row are serialized by memory id; ensure the API exposes a function
like acquireLockFor(memoryId) or lock(memoryId, fn) and update calls in the
mem::relate and mem::remember handlers to use that keyed lock.

Comment thread src/providers/circuit-breaker.ts
Comment thread src/types.ts
Comment thread src/viewer/server.ts Outdated
Comment on lines +418 to +434
const base = dirname(fileURLToPath(import.meta.url));
const candidates = [
join(base, "..", "src", "viewer", "index.html"),
join(base, "..", "viewer", "index.html"),
join(base, "viewer", "index.html"),
];
for (const p of candidates) {
try {
const html = readFileSync(p, "utf-8");
res.writeHead(200, {
"Content-Type": "text/html; charset=utf-8",
"Content-Security-Policy": VIEWER_CSP,
"Cache-Control": "no-cache",
});
res.end(html);
return;
} catch {}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify synchronous fs usage in HTTP-serving paths.
rg -n --type ts 'readFileSync\(' src/viewer/server.ts src/triggers/api.ts

Repository: rohitg00/agentmemory

Length of output: 209


🏁 Script executed:

#!/bin/bash
# Check context around readFileSync calls
echo "=== server.ts context (lines 410-440) ==="
sed -n '410,440p' src/viewer/server.ts

echo ""
echo "=== api.ts context (lines 1000-1015) ==="
sed -n '1000,1015p' src/triggers/api.ts

Repository: rohitg00/agentmemory

Length of output: 1537


🏁 Script executed:

#!/bin/bash
# Verify the broader scope - check if there are other readFileSync calls in hot paths
echo "=== All readFileSync calls in codebase ==="
rg -n 'readFileSync\(' --type ts | head -20

Repository: rohitg00/agentmemory

Length of output: 630


Cache the viewer HTML instead of reading it synchronously on every request.

readFileSync at line 426 blocks the event loop for every GET request to /viewer. Load the HTML once at module initialization and reuse it.

Suggested approach
+let cachedViewerHtml: string | null | undefined;
+
+function getViewerHtml(base: string): string | null {
+  if (cachedViewerHtml !== undefined) return cachedViewerHtml;
+  const candidates = [
+    join(base, "..", "src", "viewer", "index.html"),
+    join(base, "..", "viewer", "index.html"),
+    join(base, "viewer", "index.html"),
+  ];
+  for (const p of candidates) {
+    try {
+      cachedViewerHtml = readFileSync(p, "utf-8");
+      return cachedViewerHtml;
+    } catch {}
+  }
+  cachedViewerHtml = null;
+  return null;
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/viewer/server.ts` around lines 418 - 434, The handler currently calls
readFileSync for each request (using variables candidates and VIEWER_CSP);
instead, load and cache the viewer HTML once at module initialization: iterate
the same candidates array at top-level (before the request handler), try reading
each path with readFileSync until one succeeds, store the resulting string in a
module-scoped constant like cachedViewerHtml (or null/undefined if none found),
and then in the request handler return that cachedViewerHtml with the existing
headers (including VIEWER_CSP) rather than calling readFileSync per request;
also ensure you handle the case where no candidate was found by sending a 404 or
appropriate error response.

…nd reliability

Security:
- Timing-safe auth comparison (api.ts, mcp/server.ts, viewer auth)
- Commit hash format validation in snapshot restore
- 3 new secret patterns (npm_, glpat-, dop_v1_) in privacy scanner
- CSP header shared constant

Correctness:
- Single VERSION constant replacing 11 hardcoded strings
- GraphNode.properties typed as Record<string, unknown>
- Schema version accepts both 0.3.0 and 0.4.0
- Standalone MCP filters tools/list to only implemented tools

Race conditions:
- In-memory mutex for remember (read-find-supersede-write)
- In-memory mutex for relations (read-modify-write relatedIds)

Input validation:
- Governance bulk delete requires at least one filter
- Team share validates itemType against whitelist
- Observe enforces maxObservationsPerSession limit
- Content-Type enforcement reference

Reliability:
- Circuit breaker thresholds configurable via constructor
- Health monitor interval.unref() prevents process hang
- Removed dead getCounters/getHistograms exports from telemetry
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/triggers/api.ts (1)

74-74: ⚠️ Potential issue | 🟡 Minor

Use VERSION constant instead of hardcoded string.

VERSION is imported (line 11) but not used here. The health endpoint should use the centralized constant for consistency.

Proposed fix
       return {
         status_code: statusCode,
         body: {
           status,
           service: "agentmemory",
-          version: "0.4.0",
+          version: VERSION,
           health: health || null,
           functionMetrics,
           circuitBreaker,
         },
       };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/triggers/api.ts` at line 74, Replace the hardcoded version string in the
health endpoint response ("0.4.0") with the imported VERSION constant; locate
the object that currently contains version: "0.4.0" in src/triggers/api.ts (the
health response/handler) and change it to version: VERSION so the endpoint uses
the centralized constant.
src/functions/remember.ts (1)

109-155: ⚠️ Potential issue | 🟡 Minor

mem::forget operates without lock protection.

While mem::remember is now locked, mem::forget deletes memories and sessions without acquiring the mutex. A concurrent mem::remember could create a memory referencing a session being deleted, or mem::forget could delete a memory mid-update from mem::relate.

Consider wrapping destructive operations in withLock:

🔒 Proposed fix
   sdk.registerFunction(
     { id: "mem::forget" },
     async (data: {
       sessionId?: string;
       observationIds?: string[];
       memoryId?: string;
     }) => {
       const ctx = getContext();
-      let deleted = 0;
-
-      if (data.memoryId) {
-        await kv.delete(KV.memories, data.memoryId);
-        deleted++;
-      }
-      // ... rest of implementation
-      return { success: true, deleted };
+      return withLock(async () => {
+        let deleted = 0;
+
+        if (data.memoryId) {
+          await kv.delete(KV.memories, data.memoryId);
+          deleted++;
+        }
+        // ... rest of implementation
+        return { success: true, deleted };
+      });
     },
   );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/remember.ts` around lines 109 - 155, The mem::forget function
performs deletions without acquiring the same mutex used by
mem::remember/relate; wrap its destructive sections in the existing withLock
call to prevent races: acquire the global memory lock (the same lock used by
mem::remember/mem::relate) before any kv.delete or kv.list operations and
release after the deletions complete, ensuring the branches that delete by
memoryId, by sessionId+observationIds, and the full-session cleanup all run
under the lock; reference mem::forget, withLock, mem::remember, mem::relate, and
the kv.delete/kv.list/KV.* keys when making the change.
src/functions/relations.ts (1)

110-159: ⚠️ Potential issue | 🟠 Major

mem::evolve performs read-modify-write without lock protection.

This function reads existing (Line 119), modifies isLatest (Line 140), and writes back (Line 141). Concurrent mem::evolve, mem::relate, or mem::remember calls targeting the same memory can overwrite each other's changes (lost updates to relatedIds, isLatest, etc.).

Wrap the critical section in withLock for consistency with mem::relate:

🔒 Proposed fix
     async (data: {
       memoryId: string;
       newContent: string;
       newTitle?: string;
     }) => {
       const ctx = getContext();

-      const existing = await kv.get<Memory>(KV.memories, data.memoryId);
-      if (!existing) {
-        return { success: false, error: "memory not found" };
-      }
-
-      const now = new Date().toISOString();
-      // ... rest of implementation ...
-
-      return { success: true, memory: evolved, previousId: existing.id };
+      return withLock(async () => {
+        const existing = await kv.get<Memory>(KV.memories, data.memoryId);
+        if (!existing) {
+          return { success: false, error: "memory not found" };
+        }
+
+        const now = new Date().toISOString();
+        // ... rest of implementation ...
+
+        return { success: true, memory: evolved, previousId: existing.id };
+      });
     },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/relations.ts` around lines 110 - 159, The mem::evolve handler
performs a read-modify-write on the same Memory record (reads existing, sets
existing.isLatest = false, writes both existing and evolved) without
synchronization, causing lost updates under concurrent
mem::evolve/mem::relate/mem::remember calls; to fix, wrap the critical section
that reads existing, writes evolved and updates existing (the calls to
kv.get(KV.memories, ...), kv.set(KV.memories, evolved.id, ...), updating
existing.isLatest and kv.set(KV.memories, existing.id, ...), and creating the
relation via kv.set(KV.relations, ...)) inside the same withLock call used
elsewhere (use the memory id as the lock key), ensuring only one mutation runs
at a time and preserve existing logging and returned values from the locked
block.
🧹 Nitpick comments (2)
src/functions/remember.ts (1)

7-17: Duplicate implementation - see comment in relations.ts.

Same createMutex() function duplicated here. Extract to shared utility.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/remember.ts` around lines 7 - 17, The createMutex
implementation in remember.ts is duplicated from relations.ts; extract it into a
shared utility (e.g., utils/mutex or a new exported helper) by moving the
function createMutex (and its inner withLock type/semantics) into that module,
export it, then replace the local implementation in remember.ts and relations.ts
with an import of createMutex and use the exported function (ensuring any type
generics/return types for withLock are preserved); update imports where
createMutex/withLock were used and remove the duplicated code from both files.
src/functions/relations.ts (1)

7-17: Duplicate createMutex() implementation across modules.

This exact implementation is duplicated in src/functions/remember.ts. Extract to a shared utility module (e.g., src/utils/mutex.ts) and import in both places to avoid drift.

♻️ Suggested shared mutex utility

Create src/utils/mutex.ts:

export function createMutex() {
  let chain = Promise.resolve();
  return function withLock<T>(fn: () => Promise<T>): Promise<T> {
    const next = chain.then(fn, fn);
    chain = next.then(
      () => {},
      () => {},
    );
    return next;
  };
}

Then import in both files:

+import { createMutex } from "../utils/mutex.js";
-function createMutex() { ... }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/relations.ts` around lines 7 - 17, The createMutex
implementation is duplicated; extract it into a shared exported utility (e.g.,
createMutex in src/utils/mutex.ts) and replace the local implementations in both
modules by importing that exported createMutex; update
src/functions/relations.ts and src/functions/remember.ts to import { createMutex
} from the new utils module and remove the duplicated function bodies so both
modules reuse the single exported createMutex.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/functions/relations.ts`:
- Around line 110-159: The mem::evolve handler performs a read-modify-write on
the same Memory record (reads existing, sets existing.isLatest = false, writes
both existing and evolved) without synchronization, causing lost updates under
concurrent mem::evolve/mem::relate/mem::remember calls; to fix, wrap the
critical section that reads existing, writes evolved and updates existing (the
calls to kv.get(KV.memories, ...), kv.set(KV.memories, evolved.id, ...),
updating existing.isLatest and kv.set(KV.memories, existing.id, ...), and
creating the relation via kv.set(KV.relations, ...)) inside the same withLock
call used elsewhere (use the memory id as the lock key), ensuring only one
mutation runs at a time and preserve existing logging and returned values from
the locked block.

In `@src/functions/remember.ts`:
- Around line 109-155: The mem::forget function performs deletions without
acquiring the same mutex used by mem::remember/relate; wrap its destructive
sections in the existing withLock call to prevent races: acquire the global
memory lock (the same lock used by mem::remember/mem::relate) before any
kv.delete or kv.list operations and release after the deletions complete,
ensuring the branches that delete by memoryId, by sessionId+observationIds, and
the full-session cleanup all run under the lock; reference mem::forget,
withLock, mem::remember, mem::relate, and the kv.delete/kv.list/KV.* keys when
making the change.

In `@src/triggers/api.ts`:
- Line 74: Replace the hardcoded version string in the health endpoint response
("0.4.0") with the imported VERSION constant; locate the object that currently
contains version: "0.4.0" in src/triggers/api.ts (the health response/handler)
and change it to version: VERSION so the endpoint uses the centralized constant.

---

Nitpick comments:
In `@src/functions/relations.ts`:
- Around line 7-17: The createMutex implementation is duplicated; extract it
into a shared exported utility (e.g., createMutex in src/utils/mutex.ts) and
replace the local implementations in both modules by importing that exported
createMutex; update src/functions/relations.ts and src/functions/remember.ts to
import { createMutex } from the new utils module and remove the duplicated
function bodies so both modules reuse the single exported createMutex.

In `@src/functions/remember.ts`:
- Around line 7-17: The createMutex implementation in remember.ts is duplicated
from relations.ts; extract it into a shared utility (e.g., utils/mutex or a new
exported helper) by moving the function createMutex (and its inner withLock
type/semantics) into that module, export it, then replace the local
implementation in remember.ts and relations.ts with an import of createMutex and
use the exported function (ensuring any type generics/return types for withLock
are preserved); update imports where createMutex/withLock were used and remove
the duplicated code from both files.

ℹ️ Review info
Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f12e7a93-1bac-4efb-a95e-c87849678d06

📥 Commits

Reviewing files that changed from the base of the PR and between 52deda9 and bf2594f.

📒 Files selected for processing (20)
  • src/auth.ts
  • src/eval/schemas.ts
  • src/functions/export-import.ts
  • src/functions/governance.ts
  • src/functions/observe.ts
  • src/functions/privacy.ts
  • src/functions/relations.ts
  • src/functions/remember.ts
  • src/functions/snapshot.ts
  • src/functions/team.ts
  • src/health/monitor.ts
  • src/index.ts
  • src/mcp/server.ts
  • src/mcp/standalone.ts
  • src/providers/circuit-breaker.ts
  • src/telemetry/setup.ts
  • src/triggers/api.ts
  • src/types.ts
  • src/version.ts
  • test/snapshot.test.ts
🚧 Files skipped from review as they are similar to previous changes (8)
  • src/mcp/server.ts
  • src/functions/team.ts
  • src/providers/circuit-breaker.ts
  • test/snapshot.test.ts
  • src/functions/governance.ts
  • src/functions/export-import.ts
  • src/functions/observe.ts
  • src/health/monitor.ts

- auth: compare UTF-8 byte lengths via Buffer.from(…, 'utf8') to prevent
  ERR_CRYPTO_TIMING_SAFE_EQUAL_LENGTH on multi-byte chars
- observe: wrap limit check + write + count increment in session-scoped
  keyed lock to prevent concurrent observe calls exceeding the limit
- relations + remember: replace per-file mutexes with shared keyed-mutex
  so cross-function updates to the same KV.memories row are serialized
- circuit-breaker: validate constructor opts (Number.isFinite, >0, floor
  threshold to integer >=1) to prevent NaN/Infinity/negative from
  disabling the breaker
- graph: add typeof === 'string' guard on Object.values(n.properties)
  after changing GraphNode.properties to Record<string, unknown>
- version: type VERSION as literal union '0.3.0' | '0.4.0' to remove
  the cast in export-import.ts
- team: remove redundant itemType branch after VALID_ITEM_TYPES guard
- api: replace last hardcoded '0.4.0' with VERSION
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
src/functions/relations.ts (1)

50-52: ⚠️ Potential issue | 🔴 Critical

Lock key granularity still allows lost updates on shared memories.

Line 50 uses a pair lock (sourceId:targetId). Concurrent relates like A-B and A-C take different locks, both mutate KV.memories[A], and can overwrite each other’s relatedIds. Use per-memory lock keys in deterministic order so any operation touching the same memory serializes; align mem::remember to the same key domain for existing-memory writes.

🔧 Suggested direction
-      const lockKey = [data.sourceId, data.targetId].sort().join(":");
-
-      return withKeyedLock(lockKey, async () => {
+      const ids = [...new Set([data.sourceId, data.targetId])].sort();
+      const run = async () => {
         // existing relate body unchanged
-      });
+      };
+
+      if (ids.length === 1) {
+        return withKeyedLock(`mem:${ids[0]}`, run);
+      }
+      return withKeyedLock(`mem:${ids[0]}`, () =>
+        withKeyedLock(`mem:${ids[1]}`, run),
+      );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/relations.ts` around lines 50 - 52, The current pair lock
(lockKey built from [data.sourceId,data.targetId]) allows concurrent ops
touching the same memory to race; change to acquire per-memory locks in
deterministic order: build an array of the distinct memory ids involved (e.g.,
[data.sourceId, data.targetId].filter(unique).sort()), map them to a stable
domain key (prefix each with a token like "mem:"), and call withKeyedLock using
that ordered set (or a joined key) so any operation that touches a memory
serializes on that memory id; also ensure the mem::remember codepath uses the
same "mem:<id>" lock key domain when writing existing KV.memories entries so
both relate() and mem::remember serialize on the same lock names.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/functions/observe.ts`:
- Around line 90-99: The dedupMap.record(hash) is being called before write
admission and lock checks, causing rejected observations to be marked as
duplicates; move both the dedup check (dedupMap.has/hash check) and
dedupMap.record(hash) inside the withKeyedLock callback (the block guarded by
withKeyedLock(`obs:${payload.sessionId}`)) and only call dedupMap.record(hash)
after a successful persistence (after kv.set in the observation write path), so
that rejected or non-persisted observations are not recorded as deduplicated;
reference the withKeyedLock call, payload.sessionId, dedupMap.record/hash
checks, and kv.set to locate the changes.

In `@src/providers/circuit-breaker.ts`:
- Around line 35-36: The timestamp checks use truthiness which treats 0 as
falsy; update the conditions to explicit null checks: replace uses like
"this.openedAt && Date.now() - this.openedAt >= this.recoveryTimeoutMs" with
"this.openedAt !== null && Date.now() - this.openedAt >=
this.recoveryTimeoutMs", and similarly change the other check (e.g.
"this.windowStart") to "this.windowStart !== null" so epoch-zero or fake timers
won't skip timeout/window logic in the CircuitBreaker methods.

---

Duplicate comments:
In `@src/functions/relations.ts`:
- Around line 50-52: The current pair lock (lockKey built from
[data.sourceId,data.targetId]) allows concurrent ops touching the same memory to
race; change to acquire per-memory locks in deterministic order: build an array
of the distinct memory ids involved (e.g., [data.sourceId,
data.targetId].filter(unique).sort()), map them to a stable domain key (prefix
each with a token like "mem:"), and call withKeyedLock using that ordered set
(or a joined key) so any operation that touches a memory serializes on that
memory id; also ensure the mem::remember codepath uses the same "mem:<id>" lock
key domain when writing existing KV.memories entries so both relate() and
mem::remember serialize on the same lock names.

ℹ️ Review info
Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 38d43309-f8bf-46d8-9a67-4a6e4f6afc83

📥 Commits

Reviewing files that changed from the base of the PR and between bf2594f and bf3a03a.

📒 Files selected for processing (11)
  • src/auth.ts
  • src/functions/export-import.ts
  • src/functions/graph.ts
  • src/functions/observe.ts
  • src/functions/relations.ts
  • src/functions/remember.ts
  • src/functions/team.ts
  • src/providers/circuit-breaker.ts
  • src/state/keyed-mutex.ts
  • src/triggers/api.ts
  • src/version.ts
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/functions/export-import.ts
  • src/auth.ts
  • src/functions/team.ts

Comment thread src/functions/observe.ts
Comment on lines +90 to +99
return withKeyedLock(`obs:${payload.sessionId}`, async () => {
if (maxObservationsPerSession && maxObservationsPerSession > 0) {
const existing = await kv.list(KV.observations(payload.sessionId));
if (existing.length >= maxObservationsPerSession) {
return {
success: false,
error: `Session observation limit reached (${maxObservationsPerSession})`,
};
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Dedup state is recorded before write admission completes.

dedupMap.record(hash) happens before the Line 93 limit rejection path. If the observation is rejected, later retries can be treated as duplicates even though nothing was persisted. Move dedup check/record into the locked block and record only after successful kv.set.

🔧 Proposed adjustment
-      if (dedupMap) {
+      let dedupHash: string | undefined;
+      if (dedupMap) {
         const d =
           typeof payload.data === "object" && payload.data !== null
             ? (payload.data as Record<string, unknown>)
             : {};
         const toolName = (d["tool_name"] as string) || payload.hookType;
-        const hash = dedupMap.computeHash(
+        dedupHash = dedupMap.computeHash(
           payload.sessionId,
           toolName,
           d["tool_input"],
         );
-        if (dedupMap.isDuplicate(hash)) {
-          return { deduplicated: true, sessionId: payload.sessionId };
-        }
-        dedupMap.record(hash);
       }
@@
       return withKeyedLock(`obs:${payload.sessionId}`, async () => {
         if (maxObservationsPerSession && maxObservationsPerSession > 0) {
           const existing = await kv.list(KV.observations(payload.sessionId));
           if (existing.length >= maxObservationsPerSession) {
             return {
               success: false,
               error: `Session observation limit reached (${maxObservationsPerSession})`,
             };
           }
         }
+        if (dedupMap && dedupHash && dedupMap.isDuplicate(dedupHash)) {
+          return { deduplicated: true, sessionId: payload.sessionId };
+        }
 
         await kv.set(KV.observations(payload.sessionId), obsId, raw);
+        if (dedupMap && dedupHash) dedupMap.record(dedupHash);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/observe.ts` around lines 90 - 99, The dedupMap.record(hash) is
being called before write admission and lock checks, causing rejected
observations to be marked as duplicates; move both the dedup check
(dedupMap.has/hash check) and dedupMap.record(hash) inside the withKeyedLock
callback (the block guarded by withKeyedLock(`obs:${payload.sessionId}`)) and
only call dedupMap.record(hash) after a successful persistence (after kv.set in
the observation write path), so that rejected or non-persisted observations are
not recorded as deduplicated; reference the withKeyedLock call,
payload.sessionId, dedupMap.record/hash checks, and kv.set to locate the
changes.

Comment on lines +35 to +36
this.openedAt &&
Date.now() - this.openedAt >= this.recoveryTimeoutMs
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use explicit null checks for timestamps instead of truthiness.

At Line 35 and Line 62, 0 is treated as falsy, so timeout/window logic can be skipped under fake timers (or epoch-based test clocks). Use !== null checks to avoid state bugs.

🔧 Proposed fix
-      if (
-        this.openedAt &&
-        Date.now() - this.openedAt >= this.recoveryTimeoutMs
-      ) {
+      if (
+        this.openedAt !== null &&
+        Date.now() - this.openedAt >= this.recoveryTimeoutMs
+      ) {
         this.state = "half-open";
         return true;
       }
@@
-    if (this.lastFailureAt && now - this.lastFailureAt > this.failureWindowMs) {
+    if (
+      this.lastFailureAt !== null &&
+      now - this.lastFailureAt > this.failureWindowMs
+    ) {
       this.failures = 0;
     }

Also applies to: 62-62

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/providers/circuit-breaker.ts` around lines 35 - 36, The timestamp checks
use truthiness which treats 0 as falsy; update the conditions to explicit null
checks: replace uses like "this.openedAt && Date.now() - this.openedAt >=
this.recoveryTimeoutMs" with "this.openedAt !== null && Date.now() -
this.openedAt >= this.recoveryTimeoutMs", and similarly change the other check
(e.g. "this.windowStart") to "this.windowStart !== null" so epoch-zero or fake
timers won't skip timeout/window logic in the CircuitBreaker methods.

@rohitg00
Copy link
Copy Markdown
Owner Author

rohitg00 commented Mar 4, 2026

Replaced by #68 (clean branch from main)

@rohitg00 rohitg00 closed this Mar 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant