Skip to content

fix(boot): self-explanatory + one-shot recovery for vector-dim mismatch (closes #469, #455)#575

Open
rohitg00 wants to merge 1 commit into
mainfrom
fix/vector-dim-recovery
Open

fix(boot): self-explanatory + one-shot recovery for vector-dim mismatch (closes #469, #455)#575
rohitg00 wants to merge 1 commit into
mainfrom
fix/vector-dim-recovery

Conversation

@rohitg00
Copy link
Copy Markdown
Owner

@rohitg00 rohitg00 commented May 20, 2026

Summary

Closes #469. Closes #455.

Recovery from a persisted vector-index dimension mismatch (after the user changes EMBEDDING_PROVIDER and the new provider declares a different vector dimension) was unnecessarily hard. Two real problems behind the report:

1. Users couldn't discover the right .env file

The error pointed at AGENTMEMORY_DROP_STALE_INDEX=true but never said where that flag belongs. Under LaunchAgent / systemd / Docker the running process's HOME can differ from the user's interactive shell, so editing the shell's ~/.agentmemory/.env doesn't reach the process.

Fix: export the resolved paths from config.ts and embed them in the error message, including a ready-to-paste echo … >> $envFile recovery line and an explicit HOME-resolution note.

2. Dropping was not one-shot

The dropStale branch only skipped restoring the bad vectors in memory — the stale payload stayed in KV. Removing the flag on the next boot would re-trip the guard.

Fix: in the dropStale branch, persist the cleared vector index back via indexPersistence.save(). The recovery is now one-shot: the flag drops the stale payload AND clears the on-disk KV, so the next boot is clean even after the flag is removed.

Diff

  • src/config.ts+12 exports RESOLVED_PATHS for callers that need the actually-read paths
  • src/index.ts+33/-5 better error message + post-drop persist

40 insertions, 5 deletions across 2 files.

What it does NOT do

  • No new CLI subcommand (the --drop-stale flow is already the documented escape hatch)
  • No auto-migrate / re-embed (larger feature, out of scope for a minimal bug fix)
  • No dotenv-style global env injection (the runtime reads .env per call already; the issue was discoverability not load order)

Validation

  • npm test → 97/97 test files, 1081/1081 tests pass
  • npm run build → bundle clean
  • tsc --noEmit clean on touched files

Example error after the fix

[agentmemory] Refusing to start: persisted vector index has 19 of 19 vectors with the wrong dimension. Active provider (local) declares 384; dimensions seen on disk: 2048. First mismatched obsIds: mem_… (dim=2048), ... Loading would silently corrupt search (cross-dimension cosine returns 0).

Resolved paths:
  data dir: /Users/foo/.agentmemory
  env file: /Users/foo/.agentmemory/.env (exists: true)

Recovery — pick ONE:
  1. One-shot drop + rebuild (recommended):
       echo 'AGENTMEMORY_DROP_STALE_INDEX=true' >> /Users/foo/.agentmemory/.env
       # restart agentmemory; the flag can be removed after the next clean boot.
  2. Re-embed the existing index against the new provider, then start.
  3. Switch the embedding provider back to the one that wrote the index.

If running under a service manager (LaunchAgent, systemd, Docker), confirm
HOME points at the user account that owns /Users/foo/.agentmemory —
the .env file above is what the running process actually reads.

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced error handling for vector index dimension mismatches, with automatic cleanup of stale indexes when configured.
    • Improved startup error messages with clear recovery steps and troubleshooting guidance.
    • Prevented startup crash-loop scenarios during vector index recovery.

Review Change Stack

…sistent

Recovery from a persisted vector-index dimension mismatch (after the user
changes EMBEDDING_PROVIDER) was unnecessarily hard:

  1. The error pointed users at `AGENTMEMORY_DROP_STALE_INDEX=true` but
     never told them which `.env` file the running process actually reads,
     which matters under LaunchAgent / systemd / Docker contexts where
     HOME can differ from the expected user directory.

  2. Even after setting the flag, the dropStale branch only skipped
     restoring the bad vectors in memory — the stale payload stayed in KV.
     Removing the flag on the next boot would crash-loop the server again
     because the persisted index was still bad.

Two minimal changes:

  - Export the resolved data dir + env-file paths from config and surface
    them in the error message, with a ready-to-paste `echo … >> $envFile`
    recovery line and an explicit HOME-resolution note for service-managed
    deployments.

  - In the dropStale branch, persist the now-cleared index back via
    indexPersistence.save() so the recovery is one-shot: the flag drops
    the stale payload AND clears the on-disk KV, so the next boot is
    clean even after the flag is removed.

Tests (1081) + build pass.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 20, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agentmemory Ready Ready Preview, Comment May 20, 2026 3:43pm

Request Review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

📝 Walkthrough

Walkthrough

This PR improves recovery from persisted vector index dimension mismatches during startup. It centralizes filesystem paths in a RESOLVED_PATHS config export, then uses those paths in enhanced error messaging and automatic recovery logic when embedding dimensions conflict. When the AGENTMEMORY_DROP_STALE_INDEX flag is enabled, the stale index is cleared and persisted immediately; otherwise, a detailed error with recovery steps and path information is reported.

Changes

Vector Index Dimension Mismatch Handling

Layer / File(s) Summary
Config path resolution
src/config.ts
RESOLVED_PATHS constant exports dataDir and envFile paths resolved from the user home directory, with an envFileExists() helper using existsSync().
Dimension mismatch handling and recovery
src/index.ts
Import RESOLVED_PATHS for use in startup validation. When persisted vectors have wrong embedding dimensions and AGENTMEMORY_DROP_STALE_INDEX=true, immediately clear and save the index via indexPersistence.save() to prevent crash-loop restarts. When the flag is not set, throw an expanded error message that includes resolved data/env paths, env file existence status, numbered recovery steps with explicit echo instructions, and HOME/data-dir ownership guidance.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • rohitg00/agentmemory#248: Both PRs modify the persisted vector-dimension mismatch startup path in src/index.ts regarding validation and persistence of embeddings.
  • rohitg00/agentmemory#461: Both PRs address the AGENTMEMORY_DROP_STALE_INDEX startup guard, reading the flag from config and clearing/persisting the mismatched index in src/index.ts.

Poem

🐰 A path through the chaos, resolved and clear,
No more dimension ghosts to fear!
Drop stale vectors, save with grace—
The rabbit hops to startup's place.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main changes: fixing boot-time recovery for vector-dimension mismatch and making error messages self-explanatory, directly addressing issues #469 and #455.
Linked Issues check ✅ Passed The PR successfully addresses all primary coding objectives: exporting RESOLVED_PATHS to surface actual env/data paths, enhancing error messages with actionable recovery steps, and persisting cleared index via indexPersistence.save() to ensure one-shot recovery flow.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the stated objectives: centralizing paths in config.ts, improving error messaging and recovery flow in index.ts. No CLI subcommands, auto-migration features, or global dotenv injection were introduced.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/vector-dim-recovery

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/index.ts`:
- Around line 416-418: The printed recovery command inserts
RESOLVED_PATHS.envFile raw into the shell snippet which breaks if the path
contains spaces or shell metacharacters; update the template in src/index.ts to
shell-quote or escape RESOLVED_PATHS.envFile before interpolation (e.g., call an
escape helper like escapeShellArg or wrap the path in single quotes while
escaping any embedded single quotes) so the printed command is safe to
copy-paste; ensure you replace the direct interpolation of
RESOLVED_PATHS.envFile in the multiline string with the escaped/quoted value.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e5fea2ac-ea27-45a2-ada7-412e894c4175

📥 Commits

Reviewing files that changed from the base of the PR and between edd1ceb and e2c44aa.

📒 Files selected for processing (2)
  • src/config.ts
  • src/index.ts

Comment thread src/index.ts
Comment on lines +416 to +418
` 1. One-shot drop + rebuild (recommended):\n` +
` echo 'AGENTMEMORY_DROP_STALE_INDEX=true' >> ${RESOLVED_PATHS.envFile}\n` +
` # restart agentmemory; the flag can be removed after the next clean boot.\n` +
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Quote the env-file path in the recovery command.

Line 417 inserts the raw path into shell syntax. If the resolved HOME path contains spaces or shell metacharacters, the suggested copy-paste recovery step breaks, which undercuts the main operator guidance in this error path. Please shell-quote or escape RESOLVED_PATHS.envFile before printing it.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/index.ts` around lines 416 - 418, The printed recovery command inserts
RESOLVED_PATHS.envFile raw into the shell snippet which breaks if the path
contains spaces or shell metacharacters; update the template in src/index.ts to
shell-quote or escape RESOLVED_PATHS.envFile before interpolation (e.g., call an
escape helper like escapeShellArg or wrap the path in single quotes while
escaping any embedded single quotes) so the printed command is safe to
copy-paste; ensure you replace the direct interpolation of
RESOLVED_PATHS.envFile in the multiline string with the escaped/quoted value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant