fix(rebuild): batch embed calls in rebuildIndex (25h → 3h on large corpora) by efenex · Pull Request #504 · rohitg00/agentmemory

efenex · 2026-05-18T11:04:02Z

Summary

rebuildIndex called await vectorIndexAddGuarded(...) per memory and per observation — each is one HTTP round-trip embedding a single input. On a real bulk-imported corpus + any non-zero network latency, the per-item serialization dominates wallclock.

Add a batched helper vectorIndexAddBatchGuarded() that calls provider.embedBatch() once for a buffered group of items, and refactor rebuildIndex to accumulate + flush in batches of REBUILD_EMBED_BATCH_SIZE (default 32).

The numbers

Measured against a self-hosted vLLM (Qwen3-Embedding-8B) endpoint:

call shape	latency	per-item
single embed	175 ms	175 ms
batch of 32	737 ms	23 ms

That's ~7.6× speedup per item. For a 500k-observation corpus, projected rebuild time drops from ~25 hours to ~3 hours.

The same benefit applies to any batchable OpenAI-compat endpoint — vLLM, Triton, OpenAI's own /v1/embeddings, LM Studio, llama.cpp server, etc. All accept an input array; their providers already amortize network + GPU setup across the batch.

Companion to #500

#500 made rebuildIndex non-blocking so the viewer + later boot steps run immediately. But the rebuild itself still took the same wallclock. This PR cuts the rebuild itself. After both, boot is fast AND the index hot-loads in a small multiple of the time it takes the provider to physically embed the corpus.

Failure semantics

failure	before	after
network/provider error on the call	per-item warn × N	single warn for the batch, N items marked failed
dimension mismatch on one record	per-item warn	per-item warn (item skipped, others in batch continue)
`rebuildIndex` return value	count of attempted items	unchanged

The soft-fail surface (call always returns, embed errors don't abort the rebuild) is preserved item-for-item.

Configuration

# default
REBUILD_EMBED_BATCH_SIZE=32

# small endpoints that prefer fewer items per call
REBUILD_EMBED_BATCH_SIZE=8

# fall back to per-item path (effectively pre-PR behavior)
REBUILD_EMBED_BATCH_SIZE=1

Test plan

npx vitest run test/search-index.test.ts test/vector-index*.test.ts test/remember-bm25-index.test.ts — 39/39 pass unchanged.

No new test added for the batching path itself (would need mocking the provider's embedBatch and asserting buffered flush at boundaries — possible but not done in this PR; the existing tests cover that rebuildIndex produces the same final index, and the batched helper is a transparent optimization). Happy to add if reviewer prefers.

Verified live against a real corpus + vLLM endpoint by counting vLLM's vllm:request_success_total before/after a rebuild — request count drops by a factor of ~32 as expected.

Files

src/functions/search.ts — new vectorIndexAddBatchGuarded, refactor rebuildIndex to use it; +112 lines, -9 lines

Summary by CodeRabbit

New Features
- Batched embedding processing to speed up index rebuilds.
Improvements
- Granular success/failure tracking that skips problematic items without failing entire batches.
- Configurable rebuild batch size (default 32) for optimized memory/observation indexing.
- Faster handling when no sessions/data to process, avoiding unnecessary work.

…rpora) rebuildIndex called `await vectorIndexAddGuarded(...)` per memory and per observation. Each call is one HTTP round-trip to the embedding provider for a single input. On a 500k-observation imported corpus against an embedding endpoint with even modest latency, that's serial 100-200ms per call = 14-28 hours of wallclock. The new non-blocking rebuild path (rohitg00#500) made this no longer block boot, but the rebuild itself still takes the same wallclock. Add `vectorIndexAddBatchGuarded()` next to the existing per-item helper, accepting an array of items and calling `provider.embedBatch()` once. For batchable endpoints (vLLM, Triton, OpenAI's `/v1/embeddings` all accept an `input` array), latency for N items is roughly the latency of a single embed because network + GPU setup amortize. Refactor `rebuildIndex` to accumulate items into a buffer and flush every REBUILD_EMBED_BATCH_SIZE (default 32). BM25 add stays per-item-synchronous; only the vector path is batched. Validated against a vLLM Qwen3-Embedding-8B endpoint: - single embed: 175ms - batch-of-32: 737ms (= 23ms/item amortized, ~7.6× speedup) - projected backfill time for 500k obs: 25h → 3h Per-item failure shape is preserved: - whole-batch network/provider error → all skipped, single warn line (vs N warns previously when the same error hit every item) - per-item dimension mismatch → that item skipped, others continue - rebuildIndex return value unchanged (count of attempted items) Override knob: - REBUILD_EMBED_BATCH_SIZE (default 32) — set lower for endpoints with small per-request input limits, higher for endpoints that prefer larger batches. Set to 1 to fall back to the per-item path. 39/39 existing tests in search-index/vector-index/remember-bm25-index pass unchanged. Related: rohitg00#500 (non-blocking rebuildIndex), rohitg00#503 (separate embedding base URL).

vercel · 2026-05-18T11:04:07Z

@efenex is attempting to deploy a commit to the rohitg00's projects Team on Vercel.

A member of the Team first needs to authorize it.

coderabbitai · 2026-05-18T11:04:16Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5d800c34-03c3-4339-bfde-5dcad3ab9f6f

📥 Commits

Reviewing files that changed from the base of the PR and between 5613ce7 and 81c387a.

📒 Files selected for processing (1)

src/functions/search.ts

🚧 Files skipped from review as they are similar to previous changes (1)

src/functions/search.ts

📝 Walkthrough

Walkthrough

rebuildIndex now buffers memory and observation embed jobs into configurable batches and writes vectors using a new vectorIndexAddBatchGuarded helper that calls EmbeddingProvider.embedBatch once per batch, validates counts and per-item dimensions, soft-fails on errors, and returns aggregated { ok, fail }.

Changes

Batched Embedding for Index Rebuild

Layer / File(s)	Summary
Batched embedding helper and batch sizing `src/functions/search.ts`	Exports `vectorIndexAddBatchGuarded` that embeds a batch in one `EmbeddingProvider.embedBatch` call, validates returned vector counts and per-item dimensions, attempts per-item index writes, logs/skips failures, and returns `{ ok, fail }`. Adds `DEFAULT_REBUILD_EMBED_BATCH` and `getRebuildEmbedBatchSize()` reading `REBUILD_EMBED_BATCH_SIZE`.
rebuildIndex batch queuing and flush infrastructure `src/functions/search.ts`	`rebuildIndex` defines `EmbedJob`, maintains a `pending` queue, and adds `enqueue()`/`flush()` helpers that call `vectorIndexAddBatchGuarded(pending)` when the queue reaches the configured batch size. Memory repopulation enqueues `title + content` jobs with `context.kind: "memory"`, observation repopulation enqueues `title + narrative` jobs with `context.kind: "observation"`, and remaining buffered jobs are flushed before returning (including an early "no sessions" flush).

Sequence Diagram

sequenceDiagram
  participant rebuildIndex
  participant vectorIndexAddBatchGuarded
  participant EmbeddingProvider
  participant vectorIndex
  rebuildIndex->>rebuildIndex: enqueue EmbedJob items
  rebuildIndex->>rebuildIndex: flush when batch size reached
  rebuildIndex->>vectorIndexAddBatchGuarded: call with batch items
  vectorIndexAddBatchGuarded->>EmbeddingProvider: embedBatch all items once
  EmbeddingProvider-->>vectorIndexAddBatchGuarded: vectors for each item
  vectorIndexAddBatchGuarded->>vectorIndex: write vectors per item
  vectorIndexAddBatchGuarded-->>rebuildIndex: { ok, fail } counts

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly Related PRs

rohitg00/agentmemory#327: Extends the vector-index live-write infrastructure; this PR replaces per-item writes with batched embedding writes and rebuildIndex batching.

Poem

🐰
I hop through queued embeds at last,
Batching memories held fast,
One call to embed, per-item care,
Softly skipping errors there,
Count the wins — the rest I past.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title clearly and specifically describes the main change: batching embed calls in rebuildIndex with a concrete performance improvement metric (25h → 3h on large corpora).
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

src/functions/search.ts (1)
89-99: ⚡ Quick win

Trim added WHAT-style comments in changed blocks.

These new comments describe mechanics/intent in detail; prefer clearer naming and keep comments minimal to avoid drift.

As per coding guidelines, "Avoid code comments explaining WHAT — use clear naming instead."

Also applies to: 157-163, 184-185, 269-270
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/functions/search.ts` around lines 89 - 99, The block comment above the
function vectorIndexAddBatchGuarded (and the other nearby changed comment
blocks) is WHAT-style and too verbose; replace it with a short, intent-focused
comment or remove it entirely and instead rely on clear naming/signature (e.g.,
vectorIndexAddBatchGuarded) and well-chosen variable names to convey behavior.
Trim the long multi-line description to a one-line summary like "Batch-embed
items and write resulting vectors; skip items on per-item dimension mismatch;
log on whole-batch errors." and remove implementation mechanics/details so
comments do not duplicate code.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/functions/search.ts`:
- Around line 137-153: The loop that calls vi.add for each item currently lets
any exception abort the entire batch; wrap the per-item call to vi.add(item.id,
item.sessionId, embedding) in a try/catch so individual write-time failures are
swallowed like the old guarded path: on catch, logger.warn with context (use the
same fields: item.context.kind, item.context.logId, provider ep.name) and
increment fail, then continue; only increment ok when vi.add succeeds. Ensure
the existing dimension-mismatch branch remains unchanged.

---

Nitpick comments:
In `@src/functions/search.ts`:
- Around line 89-99: The block comment above the function
vectorIndexAddBatchGuarded (and the other nearby changed comment blocks) is
WHAT-style and too verbose; replace it with a short, intent-focused comment or
remove it entirely and instead rely on clear naming/signature (e.g.,
vectorIndexAddBatchGuarded) and well-chosen variable names to convey behavior.
Trim the long multi-line description to a one-line summary like "Batch-embed
items and write resulting vectors; skip items on per-item dimension mismatch;
log on whole-batch errors." and remove implementation mechanics/details so
comments do not duplicate code.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0e055530-d3a9-4b4e-a4b5-854ba5fb4229

📥 Commits

Reviewing files that changed from the base of the PR and between caa9f52 and 5613ce7.

📒 Files selected for processing (1)

src/functions/search.ts

Restores the pre-batch soft-fail behavior — a single failing vi.add() no longer aborts the entire rebuild batch. Failures are logged and counted toward fail, just like dimension mismatches above.

efenex · 2026-05-18T15:09:52Z

Thanks for the catch — pushed 81c387a wrapping the per-item vi.add(...) in try/catch inside the batch loop, mirroring your suggested diff. A single failing write now logs + increments fail and the loop continues, matching the pre-batch soft-fail semantics. The existing dimension-mismatch branch is untouched.

rohitg00 · 2026-05-19T18:28:48Z

Merged + shipping in v0.9.21. Thanks @efenex — 25h→3h is huge for users with large corpora. The batch-flush + REBUILD_EMBED_BATCH_SIZE env knob shape is clean.

@cl0ckt0wer

Quality + integration wave. Bundles 11 PRs since v0.9.20: Contributor feature: - #237 OpenCode plugin with 22 auto-capture hooks (@cl0ckt0wer) Bug fixes (9): - #516 memory_recall endpoint + format/token_budget (@serhiizghama, closes #507/#440) - #461 env-file AGENTMEMORY_DROP_STALE_INDEX flag honored (@honor2030, closes #456) - #487 Windows hook path quoting (@honor2030, closes #477) - #517 viewer IME composition guard (@jonathanzhan1975) - #472 chunk large sessions for LLM context window (@efenex) - #473 surface lessons in smart-search + diagnose tally (@efenex) - #486 declare all Hermes plugin hooks (@honor2030) - #500 rebuildIndex non-blocking on boot (@efenex) - #504 batched embed in rebuildIndex (25h -> 3h) (@efenex) - #491 cli skip onboarding without tty (@honor2030) Upstream-installer revert: - #546 drop --next workaround now that iii-hq/iii#1660 shipped 1067/1067 tests pass across 95 files.

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

Comment thread src/functions/search.ts

fix(rebuild): per-item vi.add try/catch to preserve soft-fail

81c387a

Restores the pre-batch soft-fail behavior — a single failing vi.add() no longer aborts the entire rebuild batch. Failures are logged and counted toward fail, just like dimension mismatches above.

rohitg00 mentioned this pull request May 19, 2026

chore(release): v0.9.21 #550

Closed

rohitg00 merged commit 6c2a689 into rohitg00:main May 19, 2026
1 of 2 checks passed

rohitg00 mentioned this pull request May 19, 2026

chore(release): v0.9.21 #551

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(rebuild): batch embed calls in rebuildIndex (25h → 3h on large corpora)#504

fix(rebuild): batch embed calls in rebuildIndex (25h → 3h on large corpora)#504
rohitg00 merged 2 commits into
rohitg00:mainfrom
efenex:fix/batch-embed-rebuild

efenex commented May 18, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

vercel Bot commented May 18, 2026

Uh oh!

coderabbitai Bot commented May 18, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Possibly Related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

efenex commented May 18, 2026

Uh oh!

Uh oh!

rohitg00 commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

efenex commented May 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

The numbers

Companion to #500

Failure semantics

Configuration

Test plan

Files

Summary by CodeRabbit

Uh oh!

vercel Bot commented May 18, 2026

Uh oh!

coderabbitai Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Possibly Related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

efenex commented May 18, 2026

Uh oh!

Uh oh!

rohitg00 commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

efenex commented May 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 18, 2026 •

edited

Loading