feat: add batch embeddings by 0xbbjoker · Pull Request #50 · elizaos-plugins/plugin-knowledge

0xbbjoker · 2025-12-26T18:23:30Z

add batch embeddings

Note

Batch embeddings pipeline

Adds batch embedding flow in document-processor.ts with EMBEDDING_BATCH_SIZE=100, shouldUseBatchEmbeddings, generateEmbeddingsBatch, and generateBatchEmbeddingsViaRuntime (uses runtime.useModel(ModelType.TEXT_EMBEDDING, { texts })), with automatic fallback to per-chunk embedding.
Embedding result handling standardized (zero-vector checks) and improved logging; failed chunks are pre-populated in results.

Rate limiting and config tweaks

Updates defaults in config.ts for batch mode: MAX_CONCURRENT_REQUESTS=100, REQUESTS_PER_MINUTE=500, TOKENS_PER_MINUTE=1000000, and clarifies comments; retains BATCH_DELAY_MS.
Simplifies client-side rate limiter (actual limits handled by API headers) and adds clearer wait logging.

Misc

Version bump to 1.6.1 in package.json.

^{Written by Cursor Bugbot for commit cbf3e1a. This will update automatically on new commits. Configure here.}

Summary by CodeRabbit

Chores
- Released version 1.6.1.
Performance
- Updated rate-limiting configuration thresholds to optimize batch processing workflows.
- Implemented batch-based embedding generation to reduce API call overhead and improve throughput.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-26T18:23:40Z

Caution

Review failed

The pull request is closed.

Walkthrough

Version bumped to 1.6.1 with rate-limiting configuration adjusted for batch optimization (reduced concurrent requests, increased throughput caps). Document processor refactored to support batch-based embedding generation alongside individual fallbacks, with runtime integration for batch API calls.

Changes

Cohort / File(s)	Summary
Version & Dependency Management `package.json`	Version incremented from 1.6.0 to 1.6.1.
Rate-Limiting Configuration `src/config.ts`	Default rate-limit values updated for batch-oriented workloads: `MAX_CONCURRENT_REQUESTS` (150 → 100), `REQUESTS_PER_MINUTE` (300 → 500), `TOKENS_PER_MINUTE` (750000 → 1000000). Comments revised to emphasize batch embedding optimization.
Batch Embedding Feature & Refactoring `src/document-processor.ts`	Introduced batch embedding support: new `EMBEDDING_BATCH_SIZE`, `shouldUseBatchEmbeddings`, `generateEmbeddingsBatch`, and `generateBatchEmbeddingsViaRuntime` functions. Refactored `generateEmbeddingsForChunks` to route between batch and individual paths, add token estimation, and synchronize rate limiting. Added `generateEmbeddingsIndividual` as fallback linear path. Updated rate limiter commentary. Enhanced error handling and logging for batch operations.

Sequence Diagram(s)

sequenceDiagram
    participant Processor as Document Processor
    participant Batcher as Batch Router
    participant Runtime as Runtime/Model Service
    participant Embedder as Embedding Generator

    Processor->>Batcher: generateEmbeddingsForChunks(chunks)
    
    rect rgb(200, 220, 255)
    Note over Batcher: Check Config
    Batcher->>Batcher: shouldUseBatchEmbeddings?
    end

    alt Batch Mode Enabled
        rect rgb(220, 240, 220)
        Batcher->>Runtime: generateEmbeddingsBatch(textArray)
        Runtime->>Runtime: useModel(batch embeddings path)
        Runtime-->>Batcher: embeddings[] or fallback
        Batcher->>Embedder: generateEmbeddingsIndividual(failed chunks)
        Embedder-->>Batcher: individual embeddings
        end
    else Batch Mode Disabled
        rect rgb(240, 220, 220)
        Batcher->>Embedder: generateEmbeddingsIndividual(chunks)
        Embedder->>Runtime: generateEmbeddingWithValidation per chunk
        Runtime-->>Embedder: embedding
        Embedder-->>Batcher: embeddings[]
        end
    end

    Batcher-->>Processor: results with tokens estimated & rate-limited

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Batches bundled, speeds align,
A hundred texts in one design,
From chunks to streams, embeddings flow,
Where batch and fallback softly go! ✨

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/batch-embeddings

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 4d9f87d and cbf3e1a.

📒 Files selected for processing (3)

package.json
src/config.ts
src/document-processor.ts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

cursor · 2025-12-26T18:26:46Z

src/document-processor.ts

+        const chunk = batch[i];
+        const embedding = embeddings[i];
+
+        if (embedding && embedding.length > 0 && embedding[0] !== 0) {


Incorrect embedding validation rejects valid embeddings with zero first element

The validation check embedding[0] !== 0 incorrectly rejects valid embeddings where the first element happens to be zero. This is inconsistent with generateEmbeddingWithValidation which only checks !embedding || embedding.length === 0. A valid embedding vector can legitimately have zero as its first component. If the intent is to detect a true zero vector, all elements would need to be checked, not just the first one.

cursor · 2025-12-26T18:26:46Z

src/document-processor.ts

+            text: chunk.contextualizedText,
+          });
+        }
+      }


Missing rate limiting in batch embedding error fallback path

When batch embedding fails in generateEmbeddingsBatch, the fallback loop processes chunks individually by calling generateEmbeddingWithValidation without invoking the rateLimiter. The rate limiter was only called once for the entire batch before the try block. This could lead to API rate limit errors or service disruption when the batch fails and all individual requests fire rapidly.

cursor · 2025-12-26T18:26:46Z

src/document-processor.ts

+        }
+        return (result as { embedding: number[] })?.embedding || [];
+      })
+    );


Concurrent fallback requests bypass rate limiting entirely

In generateBatchEmbeddingsViaRuntime, when the handler returns a single embedding instead of batch results, the fallback uses Promise.all to process all texts concurrently without any rate limiting. This sends all individual embedding requests simultaneously, which could overwhelm the API and trigger rate limit errors, especially for large batches of up to 100 texts.

cursor · 2025-12-26T18:26:46Z

src/document-processor.ts

+      // Fall back to individual processing for this batch
+      for (const chunk of batch) {
+        try {
+          const result = await generateEmbeddingWithValidation(runtime, chunk.contextualizedText);


Batch fallback lacks retry logic for rate limit errors

The fallback path in generateEmbeddingsBatch calls generateEmbeddingWithValidation directly without wrapping it in withRateLimitRetry. This is inconsistent with generateEmbeddingsIndividual which uses withRateLimitRetry to handle 429 errors with automatic retry. When batch processing fails and falls back to individual calls, any rate limit errors will immediately fail rather than being retried, leading to unnecessary chunk failures.

feat: add batch embeddings

cbf3e1a

0xbbjoker merged commit ab5660a into 1.x Dec 26, 2025
1 of 2 checks passed

cursor bot reviewed Dec 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add batch embeddings#50

feat: add batch embeddings#50
0xbbjoker merged 1 commit into1.xfrom
feat/batch-embeddings

0xbbjoker commented Dec 26, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 26, 2025 •

edited

Loading

Review failed

Uh oh!

Uh oh!

cursor bot Dec 26, 2025

Uh oh!

cursor bot Dec 26, 2025

Uh oh!

cursor bot Dec 26, 2025

Uh oh!

cursor bot Dec 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

0xbbjoker commented Dec 26, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

Uh oh!

cursor bot Dec 26, 2025

Choose a reason for hiding this comment

Incorrect embedding validation rejects valid embeddings with zero first element

Uh oh!

cursor bot Dec 26, 2025

Choose a reason for hiding this comment

Missing rate limiting in batch embedding error fallback path

Uh oh!

cursor bot Dec 26, 2025

Choose a reason for hiding this comment

Concurrent fallback requests bypass rate limiting entirely

Uh oh!

cursor bot Dec 26, 2025

Choose a reason for hiding this comment

Batch fallback lacks retry logic for rate limit errors

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

0xbbjoker commented Dec 26, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 26, 2025 •

edited

Loading