Skip to content

feat: optimize knowledge graph#35

Merged
0xbbjoker merged 9 commits into1.xfrom
feat/optimize-knowledge-graph
Oct 7, 2025
Merged

feat: optimize knowledge graph#35
0xbbjoker merged 9 commits into1.xfrom
feat/optimize-knowledge-graph

Conversation

@0xbbjoker
Copy link
Copy Markdown

@0xbbjoker 0xbbjoker commented Jul 24, 2025

Summary

Optimizes knowledge graph performance and memory usage in the ElizaOS plugin-knowledge system.

Changes

  • Enhanced knowledge retrieval algorithms
  • Improved memory management for large knowledge bases
  • Optimized database operations and query processing
  • Better error handling and type safety

Impact

  • Faster knowledge retrieval and indexing
  • Reduced memory footprint
  • Better scalability for large knowledge graphs

Summary by CodeRabbit

  • New Features
    • Interactive canvas knowledge graph with pagination, per-node loading indicators, legend and "Load more"; node selection fetches and shows detailed memory or document views.
    • New lightweight API endpoints to list graph nodes, fetch node details, and expand a document into its fragment subgraph.
    • Service support for paginated memory retrieval and counting.
  • Refactor
    • Knowledge tab now uses an optimized graph and unified, scroll-into-view details panel with improved content rendering and caching.
  • Chores
    • Bumped package version to 1.5.11.

Note

Introduces a new lightweight graph API and an optimized frontend graph component with pagination, node detail fetching/caching, and UI refinements, plus service pagination support and exports.

  • Frontend
    • Optimized Graph: Replace MemoryGraph with MemoryGraphOptimized in src/frontend/ui/knowledge-tab.tsx; adds memoized node click handler, auto-scroll to details, improved document/fragment panels, and better content rendering.
    • New Component: Add src/frontend/ui/memory-graph-optimized.tsx with paginated loading, node detail fetch/caching, legend, and load-more.
    • Exports: Introduce src/frontend/ui/index.ts to export UI components.
  • Backend/API
    • Graph Endpoints: Add GET /graph/nodes, GET /graph/node/:nodeId, GET /graph/expand/:documentId in src/routes.ts for lightweight graph data and details.
  • Service
    • Pagination Support: Extend getMemories to accept offset; add countMemories for totals; minor cleanup in fragment ID creation.
  • Package
    • Bump version in package.json to 1.5.11.

Written by Cursor Bugbot for commit c5feaf1. This will update automatically on new commits. Configure here.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Jul 24, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds a paginated, canvas-based MemoryGraphOptimized UI and an export barrel; updates KnowledgeTab to use the optimized graph with agent-scoped behavior and auto-scrolling details panel; introduces lightweight backend graph REST endpoints and document expansion; extends KnowledgeService with pagination/counting; bumps package version.

Changes

Cohort / File(s) Summary of changes
UI Export Barrel
src/frontend/ui/index.ts
New barrel file re-exporting UI components: MemoryGraph, MemoryGraphOptimized, KnowledgeTab, Badge, Button, Card, and Input.
Knowledge Tab Integration
src/frontend/ui/knowledge-tab.tsx
Replaced MemoryGraph with MemoryGraphOptimized; wired onNodeClick to a memoized handler; added agentId prop passthrough; introduced detailsPanelRef and auto-scroll effect for selected memory; differentiated document vs fragment views; refined layout, overflow, and action controls.
Optimized Graph Component
src/frontend/ui/memory-graph-optimized.tsx
New MemoryGraphOptimized React component: paginated node fetching, canvas force-directed rendering (react-force-graph-2d), per-node loading and detail caching, selection highlighting, "Load More" flow, resize handling, and onNodeClick(memory) callback; reads API base from window.ELIZA_CONFIG (default /api). Exports public interfaces: GraphNode, GraphLink, GraphData, PaginationInfo, MemoryGraphOptimizedProps.
Backend Graph Routes
src/routes.ts
Added handlers and GET routes under knowledgeRoutes: /graph/nodes (paginated lightweight nodes/links), /graph/node/:nodeId (node details), /graph/expand/:documentId (document→fragments subgraph); includes validation, pagination calculations, logging, and error handling. Also reduced fragment fetch limit in getKnowledgeChunksHandler for single-document requests.
KnowledgeService Pagination & Counting
src/service.ts
getMemories extended to accept and forward offset for pagination; new countMemories method added to return numeric counts scoped by table/room/unique.
Version Bump
package.json
Package version increment from 1.5.10 to 1.5.11.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor U as User
  participant KT as KnowledgeTab
  participant MGO as MemoryGraphOptimized (UI)
  participant API as Backend /graph
  participant SVC as KnowledgeService

  U->>KT: Open Knowledge tab (agentId)
  KT->>MGO: Mount with agentId & handlers
  MGO->>API: GET /graph/nodes?page=1&limit=20&agentId=...
  API->>SVC: query paginated documents/fragments
  SVC-->>API: lightweight nodes/links + pagination
  API-->>MGO: GraphData {nodes, links, pagination}
  MGO-->>KT: Render graph

  U->>MGO: Click node
  alt node details cached?
    MGO-->>KT: onNodeClick(cachedMemory)
  else
    MGO->>API: GET /graph/node/:nodeId?agentId=...
    API->>SVC: fetch document or fragment details
    SVC-->>API: memory details
    API-->>MGO: Memory details
    MGO-->>KT: onNodeClick(memory)
  end

  alt More pages
    U->>MGO: Click "Load More"
    MGO->>API: GET /graph/nodes?page=N&agentId=...
    API-->>MGO: Next page nodes/links
    MGO-->>KT: Append nodes/links
  end

  opt Expand document subgraph
    U->>MGO: Request expand document
    MGO->>API: GET /graph/expand/:documentId?agentId=...
    API->>SVC: fetch fragments for document
    SVC-->>API: fragments subgraph
    API-->>MGO: Document↔fragments subgraph
    MGO-->>KT: Update graph
  end

  rect rgb(255,245,230)
  Note over MGO,API: Error handling paths (network/500/404)
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

I hop through nodes where memories glow,
I fetch, I cache, then forward a click—
Documents open, fragments show,
A scroll, a view, the details stick.
Carrots for code, the graph grows quick. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title “feat: optimize knowledge graph” clearly and concisely summarizes the primary change by indicating that the knowledge graph implementation has been optimized for performance and memory usage, which aligns with the main objectives of the pull request.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/optimize-knowledge-graph

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between c5feaf1 and bc3e1ed.

📒 Files selected for processing (1)
  • src/routes.ts (3 hunks)
🔇 Additional comments (1)
src/routes.ts (1)

1270-1285: LGTM: Route registrations are properly structured.

The new graph route registrations follow the existing pattern and use appropriate HTTP methods and path structures.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@0xbbjoker 0xbbjoker marked this pull request as ready for review July 24, 2025 14:38
cursor[bot]

This comment was marked as outdated.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Nitpick comments (4)
src/frontend/ui/memory-graph-optimized.tsx (4)

3-4: Consider adding type definitions instead of @ts-ignore

Using @ts-ignore bypasses TypeScript's type checking. Consider installing type definitions if available.

First, check if types are available:

npm search @types/react-force-graph-2d

If not available, create a local type definition file types/react-force-graph-2d.d.ts:

declare module 'react-force-graph-2d' {
  import { FC, RefObject } from 'react';
  
  export interface NodeObject {
    id?: string | number;
    x?: number;
    y?: number;
    [key: string]: any;
  }
  
  export interface LinkObject {
    source?: string | number | NodeObject;
    target?: string | number | NodeObject;
    [key: string]: any;
  }
  
  export interface ForceGraphMethods {
    zoomToFit: (duration?: number) => void;
    // Add other methods as needed
  }
  
  // Add other type definitions as needed
  const ForceGraph2D: FC<any>;
  export default ForceGraph2D;
}

199-207: Remove debug logging for production

Debug console.log statements should be removed or gated behind a debug flag.

- // Debug effect to monitor graph data changes
- useEffect(() => {
-   console.log('Graph data changed:', {
-     nodeCount: graphData.nodes.length,
-     linkCount: graphData.links.length,
-     nodes: graphData.nodes,
-     links: graphData.links
-   });
- }, [graphData]);
+ // Debug effect to monitor graph data changes
+ useEffect(() => {
+   if (process.env.NODE_ENV === 'development') {
+     console.log('Graph data changed:', {
+       nodeCount: graphData.nodes.length,
+       linkCount: graphData.links.length,
+     });
+   }
+ }, [graphData]);

277-277: Consider preserving graph state on data updates

Using a key that changes on every data update causes the graph to fully re-render, losing zoom and pan state. This might frustrate users when loading more nodes.

Instead of using a key, you could update the graph data without re-mounting:

- <ForceGraph2D
-   key={`graph-${graphVersion}`}
+ <ForceGraph2D

Then handle data updates more gracefully in the ForceGraph2D component. If the library doesn't support this well, consider storing and restoring the zoom/pan state.


188-188: Remove debug console.log

- console.log('Node clicked:', node);
+ // Node click handled
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between e3db460 and 856c34b.

📒 Files selected for processing (4)
  • src/frontend/ui/index.ts (1 hunks)
  • src/frontend/ui/knowledge-tab.tsx (8 hunks)
  • src/frontend/ui/memory-graph-optimized.tsx (1 hunks)
  • src/routes.ts (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/frontend/ui/knowledge-tab.tsx (2)
src/frontend/ui/index.ts (1)
  • MemoryGraphOptimized (3-3)
src/frontend/ui/memory-graph-optimized.tsx (1)
  • MemoryGraphOptimized (37-339)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Cursor Bugbot
🔇 Additional comments (5)
src/frontend/ui/index.ts (1)

1-9: LGTM!

Clean module structure that provides a centralized export point for UI components.

src/frontend/ui/knowledge-tab.tsx (2)

1419-1431: LGTM! Proper integration of MemoryGraphOptimized

The integration correctly:

  • Passes the required agentId prop
  • Maintains the existing callback handlers
  • Preserves the document expansion functionality

1050-1113: Nice CSS improvements for better UX

The scroll styling and layout improvements enhance readability and user experience:

  • Proper flex layout with overflow handling
  • Custom scrollbar styling for better appearance
  • Correct text wrapping for long content
src/frontend/ui/memory-graph-optimized.tsx (2)

37-339: Well-implemented graph visualization component

This is a well-structured React component with:

  • Proper state management and error handling
  • Efficient caching of node details
  • Good use of React hooks and memoization
  • Clean separation of concerns
  • Responsive design with resize handling

82-84: API endpoint path is correctly mapped

The fetch call to /api/graph/nodes aligns with the route defined in src/routes.ts (path: '/graph/nodes') and the plugin’s mounted base (/api). No changes are needed.

Comment on lines +896 to +900
const allFragments = await service.getMemories({
tableName: 'knowledge',
roomId: agentId,
count: 100000,
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Critical performance issue: Fetching 100,000 fragments is dangerous

Loading 100,000 fragments into memory could easily crash the server or cause severe performance degradation. This is a critical scalability issue.

The fragments should be fetched only for the paginated documents:

- // Get all fragments for the paginated documents
- const allFragments = await service.getMemories({
-   tableName: 'knowledge',
-   roomId: agentId,
-   count: 100000,
- });
+ // Get fragments only for the paginated documents
+ const documentIds = paginatedDocuments.map(doc => doc.id);
+ const allFragments = await service.getMemoriesByDocumentIds({
+   tableName: 'knowledge',
+   roomId: agentId,
+   documentIds: documentIds,
+ });

Note: This requires adding a method to fetch fragments by document IDs.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/routes.ts around lines 896 to 900, fetching 100,000 fragments at once
causes a critical performance and scalability issue. To fix this, implement a
method in the service to fetch fragments by a limited set of document IDs
corresponding to the current page. Then modify the code to first fetch only the
paginated documents, extract their IDs, and fetch fragments only for those IDs
instead of all fragments.

Comment on lines +905 to +913
if (allFragments.length > 0) {
logger.debug(
`[Graph API] 📊 Sample fragment metadata:`,
allFragments.slice(0, 3).map((f) => ({
id: f.id,
metadata: f.metadata,
}))
);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove debug logging that could expose sensitive data

Debug logging of fragment metadata in production could expose sensitive information and should be removed or gated behind a debug flag.

- logger.debug(
-   `[Graph API] 📊 Sample fragment metadata:`,
-   allFragments.slice(0, 3).map((f) => ({
-     id: f.id,
-     metadata: f.metadata,
-   }))
- );
+ if (runtime.getSetting('DEBUG_MODE') === 'true') {
+   logger.debug(
+     `[Graph API] 📊 Sample fragment metadata:`,
+     allFragments.slice(0, 3).map((f) => ({
+       id: f.id,
+       metadataKeys: Object.keys(f.metadata || {}),
+     }))
+   );
+ }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (allFragments.length > 0) {
logger.debug(
`[Graph API] 📊 Sample fragment metadata:`,
allFragments.slice(0, 3).map((f) => ({
id: f.id,
metadata: f.metadata,
}))
);
}
if (allFragments.length > 0) {
if (runtime.getSetting('DEBUG_MODE') === 'true') {
logger.debug(
`[Graph API] 📊 Sample fragment metadata:`,
allFragments.slice(0, 3).map((f) => ({
id: f.id,
metadataKeys: Object.keys(f.metadata || {}),
}))
);
}
}
🤖 Prompt for AI Agents
In src/routes.ts around lines 905 to 913, the debug logging outputs fragment
metadata which may contain sensitive data. Remove this debug logging or wrap it
in a conditional check that only enables it when a debug flag or environment
variable is set, ensuring sensitive information is not exposed in production
logs.

Comment on lines +970 to +972
if (!nodeId || nodeId.length < 36) {
return sendError(res, 400, 'INVALID_ID', 'Invalid node ID format');
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve UUID validation

The current validation only checks length, not the proper UUID format.

- if (!nodeId || nodeId.length < 36) {
-   return sendError(res, 400, 'INVALID_ID', 'Invalid node ID format');
- }
+ const uuidRegex = /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;
+ if (!nodeId || !uuidRegex.test(nodeId)) {
+   return sendError(res, 400, 'INVALID_ID', 'Invalid node ID format');
+ }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (!nodeId || nodeId.length < 36) {
return sendError(res, 400, 'INVALID_ID', 'Invalid node ID format');
}
const uuidRegex = /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;
if (!nodeId || !uuidRegex.test(nodeId)) {
return sendError(res, 400, 'INVALID_ID', 'Invalid node ID format');
}
🤖 Prompt for AI Agents
In src/routes.ts around lines 970 to 972, the current validation for nodeId only
checks if it exists and its length is at least 36 characters, which is
insufficient for proper UUID validation. Replace this length check with a regex
or a UUID validation function that verifies the nodeId matches the standard UUID
format (e.g., using a regex like
/^[0-9a-f]{8}-[0-9a-f]{4}-[1-5][0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$/i).
Update the if condition to use this validation and keep the error response the
same if validation fails.

src/routes.ts Outdated
Comment on lines +977 to +996
// Try to find in documents first
const documents = await service.getMemories({
tableName: 'documents',
roomId: agentId, // Filter by agent
count: 10000,
});

const document = documents.find((doc) => doc.id === nodeId);

if (document) {
// Return document details without embedding
sendSuccess(res, {
id: document.id,
type: 'document',
content: document.content,
metadata: document.metadata,
createdAt: document.createdAt,
});
return;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Performance issue: Fetching all documents to find one

Fetching 10,000 documents to find a single one by ID is extremely inefficient.

Use a direct lookup method instead:

- // Try to find in documents first
- const documents = await service.getMemories({
-   tableName: 'documents',
-   roomId: agentId,
-   count: 10000,
- });
- 
- const document = documents.find((doc) => doc.id === nodeId);
+ // Direct lookup in documents table
+ const document = await service.getMemoryById(nodeId, {
+   tableName: 'documents',
+   roomId: agentId,
+ });

Note: This requires the service to expose a filtered getMemoryById method.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Try to find in documents first
const documents = await service.getMemories({
tableName: 'documents',
roomId: agentId, // Filter by agent
count: 10000,
});
const document = documents.find((doc) => doc.id === nodeId);
if (document) {
// Return document details without embedding
sendSuccess(res, {
id: document.id,
type: 'document',
content: document.content,
metadata: document.metadata,
createdAt: document.createdAt,
});
return;
}
// Direct lookup in documents table
const document = await service.getMemoryById(nodeId, {
tableName: 'documents',
roomId: agentId,
});
if (document) {
// Return document details without embedding
sendSuccess(res, {
id: document.id,
type: 'document',
content: document.content,
metadata: document.metadata,
createdAt: document.createdAt,
});
return;
}
🤖 Prompt for AI Agents
In src/routes.ts around lines 977 to 996, the current code fetches up to 10,000
documents and then searches for one by ID, which is inefficient. To fix this,
implement and use a direct lookup method like getMemoryById in the service that
fetches a single document by its ID and agentId. Replace the bulk fetch and find
logic with a call to this new method to retrieve the document directly.

src/routes.ts Outdated
Comment on lines +999 to +1015
const fragments = await service.getMemories({
tableName: 'knowledge',
roomId: agentId, // Filter by agent
count: 100000, // High limit
});

const fragment = fragments.find((frag) => frag.id === nodeId);

if (fragment) {
sendSuccess(res, {
id: fragment.id,
type: 'fragment',
content: fragment.content,
metadata: fragment.metadata,
createdAt: fragment.createdAt,
});
return;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Performance issue: Fetching all fragments to find one

Fetching 100,000 fragments to find a single one is a critical performance issue.

- // If not found in documents, try fragments
- const fragments = await service.getMemories({
-   tableName: 'knowledge',
-   roomId: agentId,
-   count: 100000,
- });
- 
- const fragment = fragments.find((frag) => frag.id === nodeId);
+ // Direct lookup in fragments table
+ const fragment = await service.getMemoryById(nodeId, {
+   tableName: 'knowledge',
+   roomId: agentId,
+ });

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/routes.ts around lines 999 to 1015, the code fetches up to 100,000
fragments to find a single fragment by ID, causing a major performance issue.
Modify the service.getMemories call or use a more specific query method to fetch
only the fragment with the matching nodeId instead of retrieving all fragments.
This will reduce data transfer and improve response time significantly.

Comment on lines +1038 to +1039
try {
logger.debug(`[Graph API] 📊 Expanding document: ${documentId}, agent: ${agentId}`);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add document existence check

The function should verify that the document exists before trying to expand it.

After the documentId validation, add:

// Verify document exists
const documentExists = await service.getMemoryById(documentId, {
  tableName: 'documents',
  roomId: agentId,
});

if (!documentExists) {
  return sendError(res, 404, 'DOCUMENT_NOT_FOUND', `Document ${documentId} not found`);
}
🤖 Prompt for AI Agents
In src/routes.ts around lines 1038 to 1039, the code attempts to expand a
document without verifying its existence. After validating documentId, add an
asynchronous check using service.getMemoryById with the documentId, specifying
tableName as 'documents' and roomId as agentId. If the document does not exist,
immediately return a 404 error response using sendError with the code
'DOCUMENT_NOT_FOUND' and a message indicating the missing document.

Comment on lines +1042 to +1046
const allFragments = await service.getMemories({
tableName: 'knowledge',
roomId: agentId, // Filter by agent
count: 100000, // High limit
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Critical performance issue: Fetching 100,000 fragments

This approach doesn't scale and could crash the server with large knowledge bases.

Fetch only fragments for the specific document:

- // Get all fragments for this document
- const allFragments = await service.getMemories({
-   tableName: 'knowledge',
-   roomId: agentId,
-   count: 100000,
- });
+ // Get fragments for this specific document
+ const documentFragments = await service.getMemoriesByMetadata({
+   tableName: 'knowledge',
+   roomId: agentId,
+   metadata: { documentId: documentId },
+ });

Note: This requires adding a method to query by metadata fields.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/routes.ts around lines 1042 to 1046, fetching 100,000 fragments at once
causes a critical performance issue and risks crashing the server. Modify the
code to fetch only the fragments related to the specific document by adding a
query method that filters by metadata fields, such as document ID or relevant
tags, instead of fetching all fragments. Implement this new method in the
service and replace the current call with it to ensure efficient and scalable
data retrieval.

src/routes.ts Outdated
Comment on lines +1051 to +1074
if (allFragments.length > 0) {
logger.debug(`[Graph API] 📊 Sample fragment metadata:`, allFragments[0].metadata);

// Log all unique metadata types found
const uniqueTypes = new Set(allFragments.map((f) => (f.metadata as any)?.type));
logger.debug(
`[Graph API] 📊 Unique metadata types found in knowledge table:`,
Array.from(uniqueTypes)
);

// Log metadata of all fragments for this specific document
const relevantFragments = allFragments.filter((fragment) => {
const metadata = fragment.metadata as any;
const hasDocumentId = metadata?.documentId === documentId;
if (hasDocumentId) {
logger.debug(`[Graph API] 📊 Fragment ${fragment.id} metadata:`, metadata);
}
return hasDocumentId;
});

logger.debug(
`[Graph API] 📊 Fragments with matching documentId: ${relevantFragments.length}`
);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove or gate debug logging

Extensive debug logging of metadata could expose sensitive information in production.

- logger.debug(`[Graph API] 📊 Total fragments in knowledge table: ${allFragments.length}`);
- 
- // Log a sample fragment to see its structure
- if (allFragments.length > 0) {
-   logger.debug(`[Graph API] 📊 Sample fragment metadata:`, allFragments[0].metadata);
-   // ... more debug logs
- }
+ if (runtime.getSetting('DEBUG_MODE') === 'true') {
+   logger.debug(`[Graph API] 📊 Fragment count: ${allFragments.length}`);
+ }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/routes.ts around lines 1051 to 1074, the extensive debug logging of
fragment metadata may expose sensitive information in production. To fix this,
wrap all debug logging statements inside a conditional check that ensures they
only run in a development or debug environment, such as checking a NODE_ENV
variable or a dedicated debug flag. This will prevent sensitive data from being
logged in production while preserving the logs for debugging purposes.

@0xbbjoker 0xbbjoker force-pushed the feat/optimize-knowledge-graph branch from 856c34b to ef7b4db Compare July 28, 2025 20:40
cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (16)
src/routes.ts (15)

991-993: Improve UUID validation with proper format check

The current validation only checks length, not proper UUID format.

Apply this diff to use proper UUID validation:

-if (!nodeId || nodeId.length < 36) {
+const uuidRegex = /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;
+if (!nodeId || !uuidRegex.test(nodeId)) {
   return sendError(res, 400, 'INVALID_ID', 'Invalid node ID format');
 }

1055-1057: Improve UUID validation with proper format check

The current validation only checks length, not proper UUID format.

Apply this diff:

-if (!documentId || documentId.length < 36) {
+const uuidRegex = /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;
+if (!documentId || !uuidRegex.test(documentId)) {
   return sendError(res, 400, 'INVALID_ID', 'Invalid document ID format');
 }

1059-1061: Add document existence verification

The function should verify that the document exists before attempting to expand it.

After validating documentId, add:

// Verify document exists
const documentExists = await service.getMemoryById(documentId, {
  tableName: 'documents',
  roomId: agentId,
});

if (!documentExists) {
  return sendError(res, 404, 'DOCUMENT_NOT_FOUND', `Document ${documentId} not found`);
}

1072-1095: Gate debug logging to prevent data exposure

Extensive debug logging may expose sensitive metadata in production.

Wrap all debug logging in a conditional:

-logger.debug(`[Graph API] 📊 Total fragments in knowledge table: ${allFragments.length}`);
-
-if (allFragments.length > 0) {
-  logger.debug(`[Graph API] 📊 Sample fragment metadata:`, allFragments[0].metadata);
-  // ... more debug logs
-}
+if (runtime.getSetting('DEBUG_MODE') === 'true') {
+  logger.debug(`[Graph API] 📊 Fragment count: ${allFragments.length}`);
+}

880-880: Add UUID validation for agentId

The agentId parameter should be validated to ensure it's a proper UUID format before using it in queries.

Apply this diff to add validation:

 const agentId = (req.query.agentId as UUID) || runtime.agentId;
+
+// Validate UUID format
+const uuidRegex = /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;
+if (!uuidRegex.test(agentId)) {
+  return sendError(res, 400, 'INVALID_AGENT_ID', 'Invalid agent ID format');
+}

891-895: Critical: Replace in-memory pagination with database-level pagination

Fetching 10,000 documents into memory to paginate is inefficient and negates pagination benefits. This can cause memory issues with large datasets.

Consider implementing database-level pagination by modifying the service to accept offset and limit parameters and adding a separate count query.


917-921: Critical: Fetching 100,000 fragments risks server crash

Loading 100,000 fragments into memory is dangerous and could crash the server with large knowledge bases.

Fetch only fragments for the paginated documents by filtering at the database level using document IDs.


926-934: Remove debug logging to prevent data exposure

Debug logging of fragment metadata may expose sensitive information in production.

Apply this diff to gate debug logging:

-logger.debug(
-  `[Graph API] 📊 Sample fragment metadata:`,
-  allFragments.slice(0, 3).map((f) => ({
-    id: f.id,
-    metadata: f.metadata,
-  }))
-);
+if (runtime.getSetting('DEBUG_MODE') === 'true') {
+  logger.debug(
+    `[Graph API] 📊 Sample fragment metadata:`,
+    allFragments.slice(0, 3).map((f) => ({
+      id: f.id,
+      metadataKeys: Object.keys(f.metadata || {}),
+    }))
+  );
+}

911-911: Fix null safety: Add runtime checks for memory IDs

Using non-null assertions (doc.id!, frag.id!) without runtime validation can lead to undefined values being used.

Apply this diff to add validation:

 paginatedDocuments.forEach((doc) => {
+  if (!doc.id) {
+    logger.warn('[Graph API] Document missing ID, skipping');
+    return;
+  }
   nodes.push({ id: doc.id!, type: 'document' });
 });
 docFragments.forEach((frag) => {
+  if (!frag.id || !doc.id) {
+    logger.warn('[Graph API] Fragment or document missing ID, skipping');
+    return;
+  }
   nodes.push({ id: frag.id!, type: 'fragment' });
   links.push({ source: doc.id!, target: frag.id! });
 });

Also applies to: 955-956


941-942: Fix type safety: Validate metadata.type before toLowerCase()

Calling .toLowerCase() on metadata?.type without checking if it's a string will cause runtime errors if metadata.type is not a string.

Apply this diff to add type checking:

 const isFragment =
-  metadata?.type?.toLowerCase() === 'fragment' ||
+  (typeof metadata?.type === 'string' && metadata.type.toLowerCase() === 'fragment') ||
   metadata?.type === MemoryType.FRAGMENT ||
   (!metadata?.type && metadata?.documentId);

999-1005: Critical: Use direct lookup instead of fetching all documents

Fetching 10,000 documents to find one by ID is extremely inefficient.

Implement a direct lookup method in the service:

-const documents = await service.getMemories({
-  tableName: 'documents',
-  roomId: agentId,
-  count: 10000,
-});
-
-const document = documents.find((doc) => doc.id === nodeId);
+const document = await service.getMemoryById(nodeId, {
+  tableName: 'documents',
+  roomId: agentId,
+});

Note: This requires the service to expose a getMemoryById method with filtering.


1020-1026: Critical: Use direct lookup instead of fetching all fragments

Fetching 100,000 fragments to find one is a severe performance issue.

Use a direct lookup:

-const fragments = await service.getMemories({
-  tableName: 'knowledge',
-  roomId: agentId,
-  count: 100000,
-});
-
-const fragment = fragments.find((frag) => frag.id === nodeId);
+const fragment = await service.getMemoryById(nodeId, {
+  tableName: 'knowledge',
+  roomId: agentId,
+});

1063-1067: Critical: Fetch only relevant fragments, not all 100,000

Fetching all fragments is inefficient and risks server crashes.

Modify the service to query fragments by metadata:

-const allFragments = await service.getMemories({
-  tableName: 'knowledge',
-  roomId: agentId,
-  count: 100000,
-});
+const documentFragments = await service.getMemoriesByMetadata({
+  tableName: 'knowledge',
+  roomId: agentId,
+  metadata: { documentId: documentId },
+});

Note: This requires adding a getMemoriesByMetadata method to the service.


1110-1110: Fix null safety: Validate fragment IDs before use

Non-null assertions without runtime checks can cause undefined values to be used.

Apply this diff:

 const nodes = documentFragments.map((frag) => ({
+  ...(frag.id ? { id: frag.id } : null),
-  id: frag.id!,
   type: 'fragment' as const,
-}));
+})).filter(Boolean);

 const links = documentFragments.map((frag) => ({
   source: documentId,
+  ...(frag.id ? { target: frag.id } : null),
-  target: frag.id!,
-}));
+})).filter(node => node.target);

Also applies to: 1116-1116


1100-1102: Fix type safety: Validate metadata.type is a string

Calling .toLowerCase() without type checking will cause runtime errors if metadata.type is not a string.

Apply this diff:

 const isFragment =
-  metadata?.type?.toLowerCase() === 'fragment' ||
+  (typeof metadata?.type === 'string' && metadata.type.toLowerCase() === 'fragment') ||
   metadata?.type === MemoryType.FRAGMENT ||
   (!metadata?.type && metadata?.documentId);
src/frontend/ui/knowledge-tab.tsx (1)

1445-1457: Verify: Graph component may ignore document selection state

MemoryGraphOptimized fetches data independently and doesn't receive selectedDocumentForGraph as a prop. When a document is selected (line 1452), the parent component's useKnowledgeChunks hook responds to load fragments, but the graph component continues showing its own independently-fetched data.

This may be intentional if the new design is that the graph always shows an overview. However, if the graph should zoom to a specific document's fragments when selectedDocumentForGraph is set, the component needs to be modified to accept and respond to this prop.

Review the intended behavior:

  • Should the graph update when a document is focused?
  • If yes, pass selectedDocumentForGraph to MemoryGraphOptimized and have it fetch/filter accordingly
  • If no, consider removing the unused selectedDocumentForGraph state from KnowledgeTab
🧹 Nitpick comments (5)
src/frontend/ui/memory-graph-optimized.tsx (5)

3-4: Note: @ts-ignore suppresses type checking for react-force-graph-2d

The @ts-ignore directive suppresses TypeScript errors for the react-force-graph-2d import. While this may be necessary if the library lacks proper type definitions, be aware it prevents compile-time type checking for this import.

If type definitions become available for react-force-graph-2d, consider installing @types/react-force-graph-2d and removing the @ts-ignore.


111-134: Add defensive checks for API response structure

The code destructures result.data without verifying its structure, which could cause runtime errors if the API response doesn't match expectations.

Apply this diff to add validation:

 if (result.success && result.data) {
-  const { nodes, links, pagination } = result.data;
+  const { nodes = [], links = [], pagination = null } = result.data;
+  
+  if (!Array.isArray(nodes) || !Array.isArray(links)) {
+    throw new Error('Invalid response format: nodes and links must be arrays');
+  }

   const graphNodes: GraphNode[] = nodes.map((node: any) => ({
     id: node.id,
     type: node.type,
     loading: false,
     val: node.type === 'document' ? 8 : 4,
   }));

176-194: Add validation for node details response structure

The code constructs a Memory object from result.data without validating the response structure, which could cause runtime errors if required fields are missing.

Apply this diff to add validation:

 if (result.success && result.data) {
+  const { id, content, metadata, createdAt } = result.data;
+  
+  if (!id) {
+    throw new Error('Invalid response: missing node id');
+  }
+
   const memory: Memory = {
-    id: result.data.id,
-    content: result.data.content,
-    metadata: result.data.metadata,
-    createdAt: result.data.createdAt,
+    id,
+    content: content || { text: '' },
+    metadata: metadata || {},
+    createdAt: createdAt || Date.now(),
     entityId: agentId,
     roomId: agentId,
   };

220-227: Remove debug console.log from production code

The debug effect that logs graph data changes on every update will clutter production console logs and should be removed or gated behind a debug flag.

Apply this diff to remove or gate the debug logging:

-// Debug effect to monitor graph data changes
-useEffect(() => {
-  console.log('Graph data changed:', {
-    nodeCount: graphData.nodes.length,
-    linkCount: graphData.links.length,
-    nodes: graphData.nodes,
-    links: graphData.links
-  });
-}, [graphData]);
+// Debug effect to monitor graph data changes (development only)
+if (process.env.NODE_ENV === 'development') {
+  useEffect(() => {
+    console.log('Graph data changed:', {
+      nodeCount: graphData.nodes.length,
+      linkCount: graphData.links.length,
+    });
+  }, [graphData]);
+}

296-298: Consider performance impact of key-based re-rendering

Using key={graph-${graphVersion}} forces a complete remount of the ForceGraph2D component whenever graphVersion changes. This can impact performance if graphVersion updates frequently (e.g., line 131 increments it on every page load).

Consider using the graph's graphData prop reactivity instead of forcing remounts:

-<ForceGraph2D
-  key={`graph-${graphVersion}`}
-  ref={graphRef as any}
+<ForceGraph2D
+  ref={graphRef as any}
   graphData={graphData}

ForceGraph2D should automatically update when graphData changes. If you're experiencing issues with incremental updates, investigate the graph's update mechanism rather than forcing full remounts.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between ef7b4db and decc153.

📒 Files selected for processing (3)
  • src/frontend/ui/knowledge-tab.tsx (7 hunks)
  • src/frontend/ui/memory-graph-optimized.tsx (1 hunks)
  • src/routes.ts (2 hunks)
🔇 Additional comments (3)
src/frontend/ui/knowledge-tab.tsx (2)

1076-1150: LGTM! Layout improvements enhance scrolling behavior

The CSS changes to MemoryDetails improve flex behavior, overflow handling, and text wrapping. The additions of min-h-0, proper scrollbar styling, and break-words prevent layout issues in flex containers.


1645-1649: LGTM! Improved text rendering in document viewer

The addition of overflow-y-auto and break-words ensures proper text wrapping and scrolling in the document content viewer.

src/frontend/ui/memory-graph-optimized.tsx (1)

55-358: LGTM! Well-structured graph visualization component

The component implements proper:

  • Pagination and incremental loading
  • Cache-first data fetching
  • Loading and error states
  • Proper React hooks usage
  • Responsive dimension handling

The architecture separates concerns well: data fetching, caching, UI state, and rendering.

@0xbbjoker 0xbbjoker force-pushed the feat/optimize-knowledge-graph branch from ac800d2 to c529d7b Compare October 5, 2025 09:30
cursor[bot]

This comment was marked as outdated.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (5)
src/frontend/ui/memory-graph-optimized.tsx (1)

69-352: Keep ForceGraph mounted between page loads

Incrementing graphVersion and feeding it into the component key forces a remount on every “Load More”, wiping zoom/pan (exactly the UX regression previously reported). You can rely on graphData updates alone—ForceGraph mutates in place just fine.

-    const [graphVersion, setGraphVersion] = useState(0);
+    const [, setGraphVersion] = useState(0); // remove if no longer needed
@@
-                if (page === 1) {
-                    setGraphData({ nodes: graphNodes, links });
-                    setGraphVersion(1); // Reset version for initial load
-                } else {
-                    setGraphData(prev => ({
-                        nodes: [...prev.nodes, ...graphNodes],
-                        links: [...prev.links, ...links]
-                    }));
-                    setGraphVersion(prev => prev + 1); // Increment for additions
-                }
+                if (page === 1) {
+                    setGraphData({ nodes: graphNodes, links });
+                } else {
+                    setGraphData(prev => ({
+                        nodes: [...prev.nodes, ...graphNodes],
+                        links: [...prev.links, ...links]
+                    }));
+                }
@@
-            <ForceGraph2D
-                key={`graph-${graphVersion}`}
+            <ForceGraph2D

(You can remove the graphVersion state entirely once it’s no longer used.)

src/routes.ts (4)

880-885: Validate agentId before querying

agentId is accepted straight from req.query and fed into getMemories. A crafted value (empty string, random text, or someone else’s ID) will still be sent to the service, inviting both noisy queries and cross-tenant probing. This gap was already flagged earlier. Please reject non-UUID values up front.

-    const agentId = (req.query.agentId as UUID) || runtime.agentId;
+    const agentId = (req.query.agentId as UUID) || runtime.agentId;
+    const uuidRegex =
+      /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;
+    if (!uuidRegex.test(agentId)) {
+      return sendError(res, 400, 'INVALID_AGENT_ID', 'Invalid agent ID format');
+    }

927-939: Remove fragment metadata dumps from default logs

Dumping fragment metadata (often containing raw text) on every call risks leaking sensitive content into production logs. Gate the log behind an explicit debug flag or drop it entirely.

-      if (allFragments.length > 0) {
-        logger.debug(
-          `[Graph API] 📊 Sample fragment metadata: ${JSON.stringify(
-            allFragments.slice(0, 3).map((f) => ({
-              id: f.id,
-              metadata: f.metadata,
-            }))
-          )}`
-        );
-      }
+      if (
+        allFragments.length > 0 &&
+        runtime.getSetting('DEBUG_MODE') === 'true'
+      ) {
+        logger.debug(
+          `[Graph API] 📊 Sample fragment metadata keys: ${JSON.stringify(
+            allFragments.slice(0, 3).map((f) => ({
+              id: f.id,
+              metadataKeys: Object.keys(f.metadata || {}),
+            }))
+          )}`
+        );
+      }

1011-1013: Tighten nodeId validation

A length check isn’t enough—non-UUID strings still pass and flow to the DB. Use a proper UUID validation (regex or helper) and reject anything else, as previously requested.

-  if (!nodeId || nodeId.length < 36) {
+  const uuidRegex =
+    /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;
+  if (!nodeId || !uuidRegex.test(nodeId)) {
     return sendError(res, 400, 'INVALID_ID', 'Invalid node ID format');
   }

1116-1188: Avoid reading 100 000 fragments to expand one document

expandDocumentGraphHandler grabs count: 100000 fragments for the whole agent and then filters by documentId. That’s the same scalability problem as above and can generate massive responses under load. Please query only the fragments tied to the requested document (e.g. add a getMemoriesByMetadata({ documentId }) helper) and, ideally, return 404 when the document itself doesn’t exist (per the previous review).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between decc153 and c529d7b.

📒 Files selected for processing (4)
  • package.json (1 hunks)
  • src/frontend/ui/knowledge-tab.tsx (7 hunks)
  • src/frontend/ui/memory-graph-optimized.tsx (1 hunks)
  • src/routes.ts (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • package.json

Comment on lines +921 to +978
const allFragments = await service.getMemories({
tableName: 'knowledge',
roomId: agentId,
count: 100000,
});

logger.debug(`[Graph API] 📊 Total fragments found: ${allFragments.length}`);

// Debug: Log the first few fragment metadata to understand structure
if (allFragments.length > 0) {
logger.debug(
`[Graph API] 📊 Sample fragment metadata: ${JSON.stringify(
allFragments.slice(0, 3).map((f) => ({
id: f.id,
metadata: f.metadata,
}))
)}`
);
}

// For each document, find its fragments and add them
paginatedDocuments.forEach((doc) => {
// Skip documents without IDs
if (!doc.id) {
return;
}

const docFragments = allFragments.filter((fragment) => {
const metadata = fragment.metadata as any;
// Check both documentId match and that it's a fragment type (case-insensitive)
// Safely handle type checking with string validation
const typeString = typeof metadata?.type === 'string' ? metadata.type : null;
const isFragment =
(typeString && typeString.toLowerCase() === 'fragment') ||
metadata?.type === MemoryType.FRAGMENT ||
// If no type but has documentId, assume it's a fragment
(!metadata?.type && metadata?.documentId);
return metadata?.documentId === doc.id && isFragment;
});

if (docFragments.length > 0) {
logger.debug(`[Graph API] 📊 Document ${doc.id} has ${docFragments.length} fragments`);
}

// Add fragment nodes and links (filter out any without IDs)
docFragments.forEach((frag) => {
// TypeScript guard: doc.id is already checked above, but reassign for type safety
const docId = doc.id;
if (!frag.id || !docId) {
logger.warn(
`[Graph API] ⚠️ Skipping fragment without ID for document ${docId || 'unknown'}`
);
return;
}
nodes.push({ id: frag.id, type: 'fragment' });
links.push({ source: docId, target: frag.id });
});
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Stop loading 100 000 fragments per paginated request

Every call to /graph/nodes reads count: 100000 fragments for the entire agent and then filters in memory. For bigger knowledge bases this is O(N·page) work, spikes memory, and defeats the “lightweight graph” goal. Please fetch only the fragments that belong to the documents in the current page (e.g. gather their IDs and issue a targeted query or extend the service with getFragmentsByDocumentIds). Once the service exists, the loop collapses to just the fragments you need instead of hauling 100 000 rows on every request.

cursor[bot]

This comment was marked as outdated.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
src/frontend/ui/memory-graph-optimized.tsx (1)

292-292: Graph remount on pagination loses zoom/pan state.

Changing the key prop forces a full remount of ForceGraph2D, discarding user zoom and pan positions each time new nodes are loaded. This degrades UX significantly during pagination.

Remove the key prop to preserve interaction state:

-            key={`graph-${graphVersion}`}

If you need to trigger re-initialization for other reasons, use the imperative graphRef.current.refresh() method instead of remounting.

🧹 Nitpick comments (3)
src/frontend/ui/memory-graph-optimized.tsx (3)

66-66: Unbounded cache growth for node details.

The nodeDetails Map grows indefinitely as users click nodes. For large knowledge graphs with hundreds or thousands of nodes, this can consume significant memory over time.

Consider implementing an LRU cache or size limit:

const MAX_CACHE_SIZE = 100;

// In fetchNodeDetails, after caching:
setNodeDetails(prev => {
    const newMap = new Map(prev).set(nodeId, memory);
    
    // Evict oldest entries if cache exceeds limit
    if (newMap.size > MAX_CACHE_SIZE) {
        const firstKey = newMap.keys().next().value;
        newMap.delete(firstKey);
    }
    
    return newMap;
});

Alternatively, use a library like lru-cache for more sophisticated eviction policies.


318-344: Explicitly set nodeCanvasObjectMode for custom rendering.

When using nodeCanvasObject, the nodeCanvasObjectMode prop controls how the custom drawing integrates with default rendering. Without it, the default mode may overlay or underlay your custom draw, which can cause unexpected visual results.

Based on learnings

Set the mode explicitly to 'replace' for full control:

             nodeCanvasObject={(node: GraphNode, ctx, globalScale) => {
                 const size = (node.val || 4);
                 const isSelected = selectedMemoryId === node.id;
                 const isLoading = loadingNodes.has(node.id);

                 // Draw node circle
                 ctx.beginPath();
                 ctx.arc(node.x!, node.y!, size, 0, 2 * Math.PI);
                 ctx.fillStyle = getNodeColor(node);
                 ctx.fill();

                 // Border
                 ctx.strokeStyle = isSelected ? 'hsl(var(--primary))' : 'hsl(var(--border))';
                 ctx.lineWidth = isSelected ? 2 : 1;
                 ctx.stroke();

                 // Loading indicator
                 if (isLoading) {
                     ctx.beginPath();
                     ctx.arc(node.x!, node.y!, size * 1.5, 0, Math.PI * 2 * 0.3);
                     ctx.strokeStyle = 'hsl(var(--primary))';
                     ctx.lineWidth = 2;
                     ctx.stroke();
                 }
             }}
+            nodeCanvasObjectMode={() => 'replace'}

211-211: Consider removing nodeDetails from dependencies.

Including the nodeDetails Map in useCallback dependencies causes the callback to be recreated every time the cache changes. Since nodeDetails is only used for synchronous reads (cache lookup at line 154), you can use a ref instead to avoid unnecessary recreations.

Use a ref for the cache:

-const [nodeDetails, setNodeDetails] = useState<Map<UUID, Memory>>(new Map());
+const nodeDetailsRef = useRef<Map<UUID, Memory>>(new Map());

 const fetchNodeDetails = useCallback(async (nodeId: UUID) => {
     // Check cache first
-    if (nodeDetails.has(nodeId)) {
-        const memory = nodeDetails.get(nodeId)!;
+    if (nodeDetailsRef.current.has(nodeId)) {
+        const memory = nodeDetailsRef.current.get(nodeId)!;
         onNodeClick(memory);
         return;
     }

     // ... fetch logic ...

     // Cache the details
-    setNodeDetails(prev => new Map(prev).set(nodeId, memory));
+    nodeDetailsRef.current.set(nodeId, memory);

     // Trigger the callback
     onNodeClick(memory);
-}, [agentId, nodeDetails, onNodeClick]);
+}, [agentId, onNodeClick]);

This reduces the callback's recreation frequency and slightly improves performance.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between c529d7b and 152d1c6.

📒 Files selected for processing (1)
  • src/frontend/ui/memory-graph-optimized.tsx (1 hunks)
🔇 Additional comments (1)
src/frontend/ui/memory-graph-optimized.tsx (1)

187-189: Memory fields are correctly assigned from API response.

The previous review flagged these fields as incorrectly hardcoded, but the current code correctly uses result.data.entityId, result.data.roomId, and result.data.agentId.

Comment on lines +89 to +142
const loadGraphNodes = useCallback(async (page = 1) => {
setIsLoading(true);
setError(null);

try {
const params = new URLSearchParams();
params.append('agentId', agentId);
params.append('page', page.toString());
params.append('limit', '20');
// Don't specify type to get documents with fragments

const apiBase = getApiBase();
const response = await fetch(
`${apiBase}/graph/nodes?${params.toString()}`
);

if (!response.ok) {
throw new Error(`Failed to load graph nodes: ${response.statusText}`);
}

const result = await response.json();

if (result.success && result.data) {
const { nodes, links, pagination } = result.data;

// Convert to graph nodes with initial properties
const graphNodes: GraphNode[] = nodes.map((node: any) => ({
id: node.id,
type: node.type,
loading: false,
val: node.type === 'document' ? 8 : 4,
}));

if (page === 1) {
setGraphData({ nodes: graphNodes, links });
setGraphVersion(1); // Reset version for initial load
} else {
// Append to existing nodes
setGraphData(prev => ({
nodes: [...prev.nodes, ...graphNodes],
links: [...prev.links, ...links]
}));
setGraphVersion(prev => prev + 1); // Increment for additions
}

setPagination(pagination);
}
} catch (err) {
console.error('Error loading graph nodes:', err);
setError(err instanceof Error ? err.message : 'Failed to load graph');
} finally {
setIsLoading(false);
}
}, [agentId]);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Request not cancelled on unmount or navigation.

The fetch call lacks an AbortController, so it can't be cancelled if the component unmounts or agentId changes mid-request. This can cause memory leaks, state updates on unmounted components, and race conditions if multiple requests overlap.

Add cancellation:

 const loadGraphNodes = useCallback(async (page = 1) => {
+    const abortController = new AbortController();
     setIsLoading(true);
     setError(null);

     try {
         const params = new URLSearchParams();
         params.append('agentId', agentId);
         params.append('page', page.toString());
         params.append('limit', '20');

         const apiBase = getApiBase();
-        const response = await fetch(
-            `${apiBase}/graph/nodes?${params.toString()}`
-        );
+        const response = await fetch(
+            `${apiBase}/graph/nodes?${params.toString()}`,
+            { signal: abortController.signal }
+        );

         if (!response.ok) {
             throw new Error(`Failed to load graph nodes: ${response.statusText}`);
         }

         const result = await response.json();
         // ... rest of logic
     } catch (err) {
+        if (err instanceof Error && err.name === 'AbortError') {
+            return; // Request was cancelled, ignore
+        }
         console.error('Error loading graph nodes:', err);
         setError(err instanceof Error ? err.message : 'Failed to load graph');
     } finally {
         setIsLoading(false);
     }
+    return () => abortController.abort();
 }, [agentId]);

Then clean up in the effect:

 useEffect(() => {
-    loadGraphNodes(1);
+    const cleanup = loadGraphNodes(1);
+    return cleanup;
 }, [loadGraphNodes]);

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/frontend/ui/memory-graph-optimized.tsx around lines 89 to 142, the fetch
in loadGraphNodes isn't cancellable which can cause state updates after unmount
or when agentId changes; create an AbortController at the start of
loadGraphNodes, pass its signal into fetch, and on component cleanup (or before
starting a new load when agentId changes) call controller.abort(); in the catch
block ignore errors caused by abort (check for error.name === 'AbortError' or
similar) to avoid setting error/loading state for cancelled requests, and ensure
any setState calls only run when the request wasn't aborted.

Comment on lines +152 to +211
const fetchNodeDetails = useCallback(async (nodeId: UUID) => {
// Check cache first
if (nodeDetails.has(nodeId)) {
const memory = nodeDetails.get(nodeId)!;
onNodeClick(memory);
return;
}

// Mark as loading
setLoadingNodes(prev => new Set(prev).add(nodeId));

try {
const params = new URLSearchParams();
params.append('agentId', agentId);

const apiBase = getApiBase();
const url = `${apiBase}/graph/node/${nodeId}?${params.toString()}`;

const response = await fetch(url);

if (!response.ok) {
const errorText = await response.text();
console.error('API error response:', errorText);
throw new Error(`Failed to fetch node details: ${response.statusText}`);
}

const result = await response.json();

if (result.success && result.data) {
// Convert to Memory format with all required fields
const memory: Memory = {
id: result.data.id,
content: result.data.content,
metadata: result.data.metadata,
createdAt: result.data.createdAt,
entityId: result.data.entityId,
roomId: result.data.roomId,
agentId: result.data.agentId,
};

// Cache the details
setNodeDetails(prev => new Map(prev).set(nodeId, memory));

// Trigger the callback
onNodeClick(memory);
} else {
console.error('Invalid API response format:', result);
throw new Error('Invalid response format from API');
}
} catch (err) {
console.error('Error fetching node details:', err);
alert(`Failed to load node details: ${err instanceof Error ? err.message : 'Unknown error'}`);
} finally {
setLoadingNodes(prev => {
const newSet = new Set(prev);
newSet.delete(nodeId);
return newSet;
});
}
}, [agentId, nodeDetails, onNodeClick]);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Request not cancelled on unmount or rapid clicks.

Similar to loadGraphNodes, fetchNodeDetails lacks an AbortController. If a user clicks multiple nodes rapidly or navigates away, pending requests can update state after unmount or cause race conditions where responses arrive out of order.

Apply the same pattern as suggested for loadGraphNodes:

 const fetchNodeDetails = useCallback(async (nodeId: UUID) => {
     if (nodeDetails.has(nodeId)) {
         const memory = nodeDetails.get(nodeId)!;
         onNodeClick(memory);
         return;
     }

+    const abortController = new AbortController();
     setLoadingNodes(prev => new Set(prev).add(nodeId));

     try {
         const params = new URLSearchParams();
         params.append('agentId', agentId);

         const apiBase = getApiBase();
         const url = `${apiBase}/graph/node/${nodeId}?${params.toString()}`;

-        const response = await fetch(url);
+        const response = await fetch(url, { signal: abortController.signal });

         if (!response.ok) {
             const errorText = await response.text();
             console.error('API error response:', errorText);
             throw new Error(`Failed to fetch node details: ${response.statusText}`);
         }

         const result = await response.json();
         // ... rest of logic
     } catch (err) {
+        if (err instanceof Error && err.name === 'AbortError') {
+            return; // Request was cancelled, ignore
+        }
         console.error('Error fetching node details:', err);
         alert(`Failed to load node details: ${err instanceof Error ? err.message : 'Unknown error'}`);
     } finally {
         setLoadingNodes(prev => {
             const newSet = new Set(prev);
             newSet.delete(nodeId);
             return newSet;
         });
     }
 }, [agentId, nodeDetails, onNodeClick]);

Committable suggestion skipped: line range outside the PR's diff.

cursor[bot]

This comment was marked as outdated.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (7)
src/frontend/ui/memory-graph-optimized.tsx (3)

89-155: Missing AbortController causes memory leaks and race conditions.

The fetch call lacks an AbortController, preventing cancellation when the component unmounts or agentId changes. This can cause memory leaks, state updates on unmounted components, and race conditions if multiple requests overlap.

Note: The link deduplication logic (lines 126-144) is good and addresses potential duplicate link issues. However, the graphVersion state updates (lines 124, 144) that are used as keys on the ForceGraph2D component (line 305) will cause the graph to remount on every load, losing user interaction state like zoom and pan position.


165-224: Missing AbortController causes memory leaks and race conditions.

Similar to loadGraphNodes, fetchNodeDetails lacks an AbortController. If a user clicks multiple nodes rapidly or navigates away, pending requests can update state after unmount or cause race conditions where responses arrive out of order.


304-305: Graph remount on pagination loses user interaction state.

The graphVersion key causes the ForceGraph2D component to fully remount when loading more nodes, resetting zoom and pan position. This negatively impacts user experience during pagination.

Consider using React Force Graph's built-in graph data mutation methods to add nodes incrementally without remounting, or preserve and restore zoom/pan state across remounts.

src/routes.ts (4)

925-987: Critical: Fetching 100,000 fragments per request defeats pagination.

Every call to /graph/nodes fetches up to 100,000 fragments (line 928) and filters them in memory. For large knowledge bases, this causes severe performance issues, spikes memory, and defeats the "lightweight graph" optimization goal.

Fetch only the fragments belonging to the paginated documents. Consider adding a method like getFragmentsByDocumentIds(documentIds, roomId) to the service, then:

- const allFragments = await service.getMemories({
-   tableName: 'knowledge',
-   roomId: agentId,
-   count: 100000,
- });
+ // Get document IDs from the paginated documents
+ const documentIds = paginatedDocuments.map(doc => doc.id).filter(Boolean);
+ 
+ // Fetch only fragments for these documents
+ const allFragments = await service.getFragmentsByDocumentIds({
+   documentIds,
+   roomId: agentId,
+ });

This would reduce the query from O(all fragments) to O(fragments for current page).


1022-1096: Critical: Cross-agent data leak via fallback lookup.

The handler fetches all documents/fragments without roomId filtering (lines 1023-1026, 1062-1065), then falls back to unfiltered lookup if the scoped search fails (lines 1034-1042, 1073-1081). This allows any caller to retrieve another agent's knowledge by guessing a UUID—a critical security vulnerability.

Remove the fallback logic and scope all queries to the requesting agent:

- const allDocuments = await service.getMemories({
-   tableName: 'documents',
-   count: 10000,
- });
-
- let document = allDocuments.find((doc) => doc.id === nodeId && doc.roomId === agentId);
-
- if (!document) {
-   logger.debug(`[Graph API] 📊 Document not found with roomId filter, trying without filter`);
-   document = allDocuments.find((doc) => doc.id === nodeId);
-   if (document) {
-     logger.warn(`[Graph API] ⚠️ Document ${nodeId} found but has different roomId: ${document.roomId} vs ${agentId}`);
-   }
- }
+ const allDocuments = await service.getMemories({
+   tableName: 'documents',
+   roomId: agentId,
+   count: 1000,
+ });
+
+ const document = allDocuments.find((doc) => doc.id === nodeId);

Apply the same pattern to fragment lookup. Additionally, use a direct ID lookup method instead of fetching thousands of records to find one.


1123-1127: Critical: Fetching 100,000 fragments is a scalability issue.

The handler fetches up to 100,000 fragments on every expand request, which is inefficient and risks crashing the server with large knowledge bases.

Add a service method to fetch fragments by specific document ID and metadata, avoiding the full table scan:

- const allFragments = await service.getMemories({
-   tableName: 'knowledge',
-   roomId: agentId,
-   count: 100000,
- });
+ const documentFragments = await service.getFragmentsByDocumentId({
+   documentId,
+   roomId: agentId,
+ });

1132-1158: Debug logging may expose sensitive data.

Extensive debug logging of fragment metadata (lines 1132-1158) could expose sensitive information in production logs. This logging should be removed or gated behind a debug flag.

Gate the logging with an environment check:

- if (allFragments.length > 0 && process.env.NODE_ENV !== 'production') {
+ if (process.env.DEBUG_GRAPH_API === 'true') {
    logger.debug(
      `[Graph API] 📊 Sample fragment metadata: ${JSON.stringify(allFragments[0].metadata)}`
    );
    // ... rest of debug logging
  }
🧹 Nitpick comments (1)
src/routes.ts (1)

1119-1121: Add document existence check.

The handler should verify the document exists and belongs to the requesting agent before attempting to expand it. This prevents unnecessary work and provides better error messages.

After parameter validation, add:

// Verify document exists and belongs to agent
const document = await runtime.getMemoryById(documentId);
if (!document) {
  return sendError(res, 404, 'DOCUMENT_NOT_FOUND', `Document ${documentId} not found`);
}
if (document.roomId !== agentId) {
  return sendError(res, 403, 'FORBIDDEN', 'Document belongs to a different agent');
}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 152d1c6 and d598a7d.

📒 Files selected for processing (3)
  • src/frontend/ui/memory-graph-optimized.tsx (1 hunks)
  • src/routes.ts (2 hunks)
  • src/service.ts (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Cursor Bugbot
🔇 Additional comments (10)
src/service.ts (2)

873-884: LGTM! Pagination support added correctly.

The offset parameter is properly typed as optional and propagated to the runtime's getMemories call via the spread operator. This enables database-level pagination for the new graph endpoints without breaking existing callers.


886-901: LGTM! Count method implemented correctly.

The countMemories method properly wraps runtime.countMemories with sensible defaults (roomId defaults to agentId, unique defaults to false). This provides the pagination metadata needed by the graph endpoints.

src/frontend/ui/memory-graph-optimized.tsx (4)

1-86: LGTM! Component setup is well-structured.

The component properly imports dependencies, declares types, and initializes state. The resize effect correctly cleans up the event listener on unmount. The getApiBase helper provides a sensible fallback to '/api'.


226-271: LGTM! Rendering logic is sound.

The node click handler, color computation, and loading/error state rendering are all implemented correctly. The useEffect properly initializes the graph data on mount.

Note: The useEffect at line 233-235 does not clean up the loadGraphNodes request, which relates to the earlier AbortController issue.


273-366: Graph rendering configuration is well-implemented.

The legend, pagination button, and ForceGraph2D configuration are properly set up. Custom node rendering, interaction handlers, and the zoomToFit on engine stop provide good UX.

Note: The graphVersion remount issue (line 305) is a separate concern flagged above.


195-203: No action needed: API returns required fields. The getGraphNodeDetailsHandler’s sendSuccess payload includes entityId, roomId, and agentId for both documents and fragments.

src/routes.ts (4)

891-907: LGTM! Database-level pagination implemented correctly.

The handler now uses countMemories for total count and getMemories with offset/limit for paginated document retrieval. This is a significant improvement over in-memory pagination.


956-962: LGTM! Safe type checking prevents runtime errors.

The code safely validates that metadata.type is a string before calling toLowerCase(), preventing runtime errors if the type field is non-string or undefined.


1163-1170: LGTM! Safe type checking implemented.

The code safely validates metadata.type is a string before calling toLowerCase(), preventing runtime errors with non-string types.


1264-1279: LGTM! Route definitions are correctly configured.

The three new graph endpoints are properly defined with appropriate HTTP methods (GET) and path parameters. The route-to-handler mappings are correct.

@0xbbjoker
Copy link
Copy Markdown
Author

Depends on elizaOS/eliza#6032

cursor[bot]

This comment was marked as outdated.

metadata?.type === MemoryType.FRAGMENT ||
// If no type but has documentId, assume it's a fragment
(!metadata?.type && metadata?.documentId);
return metadata?.documentId === doc.id && isFragment;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Missing Enum Import Causes ReferenceError

The MemoryType.FRAGMENT enum is used in getGraphNodesHandler and expandDocumentGraphHandler for fragment type checking. However, MemoryType is not imported in this file, which will cause a ReferenceError when the code attempts to access it.

Fix in Cursor Fix in Web


// Calculate pagination
const totalPages = Math.ceil(totalDocuments / limit);
const hasMore = page < totalPages;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Incorrect Parameter Handling in countMemories

The countMemories method is called with an object containing tableName, roomId, and unique. If countMemories expects these parameters positionally, this will result in incorrect document counts and pagination.

Fix in Cursor Fix in Web

@0xbbjoker 0xbbjoker merged commit 33680de into 1.x Oct 7, 2025
2 checks passed
@0xbbjoker 0xbbjoker deleted the feat/optimize-knowledge-graph branch October 7, 2025 07:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant