Skip to content

feat: support openai embedding and add llm base_urls in client#182

Merged
vishnurk6247 merged 4 commits into
developfrom
fix_migrating_changes
Dec 14, 2025
Merged

feat: support openai embedding and add llm base_urls in client#182
vishnurk6247 merged 4 commits into
developfrom
fix_migrating_changes

Conversation

@rootflo-hardik
Copy link
Copy Markdown
Contributor

@rootflo-hardik rootflo-hardik commented Dec 14, 2025

  • fixed the base_url defaults issue in frontend
  • using openai embeddings for rag service

Summary by CodeRabbit

  • New Features

    • Added support for configurable base URLs for LLM providers with intelligent defaults.
  • Improvements

    • Enhanced error handling and logging for datasource operations.
    • Improved API response formatting for knowledge base retrieval.
    • Updated embedding model selection with environment-based configuration.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 14, 2025

Walkthrough

The PR adds baseUrl support to LLM provider configuration and integrates it into dialog/edit components. Server-side embedding services now authenticate via OpenAI API key and support configurable embedding models. Response formatting is refactored in the knowledge base controller. Minor type and initialization improvements across client components.

Changes

Cohort / File(s) Change Summary
TypeScript type refinement
wavefront/client/src/components/DashboardLayout.tsx
Updated timeoutRef type from number | null to ReturnType<typeof setTimeout> | null; removed ts-ignore directive.
LLM Provider Configuration
wavefront/client/src/config/llm-providers.ts
wavefront/server/background_jobs/rag_ingestion/rag_ingestion/env.py
Added optional baseUrl?: string field to ProviderConfig interface and populated entries for openai, anthropic, gemini, groq. Created getDefaultBaseUrl() utility function. Added environment variables: OPENAI_API_KEY and EMBEDDING_MODEL with defaults.
LLM Dialog & Edit Components
wavefront/client/src/pages/apps/[appId]/llm-inference/CreateLLMInferenceDialog.tsx
wavefront/client/src/pages/apps/[appId]/llm-inference/[configId].tsx
Integrated getDefaultBaseUrl() for automatic base URL initialization on provider type changes. Extended supportsBaseUrl to include openai, anthropic, gemini, groq. Updated form reset and type-change flows to set default base URLs.
Embedding Service Integration
wavefront/server/background_jobs/rag_ingestion/rag_ingestion/embeddings/embed.py
wavefront/server/modules/knowledge_base_module/knowledge_base_module/embeddings/embed.py
Added OpenAI API key authentication via Authorization header. Switched embedding model from BAAI/bge-m3 to text-embedding-3-small. Updated URL handling and response parsing. Added timeout and error status checking.
Knowledge Base Controller Response Formatting
wavefront/server/modules/knowledge_base_module/knowledge_base_module/controllers/knowledge_base_controller.py
Refactored get_knowledge_bases_id() endpoint: removed response_model decorator parameter, injected ResponseFormatter dependency, changed return type from KnowledgeBaseResponse to JSONResponse, and used response_formatter.buildSuccessResponse() for response construction.
Minor Variable & Logic Updates
wavefront/client/src/pages/apps/[appId]/agents/[id].tsx
wavefront/client/src/pages/apps/[appId]/datasources/[datasourceId].tsx
Changed finalTextInput from let to const declaration. Restructured conditional logic for YAML validation and added error logging on datasource deletion failure.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Knowledge base controller signature change (knowledge_base_controller.py): Verify that ResponseFormatter is correctly injected, response structure matches expected API contract, and no downstream consumers break due to the return type change from KnowledgeBaseResponse to JSONResponse.
  • Embedding service authentication (both embed.py files): Confirm OPENAI_API_KEY is properly sourced and that the Authorization header format aligns with OpenAI API requirements. Verify model compatibility with text-embedding-3-small.
  • Dialog/form initialization logic (CreateLLMInferenceDialog.tsx, [configId].tsx): Ensure default base URL initialization works correctly across all provider type changes, especially during form reset and type switching.

Possibly related PRs

Poem

🐰 A baseUrl blooms, embedding models sing,
Providers now speak with default voices bright,
OpenAI keys unlock the door,
Forms remember their settings right,
One hop closer to inference delight! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main changes: adding OpenAI embedding support and implementing LLM base_url defaults in the client, which aligns with both the PR objectives and the file-level changes.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix_migrating_changes

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
wavefront/server/background_jobs/rag_ingestion/rag_ingestion/embeddings/embed.py (1)

20-20: Critical bug: Only the first batch is processed and batching is broken.

batches[0] is the first batch (a list of strings), so iterating over it processes individual strings instead of batches. This:

  1. Ignores all batches after the first
  2. Defeats the purpose of batching by calling bgm_embedding per-string
-        embeddings = [self.bgm_embedding(batch) for batch in batches[0]]
+        embeddings = []
+        for batch in batches:
+            batch_embeddings = self.bgm_embedding(batch)
+            embeddings.extend(batch_embeddings)

Note: This fix also requires updating bgm_embedding to return all embeddings from the batch (see related comment on line 49).

wavefront/server/modules/knowledge_base_module/knowledge_base_module/embeddings/embed.py (1)

19-39: Critical bug: incorrect embedding assignment.

Line 34 assigns the entire flat_embeddings list to every KnowledgeBaseEmbeddingObject, but it should assign the individual embedding at index i. This causes all chunks to share the same embedding reference instead of having unique embeddings.

Apply this diff to fix the indexing:

         for i, (k, v) in enumerate(chunks.items()):
             data_list.append(
                 KnowledgeBaseEmbeddingObject(
-                    embedding_vector=flat_embeddings,
+                    embedding_vector=flat_embeddings[i],
                     chunk_text=v['content'],
                     chunk_index=k,
                 )
🧹 Nitpick comments (2)
wavefront/client/src/components/DashboardLayout.tsx (1)

30-31: Consider using ReturnType<typeof setTimeout> for proper typing.

The @ts-expect-error directive is an improvement over @ts-ignore, but the comment "ts-expect error" doesn't explain the reason for suppression. The root cause is that timeoutRef is typed as number | null while setTimeout returns NodeJS.Timeout in Node environments.

A cleaner fix would be to type the ref correctly:

-  const timeoutRef = useRef<number | null>(null);
+  const timeoutRef = useRef<ReturnType<typeof setTimeout> | null>(null);

Then remove the @ts-expect-error directive entirely:

-    // @ts-expect-error ts-expect error
     timeoutRef.current = setTimeout(
wavefront/client/src/pages/apps/[appId]/llm-inference/CreateLLMInferenceDialog.tsx (1)

119-124: Consider removing redundant baseUrl setting.

The baseUrl is set to the default value both here in the useEffect and in the Select.onValueChange handler (lines 344-347). When the user changes the type via the dropdown, onValueChange fires first and sets baseUrl, then this effect runs and sets it again to the same value.

You could remove the baseUrl setting from either location. Keeping it only in onValueChange would be more direct:

   useEffect(() => {
     if (isOpen && watchedType) {
       const newType = watchedType as InferenceEngineType;
       setType(newType);
       const defaultParams = initializeParameters(newType);
       setParameters(defaultParams);
-
-      // Set default base URL for the provider
-      const defaultBaseUrl = getDefaultBaseUrl(newType);
-      form.setValue('baseUrl', defaultBaseUrl);
     }
   }, [watchedType, isOpen, form]);

However, keeping both is also harmless since they set identical values.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 22fdbf5 and 7e89835.

📒 Files selected for processing (10)
  • wavefront/client/eslint.config.js (1 hunks)
  • wavefront/client/src/components/DashboardLayout.tsx (1 hunks)
  • wavefront/client/src/config/llm-providers.ts (6 hunks)
  • wavefront/client/src/pages/apps/[appId]/agents/[id].tsx (1 hunks)
  • wavefront/client/src/pages/apps/[appId]/datasources/[datasourceId].tsx (2 hunks)
  • wavefront/client/src/pages/apps/[appId]/llm-inference/CreateLLMInferenceDialog.tsx (6 hunks)
  • wavefront/client/src/pages/apps/[appId]/llm-inference/[configId].tsx (3 hunks)
  • wavefront/server/background_jobs/rag_ingestion/rag_ingestion/embeddings/embed.py (3 hunks)
  • wavefront/server/modules/knowledge_base_module/knowledge_base_module/controllers/knowledge_base_controller.py (2 hunks)
  • wavefront/server/modules/knowledge_base_module/knowledge_base_module/embeddings/embed.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
wavefront/client/src/pages/apps/[appId]/llm-inference/[configId].tsx (1)
wavefront/client/src/config/llm-providers.ts (2)
  • mergeParameters (513-530)
  • getDefaultBaseUrl (548-551)
wavefront/client/src/pages/apps/[appId]/llm-inference/CreateLLMInferenceDialog.tsx (1)
wavefront/client/src/config/llm-providers.ts (2)
  • getDefaultBaseUrl (548-551)
  • initializeParameters (494-507)
wavefront/server/modules/knowledge_base_module/knowledge_base_module/controllers/knowledge_base_controller.py (1)
flo_ai/flo_ai/arium/memory.py (1)
  • to_dict (28-33)
🪛 Ruff (0.14.8)
wavefront/server/background_jobs/rag_ingestion/rag_ingestion/embeddings/embed.py

38-38: Probable use of requests call without timeout

(S113)

wavefront/server/modules/knowledge_base_module/knowledge_base_module/embeddings/embed.py

47-47: Probable use of requests call without timeout

(S113)

wavefront/server/modules/knowledge_base_module/knowledge_base_module/controllers/knowledge_base_controller.py

89-91: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


92-94: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

🔇 Additional comments (19)
wavefront/server/background_jobs/rag_ingestion/rag_ingestion/embeddings/embed.py (2)

1-1: LGTM!

The os import is correctly added for the os.getenv call, and using the EMBEDDING_SERVICE_URL directly aligns with the updated embedding service configuration.

Also applies to: 11-12


32-34: [Rewritten review comment]
[Classification tag]

wavefront/client/src/pages/apps/[appId]/agents/[id].tsx (1)

406-406: LGTM!

Using const is appropriate here since finalTextInput is assigned once and never reassigned.

wavefront/client/src/pages/apps/[appId]/datasources/[datasourceId].tsx (1)

120-123: LGTM!

The refactored validation check is cleaner - directly returning early when validation fails rather than using an empty block for the valid case.

wavefront/client/src/config/llm-providers.ts (3)

23-23: LGTM!

Adding the optional baseUrl property to ProviderConfig enables consistent default URL handling across providers.


544-551: LGTM!

The getDefaultBaseUrl function correctly returns the provider's configured baseUrl or an empty string for providers that require user-provided URLs (ollama, vllm, azure_openai).


40-40: OpenAI base URL is correct.

The URL https://api.openai.com/v1 matches the official OpenAI API documentation.

wavefront/client/src/pages/apps/[appId]/llm-inference/[configId].tsx (3)

17-23: LGTM!

The import additions correctly include getDefaultBaseUrl for the new default URL handling functionality.


101-115: LGTM!

The effect correctly updates both parameters and base URL when the provider type changes during editing. The condition watchedType !== config.type ensures the default URL is only applied when switching to a different provider, preserving the user's custom URL when staying on the same provider.


193-195: LGTM!

Extending supportsBaseUrl to include all providers with default URLs defined (openai, anthropic, gemini, groq) ensures consistent UI behavior for base URL configuration.

wavefront/client/src/pages/apps/[appId]/llm-inference/CreateLLMInferenceDialog.tsx (6)

25-31: LGTM!

The import additions correctly include the required utilities for default URL handling.


64-68: LGTM!

Adding placeholder URLs for the cloud providers helps guide users with the expected URL format.


107-107: LGTM!

Using getDefaultBaseUrl('openai') for the initial default value ensures consistency with the centralized provider configuration.


129-138: LGTM!

The reset logic correctly initializes both baseUrl and parameters to their defaults for the default provider type when the dialog closes.


146-148: LGTM!

Extending supportsBaseUrl to include all providers with default URLs ensures consistent behavior with the edit page.


344-348: LGTM!

Setting the default base URL in the onValueChange handler provides immediate feedback when the user changes the provider type.

wavefront/server/modules/knowledge_base_module/knowledge_base_module/embeddings/embed.py (2)

1-1: LGTM!

The os import is necessary for accessing the OPENAI_API_KEY environment variable.


17-17: LGTM!

Simplifying the URL construction to use the embedding URL directly is cleaner and aligns with the PR objective to fix base URL handling.

wavefront/server/modules/knowledge_base_module/knowledge_base_module/controllers/knowledge_base_controller.py (1)

85-109: LGTM! Response formatting is now consistent.

The changes unify the response format with other endpoints in this controller by:

  • Removing the Pydantic response_model in favor of JSONResponse
  • Using response_formatter for consistent success/error formatting
  • Injecting dependencies via FastAPI's Depends()

The static analysis warnings (B008) about Depends() in default arguments are false positives—this is the standard FastAPI dependency injection pattern. The to_dict() method on the KnowledgeBase model properly serializes UUID and datetime fields to JSON-compatible formats.

Comment thread wavefront/client/eslint.config.js Outdated
notifySuccess('Datasource deleted successfully');
navigate(`/apps/${appId}/datasources`);
} catch (error) {
console.error('Error deleting datasource');
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Include the error object in the log for debugging.

Logging a static string without the actual error loses valuable debugging information when troubleshooting production issues.

-      console.error('Error deleting datasource');
+      console.error('Error deleting datasource:', error);

Additionally, consider notifying the user about the failure since the error is currently swallowed silently:

     } catch (error) {
       console.error('Error deleting datasource:', error);
+      notifyError('Failed to delete datasource');
     } finally {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
console.error('Error deleting datasource');
console.error('Error deleting datasource:', error);
🤖 Prompt for AI Agents
In wavefront/client/src/pages/apps/[appId]/datasources/[datasourceId].tsx around
line 192, the catch block logs only a static string "Error deleting datasource"
which omits the caught error details and swallows the failure; update the catch
to log both the message and the actual error object (e.g., process the caught
error variable in the log call) and add user-facing feedback (toast, alert, or
set an error state) so the user is notified of the deletion failure.

return embeddings

def bgm_embedding(self, texts):
openai_api_key = os.getenv('OPENAI_API_KEY') or ''
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing API key validation leads to unclear authentication failures.

If OPENAI_API_KEY is not set, the code silently uses an empty string, causing opaque authentication errors at runtime instead of failing fast with a clear message.

-        openai_api_key = os.getenv('OPENAI_API_KEY') or ''
+        openai_api_key = os.getenv('OPENAI_API_KEY')
+        if not openai_api_key:
+            raise ValueError('OPENAI_API_KEY environment variable is not set')

Alternatively, validate this once in __init__ rather than on every embedding call.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
openai_api_key = os.getenv('OPENAI_API_KEY') or ''
openai_api_key = os.getenv('OPENAI_API_KEY')
if not openai_api_key:
raise ValueError('OPENAI_API_KEY environment variable is not set')

Comment on lines 38 to 47
response = requests.post(
self.bgm_url,
headers={'Authorization': 'Bearer ' + openai_api_key},
json={
'model': 'BAAI/bge-m3',
# 'model': 'BAAI/bge-m3',
'model': 'text-embedding-3-small',
'input': texts,
'encoding_format': 'float',
},
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add timeout to prevent indefinite hangs.

The requests.post call lacks a timeout, which can cause the background job to hang indefinitely if the embedding service is unresponsive.

         response = requests.post(
             self.bgm_url,
             headers={'Authorization': 'Bearer ' + openai_api_key},
             json={
                 # 'model': 'BAAI/bge-m3',
                 'model': 'text-embedding-3-small',
                 'input': texts,
                 'encoding_format': 'float',
             },
+            timeout=60,
         )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
response = requests.post(
self.bgm_url,
headers={'Authorization': 'Bearer ' + openai_api_key},
json={
'model': 'BAAI/bge-m3',
# 'model': 'BAAI/bge-m3',
'model': 'text-embedding-3-small',
'input': texts,
'encoding_format': 'float',
},
)
response = requests.post(
self.bgm_url,
headers={'Authorization': 'Bearer ' + openai_api_key},
json={
# 'model': 'BAAI/bge-m3',
'model': 'text-embedding-3-small',
'input': texts,
'encoding_format': 'float',
},
timeout=60,
)
🧰 Tools
🪛 Ruff (0.14.8)

38-38: Probable use of requests call without timeout

(S113)

🤖 Prompt for AI Agents
In
wavefront/server/background_jobs/rag_ingestion/rag_ingestion/embeddings/embed.py
around lines 38 to 47, the requests.post call has no timeout which can cause the
job to hang indefinitely; modify the call to pass a reasonable timeout parameter
(e.g., timeout=10 or configurable via an environment variable) and handle
timeout-related exceptions (requests.Timeout and requests.RequestException) so
the job can retry or fail gracefully with a logged error.

Comment on lines +48 to +49
res = response.json()
return res['data'][0]['embedding']
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Returns only the first embedding; batch processing is broken.

When input contains multiple texts, the OpenAI API returns embeddings for all of them in res['data']. This code discards all but the first, breaking batch processing.

Additionally, there's no error handling for HTTP failures or unexpected responses.

-        res = response.json()
-        return res['data'][0]['embedding']
+        response.raise_for_status()
+        res = response.json()
+        return [item['embedding'] for item in res['data']]

This returns all embeddings in order, which pairs with the fix needed at line 20.

Comment on lines 45 to +58
def bgm_embedding(self, texts):
openai_api_key = os.getenv('OPENAI_API_KEY') or ''
response = requests.post(
self.bgm_url,
headers={'Authorization': 'Bearer ' + openai_api_key},
json={
'model': 'BAAI/bge-m3',
# 'model': 'BAAI/bge-m3',
'model': 'text-embedding-3-small',
'input': texts,
'encoding_format': 'float',
},
)
return response.json()['data'][0]['embedding']
res = response.json()
return res['data'][0]['embedding']
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Add timeout, error handling, and validate API key.

The current implementation has several issues:

  1. Missing timeout (line 47): The requests.post call lacks a timeout parameter, which can cause indefinite hangs if the API is unresponsive.
  2. No error handling: API failures (network errors, 4xx/5xx responses) will raise unhandled exceptions.
  3. Empty string fallback (line 46): Using an empty string when OPENAI_API_KEY is missing will cause authentication failures without clear error messages.

Apply this diff to address these issues:

 def bgm_embedding(self, texts):
     openai_api_key = os.getenv('OPENAI_API_KEY') or ''
+    if not openai_api_key:
+        raise ValueError('OPENAI_API_KEY environment variable is required')
+    
     response = requests.post(
         self.bgm_url,
         headers={'Authorization': 'Bearer ' + openai_api_key},
         json={
             # 'model': 'BAAI/bge-m3',
             'model': 'text-embedding-3-small',
             'input': texts,
             'encoding_format': 'float',
         },
+        timeout=30,
     )
+    response.raise_for_status()
     res = response.json()
     return res['data'][0]['embedding']
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def bgm_embedding(self, texts):
openai_api_key = os.getenv('OPENAI_API_KEY') or ''
response = requests.post(
self.bgm_url,
headers={'Authorization': 'Bearer ' + openai_api_key},
json={
'model': 'BAAI/bge-m3',
# 'model': 'BAAI/bge-m3',
'model': 'text-embedding-3-small',
'input': texts,
'encoding_format': 'float',
},
)
return response.json()['data'][0]['embedding']
res = response.json()
return res['data'][0]['embedding']
def bgm_embedding(self, texts):
openai_api_key = os.getenv('OPENAI_API_KEY') or ''
if not openai_api_key:
raise ValueError('OPENAI_API_KEY environment variable is required')
response = requests.post(
self.bgm_url,
headers={'Authorization': 'Bearer ' + openai_api_key},
json={
# 'model': 'BAAI/bge-m3',
'model': 'text-embedding-3-small',
'input': texts,
'encoding_format': 'float',
},
timeout=30,
)
response.raise_for_status()
res = response.json()
return res['data'][0]['embedding']
🧰 Tools
🪛 Ruff (0.14.8)

47-47: Probable use of requests call without timeout

(S113)

🤖 Prompt for AI Agents
In
wavefront/server/modules/knowledge_base_module/knowledge_base_module/embeddings/embed.py
around lines 45 to 58, the bgm_embedding method lacks API key validation,
timeout, and robust error handling; update it to (1) validate that
OPENAI_API_KEY exists and raise/log a clear error if missing instead of using an
empty string, (2) call requests.post with a sensible timeout (e.g., 10s) to
avoid hangs, (3) wrap the request in a try/except to catch requests.exceptions
(e.g., ConnectionError, Timeout, HTTPError) and raise or return a clear error,
and (4) check response.status_code and handle non-2xx responses by parsing any
error message from the response and raising/logging a descriptive exception
before attempting to access res['data'][0]['embedding'] so you never assume a
successful JSON shape.

Comment on lines +51 to +52
# 'model': 'BAAI/bge-m3',
'model': 'text-embedding-3-small',
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Search for references to the old BAAI/bge-m3 model or its typical vector size
rg -n "BAAI/bge-m3|bge-m3" --type py

# Search for vector_size configurations that might need updating
rg -n "vector_size" --type py -C 2

Repository: rootflo/wavefront

Length of output: 25282


🏁 Script executed:

# Read the embed.py file at the target lines to see context
head -70 wavefront/server/modules/knowledge_base_module/knowledge_base_module/embeddings/embed.py | tail -30

# Check if both embed.py files have similar changes
head -60 wavefront/server/background_jobs/rag_ingestion/rag_ingestion/embeddings/embed.py | tail -20

Repository: rootflo/wavefront

Length of output: 1727


🏁 Script executed:

# Search for bgm_url definition
rg -n "bgm_url" --type py -B 3 -A 3

# Search for any migration-related code or documentation
rg -n "migration|migrate|regenerate" --type py -i | head -30

# Check for configuration or setup documentation
fd -e "README|SETUP|CONFIG|\.md" -i | head -20

Repository: rootflo/wavefront

Length of output: 7156


🏁 Script executed:

# Search for embedding deletion or regeneration logic
rg -n "delete.*embedding|regenerate|re-embed" --type py -i

# Check if embeddings are generated on-demand or stored
rg -n "generate.*embedding|store.*embedding" --type py -A 2 | head -40

# Look for any data migration scripts
fd -e "py" -path "*/alembic/versions/*" -exec basename {} \; | grep -i embed

Repository: rootflo/wavefront

Length of output: 5787


🌐 Web query:

BAAI bge-m3 embedding dimensions output size

💡 Result:

BAAI's BGE‑M3 produces 1024‑dimensional embeddings (output vector length = 1024). [1][2][3]

Sources:

  • BGE docs (BAAI/bge-m3) — specifies model and embedding specs. [1]
  • Hugging Face discussion for BAAI/bge-m3 — confirms embedding dimension = 1024. [2]
  • Model summaries (e.g., PromptLayer/IONOS) — lists 1024 output dimensionality. [3]

Verify embedding migration strategy before deploying the model change.

The switch from BAAI/bge-m3 (1024 dimensions) to text-embedding-3-small (1536 dimensions) introduces a dimension incompatibility. Vector size validation in rag_retreival_controller.py (lines 449-452) will reject any existing embeddings generated with the old model. Ensure:

  1. Existing embeddings are deleted or regenerated before enabling the new model.
  2. The embedding service endpoint (bgm_url / EMBEDDING_SERVICE_URL) is configured to serve text-embedding-3-small.
  3. Document ingestion workflows trigger re-embedding of all knowledge bases post-deployment.
🤖 Prompt for AI Agents
In
wavefront/server/modules/knowledge_base_module/knowledge_base_module/embeddings/embed.py
around lines 51-52, the model change from 'BAAI/bge-m3' (1024d) to
'text-embedding-3-small' (1536d) will break vector-size validation; before
deploying, implement a migration plan: add or run a script to detect and delete
or regenerate all existing embeddings (or mark KBs as needing re-embed), ensure
EMBEDDING_SERVICE_URL/bgm_url is configured to serve text-embedding-3-small,
update deployment docs and ingestion workflows to trigger re-embedding of every
knowledge base after rollout, and (optionally) add a runtime safety check that
prevents switching models unless a forced migration flag is supplied.

@vishnurk6247 vishnurk6247 changed the title Fixing some bugs feat: support openai embedding and add llm base_urls in client Dec 14, 2025
@vishnurk6247 vishnurk6247 merged commit aaaf27a into develop Dec 14, 2025
7 of 8 checks passed
@vishnurk6247 vishnurk6247 deleted the fix_migrating_changes branch December 14, 2025 18:46
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7e89835 and 3d676d9.

📒 Files selected for processing (4)
  • wavefront/client/src/components/DashboardLayout.tsx (2 hunks)
  • wavefront/client/src/config/llm-providers.ts (6 hunks)
  • wavefront/server/background_jobs/rag_ingestion/rag_ingestion/embeddings/embed.py (2 hunks)
  • wavefront/server/background_jobs/rag_ingestion/rag_ingestion/env.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • wavefront/server/background_jobs/rag_ingestion/rag_ingestion/embeddings/embed.py
🔇 Additional comments (3)
wavefront/client/src/components/DashboardLayout.tsx (1)

17-17: LGTM! Type safety improvement.

Using ReturnType<typeof setTimeout> is more type-safe and environment-agnostic than number | null, and enables removal of the ts-ignore directive.

wavefront/client/src/config/llm-providers.ts (2)

23-23: LGTM! Interface extension is appropriate.

Adding baseUrl as an optional field allows provider-specific base URLs while remaining backward-compatible with providers that don't require them (Azure, Ollama, vLLM).


518-524: LGTM! Helper function is clear and correct.

The getDefaultBaseUrl function correctly retrieves the provider's base URL with a safe fallback to an empty string when not configured.

export const LLM_PROVIDERS_CONFIG: Record<InferenceEngineType, ProviderConfig> = {
openai: {
name: 'OpenAI GPT',
baseUrl: 'https://api.openai.com/v1',
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Anthropic Claude API base URL official documentation

💡 Result:

The official Anthropic Claude API base URL is: https://api.anthropic.com/v1/. [1][2]

Sources:

  • Anthropic API docs / release notes. [1]
  • Anthropic Help Center (API docs link and examples). [2]

Add /v1 to the Anthropic base URL.

The Anthropic base URL should be https://api.anthropic.com/v1/ according to official Anthropic API documentation. Update line 113 to include the /v1 path segment.

Also applies to: 113-113

🤖 Prompt for AI Agents
In wavefront/client/src/config/llm-providers.ts around line 113, the Anthropic
provider baseUrl is missing the required /v1 path segment; update the Anthropic
baseUrl value to include the API version (e.g. "https://api.anthropic.com/v1/"
or "https://api.anthropic.com/v1") so requests target the correct endpoint and
match the official documentation.


CLOUD_PROVIDER = os.getenv('CLOUD_PROVIDER', 'gcp')
RETRY_COUNT = os.getenv('RETRY_COUNT', 3)
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find all usages of OPENAI_API_KEY in the codebase
echo "=== Searching for OPENAI_API_KEY usages ==="
rg -n "OPENAI_API_KEY" --type=py -C3

echo ""
echo "=== Searching for OpenAI client initialization ==="
rg -n "openai\.(Client|OpenAI|AzureOpenAI)" --type=py -C3

echo ""
echo "=== Searching for from_env or similar patterns ==="
rg -n "from_env|api_key.*=|Client\(" --type=py | head -50

Repository: rootflo/wavefront

Length of output: 22707


🏁 Script executed:

cat -n wavefront/server/background_jobs/rag_ingestion/rag_ingestion/embeddings/embed.py

Repository: rootflo/wavefront

Length of output: 2262


Add validation to ensure OPENAI_API_KEY is set when required.

When EMBEDDING_MODEL is 'text-embedding-3-small' (the default), the code interpolates OPENAI_API_KEY directly into the Authorization header without checking if it's None. If the environment variable is not set, this will create a malformed header with 'Bearer None' that will fail at runtime. Either validate that OPENAI_API_KEY is not None before using it in the header, or raise a clear error on initialization if it's required but missing.

🤖 Prompt for AI Agents
In wavefront/server/background_jobs/rag_ingestion/rag_ingestion/env.py around
line 8, OPENAI_API_KEY is read but not validated and may be None when
EMBEDDING_MODEL == 'text-embedding-3-small', producing a malformed "Bearer None"
header; update initialization to explicitly check if EMBEDDING_MODEL requires an
API key and if so validate that OPENAI_API_KEY is set (raise a clear
RuntimeError or ValueError with a descriptive message at startup), or
alternatively only include the Authorization header when OPENAI_API_KEY is
non-empty and fail fast with a helpful error if the model requires it.

thomastomy5 pushed a commit that referenced this pull request Apr 27, 2026
* llm_provider default base url fix

* rag openai embeddings

* refactor: fix review comments

---------

Co-authored-by: vishnu r kumar <rkumar.vishnu28@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants