feat: Ollama and WatsonX embedding model support#10356
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughAdds multi-provider embedding support to the EmbeddingModelComponent by extending it beyond OpenAI to include Ollama and WatsonX providers. Updates the Nvidia Remix starter project, introduces WatsonX embedding model constants, and implements provider-specific build logic with dynamic imports and configuration handling. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Component as EmbeddingModelComponent
participant Config as update_build_config
participant Builders as Provider Builders
User->>Component: Select provider (OpenAI/Ollama/WatsonX)
alt Provider = OpenAI
Component->>Config: Pass OpenAI config
Config->>Builders: Use OpenAI models
else Provider = Ollama
Component->>Config: Pass Ollama config
Config->>Builders: Import OllamaEmbeddings
Builders->>Builders: Set default base URL
else Provider = WatsonX
Component->>Config: Pass WatsonX config + project_id
Config->>Config: Validate API key & project_id
Config->>Builders: Import WatsonxEmbeddings
Builders->>Builders: Assemble model params
end
Builders->>Component: Return embeddings instance
Component->>User: Embeddings ready
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes The changes involve logic branching across multiple providers with conditional import handling and configuration logic spread across three files. While the pattern is repetitive across providers, each provider path requires separate verification of import fallbacks, parameter handling, and visibility/default logic. Configuration file and constants additions are straightforward, but the embedding component changes demand careful review of the conditional logic and dynamic imports. Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 error, 3 warnings)
✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/backend/base/langflow/initial_setup/starter_projects/Nvidia Remix.json (1)
1955-1972: Reset OpenAI base URL when switching providersWhen the user switches from Ollama/WatsonX back to OpenAI, we keep the previous provider’s
api_basevalue (http://localhost:11434, IBM URL, etc.). The OpenAI client expects this field to be blank (it defaults tohttps://api.openai.com/v1) unless the user explicitly overrides it. Leaving the old value in place routes OpenAI calls to the wrong host and fails the flow.(reference.langchain.com)Please clear
api_base["value"](or set it to the OpenAI default) inside the OpenAI branch so we restore a valid endpoint after provider changes.if field_value == "OpenAI": build_config["model"]["options"] = OPENAI_EMBEDDING_MODEL_NAMES build_config["model"]["value"] = OPENAI_EMBEDDING_MODEL_NAMES[0] build_config["api_key"]["display_name"] = "OpenAI API Key" build_config["api_key"]["required"] = True build_config["api_key"]["show"] = True build_config["api_base"]["display_name"] = "OpenAI API Base URL" + build_config["api_base"]["value"] = "" build_config["project_id"]["show"] = False
🧹 Nitpick comments (1)
src/lfx/src/lfx/components/models/embedding_model.py (1)
135-161: Consider parameter consistency across providers.Similar to Ollama, the WatsonX provider only uses a subset of the available parameters (model_id, url, apikey, project_id) while ignoring others like dimensions, chunk_size, max_retries, timeout, and show_progress_bar. This could confuse users who set these parameters expecting them to work.
Consider one of the following approaches:
- Document in the component description or input info which parameters are supported by which providers
- Add a validation warning when unsupported parameters are set for the selected provider
- Make certain inputs conditionally visible based on the selected provider (similar to how project_id is handled)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/backend/base/langflow/initial_setup/starter_projects/Nvidia Remix.json(5 hunks)src/lfx/src/lfx/base/models/watsonx_constants.py(1 hunks)src/lfx/src/lfx/components/models/embedding_model.py(4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/lfx/src/lfx/components/models/embedding_model.py (1)
src/lfx/src/lfx/inputs/inputs.py (1)
MessageTextInput(206-257)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
- GitHub Check: Lint Backend / Run Mypy (3.10)
- GitHub Check: Lint Backend / Run Mypy (3.12)
- GitHub Check: Lint Backend / Run Mypy (3.11)
- GitHub Check: Run Frontend Tests / Determine Test Suites and Shard Distribution
- GitHub Check: Lint Backend / Run Mypy (3.13)
- GitHub Check: Test Docker Images / Test docker images
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 5
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 1
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 4
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 3
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 2
- GitHub Check: Run Backend Tests / LFX Tests - Python 3.10
- GitHub Check: Run Backend Tests / Integration Tests - Python 3.10
- GitHub Check: Test Starter Templates
- GitHub Check: test-starter-projects
- GitHub Check: Update Component Index
- GitHub Check: Optimize new Python code in this PR
🔇 Additional comments (7)
src/lfx/src/lfx/base/models/watsonx_constants.py (1)
1-8: LGTM! Clean constant definition.The WatsonX embedding model names are well-organized and follow a consistent naming convention. The constant structure aligns with similar constants used for other providers.
src/lfx/src/lfx/components/models/embedding_model.py (6)
6-8: LGTM! Imports follow existing patterns.The new imports for Ollama and WatsonX embedding model constants are consistent with the existing OpenAI import pattern.
34-42: LGTM! Provider options properly configured.The provider dropdown and icon metadata are correctly configured to support the three embedding providers.
65-71: LGTM! WatsonX-specific input properly defined.The project_id input is appropriately configured as hidden by default and will be revealed when WatsonX is selected through the
update_build_configlogic.
119-133: LGTM! Ollama integration with appropriate fallback.The dynamic import with fallback from
langchain_ollamatolangchain_communityprovides good compatibility. The default base URL for localhost is appropriate for typical Ollama deployments.However, note that several parameters defined in the component (dimensions, chunk_size, max_retries, timeout, show_progress_bar) are silently ignored for Ollama. Consider documenting which parameters are supported by each provider or adding warnings when unsupported parameters are set.
142-150: LGTM! Proper validation for required WatsonX parameters.The validation correctly ensures both api_key and project_id are provided before attempting to create the WatsonX embeddings instance.
167-196: LGTM! Provider-specific configuration properly handled.The update_build_config method correctly manages provider-specific settings:
- Model options are updated based on the selected provider's supported models
- API key requirements and visibility are appropriate for each provider (required for OpenAI/WatsonX, optional for Ollama)
- WatsonX-specific project_id field is shown only for WatsonX and hidden for other providers
- Default URLs are set appropriately for Ollama and WatsonX
Codecov Report❌ Patch coverage is
❌ Your patch status has failed because the patch coverage (0.00%) is below the target coverage (40.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #10356 +/- ##
==========================================
- Coverage 29.99% 29.99% -0.01%
==========================================
Files 1316 1317 +1
Lines 59657 59667 +10
Branches 8921 8923 +2
==========================================
Hits 17896 17896
- Misses 40944 40952 +8
- Partials 817 819 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
This pull request extends the embedding model component to support multiple providers—specifically, Ollama and IBM WatsonX—in addition to OpenAI. It introduces dynamic configuration of provider-specific options, adds new dependencies, and updates both backend logic and starter project configuration to accommodate these enhancements.
Provider support and backend logic:
EmbeddingModelComponentinembedding_model.pynow supports three providers: OpenAI, Ollama, and WatsonX, with dynamic input fields and logic for each provider, including provider-specific model lists and required fields. [1] [2] [3] [4]watsonx_constants.pyto provide available WatsonX embedding model names.Starter project and configuration updates:
langchain_ollama,langchain_community,langchain_ibm), extend provider and model options, and include a newproject_idinput for WatsonX. [1] [2] [3] [4] [5]These changes make the embedding model component more flexible and ready for a wider range of use cases involving different embedding providers.
Summary by CodeRabbit