Skip to content

feat: Ollama and WatsonX embedding model support#10356

Merged
erichare merged 31 commits into
mainfrom
feat-ollama-watsonx-embed
Oct 23, 2025
Merged

feat: Ollama and WatsonX embedding model support#10356
erichare merged 31 commits into
mainfrom
feat-ollama-watsonx-embed

Conversation

@erichare
Copy link
Copy Markdown
Collaborator

@erichare erichare commented Oct 21, 2025

This pull request extends the embedding model component to support multiple providers—specifically, Ollama and IBM WatsonX—in addition to OpenAI. It introduces dynamic configuration of provider-specific options, adds new dependencies, and updates both backend logic and starter project configuration to accommodate these enhancements.

Provider support and backend logic:

  • The EmbeddingModelComponent in embedding_model.py now supports three providers: OpenAI, Ollama, and WatsonX, with dynamic input fields and logic for each provider, including provider-specific model lists and required fields. [1] [2] [3] [4]
  • Added WatsonX model constants in watsonx_constants.py to provide available WatsonX embedding model names.

Starter project and configuration updates:

  • Updated the Nvidia Remix starter project JSON to add new dependencies (langchain_ollama, langchain_community, langchain_ibm), extend provider and model options, and include a new project_id input for WatsonX. [1] [2] [3] [4] [5]

These changes make the embedding model component more flexible and ready for a wider range of use cases involving different embedding providers.

Summary by CodeRabbit

  • New Features
    • Added support for Ollama and WatsonX embedding providers alongside existing OpenAI support.
    • Users can now select from multiple embedding model providers with provider-specific configurations.
    • Introduced WatsonX Project ID field for secure authentication and configuration.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Oct 21, 2025

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Adds multi-provider embedding support to the EmbeddingModelComponent by extending it beyond OpenAI to include Ollama and WatsonX providers. Updates the Nvidia Remix starter project, introduces WatsonX embedding model constants, and implements provider-specific build logic with dynamic imports and configuration handling.

Changes

Cohort / File(s) Change Summary
Multi-provider Starter Project Configuration
src/backend/base/langflow/initial_setup/starter_projects/Nvidia Remix.json
Updated EmbeddingModelComponent to support three providers (OpenAI, Ollama, WatsonX) with provider-specific inputs, dynamic imports for each provider, and conditional configuration logic. Added WatsonX project_id input field and updated UI metadata including icons for new providers. Increased dependency count from 2 to 5.
WatsonX Embedding Models Constants
src/lfx/src/lfx/base/models/watsonx_constants.py
Added new constant WATSONX_EMBEDDING_MODEL_NAMES containing six IBM Granite embedding model identifiers (125m English, 278m multilingual, 30m English, 107m multilingual, 30m sparse).
Embedding Component Multi-Provider Support
src/lfx/src/lfx/components/models/embedding_model.py
Extended EmbeddingModelComponent with imports for Ollama and WatsonX constants. Expanded provider dropdown to include all three providers with icons. Added WatsonX project_id input field. Implemented provider-specific build paths: OpenAI (existing), Ollama (with fallback imports and default URL), and WatsonX (with API key and project_id validation). Enhanced update_build_config to handle provider-specific model options, visibility flags, and parameter assembly.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Component as EmbeddingModelComponent
    participant Config as update_build_config
    participant Builders as Provider Builders

    User->>Component: Select provider (OpenAI/Ollama/WatsonX)
    alt Provider = OpenAI
        Component->>Config: Pass OpenAI config
        Config->>Builders: Use OpenAI models
    else Provider = Ollama
        Component->>Config: Pass Ollama config
        Config->>Builders: Import OllamaEmbeddings
        Builders->>Builders: Set default base URL
    else Provider = WatsonX
        Component->>Config: Pass WatsonX config + project_id
        Config->>Config: Validate API key & project_id
        Config->>Builders: Import WatsonxEmbeddings
        Builders->>Builders: Assemble model params
    end
    Builders->>Component: Return embeddings instance
    Component->>User: Embeddings ready
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

The changes involve logic branching across multiple providers with conditional import handling and configuration logic spread across three files. While the pattern is repetitive across providers, each provider path requires separate verification of import fallbacks, parameter handling, and visibility/default logic. Configuration file and constants additions are straightforward, but the embedding component changes demand careful review of the conditional logic and dynamic imports.

Suggested labels

enhancement, size:XL, lgtm

Suggested reviewers

  • ogabrielluiz
  • edwinjosechittilappilly

Pre-merge checks and finishing touches

❌ Failed checks (1 error, 3 warnings)
Check name Status Explanation Resolution
Test Coverage For New Implementations ❌ Error The PR adds significant new functionality by introducing support for two additional embedding providers (Ollama and WatsonX) in the EmbeddingModelComponent, along with new model constants in watsonx_constants.py. However, the PR does not include any new or updated test files to cover this new functionality. The existing test file test_embedding_model_component.py contains tests only for the OpenAI provider and was not modified in this PR. There are no tests for the Ollama provider branch, the WatsonX provider branch, the WatsonX constants, the dynamic import fallback mechanism for OllamaEmbeddings, error handling for missing WatsonX credentials, or the updated update_build_config() method logic for the new providers. To pass this check, the PR should be updated to include comprehensive test coverage for the new Ollama and WatsonX providers. This should include unit tests for both the build_embeddings() and update_build_config() methods covering each new provider, tests for the dynamic import fallback mechanism, tests for error cases (missing API keys, missing project_id for WatsonX), and optionally tests for the new WATSONX_EMBEDDING_MODEL_NAMES constant. These tests should follow the project's naming conventions and be added to the existing test file or a new test file in the same directory as other component tests.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Test Quality And Coverage ⚠️ Warning The test file src/backend/tests/unit/components/models/test_embedding_model_component.py exists and contains four test methods, but it only covers the OpenAI provider path with tests for successful initialization, missing API key error, and unknown provider handling. The PR introduces significant new functionality—Ollama and WatsonX provider support with dynamic imports, fallback mechanisms, provider-specific configuration, and validation logic—but the test file was not modified in this PR commit (git diff returns 0 lines changed). The tests lack coverage for the critical new code paths: Ollama embeddings initialization with base URL defaults and import fallbacks (both langchain-ollama and langchain_community), WatsonX embeddings with project_id and API key validation, import error handling for missing WatsonX dependencies, and update_build_config behavior for all three providers including field visibility changes and model option updates. This represents a significant gap in test quality since the implementation introduces new error conditions and complex dynamic import logic that require verification. Add test coverage for the new Ollama and WatsonX providers by extending src/backend/tests/unit/components/models/test_embedding_model_component.py. Add async test methods following the existing pattern: test_build_embeddings_ollama_success, test_build_embeddings_ollama_import_fallback (testing langchain_community fallback), test_update_build_config_ollama, test_build_embeddings_watsonx_success, test_build_embeddings_watsonx_missing_project_id, test_build_embeddings_watsonx_missing_api_key, test_update_build_config_watsonx. Mock the dynamic imports and embeddings classes using @patch decorators. For Ollama, verify the base URL defaults to "http://localhost:11434" and test the import fallback chain. For WatsonX, verify project_id is required, URL defaults to "https://us-south.ml.cloud.ibm.com", and all required parameters are passed correctly. Test that update_build_config correctly updates model options, display names, required flags, and field visibility for each provider.
Test File Naming And Structure ⚠️ Warning The existing test file at ./src/backend/tests/unit/components/models/test_embedding_model_component.py follows proper pytest naming conventions and structure with test_.py naming and test_ prefixed functions using the repository's ComponentTestBaseWithClient pattern. However, the test file contains only 4 test functions that exclusively cover the OpenAI provider, providing no coverage for the newly added Ollama and WatsonX providers. The PR's implementation adds substantial new code paths in build_embeddings() including Ollama with import fallback logic, and WatsonX with API key and project_id validation. Additionally, the code contains a redundant project_id check at line 158-159 (acknowledged in review comments) where project_id is validated as non-empty earlier but checked again before assignment. The existing tests do not validate these new provider branches, error conditions, or the configuration update logic for Ollama and WatsonX in the update_build_config method. Expand the test file with additional test functions following the repository's established patterns to provide complete coverage of the new functionality. Add tests for: (1) test_update_build_config_ollama and test_update_build_config_watsonx to verify provider-specific model options, display names, and field visibility, (2) test_build_embeddings_ollama with mocking of OllamaEmbeddings to validate successful initialization with default base URL, (3) test_build_embeddings_ollama_import_fallback to verify the fallback import from langchain_community when langchain_ollama is unavailable, (4) test_build_embeddings_watsonx to validate WatsonxEmbeddings initialization with all required parameters, (5) test_build_embeddings_watsonx_missing_project_id and test_build_embeddings_watsonx_missing_api_key for error handling. Use the test patterns established in the codebase with fixtures, async functions, mocking, and assertions following the language_model test examples.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "feat: Ollama and WatsonX embedding model support" directly and clearly reflects the primary change in the changeset. The title identifies the two new embedding providers (Ollama and WatsonX) being added to the system, which is the core purpose of this PR as evidenced by the changes to EmbeddingModelComponent, the addition of WatsonX constants, and updates to the starter project configuration. The title is concise, specific, and avoids vague language or noise. A developer scanning the commit history would immediately understand that this PR adds support for these two new embedding model providers.
Excessive Mock Usage Warning ✅ Passed The test file demonstrates minimal and appropriate mock usage that does not indicate poor test design. Out of 4 test functions, 3 (75%) test real logic without any mocks, including critical error-handling paths. The single mock present is correctly applied only to an external dependency (OpenAIEmbeddings from LangChain), not to the component's core business logic. The mock is used properly to verify correct parameter passing to the external library. This represents good test design where mocks are appropriately scoped to external dependencies rather than mocking implementation details or core logic. The mock import and usage patterns align with best practices for unit testing.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added the enhancement New feature or request label Oct 21, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Oct 21, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Oct 21, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Oct 21, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/backend/base/langflow/initial_setup/starter_projects/Nvidia Remix.json (1)

1955-1972: Reset OpenAI base URL when switching providers

When the user switches from Ollama/WatsonX back to OpenAI, we keep the previous provider’s api_base value (http://localhost:11434, IBM URL, etc.). The OpenAI client expects this field to be blank (it defaults to https://api.openai.com/v1) unless the user explicitly overrides it. Leaving the old value in place routes OpenAI calls to the wrong host and fails the flow.(reference.langchain.com)

Please clear api_base["value"] (or set it to the OpenAI default) inside the OpenAI branch so we restore a valid endpoint after provider changes.

             if field_value == "OpenAI":
                 build_config["model"]["options"] = OPENAI_EMBEDDING_MODEL_NAMES
                 build_config["model"]["value"] = OPENAI_EMBEDDING_MODEL_NAMES[0]
                 build_config["api_key"]["display_name"] = "OpenAI API Key"
                 build_config["api_key"]["required"] = True
                 build_config["api_key"]["show"] = True
                 build_config["api_base"]["display_name"] = "OpenAI API Base URL"
+                build_config["api_base"]["value"] = ""
                 build_config["project_id"]["show"] = False
🧹 Nitpick comments (1)
src/lfx/src/lfx/components/models/embedding_model.py (1)

135-161: Consider parameter consistency across providers.

Similar to Ollama, the WatsonX provider only uses a subset of the available parameters (model_id, url, apikey, project_id) while ignoring others like dimensions, chunk_size, max_retries, timeout, and show_progress_bar. This could confuse users who set these parameters expecting them to work.

Consider one of the following approaches:

  1. Document in the component description or input info which parameters are supported by which providers
  2. Add a validation warning when unsupported parameters are set for the selected provider
  3. Make certain inputs conditionally visible based on the selected provider (similar to how project_id is handled)
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 743a8d8 and 444d922.

📒 Files selected for processing (3)
  • src/backend/base/langflow/initial_setup/starter_projects/Nvidia Remix.json (5 hunks)
  • src/lfx/src/lfx/base/models/watsonx_constants.py (1 hunks)
  • src/lfx/src/lfx/components/models/embedding_model.py (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/lfx/src/lfx/components/models/embedding_model.py (1)
src/lfx/src/lfx/inputs/inputs.py (1)
  • MessageTextInput (206-257)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
  • GitHub Check: Lint Backend / Run Mypy (3.10)
  • GitHub Check: Lint Backend / Run Mypy (3.12)
  • GitHub Check: Lint Backend / Run Mypy (3.11)
  • GitHub Check: Run Frontend Tests / Determine Test Suites and Shard Distribution
  • GitHub Check: Lint Backend / Run Mypy (3.13)
  • GitHub Check: Test Docker Images / Test docker images
  • GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 5
  • GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 1
  • GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 4
  • GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 3
  • GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 2
  • GitHub Check: Run Backend Tests / LFX Tests - Python 3.10
  • GitHub Check: Run Backend Tests / Integration Tests - Python 3.10
  • GitHub Check: Test Starter Templates
  • GitHub Check: test-starter-projects
  • GitHub Check: Update Component Index
  • GitHub Check: Optimize new Python code in this PR
🔇 Additional comments (7)
src/lfx/src/lfx/base/models/watsonx_constants.py (1)

1-8: LGTM! Clean constant definition.

The WatsonX embedding model names are well-organized and follow a consistent naming convention. The constant structure aligns with similar constants used for other providers.

src/lfx/src/lfx/components/models/embedding_model.py (6)

6-8: LGTM! Imports follow existing patterns.

The new imports for Ollama and WatsonX embedding model constants are consistent with the existing OpenAI import pattern.


34-42: LGTM! Provider options properly configured.

The provider dropdown and icon metadata are correctly configured to support the three embedding providers.


65-71: LGTM! WatsonX-specific input properly defined.

The project_id input is appropriately configured as hidden by default and will be revealed when WatsonX is selected through the update_build_config logic.


119-133: LGTM! Ollama integration with appropriate fallback.

The dynamic import with fallback from langchain_ollama to langchain_community provides good compatibility. The default base URL for localhost is appropriate for typical Ollama deployments.

However, note that several parameters defined in the component (dimensions, chunk_size, max_retries, timeout, show_progress_bar) are silently ignored for Ollama. Consider documenting which parameters are supported by each provider or adding warnings when unsupported parameters are set.


142-150: LGTM! Proper validation for required WatsonX parameters.

The validation correctly ensures both api_key and project_id are provided before attempting to create the WatsonX embeddings instance.


167-196: LGTM! Provider-specific configuration properly handled.

The update_build_config method correctly manages provider-specific settings:

  • Model options are updated based on the selected provider's supported models
  • API key requirements and visibility are appropriate for each provider (required for OpenAI/WatsonX, optional for Ollama)
  • WatsonX-specific project_id field is shown only for WatsonX and hidden for other providers
  • Default URLs are set appropriately for Ollama and WatsonX

Comment thread src/lfx/src/lfx/components/models/embedding_model.py Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Oct 21, 2025

Codecov Report

❌ Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 29.99%. Comparing base (312ef9c) to head (8f13267).
⚠️ Report is 9 commits behind head on main.

Files with missing lines Patch % Lines
src/lfx/src/lfx/base/models/watsonx_constants.py 0.00% 3 Missing ⚠️

❌ Your patch status has failed because the patch coverage (0.00%) is below the target coverage (40.00%). You can increase the patch coverage or adjust the target coverage.
❌ Your project status has failed because the head coverage (39.41%) is below the target coverage (60.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main   #10356      +/-   ##
==========================================
- Coverage   29.99%   29.99%   -0.01%     
==========================================
  Files        1316     1317       +1     
  Lines       59657    59667      +10     
  Branches     8921     8923       +2     
==========================================
  Hits        17896    17896              
- Misses      40944    40952       +8     
- Partials      817      819       +2     
Flag Coverage Δ
backend 50.91% <ø> (ø)
frontend 10.02% <ø> (+<0.01%) ⬆️
lfx 39.41% <0.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/lfx/src/lfx/base/models/watsonx_constants.py 0.00% <0.00%> (ø)

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Oct 21, 2025

Frontend Unit Test Coverage Report

Coverage Summary

Lines Statements Branches Functions
Coverage: 11%
10.85% (2894/26660) 4.49% (916/20360) 6.38% (373/5844)

Unit Test Results

Tests Skipped Failures Errors Time
1207 0 💤 0 ❌ 0 🔥 14.056s ⏱️

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Oct 21, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Oct 21, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Oct 21, 2025
@github-actions github-actions Bot removed the enhancement New feature or request label Oct 21, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Oct 22, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Oct 22, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Oct 23, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Oct 23, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Oct 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants