feat: Ollama and WatsonX embedding model support by erichare · Pull Request #10356 · langflow-ai/langflow

erichare · 2025-10-21T22:49:39Z

This pull request extends the embedding model component to support multiple providers—specifically, Ollama and IBM WatsonX—in addition to OpenAI. It introduces dynamic configuration of provider-specific options, adds new dependencies, and updates both backend logic and starter project configuration to accommodate these enhancements.

Provider support and backend logic:

The EmbeddingModelComponent in embedding_model.py now supports three providers: OpenAI, Ollama, and WatsonX, with dynamic input fields and logic for each provider, including provider-specific model lists and required fields. [1] [2] [3] [4]
Added WatsonX model constants in watsonx_constants.py to provide available WatsonX embedding model names.

Starter project and configuration updates:

Updated the Nvidia Remix starter project JSON to add new dependencies (langchain_ollama, langchain_community, langchain_ibm), extend provider and model options, and include a new project_id input for WatsonX. [1] [2] [3] [4] [5]

These changes make the embedding model component more flexible and ready for a wider range of use cases involving different embedding providers.

Summary by CodeRabbit

New Features
- Added support for Ollama and WatsonX embedding providers alongside existing OpenAI support.
- Users can now select from multiple embedding model providers with provider-specific configurations.
- Introduced WatsonX Project ID field for secure authentication and configuration.

coderabbitai · 2025-10-21T22:49:57Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Adds multi-provider embedding support to the EmbeddingModelComponent by extending it beyond OpenAI to include Ollama and WatsonX providers. Updates the Nvidia Remix starter project, introduces WatsonX embedding model constants, and implements provider-specific build logic with dynamic imports and configuration handling.

Changes

Cohort / File(s)	Change Summary
Multi-provider Starter Project Configuration `src/backend/base/langflow/initial_setup/starter_projects/Nvidia Remix.json`	Updated EmbeddingModelComponent to support three providers (OpenAI, Ollama, WatsonX) with provider-specific inputs, dynamic imports for each provider, and conditional configuration logic. Added WatsonX project_id input field and updated UI metadata including icons for new providers. Increased dependency count from 2 to 5.
WatsonX Embedding Models Constants `src/lfx/src/lfx/base/models/watsonx_constants.py`	Added new constant `WATSONX_EMBEDDING_MODEL_NAMES` containing six IBM Granite embedding model identifiers (125m English, 278m multilingual, 30m English, 107m multilingual, 30m sparse).
Embedding Component Multi-Provider Support `src/lfx/src/lfx/components/models/embedding_model.py`	Extended EmbeddingModelComponent with imports for Ollama and WatsonX constants. Expanded provider dropdown to include all three providers with icons. Added WatsonX project_id input field. Implemented provider-specific build paths: OpenAI (existing), Ollama (with fallback imports and default URL), and WatsonX (with API key and project_id validation). Enhanced `update_build_config` to handle provider-specific model options, visibility flags, and parameter assembly.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Component as EmbeddingModelComponent
    participant Config as update_build_config
    participant Builders as Provider Builders

    User->>Component: Select provider (OpenAI/Ollama/WatsonX)
    alt Provider = OpenAI
        Component->>Config: Pass OpenAI config
        Config->>Builders: Use OpenAI models
    else Provider = Ollama
        Component->>Config: Pass Ollama config
        Config->>Builders: Import OllamaEmbeddings
        Builders->>Builders: Set default base URL
    else Provider = WatsonX
        Component->>Config: Pass WatsonX config + project_id
        Config->>Config: Validate API key & project_id
        Config->>Builders: Import WatsonxEmbeddings
        Builders->>Builders: Assemble model params
    end
    Builders->>Component: Return embeddings instance
    Component->>User: Embeddings ready

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

The changes involve logic branching across multiple providers with conditional import handling and configuration logic spread across three files. While the pattern is repetitive across providers, each provider path requires separate verification of import fallbacks, parameter handling, and visibility/default logic. Configuration file and constants additions are straightforward, but the embedding component changes demand careful review of the conditional logic and dynamic imports.

Suggested labels

enhancement, size:XL, lgtm

Suggested reviewers

ogabrielluiz
edwinjosechittilappilly

Pre-merge checks and finishing touches

❌ Failed checks (1 error, 3 warnings)

Check name	Status	Explanation	Resolution
Test Coverage For New Implementations	❌ Error	The PR adds significant new functionality by introducing support for two additional embedding providers (Ollama and WatsonX) in the EmbeddingModelComponent, along with new model constants in watsonx_constants.py. However, the PR does not include any new or updated test files to cover this new functionality. The existing test file `test_embedding_model_component.py` contains tests only for the OpenAI provider and was not modified in this PR. There are no tests for the Ollama provider branch, the WatsonX provider branch, the WatsonX constants, the dynamic import fallback mechanism for OllamaEmbeddings, error handling for missing WatsonX credentials, or the updated `update_build_config()` method logic for the new providers.	To pass this check, the PR should be updated to include comprehensive test coverage for the new Ollama and WatsonX providers. This should include unit tests for both the `build_embeddings()` and `update_build_config()` methods covering each new provider, tests for the dynamic import fallback mechanism, tests for error cases (missing API keys, missing project_id for WatsonX), and optionally tests for the new `WATSONX_EMBEDDING_MODEL_NAMES` constant. These tests should follow the project's naming conventions and be added to the existing test file or a new test file in the same directory as other component tests.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Test Quality And Coverage	⚠️ Warning	The test file src/backend/tests/unit/components/models/test_embedding_model_component.py exists and contains four test methods, but it only covers the OpenAI provider path with tests for successful initialization, missing API key error, and unknown provider handling. The PR introduces significant new functionality—Ollama and WatsonX provider support with dynamic imports, fallback mechanisms, provider-specific configuration, and validation logic—but the test file was not modified in this PR commit (git diff returns 0 lines changed). The tests lack coverage for the critical new code paths: Ollama embeddings initialization with base URL defaults and import fallbacks (both langchain-ollama and langchain_community), WatsonX embeddings with project_id and API key validation, import error handling for missing WatsonX dependencies, and update_build_config behavior for all three providers including field visibility changes and model option updates. This represents a significant gap in test quality since the implementation introduces new error conditions and complex dynamic import logic that require verification.	Add test coverage for the new Ollama and WatsonX providers by extending src/backend/tests/unit/components/models/test_embedding_model_component.py. Add async test methods following the existing pattern: test_build_embeddings_ollama_success, test_build_embeddings_ollama_import_fallback (testing langchain_community fallback), test_update_build_config_ollama, test_build_embeddings_watsonx_success, test_build_embeddings_watsonx_missing_project_id, test_build_embeddings_watsonx_missing_api_key, test_update_build_config_watsonx. Mock the dynamic imports and embeddings classes using @patch decorators. For Ollama, verify the base URL defaults to "http://localhost:11434" and test the import fallback chain. For WatsonX, verify project_id is required, URL defaults to "https://us-south.ml.cloud.ibm.com", and all required parameters are passed correctly. Test that update_build_config correctly updates model options, display names, required flags, and field visibility for each provider.
Test File Naming And Structure	⚠️ Warning	The existing test file at `./src/backend/tests/unit/components/models/test_embedding_model_component.py` follows proper pytest naming conventions and structure with test_.py naming and test_ prefixed functions using the repository's ComponentTestBaseWithClient pattern. However, the test file contains only 4 test functions that exclusively cover the OpenAI provider, providing no coverage for the newly added Ollama and WatsonX providers. The PR's implementation adds substantial new code paths in `build_embeddings()` including Ollama with import fallback logic, and WatsonX with API key and project_id validation. Additionally, the code contains a redundant project_id check at line 158-159 (acknowledged in review comments) where project_id is validated as non-empty earlier but checked again before assignment. The existing tests do not validate these new provider branches, error conditions, or the configuration update logic for Ollama and WatsonX in the `update_build_config` method.	Expand the test file with additional test functions following the repository's established patterns to provide complete coverage of the new functionality. Add tests for: (1) `test_update_build_config_ollama` and `test_update_build_config_watsonx` to verify provider-specific model options, display names, and field visibility, (2) `test_build_embeddings_ollama` with mocking of OllamaEmbeddings to validate successful initialization with default base URL, (3) `test_build_embeddings_ollama_import_fallback` to verify the fallback import from langchain_community when langchain_ollama is unavailable, (4) `test_build_embeddings_watsonx` to validate WatsonxEmbeddings initialization with all required parameters, (5) `test_build_embeddings_watsonx_missing_project_id` and `test_build_embeddings_watsonx_missing_api_key` for error handling. Use the test patterns established in the codebase with fixtures, async functions, mocking, and assertions following the language_model test examples.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The pull request title "feat: Ollama and WatsonX embedding model support" directly and clearly reflects the primary change in the changeset. The title identifies the two new embedding providers (Ollama and WatsonX) being added to the system, which is the core purpose of this PR as evidenced by the changes to EmbeddingModelComponent, the addition of WatsonX constants, and updates to the starter project configuration. The title is concise, specific, and avoids vague language or noise. A developer scanning the commit history would immediately understand that this PR adds support for these two new embedding model providers.
Excessive Mock Usage Warning	✅ Passed	The test file demonstrates minimal and appropriate mock usage that does not indicate poor test design. Out of 4 test functions, 3 (75%) test real logic without any mocks, including critical error-handling paths. The single mock present is correctly applied only to an external dependency (`OpenAIEmbeddings` from LangChain), not to the component's core business logic. The mock is used properly to verify correct parameter passing to the external library. This represents good test design where mocks are appropriately scoped to external dependencies rather than mocking implementation details or core logic. The mock import and usage patterns align with best practices for unit testing.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/backend/base/langflow/initial_setup/starter_projects/Nvidia Remix.json (1)
1955-1972: Reset OpenAI base URL when switching providers

When the user switches from Ollama/WatsonX back to OpenAI, we keep the previous provider’s api_base value (http://localhost:11434, IBM URL, etc.). The OpenAI client expects this field to be blank (it defaults to https://api.openai.com/v1) unless the user explicitly overrides it. Leaving the old value in place routes OpenAI calls to the wrong host and fails the flow.(reference.langchain.com)

Please clear api_base["value"] (or set it to the OpenAI default) inside the OpenAI branch so we restore a valid endpoint after provider changes.
             if field_value == "OpenAI":
                 build_config["model"]["options"] = OPENAI_EMBEDDING_MODEL_NAMES
                 build_config["model"]["value"] = OPENAI_EMBEDDING_MODEL_NAMES[0]
                 build_config["api_key"]["display_name"] = "OpenAI API Key"
                 build_config["api_key"]["required"] = True
                 build_config["api_key"]["show"] = True
                 build_config["api_base"]["display_name"] = "OpenAI API Base URL"
+                build_config["api_base"]["value"] = ""
                 build_config["project_id"]["show"] = False

🧹 Nitpick comments (1)

src/lfx/src/lfx/components/models/embedding_model.py (1)

135-161: Consider parameter consistency across providers.

Similar to Ollama, the WatsonX provider only uses a subset of the available parameters (model_id, url, apikey, project_id) while ignoring others like dimensions, chunk_size, max_retries, timeout, and show_progress_bar. This could confuse users who set these parameters expecting them to work.

Consider one of the following approaches:

Document in the component description or input info which parameters are supported by which providers

Add a validation warning when unsupported parameters are set for the selected provider

Make certain inputs conditionally visible based on the selected provider (similar to how project_id is handled)

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 743a8d8 and 444d922.

📒 Files selected for processing (3)

src/backend/base/langflow/initial_setup/starter_projects/Nvidia Remix.json (5 hunks)
src/lfx/src/lfx/base/models/watsonx_constants.py (1 hunks)
src/lfx/src/lfx/components/models/embedding_model.py (4 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

src/lfx/src/lfx/components/models/embedding_model.py (1)

src/lfx/src/lfx/inputs/inputs.py (1)

MessageTextInput (206-257)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)

GitHub Check: Lint Backend / Run Mypy (3.10)
GitHub Check: Lint Backend / Run Mypy (3.12)
GitHub Check: Lint Backend / Run Mypy (3.11)
GitHub Check: Run Frontend Tests / Determine Test Suites and Shard Distribution
GitHub Check: Lint Backend / Run Mypy (3.13)
GitHub Check: Test Docker Images / Test docker images
GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 5
GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 1
GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 4
GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 3
GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 2
GitHub Check: Run Backend Tests / LFX Tests - Python 3.10
GitHub Check: Run Backend Tests / Integration Tests - Python 3.10
GitHub Check: Test Starter Templates
GitHub Check: test-starter-projects
GitHub Check: Update Component Index
GitHub Check: Optimize new Python code in this PR

🔇 Additional comments (7)

src/lfx/src/lfx/base/models/watsonx_constants.py (1)

1-8: LGTM! Clean constant definition.

The WatsonX embedding model names are well-organized and follow a consistent naming convention. The constant structure aligns with similar constants used for other providers.

src/lfx/src/lfx/components/models/embedding_model.py (6)

6-8: LGTM! Imports follow existing patterns.

The new imports for Ollama and WatsonX embedding model constants are consistent with the existing OpenAI import pattern.

34-42: LGTM! Provider options properly configured.

The provider dropdown and icon metadata are correctly configured to support the three embedding providers.

65-71: LGTM! WatsonX-specific input properly defined.

The project_id input is appropriately configured as hidden by default and will be revealed when WatsonX is selected through the update_build_config logic.

119-133: LGTM! Ollama integration with appropriate fallback.

The dynamic import with fallback from langchain_ollama to langchain_community provides good compatibility. The default base URL for localhost is appropriate for typical Ollama deployments.

However, note that several parameters defined in the component (dimensions, chunk_size, max_retries, timeout, show_progress_bar) are silently ignored for Ollama. Consider documenting which parameters are supported by each provider or adding warnings when unsupported parameters are set.

142-150: LGTM! Proper validation for required WatsonX parameters.

The validation correctly ensures both api_key and project_id are provided before attempting to create the WatsonX embeddings instance.

167-196: LGTM! Provider-specific configuration properly handled.

The update_build_config method correctly manages provider-specific settings:

Model options are updated based on the selected provider's supported models

API key requirements and visibility are appropriate for each provider (required for OpenAI/WatsonX, optional for Ollama)

WatsonX-specific project_id field is shown only for WatsonX and hidden for other providers

Default URLs are set appropriately for Ollama and WatsonX

codecov · 2025-10-21T22:58:05Z

Codecov Report

❌ Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 29.99%. Comparing base (312ef9c) to head (8f13267).
⚠️ Report is 9 commits behind head on main.

Files with missing lines	Patch %	Lines
src/lfx/src/lfx/base/models/watsonx_constants.py	0.00%	3 Missing ⚠️

❌ Your patch status has failed because the patch coverage (0.00%) is below the target coverage (40.00%). You can increase the patch coverage or adjust the target coverage.
❌ Your project status has failed because the head coverage (39.41%) is below the target coverage (60.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #10356      +/-   ##
==========================================
- Coverage   29.99%   29.99%   -0.01%     
==========================================
  Files        1316     1317       +1     
  Lines       59657    59667      +10     
  Branches     8921     8923       +2     
==========================================
  Hits        17896    17896              
- Misses      40944    40952       +8     
- Partials      817      819       +2

Flag	Coverage Δ
backend	`50.91% <ø> (ø)`
frontend	`10.02% <ø> (+<0.01%)`	⬆️
lfx	`39.41% <0.00%> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
src/lfx/src/lfx/base/models/watsonx_constants.py	`0.00% <0.00%> (ø)`

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

github-actions · 2025-10-21T22:58:20Z

Frontend Unit Test Coverage Report

Coverage Summary

Lines	Statements	Branches	Functions
	10.85% (2894/26660)	4.49% (916/20360)	6.38% (373/5844)

Unit Test Results

Tests	Skipped	Failures	Errors	Time
1207	0 💤	0 ❌	0 🔥	14.056s ⏱️

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

feat: Ollama and WatsonX embedding model support

444d922

erichare requested review from edwinjosechittilappilly and lucaseduoli October 21, 2025 22:49

github-actions Bot added the enhancement New feature or request label Oct 21, 2025

[autofix.ci] apply automated fixes

7624e91