feat: Get live provider and model data from model.dev by deon-sanchez · Pull Request #11007 · langflow-ai/langflow

deon-sanchez · 2025-12-12T22:20:35Z

This pull request introduces improvements to model selection flexibility and configuration, as well as an environment variable for enabling live model data fetching. The most significant changes include allowing multiple model types to be filtered in the API, updating starter project components to specify model type, and adding a new environment variable for live model data.

API and Model Selection Enhancements:

The list_models API endpoint now accepts multiple model types (such as llms, embeddings, audio, video) via a repeated query parameter, allowing for more flexible model filtering.
In the Basic Prompt Chaining.json starter project, the ModelInput for the language model component now explicitly sets model_type=["llm"], ensuring that only language models are selectable in the UI. [1] [2] [3]

Configuration and Environment:

A new environment variable, LFX_USE_LIVE_MODEL_DATA, is added to .env.example to optionally enable fetching live model data from the models.dev API instead of using static constants. This provides more up-to-date model options if enabled.

coderabbitai · 2025-12-12T22:20:52Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This PR adds live model data fetching from the models.dev API with caching support, controlled by the LFX_USE_LIVE_MODEL_DATA environment variable. It expands the model metadata type system with cost, limits, and modalities structures, introduces a new models_dev_client module with cache management and API integration, and updates frontend provider icon mappings. The system maintains backward-compatible static defaults with fallback behavior when live data is unavailable.

Changes

Cohort / File(s)	Change Summary
Environment Configuration `\.env\.example`	Added `LFX_USE_LIVE_MODEL_DATA` environment variable with documentation describing purpose (fetch model data from models.dev API when enabled), supported providers, fallback behavior, and default value; minor formatting alignment in existing comments.
Frontend Provider Mappings `src/frontend/src/controllers/API/queries/models/use-get-model-providers\.ts`	Updated provider-to-icon mappings with renamed entries (e.g., "Google Generative AI" → "GoogleGenerativeAI"), new provider entries (e.g., "Google Vertex AI", "Ollama Cloud", "IBM Watsonx", "SambaNova", "Together AI", "Fireworks AI", "DeepSeek", "xAI", "Alibaba", "Cerebras", Azure/AWS variants), and preserved default "Bot" fallback.
Frontend Modal Styling `src/frontend/src/modals/modelProviderModal/index\.tsx`	Added `overflow-y-auto` to left panel container to enable vertical scrolling on content overflow.
Backend Model Metadata Types `src/lfx/src/lfx/base/models/model_metadata\.py`	Introduced three new TypedDicts (`ModelCost`, `ModelLimits`, `ModelModalities`) for structured pricing, token limits, and input/output modalities; extended `ModelMetadata` with 16 new optional fields (provider_id, display_name, structured_output, temperature, attachment, open_weights, cost, limits, modalities, knowledge_cutoff, release_date, last_updated, api_base, env_vars, documentation_url, model_type); updated `create_model_metadata()` to accept and conditionally include these fields.
Backend Models API Client (New) `src/lfx/src/lfx/base/models/models_dev_client\.py`	New module providing models.dev API integration with 30-second HTTP timeout, local disk caching (1-hour TTL at `~/.cache/langflow/.models_dev_cache.json`), error fallback to stale cache, data transformation via `transform_api_model_to_metadata()`, and public APIs: `fetch_models_dev_data()`, `get_live_models_detailed()`, `get_models_by_provider()`, `search_models()`, `get_provider_metadata_from_api()`, `clear_cache()`.
Backend Model Package API `src/lfx/src/lfx/base/models/__init__\.py`	Expanded public exports to include model metadata types (`ModelCost`, `ModelLimits`, `ModelMetadata`, `ModelModalities`, `create_model_metadata`), live-model utilities (`clear_cache`, `fetch_models_dev_data`, `get_live_models_detailed`, `get_models_by_provider`, `get_provider_metadata_from_api`, `search_models`), and `refresh_live_model_data`; added grouping comments for organized API surface.
Backend Unified Models `src/lfx/src/lfx/base/models/unified_models\.py`	Introduced `USE_LIVE_MODEL_DATA` toggle (environment-driven), added `get_static_model_provider_metadata()`, `get_live_models_as_groups()`, and `refresh_live_model_data()` functions; updated `get_models_detailed()` to conditionally fetch live data from API with fallback to static defaults; imported cache/live-fetch utilities from `models_dev_client`.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client Code
    participant UM as unified_models
    participant MDC as models_dev_client
    participant Cache as Local Cache File
    participant API as models.dev API
    
    Client->>UM: get_models_detailed()
    alt USE_LIVE_MODEL_DATA enabled
        UM->>MDC: get_live_models_detailed()
        MDC->>Cache: _load_cache()
        alt Cache valid (within 1h TTL)
            Cache-->>MDC: cached data
        else Cache missing or expired
            MDC->>API: fetch https://models.dev/api.json
            API-->>MDC: provider & model data
            MDC->>Cache: _save_cache(data)
        end
        loop For each provider & model
            MDC->>MDC: transform_api_model_to_metadata()
            Note over MDC: Map provider IDs,<br/>determine model type,<br/>structure cost/limits
        end
        MDC-->>UM: list[ModelMetadata]
    else Live data unavailable
        UM->>UM: get_static_models_detailed()
        Note over UM: Fallback to static<br/>model definitions
    end
    UM-->>Client: list[ModelMetadata]

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

models_dev_client.py: New 250+ line module with API integration, caching logic, data transformation, and multiple public/internal functions—requires verification of error handling, cache TTL logic, API contract mapping, and provider/model type determination.
model_metadata.py: Introduces three new TypedDicts and significantly extends ModelMetadata structure with 16 optional fields—requires careful review of type annotations, field initialization logic, and backward compatibility.
unified_models.py: Adds conditional live/static resolution with caching imports and new public functions—requires verification of environment variable handling, fallback paths, and interaction with models_dev_client.
Frontend mappings (use-get-model-providers.ts): Provider name/icon mappings are numerous and should be cross-checked for accuracy and consistency with backend provider IDs.

Possibly related PRs

refactor: Update AgentComponent to utilize MODEL_OPTIONS_METADATA from constants #9969: Modifies model provider metadata handling and provider-to-icon/name mappings in the same domain; closely related to this PR's provider metadata expansion and icon mapping updates.

Suggested labels

enhancement, size:L

Suggested reviewers

jordanrfrazier
edwinjosechittilappilly
lucaseduoli

Pre-merge checks and finishing touches

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Test Coverage For New Implementations	❌ Error	PR introduces significant new backend functionality and frontend changes without corresponding test coverage for API caching, error handling, live data resolution, and UI updates.	Add test files for models_dev_client.py, model_metadata.py, update unified_models.py tests, and add frontend tests for provider icons and scrolling features.
Test Quality And Coverage	⚠️ Warning	New backend functionality in models_dev_client.py and unified_models.py lacks test coverage despite extensive testing infrastructure.	Create comprehensive pytest tests for models_dev_client.py, model_metadata.py, and unified_models.py covering API interactions, caching, error handling, and frontend TypeScript tests for use-get-model-providers.ts.
Test File Naming And Structure	❓ Inconclusive	PR adds significant backend functionality (models_dev_client.py with caching and API logic) and frontend changes, but no test files are included in the modified files list.	Verify whether test files exist in separate locations, confirm the project's testing conventions and directory structure, and determine if new tests should accompany the substantial backend changes.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: Get live provider and model data from model.dev' accurately describes the main change: integrating live data fetching from models.dev API for providers and models throughout the codebase.
Docstring Coverage	✅ Passed	Docstring coverage is 95.65% which is sufficient. The required threshold is 80.00%.
Excessive Mock Usage Warning	✅ Passed	The custom check for excessive mock usage is not applicable to this pull request. The PR introduces new production code modules but does not include any test files containing mock objects.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2025-12-12T22:22:31Z

Frontend Unit Test Coverage Report

Coverage Summary

Lines	Statements	Branches	Functions
	16.63% (4717/28350)	9.99% (2201/22012)	10.96% (682/6220)

Unit Test Results

Tests	Skipped	Failures	Errors	Time
1830	0 💤	0 ❌	0 🔥	23.68s ⏱️

codecov · 2025-12-12T22:23:47Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 32.77%. Comparing base (9ce7d84) to head (9f5561f).
⚠️ Report is 232 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #11007      +/-   ##
==========================================
- Coverage   33.23%   32.77%   -0.46%     
==========================================
  Files        1394     1396       +2     
  Lines       66068    66504     +436     
  Branches     9778     9883     +105     
==========================================
- Hits        21956    21795     -161     
- Misses      42986    43563     +577     
- Partials     1126     1146      +20

Flag	Coverage Δ
backend	`51.06% <ø> (-1.38%)`	⬇️
frontend	`15.30% <ø> (-0.08%)`	⬇️
lfx	`39.32% <ø> (-0.17%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
src/backend/base/langflow/api/v1/models.py	`29.61% <ø> (ø)`
...nents/common/modelProviderCountComponent/index.tsx	`95.00% <ø> (ø)`
...Component/components/modelInputComponent/index.tsx	`57.64% <ø> (-2.85%)`	⬇️
src/frontend/src/constants/providerConstants.ts	`0.00% <ø> (ø)`
...lers/API/queries/models/use-get-model-providers.ts	`82.60% <ø> (-2.01%)`	⬇️
...ueries/models/use-get-provider-variable-mapping.ts	`0.00% <ø> (ø)`
...odelProviderModal/components/ModelProviderEdit.tsx	`94.11% <ø> (ø)`
...s/modelProviderModal/components/ModelSelection.tsx	`80.00% <ø> (-2.86%)`	⬇️
...als/modelProviderModal/components/ProviderList.tsx	`81.48% <ø> (ø)`
...c/frontend/src/modals/modelProviderModal/index.tsx	`0.00% <ø> (ø)`
... and 5 more

... and 38 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (4)

src/lfx/src/lfx/base/models/unified_models.py (2)
100-113: Consider narrowing the exception type (static analysis warning).

The blind Exception catch at line 109 triggers Ruff BLE001. While the fallback behavior is correct for resilience, consider either:

Catching more specific exceptions (e.g., OSError, ValueError, TimeoutError)

Adding # noqa: BLE001 with a comment explaining why broad catching is intentional here
-    except Exception as e:
+    except Exception as e:  # noqa: BLE001 - Intentional broad catch for graceful fallback
         logger.debug(f"Failed to get live provider metadata: {e}")
133-155: Same BLE001 warning applies here.

Line 153 has the same blind exception catch. Add # noqa: BLE001 with justification for consistency.
-    except Exception as e:
+    except Exception as e:  # noqa: BLE001 - Intentional broad catch for graceful fallback
         logger.debug(f"Failed to fetch live models: {e}")
         return []
src/lfx/src/lfx/base/models/models_dev_client.py (2)
174-189: Use idiomatic string containment check.

Line 186 uses .find() != -1 which is not idiomatic Python. Use the in operator instead.

Apply this diff:
-    if model_data.get("id", "").lower().find("embed") != -1:
+    if "embed" in model_data.get("id", "").lower():
191-227: Consider omitting None values from TypedDict construction.

The transform functions explicitly include optional fields with None values (e.g., lines 199-203). Since these TypedDicts have total=False, it's cleaner to omit None values rather than include them, which also prevents potential serialization issues.

Example for _transform_cost:
 def _transform_cost(cost_data: dict[str, Any] | None) -> ModelCost | None:
     """Transform API cost data to ModelCost format."""
     if not cost_data:
         return None
 
-    return ModelCost(
-        input=cost_data.get("input", 0),
-        output=cost_data.get("output", 0),
-        reasoning=cost_data.get("reasoning"),
-        cache_read=cost_data.get("cache_read"),
-        cache_write=cost_data.get("cache_write"),
-        input_audio=cost_data.get("input_audio"),
-        output_audio=cost_data.get("output_audio"),
-    )
+    result = ModelCost(
+        input=cost_data.get("input", 0),
+        output=cost_data.get("output", 0),
+    )
+    # Only add optional fields if present
+    for key in ["reasoning", "cache_read", "cache_write", "input_audio", "output_audio"]:
+        if key in cost_data:
+            result[key] = cost_data[key]
+    return result
Apply similar logic to _transform_limits and _transform_modalities.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a54e508 and 841b365.

📒 Files selected for processing (7)

.env.example (2 hunks)
src/frontend/src/controllers/API/queries/models/use-get-model-providers.ts (1 hunks)
src/frontend/src/modals/modelProviderModal/index.tsx (1 hunks)
src/lfx/src/lfx/base/models/__init__.py (1 hunks)
src/lfx/src/lfx/base/models/model_metadata.py (1 hunks)
src/lfx/src/lfx/base/models/models_dev_client.py (1 hunks)
src/lfx/src/lfx/base/models/unified_models.py (5 hunks)

🧰 Additional context used

📓 Path-based instructions (3)

src/frontend/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/frontend_development.mdc)

src/frontend/src/**/*.{ts,tsx}: Use React 18 with TypeScript for frontend development
Use Zustand for state management

Files:

src/frontend/src/modals/modelProviderModal/index.tsx
src/frontend/src/controllers/API/queries/models/use-get-model-providers.ts

src/frontend/src/**/*.{tsx,jsx,css,scss}

📄 CodeRabbit inference engine (.cursor/rules/frontend_development.mdc)

Use Tailwind CSS for styling

Files:

src/frontend/src/modals/modelProviderModal/index.tsx

src/frontend/src/**/*.{tsx,jsx}

📄 CodeRabbit inference engine (.cursor/rules/frontend_development.mdc)

src/frontend/src/**/*.{tsx,jsx}: Implement dark mode support using the useDarkMode hook and dark store
Use Lucide React for icon components in the application

Files:

src/frontend/src/modals/modelProviderModal/index.tsx

🧠 Learnings (5)

📚 Learning: 2025-07-11T22:12:46.255Z

Learnt from: namastex888
Repo: langflow-ai/langflow PR: 9018
File: src/frontend/src/modals/apiModal/codeTabs/code-tabs.tsx:244-244
Timestamp: 2025-07-11T22:12:46.255Z
Learning: In src/frontend/src/modals/apiModal/codeTabs/code-tabs.tsx, the inconsistent showLineNumbers setting between Step 1 (false) and Step 2 (true) in the API modal is intentional to prevent breaking the modal height. Step 1 uses showLineNumbers={false} to save vertical space while Step 2 uses showLineNumbers={true} for better readability of longer code.

Applied to files:

src/frontend/src/modals/modelProviderModal/index.tsx

📚 Learning: 2025-07-23T21:19:22.567Z

Learnt from: deon-sanchez
Repo: langflow-ai/langflow PR: 9158
File: src/backend/base/langflow/api/v1/mcp_projects.py:404-404
Timestamp: 2025-07-23T21:19:22.567Z
Learning: In langflow MCP projects configuration, prefer using dynamically computed URLs (like the `sse_url` variable) over hardcoded localhost URLs to ensure compatibility across different deployment environments.

Applied to files:

.env.example

📚 Learning: 2025-11-24T19:46:57.920Z

Learnt from: CR
Repo: langflow-ai/langflow PR: 0
File: .cursor/rules/icons.mdc:0-0
Timestamp: 2025-11-24T19:46:57.920Z
Learning: Applies to src/frontend/src/icons/lazyIconImports.ts : Add icon entries to the `lazyIconsMapping` object in `src/frontend/src/icons/lazyIconImports.ts`. The key must match the backend icon name exactly (case-sensitive) and use dynamic imports: `IconName: () => import("@/icons/IconName").then((mod) => ({ default: mod.IconNameIcon }))`.

Applied to files:

src/frontend/src/controllers/API/queries/models/use-get-model-providers.ts

📚 Learning: 2025-11-24T19:46:57.920Z

Learnt from: CR
Repo: langflow-ai/langflow PR: 0
File: .cursor/rules/icons.mdc:0-0
Timestamp: 2025-11-24T19:46:57.920Z
Learning: Use clear, recognizable icon names (e.g., `"AstraDB"`, `"Postgres"`, `"OpenAI"`). Always use the same icon name for the same service across backend and frontend.

Applied to files:

src/frontend/src/controllers/API/queries/models/use-get-model-providers.ts

📚 Learning: 2025-06-23T12:46:52.420Z

Learnt from: CR
Repo: langflow-ai/langflow PR: 0
File: .cursor/rules/icons.mdc:0-0
Timestamp: 2025-06-23T12:46:52.420Z
Learning: The frontend icon mapping key (in 'lazyIconsMapping') must match the backend 'icon' attribute string exactly, including case sensitivity, to ensure correct icon rendering.

Applied to files:

src/frontend/src/controllers/API/queries/models/use-get-model-providers.ts

🧬 Code graph analysis (2)

src/lfx/src/lfx/base/models/models_dev_client.py (1)

src/lfx/src/lfx/base/models/model_metadata.py (4)

ModelCost (4-13)

ModelLimits (16-20)

ModelMetadata (30-69)

ModelModalities (23-27)

src/lfx/src/lfx/base/models/unified_models.py (1)

src/lfx/src/lfx/base/models/models_dev_client.py (3)

clear_cache (386-394)

get_live_models_detailed (307-350)

get_provider_metadata_from_api (354-383)

🪛 GitHub Actions: Ruff Style Check

src/lfx/src/lfx/base/models/__init__.py

[error] 20-20: RUF022 __all__ is not sorted.

🪛 GitHub Check: Ruff Style Check (3.13)

src/lfx/src/lfx/base/models/models_dev_client.py

[failure] 169-169: Ruff (BLE001)
src/lfx/src/lfx/base/models/models_dev_client.py:169:12: BLE001 Do not catch blind exception: Exception

[failure] 159-159: Ruff (TRY300)
src/lfx/src/lfx/base/models/models_dev_client.py:159:9: TRY300 Consider moving this statement to an else block

src/lfx/src/lfx/base/models/unified_models.py

[failure] 109-109: Ruff (BLE001)
src/lfx/src/lfx/base/models/unified_models.py:109:12: BLE001 Do not catch blind exception: Exception

[failure] 189-189: Ruff (PLW0602)
src/lfx/src/lfx/base/models/unified_models.py:189:12: PLW0602 Using global for MODELS_DETAILED but no assignment is done

[failure] 153-153: Ruff (BLE001)
src/lfx/src/lfx/base/models/unified_models.py:153:12: BLE001 Do not catch blind exception: Exception

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)

GitHub Check: Test Docker Images / Test docker images
GitHub Check: Run Frontend Tests / Determine Test Suites and Shard Distribution
GitHub Check: Run Frontend Unit Tests / Frontend Jest Unit Tests
GitHub Check: Test Starter Templates
GitHub Check: Update Component Index
GitHub Check: Run Ruff Check and Format
GitHub Check: Update Starter Projects

🔇 Additional comments (13)

src/frontend/src/modals/modelProviderModal/index.tsx (1)

255-255: LGTM! Enables scrolling for overflowing provider list.

The addition of overflow-y-auto correctly handles scenarios where the provider list exceeds the fixed height of 513px, allowing users to scroll through all available providers.

.env.example (1)

143-148: Documentation looks good.

The environment variable documentation is clear and follows the established pattern in this file. It properly documents the purpose, allowed values, default, and fallback behavior.

src/lfx/src/lfx/base/models/model_metadata.py (2)

4-28: Well-structured type definitions.

The new ModelCost, ModelLimits, and ModelModalities TypedDicts are well-designed with total=False for optional fields and clear inline documentation of units and purposes.

72-144: Clean implementation of the factory function.

The conditional assignment pattern (only adding fields when not None) is a good approach to keep the metadata dictionaries lean. The function signature is getting long but remains manageable for a configuration factory.

src/lfx/src/lfx/base/models/unified_models.py (3)

33-35: Environment variable parsing looks good.

The pattern of parsing the env var with a lowercase comparison and default to "false" is correct and defensive.

158-175: Good fallback pattern with appropriate logging.

The live-to-static fallback with a warning log is a solid resilience pattern. The function cleanly encapsulates the toggle behavior.

181-200: > Likely an incorrect or invalid review comment.

src/frontend/src/controllers/API/queries/models/use-get-model-providers.ts (1)

78-100: Verify icon keys match lazyIconImports.ts before merging.

The provider-to-icon mappings reference several custom icons (GoogleGenerativeAI, VertexAI, Mistral, DeepSeek, xAI, WatsonxAI, SambaNova, AWS, Azure, NVIDIA, Ollama) that must be registered in src/frontend/src/icons/lazyIconImports.ts. Icon keys are case-sensitive and must match exactly. Ensure each referenced icon key has a corresponding dynamic import entry in lazyIconImports.ts. The "Bot" fallback is acceptable for unsupported providers (Together AI, Fireworks AI, Alibaba, Cerebras).

src/lfx/src/lfx/base/models/models_dev_client.py (5)

1-93: LGTM: Well-structured module with comprehensive provider mappings.

The imports, constants, and provider mappings are well-organized and comprehensive.

96-120: LGTM: Cache loading logic is sound.

The cache path construction and loading with TTL validation are implemented correctly.

122-131: LGTM: Cache saving is implemented correctly.

Directory creation and error handling are appropriate.

307-351: LGTM: Query function is well-implemented.

The filtering logic and parameter handling are correct.

397-439: LGTM: Convenience and search functions are well-implemented.

The wrapper and search functions provide a clean API surface.

…foss-3056

…into lfoss-3056

…foss-3056

…or output limit - Added MAX_EMBEDDING_OUTPUT_LIMIT constant (4096) for embedding dimension threshold - Simplified _is_embedding_model return statement by combining conditions - Improved code readability with named constant instead of magic number

…into lfoss-3056

…foss-3056

backend updated

841b365

deon-sanchez self-assigned this Dec 12, 2025

github-actions Bot added the enhancement New feature or request label Dec 12, 2025

coderabbitai Bot reviewed Dec 12, 2025

View reviewed changes

Comment thread src/lfx/src/lfx/base/models/__init__.py

Comment thread src/lfx/src/lfx/base/models/models_dev_client.py

Comment thread src/lfx/src/lfx/base/models/models_dev_client.py

Comment thread src/lfx/src/lfx/base/models/models_dev_client.py

Merge branch 'main' of https://github.com/langflow-ai/langflow into l…

76e271b

…foss-3056