feat: generate requirements.txt from dependencies (#11810)#12087
Conversation
* Base script to generate requirements Dymanically picks dependency for LanguageM Comp. Requires separate change to remove eager loading. * Lazy load imports for language model component Ensures that only the necessary dependencies are required. For example, if OpenAI provider is used, it will now only import langchain_openai, rather than requiring langchain_anthropic, langchain_ibm, etc. * Add backwards-compat functions * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * Add exception handling * Add CLI command to create reqs * correctly exclude langchain imports * Add versions to reqs * dynamically resolve provider imports for language model comp * Lazy load imports for reqs, some ruff fixes * Add dynamic resolves for embedding model comp * Add install hints * Add missing provider tests; add warnings in reqs script * Add a few warnings and fix install hint * update comments add logging * Package hints, warnings, comments, tests * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * Add alias for watsonx * Fix anthropic for basic prompt, azure mapping * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * ruff * [autofix.ci] apply automated fixes * test formatting * ruff * [autofix.ci] apply automated fixes --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
WalkthroughThe changes introduce a new CLI command and supporting infrastructure to generate requirements.txt files for Langflow flows. This includes a requirements generation module with AST-based import extraction and dynamic provider resolution, updates to model/embedding mappings, lazy-loading utilities, and comprehensive test coverage. Changes
Sequence DiagramsequenceDiagram
participant User
participant CLI as CLI Command
participant FlowFile as Flow File
participant StaticAnalysis as Static Analysis<br/>(AST)
participant DynamicAnalysis as Dynamic Analysis<br/>(Provider Registry)
participant MetadataResolver as Metadata Resolver<br/>(importlib.metadata)
participant DepResolver as Dependency Resolver<br/>(Transitive)
participant Output as Output Handler
User->>CLI: requirements --flow-path file.json --output reqs.txt
CLI->>FlowFile: Load & Parse JSON
FlowFile-->>CLI: Flow structure
CLI->>StaticAnalysis: Extract imports from component source
StaticAnalysis-->>CLI: List of import names
CLI->>DynamicAnalysis: Resolve providers (LLM, Embedding)
DynamicAnalysis->>DynamicAnalysis: Query provider registries
DynamicAnalysis-->>CLI: Provider package names
CLI->>MetadataResolver: Map imports to PyPI packages
MetadataResolver->>MetadataResolver: Resolve distribution metadata
MetadataResolver-->>CLI: Distribution names & versions
CLI->>DepResolver: Filter transitive LFX deps
DepResolver->>DepResolver: Compute lfx dependency tree
DepResolver-->>CLI: Final external package list
CLI->>Output: Generate requirements text
Output-->>Output: Pin versions & format
Output-->>CLI: Formatted requirements
CLI->>FlowFile: Write to output file (optional)
FlowFile-->>User: requirements.txt or stdout
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 2 warnings)
✅ Passed checks (4 passed)
✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is ❌ Your project status has failed because the head coverage (43.22%) is below the target coverage (60.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #12087 +/- ##
==========================================
+ Coverage 37.58% 37.72% +0.13%
==========================================
Files 1623 1624 +1
Lines 79603 79862 +259
Branches 11971 12020 +49
==========================================
+ Hits 29917 30124 +207
- Misses 48027 48067 +40
- Partials 1659 1671 +12
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
src/lfx/tests/unit/test_flow_requirements.py (1)
955-998: Cover the-owrite-failure path too.The CLI now has a dedicated
OSErrorbranch for unwritable output paths, but the suite only exercises parse/file-not-found failures. One case against a missing directory would keep that exit path from regressing.As per coding guidelines, "Verify tests cover both positive and negative scenarios where appropriate".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lfx/tests/unit/test_flow_requirements.py` around lines 955 - 998, Add a new negative test (e.g., test_output_write_failure) in src/lfx/tests/unit/test_flow_requirements.py that invokes the CLI via runner.invoke(app, ["requirements", str(flow_file), "-o", str(unwritable_path)]) where unwritable_path points to a path that cannot be written (for example a file inside a non-existent directory or a path in a read-only dir created via tmp_path), then assert result.exit_code == 1 and that result.output contains an error message (similar to test_file_not_found and test_invalid_json). This will exercise the CLI's OSError branch for unwritable output paths and prevent regressions of the -o write-failure path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lfx/src/lfx/utils/flow_requirements.py`:
- Around line 283-294: The current branch returns an empty set when
provider_info is missing from MODEL_PROVIDERS_DICT, which causes valid
but-unimportable providers to yield no requirements; change the logic in the
function that queries MODEL_PROVIDERS_DICT (the block referencing provider_info
and _PROVIDER_PACKAGE_FALLBACKS) to mirror the approach used in
_resolve_embedding_provider_packages(): when provider_info is None, consult
_PROVIDER_PACKAGE_FALLBACKS (or the same fallback source used by
_resolve_embedding_provider_packages) and return that package set instead of an
empty set; ensure the code returns set(fallback) when a fallback exists and only
warns/returns empty when no fallback is available.
- Around line 486-513: The function generate_requirements_from_flow should
validate that its flow argument is a mapping before accessing flow.get: add an
early check at the top of generate_requirements_from_flow that verifies
isinstance(flow, dict) (or collections.abc.Mapping) and raise a clear
TypeError/ValueError with a helpful message if not; callers like
generate_requirements_from_file and the CLI will then get a controlled error
instead of an AttributeError. Ensure the check happens before any use of
flow.get or attribute access so malformed JSON values (e.g., list or string) are
rejected early with an explict error message.
- Around line 222-250: The _extract_imports function currently walks the entire
AST and picks up imports inside typing-only blocks; change it to skip any
imports inside "if TYPE_CHECKING" branches by making the AST traversal aware of
those conditional blocks: implement a small ast.NodeVisitor (or modify the walk)
that when visiting ast.If checks whether the test is a Name('TYPE_CHECKING') or
an Attribute ending with 'TYPE_CHECKING' (e.g., typing.TYPE_CHECKING) and, if
so, does not traverse its body/orelse, while still visiting other nodes to
collect ast.Import and ast.ImportFrom. Ensure the detection covers both Name and
Attribute forms and keep the returned set behavior in _extract_imports
unchanged.
---
Nitpick comments:
In `@src/lfx/tests/unit/test_flow_requirements.py`:
- Around line 955-998: Add a new negative test (e.g., test_output_write_failure)
in src/lfx/tests/unit/test_flow_requirements.py that invokes the CLI via
runner.invoke(app, ["requirements", str(flow_file), "-o", str(unwritable_path)])
where unwritable_path points to a path that cannot be written (for example a
file inside a non-existent directory or a path in a read-only dir created via
tmp_path), then assert result.exit_code == 1 and that result.output contains an
error message (similar to test_file_not_found and test_invalid_json). This will
exercise the CLI's OSError branch for unwritable output paths and prevent
regressions of the -o write-failure path.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b24d050b-c6c6-4bdd-969a-3c6228045e25
📒 Files selected for processing (5)
src/lfx/src/lfx/__main__.pysrc/lfx/src/lfx/base/models/unified_models.pysrc/lfx/src/lfx/utils/__init__.pysrc/lfx/src/lfx/utils/flow_requirements.pysrc/lfx/tests/unit/test_flow_requirements.py
| def _extract_imports(source: str) -> set[str]: | ||
| """Extract top-level package names from all imports in Python source via AST. | ||
|
|
||
| Walks the entire AST (including function bodies and try/except blocks) so | ||
| that lazy imports inside ``build_model()`` etc. are captured. Returns only | ||
| the first segment of each dotted import (e.g. ``foo`` from ``import foo.bar``). | ||
| """ | ||
| try: | ||
| tree = ast.parse(source) | ||
| except SyntaxError as exc: | ||
| warnings.warn( | ||
| f"Could not parse component source (SyntaxError: {exc}). " | ||
| "Imports from this component will not be included in requirements.", | ||
| stacklevel=2, | ||
| ) | ||
| return set() | ||
|
|
||
| imports: set[str] = set() | ||
| for node in ast.walk(tree): | ||
| if isinstance(node, ast.Import): | ||
| for alias in node.names: | ||
| imports.add(alias.name.split(".")[0]) | ||
| elif isinstance(node, ast.ImportFrom): | ||
| if node.level > 0: | ||
| # Relative import - skip (internal to the component) | ||
| continue | ||
| if node.module: | ||
| imports.add(node.module.split(".")[0]) | ||
| return imports |
There was a problem hiding this comment.
Skip TYPE_CHECKING branches during import extraction.
Walking the full AST with ast.walk() makes typing-only imports look like runtime requirements. A component that does if TYPE_CHECKING: import pandas will currently emit pandas into requirements.txt, which breaks the “minimal requirements” goal.
🩹 Suggested direction
+class _ImportCollector(ast.NodeVisitor):
+ def __init__(self) -> None:
+ self.imports: set[str] = set()
+
+ def visit_If(self, node: ast.If) -> None:
+ test = node.test
+ is_type_checking = (
+ isinstance(test, ast.Name)
+ and test.id == "TYPE_CHECKING"
+ ) or (
+ isinstance(test, ast.Attribute)
+ and isinstance(test.value, ast.Name)
+ and test.value.id == "typing"
+ and test.attr == "TYPE_CHECKING"
+ )
+ if not is_type_checking:
+ self.generic_visit(node)
+
+ def visit_Import(self, node: ast.Import) -> None:
+ for alias in node.names:
+ self.imports.add(alias.name.split(".")[0])
+
+ def visit_ImportFrom(self, node: ast.ImportFrom) -> None:
+ if node.level == 0 and node.module:
+ self.imports.add(node.module.split(".")[0])
+
def _extract_imports(source: str) -> set[str]:
...
- imports: set[str] = set()
- for node in ast.walk(tree):
- ...
- return imports
+ collector = _ImportCollector()
+ collector.visit(tree)
+ return collector.imports🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lfx/src/lfx/utils/flow_requirements.py` around lines 222 - 250, The
_extract_imports function currently walks the entire AST and picks up imports
inside typing-only blocks; change it to skip any imports inside "if
TYPE_CHECKING" branches by making the AST traversal aware of those conditional
blocks: implement a small ast.NodeVisitor (or modify the walk) that when
visiting ast.If checks whether the test is a Name('TYPE_CHECKING') or an
Attribute ending with 'TYPE_CHECKING' (e.g., typing.TYPE_CHECKING) and, if so,
does not traverse its body/orelse, while still visiting other nodes to collect
ast.Import and ast.ImportFrom. Ensure the detection covers both Name and
Attribute forms and keep the returned set behavior in _extract_imports
unchanged.
| provider_info = MODEL_PROVIDERS_DICT.get(provider_name) | ||
| if not provider_info: | ||
| fallback = _PROVIDER_PACKAGE_FALLBACKS.get(provider_name) | ||
| if fallback: | ||
| return set(fallback) | ||
| warnings.warn( | ||
| f"Provider '{provider_name}' was detected in the flow but is not " | ||
| "registered in MODEL_PROVIDERS_DICT (its package may not be installed). " | ||
| "Its dependencies will not be included in requirements.", | ||
| stacklevel=2, | ||
| ) | ||
| return set() |
There was a problem hiding this comment.
Don’t return an empty package set for unloaded providers.
This resolver depends on MODEL_PROVIDERS_DICT, but the new test helper in src/lfx/tests/unit/test_flow_requirements.py:767-778 already treats that registry as incomplete when optional provider packages are not installed. In that state, a flow can select a supported provider and still get no provider requirement at all. This path needs a provider→package source that does not depend on the provider package already being importable, similar to _resolve_embedding_provider_packages().
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lfx/src/lfx/utils/flow_requirements.py` around lines 283 - 294, The
current branch returns an empty set when provider_info is missing from
MODEL_PROVIDERS_DICT, which causes valid but-unimportable providers to yield no
requirements; change the logic in the function that queries MODEL_PROVIDERS_DICT
(the block referencing provider_info and _PROVIDER_PACKAGE_FALLBACKS) to mirror
the approach used in _resolve_embedding_provider_packages(): when provider_info
is None, consult _PROVIDER_PACKAGE_FALLBACKS (or the same fallback source used
by _resolve_embedding_provider_packages) and return that package set instead of
an empty set; ensure the code returns set(fallback) when a fallback exists and
only warns/returns empty when no fallback is available.
| def generate_requirements_from_flow( | ||
| flow: dict, | ||
| *, | ||
| lfx_package: str = "lfx", | ||
| include_lfx: bool = True, | ||
| pin_versions: bool = True, | ||
| ) -> list[str]: | ||
| """Generate a requirements list from a Langflow flow JSON. | ||
|
|
||
| Args: | ||
| flow: Parsed Langflow flow JSON (dict). | ||
| lfx_package: Name of the LFX package to include (e.g. ``"lfx"`` or | ||
| ``"lfx-nightly"``). | ||
| include_lfx: Whether to include the LFX package itself. | ||
| pin_versions: If True, pin each package to the version currently | ||
| installed in this environment (``pkg==X.Y.Z``). Falls back to | ||
| an unpinned name when the package is not installed. | ||
|
|
||
| Returns: | ||
| Sorted list of PyPI package specifiers needed to run this flow. | ||
| """ | ||
| all_packages: set[str] = set() | ||
| all_providers: set[str] = set() | ||
|
|
||
| data = flow.get("data", {}) | ||
| nodes = data.get("nodes", []) | ||
|
|
||
| for node in nodes: |
There was a problem hiding this comment.
Validate the top-level flow shape before walking it.
generate_requirements_from_file() and the new CLI can hand this function any syntactically valid JSON value. If the input is [] or "foo", flow.get(...) raises AttributeError and callers get an unhandled traceback instead of a controlled failure.
🩹 Proposed fix
def generate_requirements_from_flow(
flow: dict,
*,
lfx_package: str = "lfx",
include_lfx: bool = True,
pin_versions: bool = True,
) -> list[str]:
@@
+ if not isinstance(flow, dict):
+ msg = "Flow JSON must be an object"
+ raise ValueError(msg)
+
all_packages: set[str] = set()
all_providers: set[str] = set()
data = flow.get("data", {})🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lfx/src/lfx/utils/flow_requirements.py` around lines 486 - 513, The
function generate_requirements_from_flow should validate that its flow argument
is a mapping before accessing flow.get: add an early check at the top of
generate_requirements_from_flow that verifies isinstance(flow, dict) (or
collections.abc.Mapping) and raise a clear TypeError/ValueError with a helpful
message if not; callers like generate_requirements_from_file and the CLI will
then get a controlled error instead of an AttributeError. Ensure the check
happens before any use of flow.get or attribute access so malformed JSON values
(e.g., list or string) are rejected early with an explict error message.
* Base script to generate requirements Dymanically picks dependency for LanguageM Comp. Requires separate change to remove eager loading. * Lazy load imports for language model component Ensures that only the necessary dependencies are required. For example, if OpenAI provider is used, it will now only import langchain_openai, rather than requiring langchain_anthropic, langchain_ibm, etc. * Add backwards-compat functions * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * Add exception handling * Add CLI command to create reqs * correctly exclude langchain imports * Add versions to reqs * dynamically resolve provider imports for language model comp * Lazy load imports for reqs, some ruff fixes * Add dynamic resolves for embedding model comp * Add install hints * Add missing provider tests; add warnings in reqs script * Add a few warnings and fix install hint * update comments add logging * Package hints, warnings, comments, tests * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) * Add alias for watsonx * Fix anthropic for basic prompt, azure mapping * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * ruff * [autofix.ci] apply automated fixes * test formatting * ruff * [autofix.ci] apply automated fixes --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Re-add of #11810, which was unceremoniously removed in #11490.
Summary by CodeRabbit
Release Notes
New Features
requirementsCLI command to generate requirements.txt files for Langflow flowsTests