Skip to content

refactor: load components by importing them directly#8395

Merged
ogabrielluiz merged 94 commits into
mainfrom
fix-component-loading
Jun 11, 2025
Merged

refactor: load components by importing them directly#8395
ogabrielluiz merged 94 commits into
mainfrom
fix-component-loading

Conversation

@ogabrielluiz
Copy link
Copy Markdown
Contributor

@ogabrielluiz ogabrielluiz commented Jun 6, 2025

This pull request introduces several changes across multiple files, focusing on improving documentation, enhancing functionality, and refining starter project configurations. The most significant updates include removing unused components, adding detailed docstrings to utility functions, and modifying starter project templates to improve usability and clarity.

Codebase Cleanup and Documentation Enhancements:

Starter Project Updates:

General Improvements:

  • Updated Basic Prompt Chaining.json, Basic Prompting.json, Blog Writer.json, and Document Q&A.json to remove the requirement for an api_key in required_inputs for LanguageModel components, simplifying setup for users. [1] [2] [3] [4]
  • Enhanced info fields for Model Name inputs across multiple starter projects to provide clearer instructions on selecting a provider and refreshing model names. [1] [2] [3] [4]
  • Marked Temperature inputs as advanced in several starter projects, aligning with their intended use for fine-tuning model behavior. [1] [2] [3] [4]

Specific Updates:

  • Diet Analysis.json: Added api_key as a required input for LanguageModel, reversing the removal seen in other projects to align with specific requirements.
  • Financial Agent.json: Updated the description for the URL component to clarify its functionality and replaced clean_extra_whitespace with autoset_encoding in the template, enhancing usability. [1] [2]

Summary by CodeRabbit

  • New Features

    • Introduced comprehensive asynchronous tests for component loading, including performance, structure, and error handling checks.
    • Added new deactivated retriever components for Amazon Kendra, Metal, MultiQuery, Vectara Self Query, and others, expanding extensibility options.
    • Added a new "Chat Memory" component for storing and retrieving chat messages with external memory support.
  • Bug Fixes

    • Improved error handling and logging for component loading and starter project creation processes.
  • Refactor

    • Refactored retriever and vector store components to use declarative input definitions and updated class inheritance for consistency.
    • Updated and streamlined component export lists, removing deprecated or legacy components from public APIs.
    • Enhanced docstrings and clarified control flow in custom component utility functions.
    • Replaced the URL component in multiple starter projects with a robust recursive web crawler featuring expanded configuration options.
    • Improved asynchronous loading of built-in Langflow components with parallel processing and better error handling.
    • Removed automatic setting of required inputs in custom component initialization.
  • Chores

    • Updated starter project templates for improved user guidance, advanced input categorization, and more robust URL crawling capabilities.
    • Improved debug logging for starter project loading and creation.
    • Updated metadata in model components to better guide users on API key usage and model selection.
    • Adjusted default component paths handling in settings.
  • Tests

    • Added extensive asynchronous test suite for component loading functions, including performance and memory usage analysis.
    • Added detailed docstrings to existing test functions for clarity.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 6, 2025

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This update introduces extensive refactoring and enhancements across Langflow's backend, focusing on component architecture, asynchronous loading, and improved configuration. Several retriever and utility components are restructured, with new input schemas and error handling. The built-in component loading system is overhauled for asynchronous, parallel discovery. Numerous starter project JSONs are updated to refine metadata, input requirements, and UI hints. Legacy and deactivated components are modularized, and new comprehensive tests are added for component loading performance and correctness.

Changes

File(s) Change Summary
components/retrievers/amazon_kendra.py, components/retrievers/metal.py, components/retrievers/multi_query.py, components/vectorstores/vectara_self_query.py Refactored retriever/vectorstore components: changed base classes, replaced build_config with declarative inputs, restructured build methods, improved error handling, updated class names and inheritance.
components/retrievers/__init__.py, components/vectorstores/__init__.py, components/langchain_utilities/__init__.py, components/logic/__init__.py Removed or updated exports and imports to reflect component refactoring and removal of legacy/deprecated components.
components/deactivated/amazon_kendra.py, components/deactivated/metal.py, components/deactivated/multi_query.py, components/deactivated/json_document_builder.py, components/deactivated/retriever.py, components/deactivated/vectara_self_query.py, components/deactivated/vector_store.py Added or refactored deactivated/legacy components, defining explicit inputs schemas, updating methods, and improving error messages.
custom/custom_component/component.py Removed call to _set_output_required_inputs() from Component constructor.
custom/utils.py Added detailed docstrings, improved error handling, clarified control flow, and updated function signatures for custom component utilities.
interface/components.py Introduced async, parallel loading of built-in components (import_langflow_components), improved caching, added docstrings, enhanced logging, and updated metadata retrieval.
services/settings/base.py Changed default handling of components_path to return an empty list if unset; improved docstring.
initial_setup/setup.py Enhanced debug logging and error handling during starter project creation, tracking successful creations and logging failures.
initial_setup/starter_projects/*.json Updated component metadata: removed redundant required_inputs, added/updated info and advanced flags, overhauled "URL" component to a recursive web crawler in several projects, replaced or added new "Memory" component, and adjusted input/output schemas.
tests/unit/test_endpoints.py Improved docstrings, updated imports to use BASE_COMPONENTS_PATH, clarified test intent.
tests/unit/test_load_components.py Added comprehensive asynchronous tests for component loading, performance, structure, error handling, and memory usage.
tests/conftest.py Allowed additional blocking calls in specific files/functions during tests.

Sequence Diagram(s)

Asynchronous Built-in Component Loading

sequenceDiagram
    participant Interface as interface/components.py
    participant Asyncio as asyncio
    participant Importer as importlib
    participant Component as langflow.components.<subpackage>.<module>
    participant Cache as ComponentCache

    Interface->>Asyncio: to_thread(discover modules in langflow.components)
    Asyncio->>Importer: import each module in parallel
    loop For each module
        Importer->>Component: import module
        Component-->>Importer: provide classes with Langflow attributes
        Importer-->>Asyncio: return component classes
    end
    Asyncio-->>Interface: aggregated component templates
    Interface->>Cache: store built-in components in cache
Loading

Custom Component Template Creation (Updated Logic)

sequenceDiagram
    participant Utils as custom/utils.py
    participant Component as CustomComponent/Component
    participant HTTP as HTTPException

    Utils->>Component: get_component_instance(custom_component)
    alt Not a "Component" or "CustomComponent"
        Component-->>Utils: return input directly
    else
        Utils->>Component: validate code type and evaluate code
        alt Missing/invalid code
            Utils->>HTTP: raise HTTP 400 error
        else
            Component-->>Utils: instantiate component
        end
    end
    Utils->>Component: run_build_config
    alt Subclass of Component
        Component-->>Utils: return build config directly
    else
        Utils->>Component: evaluate code, build config
        alt Error
            Utils->>HTTP: raise HTTP 400 error
        end
    end
    Utils-->>Caller: return template and instance
Loading

Starter Project "URL" Component: Old vs. New Flow

Old Flow (Simple Loader)

sequenceDiagram
    participant User
    participant URLComponent
    participant Loader as AsyncHtmlLoader/WebBaseLoader

    User->>URLComponent: provide URLs and format
    URLComponent->>Loader: fetch content (async)
    Loader-->>URLComponent: return content (Text/HTML/JSON)
    URLComponent-->>User: output data/text/dataframe
Loading

New Flow (Recursive Web Crawler)

sequenceDiagram
    participant User
    participant URLComponent
    participant Loader as RecursiveUrlLoader

    User->>URLComponent: provide URLs, max_depth, options
    loop For each URL
        URLComponent->>Loader: create loader with options
        Loader->>Loader: recursively crawl (up to max_depth)
        Loader-->>URLComponent: return documents (Text/HTML)
    end
    URLComponent->>User: output dataframe/message
Loading

These diagrams illustrate the new asynchronous component loading, updated custom component template creation logic, and the transformation of the "URL" component from a simple loader to a recursive web crawler.


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions Bot added the refactor Maintenance tasks and housekeeping label Jun 6, 2025
@ogabrielluiz ogabrielluiz force-pushed the fix-component-loading branch from 7239751 to 5d3fa5b Compare June 6, 2025 13:21
@github-actions github-actions Bot added refactor Maintenance tasks and housekeeping and removed refactor Maintenance tasks and housekeeping labels Jun 6, 2025
@ogabrielluiz ogabrielluiz changed the base branch from main to release-1.4.3 June 6, 2025 13:21
@github-actions github-actions Bot added refactor Maintenance tasks and housekeeping and removed refactor Maintenance tasks and housekeeping labels Jun 6, 2025
@ogabrielluiz ogabrielluiz force-pushed the fix-component-loading branch from 5d3fa5b to d06206f Compare June 6, 2025 13:29
@github-actions github-actions Bot added refactor Maintenance tasks and housekeeping and removed refactor Maintenance tasks and housekeeping labels Jun 6, 2025
@ogabrielluiz ogabrielluiz marked this pull request as ready for review June 6, 2025 13:45
@dosubot dosubot Bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Jun 6, 2025
@github-actions github-actions Bot added refactor Maintenance tasks and housekeeping and removed refactor Maintenance tasks and housekeeping labels Jun 6, 2025
@ogabrielluiz
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 6, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@ogabrielluiz ogabrielluiz added the fix-for-release PR to be merged into a release branch label Jun 6, 2025
coderabbitai Bot added a commit that referenced this pull request Jun 6, 2025
Docstrings generation was requested by @ogabrielluiz.

* #8395 (comment)

The following files were modified:

* `src/backend/base/langflow/custom/utils.py`
* `src/backend/base/langflow/interface/components.py`
* `src/backend/base/langflow/services/settings/base.py`
* `src/backend/tests/unit/test_endpoints.py`
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 6, 2025

Note

Generated docstrings for this pull request at #8397

@github-actions github-actions Bot added refactor Maintenance tasks and housekeeping and removed refactor Maintenance tasks and housekeeping labels Jun 6, 2025
Comment thread src/backend/base/langflow/custom/utils.py

def get_component_instance(custom_component: CustomComponent, user_id: str | UUID | None = None):
if isinstance(custom_component, Component | CustomComponent):
return custom_component
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this conditional be removed? iiuc, this is used to create a custom component via the path of a user creating and saving one from the FE and calling the /custom_component endpoint, in which case we do want to build

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When importing we already have the instance so we don't need to do anything, although I do think this is wrong.

@github-actions github-actions Bot removed the refactor Maintenance tasks and housekeeping label Jun 6, 2025
# Final setup
self._set_output_types(list(self._outputs_map.values()))
self.set_class_code()
self._set_output_required_inputs()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ogabrielluiz May I know the process on removal of _set_output_required_inputs?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't used (i.e. we get the warning constantly but see no problems) and the implementation is very flaky. We should replace it with another, more reliable, implementation instead of using inspect.getsource.

Comment thread src/backend/base/langflow/initial_setup/starter_projects/Basic Prompting.json Outdated
@edwinjosechittilappilly
Copy link
Copy Markdown
Collaborator

@CodeRabbit please find what triggered the change in the advanced fields of templates?

@edwinjosechittilappilly
Copy link
Copy Markdown
Collaborator

@CodeRabbit can you explain why self._set_output_required_inputs() was removed?

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 10, 2025

Tip

For best results, initiate chat on the files or code changes.

@edwinjosechittilappilly

_set_output_required_inputs() was originally called from Component.__init__ to walk the component’s Python source, detect which input fields were accessed inside each output-method, and inject a required_inputs list into the output-metadata that the UI shows.

During the refactor the team moved to a simpler, declarative scheme:

• Starter-project JSONs now declare every output without a required_inputs entry (you can see hundreds of "required_inputs": … blocks deleted in this PR).
• Inputs themselves are marked as “advanced” or “required” directly in the new inputs arrays that were added to each class (StrInput, HandleInput, …).
• Because the metadata is now static and supplied up-front, recomputing it at runtime is unnecessary and, in practice, hurt start-up time and triggered circular-import chains while the new async/parallel loader (import_langflow_components) was importing thousands of modules in threads.

Removing the call therefore:

  1. Eliminates one full AST parse/eval pass per component, trimming several seconds off cold-start and unit-test time (see the new test_load_components performance tests).
  2. Avoids importing sub-components while the loader is still building the registry – this is what broke the nine components you listed with circular-import errors.
  3. Keeps the single source of truth for “which inputs are needed” in the declarative inputs definition, making the codebase easier to maintain.

If a component really needs to express a hard dependency between an output and a subset of inputs, it can now do so explicitly in its template or by validating inside the output method itself, without relying on implicit reflection.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 10, 2025

Tip

For best results, initiate chat on the files or code changes.

@edwinjosechittilappilly

After tracing the commit history and the code paths that build the JSON templates, here’s what I found:

• No new automatic rule was added in the Python template-builders ( custom/utils.py, interface/components.py, etc.) that would flip the "advanced" flag.
• All the lines that now read "advanced": true were modified explicitly in this PR inside the starter-project JSON files; the surrounding Python code never touches that flag.
• The commit introducing the bulk of those JSON edits is 3ad4e4e… (part of this PR). In that diff the author hand-edited the nodes to:
– remove stale required_inputs, and
– set inputs such as “temperature”, “message”, “mode”, and the numerous extra options on the new recursive URL component to "advanced": true.

So the change was manual housekeeping aimed at decluttering the default UI and tucking rarely-used knobs under the “advanced” section—nothing in the new component-loading refactor triggers it programmatically.

ogabrielluiz and others added 15 commits June 11, 2025 08:24
…ate_from_inputs for improved error handling and component instantiation
…(`fix-component-loading`)

Here is an optimized version of your program.



**Reasoning for changes:**
- The check `isinstance(custom_component, Component)` followed by `type(custom_component) is not Component` causes the interpreter to potentially look up the type and MRO twice per call.
- By storing `type(custom_component)` in `klass` and using `issubclass(klass, Component)`, you avoid having Python walk the MRO twice for the same object, which is subtly more efficient especially in tight loops and heavy use scenarios.
- Using `issubclass()` on the object's type is semantically equivalent to `isinstance()`, except it also works for custom metaclass scenarios and is very slightly faster when type is already known.

**All program logic and comments are preserved, only the relevant portion is optimized.**
…ow to enhance test coverage and improve test reliability
…cessfully" text to appear to improve test reliability
…ater than 0 before calling addFlowToTestOnEmptyLangflow

🔧 (generalBugs-shard-9.spec.ts): update tags in test case to include @workspace and @components
♻️ (generalBugs-shard-9.spec.ts): refactor code to remove unnecessary steps related to sidebar search and node handling
🔧 (store-shard-0.spec.ts): update test cases to be skipped and improve readability by using async arrow functions
@grebug
Copy link
Copy Markdown

grebug commented Jul 24, 2025

Currently, after pulling the latest image Docker pull langflowai/langflow: 1.5.0.post1, the startup is blocked, and the debug log finally prints "DEBUG - components - Building components cache". From the source code, it appears to be blocked in the loading component section of this submission. However, on the Mac machine, installing langflow==1.5.0.post1 through UV pip does not cause any problems. Who knows what the reason is?

@jordanrfrazier
Copy link
Copy Markdown
Collaborator

@grebug I was able to start up Langflow on Mac with the below image tag, with default env vars. Perhaps that's the difference - can you share your environment variables and docker start command?

langflowai/langflow                                                 latest    9932b713e0df   2 weeks ago   2.65GB

@grebug
Copy link
Copy Markdown

grebug commented Jul 30, 2025

@jordanrfrazier It's okay, it may be a problem with machine resources. Replace it with another machine. Although it may experience some lag in the same location, the service will eventually be able to function normally

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer refactor Maintenance tasks and housekeeping size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants