Skip to content

fix: update chat components#10946

Closed
HimavarshaVS wants to merge 1 commit into
release-1.6.0from
fix/chat-components-and-starter-projects
Closed

fix: update chat components#10946
HimavarshaVS wants to merge 1 commit into
release-1.6.0from
fix/chat-components-and-starter-projects

Conversation

@HimavarshaVS
Copy link
Copy Markdown
Collaborator

@HimavarshaVS HimavarshaVS commented Dec 10, 2025

Summary by CodeRabbit

Release Notes

  • New Features

    • Added MCP Composer integration with OAuth support and project registration capabilities.
    • Introduced Knowledge Base ingestion and retrieval components for structured data management.
    • Enhanced document processing with advanced OCR and VLM pipeline options for file parsing.
    • Exposed voice mode availability as a configurable feature.
  • Improvements

    • Updated API key input labels across integrations for improved clarity.
    • Marked Claude 3 Sonnet as deprecated; updated default model selections.
    • Improved file constraint handling and data conversion logic.
    • Updated dependencies and version to 1.6.0.
  • Chores

    • Revised environment variable naming for feature flags.
    • Updated GitHub workflow configuration for branch management.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Dec 10, 2025

Walkthrough

This PR introduces MCP Composer feature flag renaming, adds branch parameter propagation to CI workflows, bumps project version and dependencies, refactors knowledge base and file components for Docling support, adds auth settings handling for MCP projects, restructures agent provider logic, renames dozens of component display labels, adds replacement attributes to components, and modifies database constraints.

Changes

Cohort / File(s) Summary
Environment & Configuration
.env.example, pyproject.toml
Renamed MCP Composer env var (LANGFLOW_FEATURE_MCP_COMPOSER → LANGFLOW_MCP_COMPOSER_ENABLED), default from true to false; bumped version 1.5.0.post2 → 1.6.0; updated langflow-base ~0.5.0 → ~0.6.0, chromadb, composio dependencies; added docling and platform-specific conditional dependencies
CI/CD Workflows
.github/workflows/ci.yml, docs_test.yml, lint-py.yml, release.yml
Added optional branch input to workflow_call triggers; propagated branch ref to nested workflow calls and checkout steps; release.yml added required release_branch input with updated version/commit reference logic
Build & Ignore Configuration
.gitignore, docker/build_and_push.Dockerfile, scripts/windows/*
Restored *.mcp.json ignore pattern, added member_servers.json; added trailing newline in Dockerfile; refactored Windows build scripts to use USE_ENV_FILE flag instead of ENV_FILE_PARAM for .env handling
Database Migration
src/backend/base/langflow/alembic/versions/d37bc4322900_drop_single_constraint_on_files_name_.py
New migration to drop single-column UNIQUE constraint on file.name while preserving composite UNIQUE(name, user_id); includes dialect-specific paths for SQLite and PostgreSQL with downgrade support
Authentication & MCP Composer Integration
src/backend/base/langflow/api/v1/auth_helpers.py, src/backend/base/langflow/api/v1/mcp_projects.py, src/backend/base/langflow/api/v1/projects.py
Added handle_auth_settings_update helper for OAuth/API-key lifecycle; introduced verify_project_auth, should_use_mcp_composer, register_project_with_composer, get_project_composer_url flows; added AsyncSession-based auth verification; integrated MCP Composer start/stop in project lifecycle
File Management
src/backend/base/langflow/api/v2/files.py, src/backend/base/langflow/api/v2/mcp.py
Added get_mcp_file helper for per-user MCP filename generation; updated upload and list logic to use per-user filenames; added backward compatibility migration from old MCP_SERVERS_FILE naming; integrated cache invalidation on server updates
Base Data & File Component Refactoring
src/backend/base/langflow/base/data/*, src/backend/base/langflow/components/data/file.py
Replaced JSON with orjson in load_files_message; introduced DoclingDependencyError exception; reordered TEXT_FILE_TYPES; refactored FileComponent with Docling subprocess support, pipeline selection (standard/vlm), OCR options, renamed outputs (load_files_advanced → load_files_dataframe), added load_files_dataframe method
Knowledge Base Components
src/backend/base/langflow/components/knowledge_bases/*, src/backend/base/langflow/components/data/__init__.py
Renamed KBIngestionComponent → KnowledgeIngestionComponent (icon/name updates), changed input from DataFrameInput to HandleInput, renamed output dataframe → dataframe_output; renamed KBRetrievalComponent → KnowledgeRetrievalComponent, changed icon/name, updated method get_chroma_kb_data → retrieve_data, adjusted embedding/metadata handling; removed KB components from data package exports
Agent Component Restructuring
src/backend/base/langflow/components/agents/agent.py
Introduced MODEL_PROVIDERS_LIST constant with reduced provider set; removed "Custom" provider option; removed structured_response output; added external_options flow for connect_other_models workflow; updated update_build_config to handle new provider connection paths
MCP & Conditional Router Logic
src/backend/base/langflow/components/logic/conditional_router.py, src/backend/base/langflow/components/agents/mcp_component.py
Enhanced conditional routing with iteration max-limit handling, cycle-breaking logic, and persistent conditional exclusions; added cache-based short-circuit in MCP component to skip tool-option updates when server/tools unchanged
Component Replacement Attributes & Removals
src/backend/base/langflow/components/*/..., src/backend/base/langflow/components/__init__.py, src/backend/base/langflow/components/deactivated/*
Added replacement attributes to 30+ components (e.g., CSVToDataComponent, EmbeddingSimilarityComponent, OutputParserComponent, etc.) mapping to newer component versions; removed prototypes from components init; deleted deprecated MCPSse and MCPStdio components
Component Display Name Updates
Multiple API key inputs across 40+ components
Standardized display names for API keys (e.g., "API Key" → "Firecrawl API Key", "Azure Chat OpenAI API Key", etc.) in agentql, azure, firecrawl, google, huggingface, ibm, confluence, milvus, olivya, and other integrations
Model Metadata & Provider Changes
src/backend/base/langflow/base/models/*
Marked claude-3-sonnet-20240229 as deprecated in ANTHROPIC_MODELS_DETAILED; disabled Groq provider (is_active: False) in MODEL_PROVIDERS_DICT; made Vector Store Connection output non-hidden
Data Component Utilities
src/backend/base/langflow/components/data/csv_to_data.py, json_to_data.py, api_request.py, url.py
Added replacement attributes to CSV/JSON-to-data components; updated api_request to unwrap Data objects from list/body; updated url to filter None headers
Docling & Remote Integration
src/backend/base/langflow/components/docling/*, docs/docs/Integrations/*/...json
Enhanced docling_inline error handling with DoclingDependencyError; updated docling_remote to Docling Serve v1 API (base URL, sources payload structure); updated default OpenAI models in integration templates from gpt-4o to gpt-4.1
DataStore & Minor Updates
src/backend/base/langflow/components/chroma/chroma.py, src/backend/base/langflow/api/v1/schemas.py, src/backend/base/langflow/base/agents/events.py, src/backend/base/langflow/components/datastax/*, src/backend/base/langflow/components/*/...py
Normalized limit parameter in chroma before use; added voice_mode_available field to ConfigResponse; removed explicit "Input" label in agent input text; moved parse_api_endpoint import to lazy/guarded paths in AstraDB components; updated AstraVectorizeComponent deprecation status; adjusted embedding model visibility in AstraDB vectorstore

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant ProjectAPI as Project API
    participant AuthHelper as Auth Helper
    participant MCPComposer as MCP Composer
    participant DB as Database
    
    User->>ProjectAPI: update_project(auth_settings={"type":"oauth"})
    ProjectAPI->>AuthHelper: handle_auth_settings_update(existing_project, new_auth_settings)
    AuthHelper->>DB: decrypt current auth_settings
    AuthHelper->>AuthHelper: merge/preserve masked secrets
    AuthHelper->>DB: encrypt and save updated auth_settings
    AuthHelper-->>ProjectAPI: {should_start_composer: true, should_stop_composer: false}
    
    ProjectAPI->>MCPComposer: register_project_with_composer(project)
    MCPComposer-->>ProjectAPI: registration complete
    ProjectAPI-->>User: project updated, composer started
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Areas requiring extra attention:

  • MCP Composer Integration (src/backend/base/langflow/api/v1/mcp_projects.py): Extensive new auth verification flows, SSE URL handling with OS-aware logic, and project lifecycle management require careful review of auth decision trees and composer registration paths
  • FileComponent Docling Refactoring (src/backend/base/langflow/components/data/file.py): Subprocess invocation, pipeline selection (standard/vlm), OCR gating, and output naming changes need verification of backward compatibility and Docling API integration
  • ConditionalRouter Control Flow (src/backend/base/langflow/components/logic/conditional_router.py): Iteration handling, cycle-breaking logic, and conditional exclusion state management introduce complex control-flow semantics that need tracing through multiple scenarios
  • Agent Provider Restructuring (src/backend/base/langflow/components/agents/agent.py): Removal of "Custom" option and introduction of connect_other_models workflow path requires validation that existing flows don't break and new provider-connection UX is sound
  • Database Migration (src/backend/base/langflow/alembic/versions/d37bc4322900_*.py): Dialect-specific constraint handling (SQLite table recreation vs PostgreSQL catalog queries) and downgrade path need careful SQL review
  • Knowledge Base Component Renaming & Input Changes (src/backend/base/langflow/components/knowledge_bases/*): Changes to input types (DataFrameInput→HandleInput), output names, and utility imports (kb_utils path) should be checked for cascading downstream impacts

Possibly related PRs

Suggested labels

bug, lgtm

Suggested reviewers

  • edwinjosechittilappilly
  • ogabrielluiz

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 68.92% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title check ❓ Inconclusive The PR title 'fix: update chat components' is vague and overly broad, lacking specific details about the actual changes made throughout the extensive changeset. Consider a more specific title that captures the primary focus, such as 'fix: refactor chat components and update environment/workflow configurations' or break into multiple PRs with focused titles.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/chat-components-and-starter-projects

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added the bug Something isn't working label Dec 10, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (9)
src/backend/base/langflow/components/icosacomputing/combinatorial_reasoner.py (2)

81-84: Potential AttributeError if called before build_prompt().

The method accesses self.reasons (line 83), which is only set in build_prompt() (line 78). If build_reasons() is called independently or before build_prompt(), it will raise an AttributeError.

Consider adding defensive initialization:

+    def __init__(self, **kwargs):
+        super().__init__(**kwargs)
+        self.reasons = []
+
     def build_reasons(self) -> Data:
         # list of selected reasons
+        if not hasattr(self, "reasons"):
+            return Data(value=[])
         final_reasons = [reason[0] for reason in self.reasons]
         return Data(value=final_reasons)

60-79: Return type mismatch: method declares Message but returns string.

Line 79 returns a plain string (prompt), but the method signature declares -> Message. This violates the return type contract. All similar components in the codebase wrap string returns with Message(text=...).

Fix:

 def build_prompt(self) -> Message:
     params = {
         "prompt": self.prompt,
         "apiKey": self.openai_api_key,
         "model": self.model_name,
     }

     creds = HTTPBasicAuth(self.username, password=self.password)
     response = requests.post(
         "https://cr-api.icosacomputing.com/cr/langflow",
         json=params,
         auth=creds,
         timeout=100,
     )
     response.raise_for_status()

     prompt = response.json()["prompt"]

     self.reasons = response.json()["finalReasons"]
-    return prompt
+    return Message(text=prompt)
src/backend/base/langflow/components/input_output/chat.py (1)

32-77: This removal breaks active test code and will cause test failures.

The removal of background_color, chat_icon, and text_color inputs is a breaking change. Test code in src/backend/tests/unit/custom/component/test_component_instance_attributes.py (lines 47, 55-56) explicitly calls build() with these parameters and asserts these attributes exist on ChatInput instances. With these inputs removed from the component's inputs list, the build() method will no longer accept these parameters, causing test failures.

Any saved workflows or serialized ChatInput configurations using these styling inputs will also fail to deserialize correctly. Before removing these inputs, either update the affected tests and migrate existing saved workflows, or add backwards compatibility handling to accept and silently ignore these parameters.

src/backend/base/langflow/components/input_output/chat_output.py (1)

35-78: Fix incomplete removal of styling and clean_data inputs—code at line 174 still references clean_data.

The inputs background_color, chat_icon, text_color, and clean_data were removed from the inputs list (lines 35–78), but the code at line 174 still attempts to access self.clean_data via getattr(self, "clean_data", False). Additionally, the test file (test_chat_output_component.py:26–29) still expects these parameters, meaning component instantiation will silently ignore them. Either restore these inputs to the list, or remove the reference to clean_data at line 174 if the feature is intentionally deprecated.

docs/docs/Integrations/Notion/Meeting_Notes_Agent.json (4)

2097-2097: Network calls lack timeouts (risk of hangs) and retries.

All Notion components use requests without timeouts. Add a reasonable timeout and a simple retry/backoff to avoid UI hangs and flaky failures.

Example adjustment (apply similarly across components):

- response = requests.post(url, headers=headers, json=data)
+ response = requests.post(url, headers=headers, json=data, timeout=30)
  response.raise_for_status()

Optional lightweight retry:

import time
for attempt in range(3):
    try:
        resp = requests.get(url, headers=headers, timeout=30)
        resp.raise_for_status()
        break
    except requests.exceptions.RequestException as e:
        if attempt == 2: raise
        time.sleep(2 * (attempt + 1))

Also applies to: 2229-2229, 2378-2378, 2504-2504, 2653-2653, 120-120, 229-229


2504-2504: Redact sensitive data from logs in NotionPageUpdate.

You log the full request payload and response body. This can leak PII and internal data.

- logger.info(f"Sending request to Notion API: URL: {url}, Data: {json.dumps(data)}")
+ logger.info(f"Sending request to Notion API: URL: {url}, Data: <redacted>")
...
- logger.info(f"Successfully updated Notion page. Response: {json.dumps(updated_page)}")
+ logger.info("Successfully updated Notion page.")

1081-1104: Dropdown default points to a value not present in options ("gpt-4.1").

This breaks the default selection and may block flow import/use in the UI. Add "gpt-4.1" to the options array (or enable combobox) in both OpenAIModel blocks at lines 1096 and 1881.

   "model_name": {
     "options": [
+      "gpt-4.1",
       "gpt-4o-mini",
       "gpt-4o",
       "gpt-4-turbo",
       "gpt-4-turbo-preview",
       "gpt-4",
       "gpt-3.5-turbo",
       "gpt-3.5-turbo-0125"
     ],
     "combobox": false,
     "name": "model_name",
     "value": "gpt-4.1"

2378-2424: Update Notion API version to latest stable and centralize the constant.

The version "2022-06-28" is hardcoded in multiple code blocks embedded in this file (lines 120, 228, 2228, 2377, 2503, 2652, 3156). As of December 2025, the latest stable Notion API version is 2025-09-03. Extract the version to a single constant to prevent drift and enable easier updates across all components using it.

src/backend/base/langflow/components/docling/docling_remote.py (1)

171-192: Fix misalignment between file_list and processed_data when some files have no path

Right now processed_data is appended to in submission order while you ignore the stored index from futures. If any file.path is None, the positions in processed_data no longer line up with file_list, which can cause rollup_data to associate results with the wrong files.

Pre‑allocate processed_data and assign by index to keep everything aligned:

-        processed_data: list[Data | None] = []
+        processed_data: list[Data | None] = [None] * len(file_list)
         with (
             httpx.Client(headers=self.api_headers) as client,
             ThreadPoolExecutor(max_workers=self.max_concurrency) as executor,
         ):
             futures: list[tuple[int, Future]] = []
             for i, file in enumerate(file_list):
                 if file.path is None:
-                    processed_data.append(None)
-                    continue
-
-                futures.append((i, executor.submit(_convert_document, client, file.path, docling_options)))
-
-            for _index, future in futures:
+                    continue
+
+                futures.append((i, executor.submit(_convert_document, client, file.path, docling_options)))
+
+            for index, future in futures:
                 try:
                     result_data = future.result()
-                    processed_data.append(result_data)
+                    processed_data[index] = result_data
                 except (httpx.HTTPStatusError, httpx.RequestError, KeyError, ValueError) as exc:
                     self.log(f"Docling remote processing failed: {exc}")
                     raise
♻️ Duplicate comments (2)
src/backend/base/langflow/components/logic/flow_tool.py (1)

17-17: Inconsistency: Display name no longer indicates deprecation while legacy flag remains true.

Same issue as in sub_flow.py: removing "[Deprecated]" from the display name while keeping legacy: bool = True creates confusion. Users won't realize this component is deprecated and should migrate to logic.RunFlow.

Apply the same resolution approach as suggested for the SubFlowComponent.

Also applies to: 22-23

src/backend/base/langflow/components/input_output/chat_output.py (1)

87-89: Consider simplifying the getattr usage.

Same as the comment for ChatInput.get_artifacts(): since _artifacts is always initialized in __init__, you could simplify to return self._artifacts.

🟡 Minor comments (5)
src/backend/base/langflow/alembic/versions/d37bc4322900_drop_single_constraint_on_files_name_.py-128-154 (1)

128-154: Downgrade may fail silently if duplicate names exist across users.

After upgrade, users can have files with the same name. Downgrade will fail on the INSERT if duplicate names exist (violating the restored UNIQUE(name) constraint). Consider adding a pre-flight check:

         if db_dialect == "sqlite":
             # SQLite: Recreate table with both constraints
             print("SQLite: Recreating table with both constraints")
+            
+            # Check for duplicate names that would violate the single UNIQUE constraint
+            result = conn.execute(sa.text(
+                "SELECT name, COUNT(*) as cnt FROM file GROUP BY name HAVING COUNT(*) > 1"
+            ))
+            duplicates = result.fetchall()
+            if duplicates:
+                raise ValueError(
+                    f"Cannot downgrade: {len(duplicates)} file names have duplicates across users. "
+                    "Resolve duplicates before downgrading."
+                )

Committable suggestion skipped: line range outside the PR's diff.

src/backend/base/langflow/api/v1/auth_helpers.py-13-23 (1)

13-23: Docstring is incomplete—missing should_handle_composer return key.

The docstring lists only should_start_composer and should_stop_composer, but the function also returns should_handle_composer (line 74).

     """Handle auth settings update including encryption/decryption and MCP Composer logic.

     Args:
         existing_project: The project being updated (modified in-place)
         new_auth_settings: New auth settings (could be dict, Pydantic model, or None)

     Returns:
         Dict containing:
         - should_start_composer: bool
         - should_stop_composer: bool
+        - should_handle_composer: bool
     """
src/backend/base/langflow/api/v1/mcp_projects.py-560-561 (1)

560-561: Bug: generated_api_key is never assigned, so the success message omits API key info.

generated_api_key is initialized to None (line 560) but never reassigned. The actual API key is stored in langflow_api_key (line 635). As a result, the condition at line 714 is always False and the message about API key authentication is never included.

Apply this diff to fix:

             if should_generate_api_key:
                 async with session_scope() as api_key_session:
                     api_key_create = ApiKeyCreate(name=f"MCP Server {project.name}")
                     api_key_response = await create_api_key(api_key_session, api_key_create, current_user.id)
                     langflow_api_key = api_key_response.api_key
+                    generated_api_key = langflow_api_key
                     args.extend(["--headers", "x-api-key", langflow_api_key])

Also applies to: 631-636, 714-716

src/backend/base/langflow/api/v1/mcp_projects.py-1152-1156 (1)

1152-1156: Inconsistent encryption: Auth settings not encrypted when resetting OAuth projects.

At line 1155, when resetting OAuth projects, clean_auth.model_dump() is used directly without encryption. However, at line 1138, encrypt_auth_settings() is called. This inconsistency could cause issues if sensitive fields are added later or for consistency in how auth_settings are stored.

Consider encrypting for consistency:

                         fallback_auth_type = "apikey" if not settings_service.auth_settings.AUTO_LOGIN else "none"
                         clean_auth = AuthSettings(auth_type=fallback_auth_type)
-                        project.auth_settings = clean_auth.model_dump(exclude_none=True)
+                        project.auth_settings = encrypt_auth_settings(clean_auth.model_dump(exclude_none=True))
src/backend/base/langflow/components/huggingface/huggingface.py-107-109 (1)

107-109: Minor formatting issue: Missing space in display name.

The display_name value "HuggingFace HubAPI Token" appears to be missing a space between "Hub" and "API". For consistency and readability, consider:

 SecretStrInput(
-    name="huggingfacehub_api_token", display_name="HuggingFace HubAPI Token", password=True, required=True
+    name="huggingfacehub_api_token", display_name="HuggingFace Hub API Token", password=True, required=True
 ),
🧹 Nitpick comments (36)
src/backend/base/langflow/components/agentql/agentql_api.py (1)

127-127: Consider refactoring to use async HTTP client.

The synchronous httpx.post() call blocks the event loop. Per coding guidelines, component execution methods should use async patterns with await for I/O operations.

Consider refactoring to:

+    async def build_output(self) -> Data:
         endpoint = "https://api.agentql.com/v1/query-data"
         headers = {
             "X-API-Key": self.api_key,
             "Content-Type": "application/json",
             "X-TF-Request-Origin": "langflow",
         }
         
         payload = {
             "url": self.url,
             "query": self.query,
             "prompt": self.prompt,
             "params": {
                 "mode": self.mode,
                 "wait_for": self.wait_for,
                 "is_scroll_to_bottom_enabled": self.is_scroll_to_bottom_enabled,
                 "is_screenshot_enabled": self.is_screenshot_enabled,
             },
             "metadata": {
                 "experimental_stealth_mode_enabled": self.is_stealth_mode_enabled,
             },
         }
         
         if not self.prompt and not self.query:
             self.status = "Either Query or Prompt must be provided."
             raise ValueError(self.status)
         if self.prompt and self.query:
             self.status = "Both Query and Prompt can't be provided at the same time."
             raise ValueError(self.status)
         
         try:
+            async with httpx.AsyncClient() as client:
+                response = await client.post(endpoint, headers=headers, json=payload, timeout=self.timeout)
-            response = httpx.post(endpoint, headers=headers, json=payload, timeout=self.timeout)
                 response.raise_for_status()

Based on coding guidelines: "Use FastAPI async patterns with await for async operations in component execution methods".

src/backend/base/langflow/components/milvus/milvus.py (1)

43-46: Fix typo in consistency level label

Minor UX nit: display_name="Consistencey Level" has a typo. Consider correcting it to display_name="Consistency Level" for a more polished UI.

src/backend/base/langflow/components/clickhouse/clickhouse.py (1)

29-29: Improved clarity in the display name.

The new display name "Clickhouse Password" is clearer and more concise than the previous text.

Optional: Consider brand name consistency.

For consistency with line 28 which uses "ClickHouse" (capital H), consider using the official brand capitalization:

-        SecretStrInput(name="password", display_name="Clickhouse Password", required=True),
+        SecretStrInput(name="password", display_name="ClickHouse Password", required=True),
src/backend/base/langflow/components/data/url.py (1)

215-215: Good improvement to filter out None header values.

The change correctly prevents None values from being passed as HTTP headers, which improves request robustness.

Consider also checking that header["key"] is not None for symmetry:

-        headers_dict = {header["key"]: header["value"] for header in self.headers if header["value"] is not None}
+        headers_dict = {header["key"]: header["value"] for header in self.headers if header["key"] is not None and header["value"] is not None}
src/backend/base/langflow/components/chroma/chroma.py (1)

124-125: DRY violation: duplicate limit normalization logic.

The same limit normalization expression appears twice. Consider extracting it to a cached property or helper to avoid duplication and ensure consistent behavior.

+    @property
+    def _normalized_limit(self) -> int | None:
+        """Normalize limit to int or None."""
+        if self.limit is not None and str(self.limit).strip():
+            return int(self.limit)
+        return None
+
     @override
     @check_cached_vector_store
     def build_vector_store(self) -> Chroma:

Then replace both occurrences:

-        limit = int(self.limit) if self.limit is not None and str(self.limit).strip() else None
-        self.status = chroma_collection_to_data(chroma.get(limit=limit))
+        self.status = chroma_collection_to_data(chroma.get(limit=self._normalized_limit))
-            limit = int(self.limit) if self.limit is not None and str(self.limit).strip() else None
-            stored_data = chroma_collection_to_data(vector_store.get(limit=limit))
+            stored_data = chroma_collection_to_data(vector_store.get(limit=self._normalized_limit))

Also applies to: 142-143

src/backend/base/langflow/base/agents/events.py (1)

100-119: Consider removing duplicate code.

Lines 100-106 and 107-113 contain identical logic for handling dictionary items with "text" keys and "tool_use" types. The duplication can be eliminated by removing one of these blocks.

Apply this diff to remove the duplication:

     if isinstance(item, dict):
         if "text" in item:
             return item["text"]
         # If the item's type is "tool_use", return an empty string.
         # This likely indicates that "tool_use" outputs are not meant to be displayed as text.
         if item.get("type") == "tool_use":
             return ""
-    if isinstance(item, dict):
-        if "text" in item:
-            return item["text"]
-        # If the item's type is "tool_use", return an empty string.
-        # This likely indicates that "tool_use" outputs are not meant to be displayed as text.
-        if item.get("type") == "tool_use":
-            return ""
         # This is a workaround to deal with function calling by Anthropic
         # since the same data comes in the tool_output we don't need to stream it here
         # although it would be nice to
         if "partial_json" in item:
             return ""
     msg = f"Output is not a string or list of dictionaries with 'text' key: {output}"
     raise TypeError(msg)
src/backend/base/langflow/alembic/versions/d37bc4322900_drop_single_constraint_on_files_name_.py (1)

96-98: Consider adding a comment explaining the f-string usage for DDL.

The constraint name comes from PostgreSQL system catalogs (trusted), and DDL statements don't support parameterized identifiers. While this is acceptable, a brief comment would clarify the safety rationale for future maintainers.

             if constraints_to_drop:
                 for constraint_name in constraints_to_drop:
+                    # constraint_name comes from pg_constraint system table (trusted source)
+                    # DDL statements don't support parameterized identifiers
                     op.execute(f'ALTER TABLE "file" DROP CONSTRAINT "{constraint_name}"')
                     print(f"PostgreSQL: Dropped constraint {constraint_name}")
src/backend/base/langflow/components/logic/conditional_router.py (1)

146-159: Consider encapsulating conditional exclusion clearing in a graph method.

Lines 148-151 directly manipulate self.graph.conditional_exclusion_sources and self.graph.conditionally_excluded_vertices, duplicating the clearing logic already present in exclude_branch_conditionally(). This creates coupling to the graph's internal structure.

Consider adding a dedicated method like clear_conditional_exclusions(vertex_id) to the graph class, or reusing the existing clearing logic in exclude_branch_conditionally() which already clears previous exclusions from the same source vertex before applying new ones.

src/backend/base/langflow/__main__.py (1)

669-701: Safer handling of default superuser password, but consider type‑safety and help text accuracy

Using DEFAULT_SUPERUSER_PASSWORD.get_secret_value() and avoiding logging the default password is a solid security improvement. Two small follow‑ups to consider:

  • This assumes DEFAULT_SUPERUSER_PASSWORD is a SecretStr‑like object. If there’s any chance it can be a plain string (now or in future refactors), a defensive pattern would avoid an AttributeError, e.g.:

  •    username = DEFAULT_SUPERUSER
    
  •    password = DEFAULT_SUPERUSER_PASSWORD.get_secret_value()
    
  •    username = DEFAULT_SUPERUSER
    
  •    default_pw = DEFAULT_SUPERUSER_PASSWORD
    
  •    password = (
    
  •        default_pw.get_secret_value()
    
  •        if hasattr(default_pw, "get_secret_value")
    
  •        else str(default_pw)
    
  •    )
    
    
    
  • The CLI option help still says the password defaults to 'langflow' when AUTO_LOGIN is enabled, which may now be inaccurate if the default comes from configuration/secret settings. You might want to reword it to something like “Defaults to the configured AUTO_LOGIN password” so it stays correct if the secret changes.
src/backend/base/langflow/components/crewai/sequential_crew.py (1)

32-32: Pre-existing: Return type annotation mismatch.

The method declares -> Message but returns a Crew object. Consider fixing the return type annotation in a follow-up.

-    def build_crew(self) -> Message:
+    def build_crew(self) -> "Crew":
src/backend/base/langflow/components/agents/agent.py (3)

45-45: Consider deriving provider list from central source.

This hardcoded list could drift from MODEL_PROVIDERS_DICT. If the subset is intentional, consider adding a comment explaining which providers are intentionally included/excluded and why.


530-530: Remove stray comment.

Line 530 has a double comment marker (# #) that appears to be leftover from development.

-                # # Update with custom component
+                # Update with custom component

541-551: Consider extracting duplicated external_options structure.

This external_options configuration is duplicated from lines 76-86. Extract to a module-level constant to maintain consistency and reduce duplication.

EXTERNAL_OPTIONS_CONFIG = {
    "fields": {
        "data": {
            "node": {
                "name": "connect_other_models",
                "display_name": "Connect other models",
                "icon": "CornerDownLeft",
            }
        }
    },
}
src/backend/base/langflow/components/input_output/chat.py (1)

82-84: Consider simplifying the getattr usage.

Since _artifacts is always initialized in __init__, the getattr default value will never be used. You could simplify to return self._artifacts, though the current defensive pattern is acceptable if you want to guard against edge cases.

Apply this diff if you prefer the simpler approach:

 def get_artifacts(self) -> dict:
     """Return artifacts dictionary for framework compatibility."""
-    return getattr(self, "_artifacts", {})
+    return self._artifacts
src/backend/base/langflow/components/data/api_request.py (2)

188-201: Top-level Data unwrapping in _process_body fixes dropped-body issue

Unwrapping any body with a .data attribute before type checks is a good way to support Data-wrapped bodies (previously these would fall through and become {}).

If you don’t expect arbitrary objects with a .data attribute here, consider tightening this to isinstance(body, Data) for slightly clearer intent and to avoid surprising unwrapping of unrelated types in the future.


213-235: Clarify/adjust early-return semantics when list items unwrap to non key/value dicts

The per-item Data unwrap and use of current_item look good, but this block has two subtle behavior changes:

  • return unwrapped_data inside the loop means the first item whose .data is a dict but not key/value format will short‑circuit the whole list, discarding any entries already accumulated in processed_dict.
  • That returned dict bypasses _process_dict_body, so its values won’t go through _parse_json_value, unlike other dict bodies.

If the intent is “the body is actually a single dict wrapped in a one‑element list of Data”, consider tightening this to something like:

  • Only doing the direct‑dict return when len(body) == 1, and/or
  • Returning self._process_dict_body(unwrapped_data) for consistency.

Otherwise, add a brief comment documenting the assumption so future changes don’t accidentally rely on the current short‑circuit behavior.

src/backend/base/langflow/components/firecrawl/firecrawl_extract_api.py (1)

107-115: Verify optional extraction flags referenced but not defined as inputs

In params you access self.ignore_sitemap, self.include_subdomains, and self.show_sources, but the corresponding BoolInput entries are commented out in the inputs list. Unless Component guarantees these attributes exist with defaults, calling extract could raise AttributeError.

Consider either restoring those BoolInputs (with appropriate defaults) or guarding access via getattr(self, "ignore_sitemap", False) / similar for the others.

src/backend/base/langflow/components/helpers/memory.py (1)

193-224: Unify “last N messages” semantics between external and internal memory

The external-memory branch now clearly implements “last N messages, then reverse if order is DESC”, which is easy to reason about:

  • aget_messages() (ascending),
  • stored = stored[-n_messages:],
  • optionally stored = stored[::-1] when order == "DESC".

For internal memory, you currently:

  • pass order through to aget_messages(..., order=order), and
  • then do stored = stored[-n_messages:] without an additional reverse.

If aget_messages honors order, then when order == "DESC" you’ll get newest-first from the DB, and stored[-n_messages:] will give you the oldest N messages rather than the latest N, which differs from the external-memory behavior and from the “Get last N messages” comment.

To make semantics consistent (last N messages window, with order only affecting presentation), consider always fetching internal messages in ascending order and then mirroring the external pattern:

-        else:
-            # For internal memory, we always fetch the last N messages by ordering by DESC
-            stored = await aget_messages(
-                sender=sender_type,
-                sender_name=sender_name,
-                session_id=session_id,
-                limit=10000,
-                order=order,
-            )
-            if n_messages:
-                stored = stored[-n_messages:]  # Get last N messages
+        else:
+            # For internal memory, fetch messages in ascending order, then
+            # apply the same "last N then optional reverse" pattern as external memory.
+            stored = await aget_messages(
+                sender=sender_type,
+                sender_name=sender_name,
+                session_id=session_id,
+                limit=10000,
+                order="ASC",
+            )
+            if n_messages:
+                stored = stored[-n_messages:]
+            if order == "DESC":
+                stored = stored[::-1]

This keeps “Number of Messages” semantics identical across internal/external memory and avoids the window depending on the chosen sort direction.

src/backend/base/langflow/api/v2/files.py (2)

30-33: Remove unnecessary async from function.

get_mcp_file doesn't perform any await operations; it's pure string manipulation. Making it a regular function avoids the overhead of creating a coroutine.

-async def get_mcp_file(current_user: CurrentActiveUser, *, extension: bool = False) -> str:
+def get_mcp_file(current_user: CurrentActiveUser, *, extension: bool = False) -> str:
     # Create a unique MCP servers file with the user id appended
     return f"{MCP_SERVERS_FILE}_{current_user.id!s}" + (".json" if extension else "")

Note: This would require updating all call sites to remove await, so this may be deferred if the scope is too large.


123-128: Consider consolidating duplicate calls to get_mcp_file.

get_mcp_file is called twice with different extension values. You could simplify by computing the base name once:

-        mcp_file = await get_mcp_file(current_user)
-        mcp_file_ext = await get_mcp_file(current_user, extension=True)
+        mcp_file = await get_mcp_file(current_user)
+        mcp_file_ext = f"{mcp_file}.json"

This reduces one async call and makes the relationship explicit.

src/backend/base/langflow/api/v2/mcp.py (1)

88-91: Redundant get_mcp_file call.

mcp_file was already computed at line 55 and should still be valid here. The variable can be reused instead of recomputing:

         # Fetch and download again
-        mcp_file = await get_mcp_file(current_user)
         server_config_file = await get_file_by_name(mcp_file, current_user, session)
src/backend/base/langflow/api/v1/auth_helpers.py (1)

47-51: Potential KeyError if Pydantic model lacks expected fields.

model_dump may not include fields that don't exist on the model. Accessing auth_dict[field] after getattr check could fail if the field was excluded. Consider using .get() for safer access:

         secret_fields = ["api_key", "oauth_client_secret"]
         for field in secret_fields:
             field_val = getattr(new_auth_settings, field, None)
             if isinstance(field_val, SecretStr):
-                auth_dict[field] = field_val.get_secret_value()
+                if field in auth_dict or field_val is not None:
+                    auth_dict[field] = field_val.get_secret_value()

However, since you're checking isinstance(field_val, SecretStr), this should be safe in practice. Consider this a minor defensive coding suggestion.

src/backend/base/langflow/api/v1/projects.py (2)

86-89: new_project.id may be None in the log message before commit.

At this point, new_project.id hasn't been assigned by the database yet (happens after session.commit()). The log message will show None for the project ID.

Consider moving this log after the commit/refresh, or log only the project name:

-        await logger.adebug(
-            f"Auto-enabled API key authentication for project {new_project.name} "
-            f"({new_project.id}) due to AUTO_LOGIN=false"
-        )
+        # Log after commit when ID is available, or just log the name here
+        await logger.adebug(
+            f"Auto-enabled API key authentication for project {new_project.name} "
+            "due to AUTO_LOGIN=false"
+        )

420-423: Same issue: new_project.id will be None in log message.

Same as in create_project—the ID isn't assigned until after commit. Consider logging only the name or moving the log after commit.

src/backend/base/langflow/api/v1/mcp_projects.py (3)

82-84: Authentication logic is clear but condition could use clarification.

The condition handles two cases:

  1. No auth_settings AND AUTO_LOGIN=false → require API key
  2. auth_settings with auth_type="apikey" → require API key

This is correct, but consider adding a brief comment to clarify the intent:

+    # Require API key authentication when:
+    # 1. No project auth settings but AUTO_LOGIN is disabled globally
+    # 2. Project explicitly requires API key authentication
     if (not auth_settings and not settings_service.auth_settings.AUTO_LOGIN) or (
         auth_settings and auth_settings.auth_type == "apikey"
     ):

473-475: Unnecessary None check: get_project_sse_url always returns a string.

Looking at get_project_sse_url (line 858), it always returns a string URL. This None check will never trigger.

                     # Provide the direct SSE URL since we're no longer using composer
                     sse_url = await get_project_sse_url(project_id)
-                    if not sse_url:
-                        raise HTTPException(status_code=500, detail="Failed to get direct SSE URL")

If you want to keep defensive coding, consider checking for empty string instead, but the function won't return that either.


776-784: Duplicate project fetch: remove the first query.

The project is fetched twice: once at lines 777-782, then again via verify_project_access at line 784. The first query is now redundant.

Apply this diff to remove the duplicate:

     try:
-        # Verify project exists and user has access
-        async with session_scope() as session:
-            project = (
-                await session.exec(select(Folder).where(Folder.id == project_id, Folder.user_id == current_user.id))
-            ).first()
-
-            if not project:
-                raise HTTPException(status_code=404, detail="Project not found")
-
         project = await verify_project_access(project_id, current_user)
docs/docs/Integrations/Notion/Meeting_Notes_Agent.json (3)

116-176: Handle Notion pagination (users, search).

Current implementations return only the first page. Iterate on has_more/next_cursor to avoid missing results.

Sketch:

results, cursor = [], None
while True:
    payload = {"query": query, "start_cursor": cursor} if cursor else {"query": query}
    r = requests.post(url, headers=headers, json=payload, timeout=30)
    r.raise_for_status()
    data = r.json()
    results.extend(data.get("results", []))
    if not data.get("has_more"): break
    cursor = data.get("next_cursor")

Also applies to: 2641-2749, 110-160, 3145-3224


151-159: Trim trailing spaces in display_name values.

Minor UX polish; trailing spaces can affect search/filter in the UI.

Targets: “List Users ”, “Search ”, “Add Content to Page ”, “Create Page ”, “List Database Properties ”, “Update Page Property ”.

Also applies to: 322-324, 2154-2161, 2303-2306, 2431-2434, 2579-2582


111-119: Remove unused variables.

combined_text is built but never used in NotionUserList.run_model and NotionSearch.run_model.

- combined_text = ""
...
- combined_text += text

Also applies to: 216-224

src/backend/base/langflow/components/docling/docling_remote.py (1)

105-115: Docling v1 base URL and new sources payload shape look fine; consider normalizing api_url

Using /v1 and the new "sources" structure matches the updated Docling Serve API, but if api_url is configured with a trailing /, you’ll end up with a double slash in the final URL. It’s harmless for most servers but easy to avoid:

-        base_url = f"{self.api_url}/v1"
+        base_url = f"{self.api_url.rstrip('/')}/v1"
src/backend/base/langflow/base/data/docling_utils.py (2)

13-19: New DoclingDependencyError is reasonable; ensure it’s wired into callers

The custom exception shape (with dependency_name and install_command) is clear and useful for surfacing actionable messages. Just make sure it’s actually raised somewhere in the codebase (e.g., when mapping queue messages or validating environment) so it doesn’t become dead code.


207-238: Dependency error classification is helpful; consider centralizing/error‑string robustness

Classifying file‑processing errors into structured dependency_error messages with a dependency_name is a good improvement for UX. However, the detection currently relies on matching specific substrings in error_msg, which can be brittle if upstream exceptions change wording.

If this starts to drift, consider either:

  • Centralizing the mapping of known error messages → dependency metadata in one helper, or
  • Raising a dedicated dependency exception closer to where the errors originate and catching that here, instead of relying on string inspection.

For now this is acceptable, but worth keeping in mind for long‑term maintainability.

src/backend/base/langflow/components/docling/docling_inline.py (1)

220-247: Handle worker import_error separately so callers see an ImportError instead of a generic RuntimeError

The new dependency_error handling and the graceful‑shutdown path look good. However, the worker can also send {"error_type": "import_error", ...} for import‑related failures, and that currently falls through to the generic RuntimeError branch.

To keep error semantics consistent, consider treating import_error similarly to the other ImportError cases:

-        # Enhanced error checking with dependency-specific handling
+        # Enhanced error checking with dependency-specific handling
         if isinstance(result, dict) and "error" in result:
             error_msg = result["error"]

             # Handle dependency errors specifically
             if result.get("error_type") == "dependency_error":
@@
-                raise ImportError(user_message)
+                raise ImportError(user_message)
+
+            # Handle import errors from the worker
+            if result.get("error_type") == "import_error":
+                raise ImportError(error_msg)

             # Handle other specific errors
             if error_msg.startswith("Docling is not installed"):
                 raise ImportError(error_msg)
@@
             else:
                 raise RuntimeError(error_msg)

This way, all import‑related failures surface as ImportError, while true processing issues remain RuntimeError.

src/backend/base/langflow/components/data/file.py (2)

509-511: Enhance file path validation for subprocess security.

The current validation checks for shell metacharacters but could be more robust. Consider using pathlib.Path to resolve and validate the path, and check for path traversal attempts.

Apply this diff to improve path validation:

-        # Validate file_path to avoid command injection or unsafe input
-        if not isinstance(args["file_path"], str) or any(c in args["file_path"] for c in [";", "|", "&", "$", "`"]):
-            return Data(data={"error": "Unsafe file path detected.", "file_path": args["file_path"]})
+        # Validate file_path to avoid command injection or unsafe input
+        from pathlib import Path
+        if not isinstance(args["file_path"], str):
+            return Data(data={"error": "Invalid file path type.", "file_path": str(args["file_path"])})
+        try:
+            resolved_path = Path(args["file_path"]).resolve()
+            # Ensure the path exists and is a file
+            if not resolved_path.is_file():
+                return Data(data={"error": "File path must be a valid file.", "file_path": args["file_path"]})
+        except (ValueError, OSError) as e:
+            return Data(data={"error": f"Invalid file path: {e}", "file_path": args["file_path"]})

309-312: Optional: Simplify OCR engine conditional logic.

The condition can be simplified for better readability.

Apply this diff:

             "pipeline": str(self.pipeline),
-            "ocr_engine": (
-                self.ocr_engine if self.ocr_engine and self.ocr_engine != "None" and self.pipeline != "vlm" else None
-            ),
+            "ocr_engine": (
+                self.ocr_engine if self.ocr_engine not in (None, "None") and self.pipeline == "standard" else None
+            ),

Comment thread .env.example
Comment on lines 104 to 108
# Should enable the MCP composer feature in MCP projects
# Values: true, false
# Default: true
LANGFLOW_FEATURE_MCP_COMPOSER=
# Default: false
LANGFLOW_MCP_COMPOSER_ENABLED=

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the new env var name is used in the codebase and old name is not referenced

echo "=== Checking for new env var usage ==="
rg -n "MCP_COMPOSER_ENABLED" --type py

echo ""
echo "=== Checking for old env var references ==="
rg -n "FEATURE_MCP_COMPOSER" --type py

echo ""
echo "=== Checking for both in all file types ==="
echo "New var (all types):"
rg -n "MCP_COMPOSER_ENABLED" -t yaml -t json -t py -t md 2>/dev/null | head -20

echo ""
echo "Old var (all types):"
rg -n "FEATURE_MCP_COMPOSER" -t yaml -t json -t py -t md 2>/dev/null | head -20

Repository: langflow-ai/langflow

Length of output: 235


🏁 Script executed:

#!/bin/bash
# Search more broadly for MCP references

echo "=== Searching for MCP references in Python files ==="
rg -i "mcp" --type py | head -30

echo ""
echo "=== Searching for COMPOSER references ==="
rg -i "composer" --type py | head -20

echo ""
echo "=== Checking .env.example for all LANGFLOW_ variables ==="
grep -n "LANGFLOW_" .env.example | head -20

echo ""
echo "=== Checking how environment variables are loaded ==="
rg -n "os.getenv|os.environ" --type py | grep -i "langflow" | head -15

Repository: langflow-ai/langflow

Length of output: 8835


🏁 Script executed:

#!/bin/bash
# Check MCP Composer service implementation and how it's conditionally started

echo "=== Checking MCPComposerService implementation ==="
cat -n src/backend/base/langflow/services/mcp_composer/service.py | head -60

echo ""
echo "=== Checking where MCPComposerService is started ==="
rg -A 5 -B 5 "mcp_composer_service.start" src/backend/base/langflow/main.py

echo ""
echo "=== Checking settings/config files for feature flags ==="
fd -e py "settings" | grep -E "(settings|config)" | head -10

echo ""
echo "=== Checking for pydantic settings configuration ==="
rg -n "class Settings" --type py | head -5

Repository: langflow-ai/langflow

Length of output: 3792


🏁 Script executed:

#!/bin/bash
# Check Settings class for mcp_composer_enabled field

echo "=== Checking Settings class for mcp_composer_enabled ==="
rg -n "mcp_composer_enabled" src/backend/base/langflow/services/settings/base.py -B 2 -A 2

echo ""
echo "=== Full Settings class to see field definitions ==="
head -200 src/backend/base/langflow/services/settings/base.py | tail -150

echo ""
echo "=== Search for any FEATURE_MCP_COMPOSER in settings ==="
rg -n "FEATURE_MCP_COMPOSER" src/backend/base/langflow/services/settings/base.py

Repository: langflow-ai/langflow

Length of output: 7119


Environment variable naming is correct, but default value in .env.example conflicts with backend code.

The variable rename from LANGFLOW_FEATURE_MCP_COMPOSER to LANGFLOW_MCP_COMPOSER_ENABLED is correct—the backend already uses mcp_composer_enabled in the Settings class (line 281 in src/backend/base/langflow/services/settings/base.py). However, the default value has a critical inconsistency: the Settings class defaults to True, while the .env.example change sets it to empty (which would default to False). Align the .env.example default with the backend implementation, or update the Settings class default if the intent is to disable MCP Composer by default.

🤖 Prompt for AI Agents
In .env.example around lines 104 to 108, the LANGFLOW_MCP_COMPOSER_ENABLED
variable is present but its example/default value is empty, which conflicts with
the backend Settings class that defaults mcp_composer_enabled to True; update
the .env.example to set LANGFLOW_MCP_COMPOSER_ENABLED=true to match the backend
default (or alternatively change the Settings class default to False if the
intended behavior is to disable it), ensuring both the example file and the
Settings default are aligned.

Comment thread .github/workflows/ci.yml
if: ${{ needs.path-filter.outputs.python == 'true' && needs.set-ci-condition.outputs.should-run-tests == 'true' }}
uses: ./.github/workflows/python_test.yml
with:
ref: ${{ inputs.branch }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find and inspect all referenced workflow files

echo "=== Locating workflow files ==="
find .github/workflows -name "python_test.yml" -o -name "jest_test.yml" -o -name "typescript_test.yml" -o -name "lint-py.yml" -o -name "docs_test.yml" | sort

echo ""
echo "=== python_test.yml inputs ==="
cat .github/workflows/python_test.yml 2>/dev/null | grep -A 20 "on:" | grep -A 20 "workflow_call:" || echo "File or section not found"

echo ""
echo "=== jest_test.yml inputs ==="
cat .github/workflows/jest_test.yml 2>/dev/null | grep -A 20 "on:" | grep -A 20 "workflow_call:" || echo "File or section not found"

echo ""
echo "=== typescript_test.yml inputs ==="
cat .github/workflows/typescript_test.yml 2>/dev/null | grep -A 20 "on:" | grep -A 20 "workflow_call:" || echo "File or section not found"

echo ""
echo "=== lint-py.yml inputs ==="
cat .github/workflows/lint-py.yml 2>/dev/null | grep -A 20 "on:" | grep -A 20 "workflow_call:" || echo "File or section not found"

echo ""
echo "=== docs_test.yml inputs ==="
cat .github/workflows/docs_test.yml 2>/dev/null | grep -A 20 "on:" | grep -A 20 "workflow_call:" || echo "File or section not found"

Repository: langflow-ai/langflow

Length of output: 2650


Fix typescript_test.yml to accept the ref parameter.

The CI workflow passes ref: ${{ inputs.branch }} to typescript_test.yml (line 195), but typescript_test.yml does not define a ref input parameter. This will cause the parameter to be silently ignored. Add ref as an input to typescript_test.yml's workflow_call trigger to match the other test workflows (python_test.yml and jest_test.yml both define ref), or update the CI workflow to use a different parameter name if appropriate.

Also note: lint-py.yml and docs_test.yml correctly define branch as their input parameter and are called with branch:, which is consistent.

🤖 Prompt for AI Agents
In .github/workflows/ci.yml around line 172 the CI calls typescript_test.yml
with ref: ${{ inputs.branch }} but .github/workflows/typescript_test.yml does
not declare a ref input; add a ref input to typescript_test.yml's workflow_call
inputs (matching the other test workflows) so the passed value is accepted, or
alternatively change the CI call to use branch: instead of ref if you prefer to
keep typescript_test.yml unchanged — ensure the inputs names match between
caller and workflow_call.

tag: ${{ needs.publish-main.needs.build-main.outputs.version }}
commit: ${{ github.ref }}
tag: ${{ needs.build-main.outputs.version }}
commit: ${{ inputs.release_branch }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

CRITICAL: commit parameter should be a commit SHA, not a branch name.

The ncipollo/release-action@v1 action expects the commit parameter to be a git commit SHA (or full commit hash). Using a branch name will either fail or create an incorrect release pointing to the wrong commit. You need to use the actual commit SHA from the release branch.

Consider using github.sha or extracting the commit from the build step output:

- commit: ${{ inputs.release_branch }}
+ commit: ${{ github.sha }}

Alternatively, if you need the commit from the specific branch, extract it in a previous step:

+ - name: Get release commit SHA
+   id: get-commit
+   run: |
+     COMMIT=$(git rev-parse ${{ inputs.release_branch }})
+     echo "commit=$COMMIT" >> $GITHUB_OUTPUT
  - name: Create Release
    uses: ncipollo/release-action@v1
    with:
      artifacts: "dist/*"
      token: ${{ secrets.GITHUB_TOKEN }}
      draft: false
      generateReleaseNotes: true
      prerelease: ${{ inputs.pre_release }}
      tag: ${{ needs.build-main.outputs.version }}
-     commit: ${{ inputs.release_branch }}
+     commit: ${{ steps.get-commit.outputs.commit }}
🤖 Prompt for AI Agents
.github/workflows/release.yml around line 346: the release action is passing a
branch name to the commit parameter which must be a commit SHA; replace the
branch input with a real commit SHA — for example set commit: ${{ github.sha }}
or add a prior step that fetches the target branch and outputs its latest commit
SHA (git fetch origin <branch> && git rev-parse origin/<branch>) and use that
step output as the commit value.

Comment on lines +42 to +67
op.execute("""
CREATE TABLE file_new (
id CHAR(32) NOT NULL,
user_id CHAR(32) NOT NULL,
name VARCHAR NOT NULL,
path VARCHAR NOT NULL,
size INTEGER NOT NULL,
provider VARCHAR,
created_at DATETIME NOT NULL,
updated_at DATETIME NOT NULL,
PRIMARY KEY (id),
CONSTRAINT file_name_user_id_key UNIQUE (name, user_id),
FOREIGN KEY(user_id) REFERENCES user (id)
)
""")

# Copy data from old table to new table
op.execute("""
INSERT INTO file_new (id, user_id, name, path, size, provider, created_at, updated_at)
SELECT id, user_id, name, path, size, provider, created_at, updated_at
FROM file
""")

# Drop old table and rename new table
op.execute("DROP TABLE file")
op.execute("ALTER TABLE file_new RENAME TO file")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

SQLite table recreation may lose existing indexes.

When recreating the table for SQLite, any existing indexes (other than those implied by constraints) will be lost. Consider querying and recreating indexes to avoid potential performance degradation.

+            # Get existing indexes before dropping table
+            existing_indexes = inspector.get_indexes("file")
+            
             # Create the new table without the single UNIQUE(name) constraint
             op.execute("""
                 CREATE TABLE file_new (
                     ...
                 )
             """)
             
             # Copy data from old table to new table
             op.execute("""
                 INSERT INTO file_new (...)
                 SELECT ...
                 FROM file
             """)
             
             # Drop old table and rename new table
             op.execute("DROP TABLE file")
             op.execute("ALTER TABLE file_new RENAME TO file")
+            
+            # Recreate indexes
+            for idx in existing_indexes:
+                if not idx.get("unique"):  # Skip unique constraints (already handled)
+                    columns = ", ".join(idx["column_names"])
+                    idx_name = idx["name"]
+                    op.execute(f'CREATE INDEX "{idx_name}" ON file ({columns})')

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In
src/backend/base/langflow/alembic/versions/d37bc4322900_drop_single_constraint_on_files_name_.py
around lines 42 to 67, the migration recreates the SQLite table but does not
preserve any non-constraint indexes; to fix, query sqlite_master (or PRAGMA
index_list) for existing indexes on "file" before dropping it, filter out
implicit sqlite_autoindex entries and constraint-backed indexes, store their
CREATE INDEX SQL (or reconstruct them from index info), then after
creating/renaming the new table execute those CREATE INDEX statements so the
same indexes are re-created on the new table; implement this
capture-and-recreate logic in the migration around the table copy/drop/rename
steps.

Comment on lines +166 to +169
except Exception as e:
print(f"Error during downgrade: {e}")
# Don't raise in downgrade - log and continue
pass
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Silent failure in downgrade can corrupt migration state.

Swallowing exceptions during downgrade leaves Alembic's migration history inconsistent with actual database state. If the constraint restoration fails, the database won't match what Alembic expects after downgrade.

Consider re-raising the exception or at minimum logging at error level:

     except Exception as e:
-        print(f"Error during downgrade: {e}")
-        # Don't raise in downgrade - log and continue
-        pass
+        print(f"Error during downgrade: {e}")
+        raise  # Let Alembic handle the failure and maintain consistent state

If there's a specific reason to allow partial downgrades, document it explicitly and consider using proper logging instead of print.

Comment on lines +393 to +410
# --- Vision-Language Model (VLM) pipeline ---
if pipeline == "vlm":
try:
from docling.pipeline.vlm_pipeline import VlmPipeline
from docling.document_converter import PdfFormatOption # type: ignore

vl_pipe = VlmPipelineOptions()

# VLM paths generally don't need OCR; keep OCR off by default here.
fmt = {}
if hasattr(input_format, "PDF"):
fmt[getattr(input_format, "PDF")] = PdfFormatOption(pipeline_cls=VlmPipeline)
if hasattr(input_format, "IMAGE"):
fmt[getattr(input_format, "IMAGE")] = PdfFormatOption(pipeline_cls=VlmPipeline)

return DocumentConverter(format_options=fmt)
except Exception:
return DocumentConverter()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Docling VLM pipeline VlmPipelineOptions class documentation

💡 Result:

VlmPipelineOptions — summary of key fields and usage

  • Purpose: pipeline configuration for the VlmPipeline (vision–language model based document conversion). [1][2]

  • Main fields (selection of the most-used):

    • vlm_options: InlineVlmOptions or ApiVlmOptions (local or OpenAI‑compatible remote model). [1][2]
    • enable_remote_services: allow API/offload to remote inference endpoints. [2]
    • force_backend_text: prefer backend/extracted PDF text over VLM output. [1][2]
    • generate_page_images, generate_picture_images: whether to create full‑page and cropped picture images. [1][2]
    • images_scale: float image scaling factor. [1][2]
    • do_picture_classification, do_picture_description and picture_description_options. [1]
    • accelerator_options (GPU/accelerator settings). [1]
    • document_timeout, allow_external_plugins, artifacts_path, kind. [1]
  • Example usage: create a VlmPipelineOptions with InlineVlmOptions (local model) or with ApiVlmOptions and set enable_remote_services=True to call hosted/OpenAI‑compatible endpoints; then pass to DocumentConverter / PdfFormatOption when using VlmPipeline. [2][3]

Sources

  • Docling API reference — VlmPipelineOptions attribute list. [1]
  • Docling usage pages and VLM pipeline examples (inline and remote configurations). [2][3]

🏁 Script executed:

# First, let's find and examine the actual file to see the full context
find . -type f -name "file.py" -path "*/components/data/*" | head -5

Repository: langflow-ai/langflow

Length of output: 115


🏁 Script executed:

# Check the file around the problematic lines
sed -n '380,420p' ./src/backend/base/langflow/components/data/file.py

Repository: langflow-ai/langflow

Length of output: 2124


🏁 Script executed:

# Check imports at the top of the file
head -50 ./src/backend/base/langflow/components/data/file.py

Repository: langflow-ai/langflow

Length of output: 1538


🏁 Script executed:

# Get the full create_converter function to see complete context
sed -n '310,430p' ./src/backend/base/langflow/components/data/file.py

Repository: langflow-ai/langflow

Length of output: 6148


Critical: Missing import for VlmPipelineOptions and unused variable assignment.

Line 399 references VlmPipelineOptions() but this class is not imported in the try block (lines 396–397 only import VlmPipeline and PdfFormatOption). This will cause a NameError at runtime. Additionally, the vl_pipe variable is assigned but never used—the VLM pipeline is configured via pipeline_cls=VlmPipeline in PdfFormatOption, not through pipeline_options.

Remove the unused line 399 entirely, as the VLM pipeline handles its own configuration automatically without requiring explicit pipeline options:

                 if pipeline == "vlm":
                     try:
                         from docling.pipeline.vlm_pipeline import VlmPipeline
                         from docling.document_converter import PdfFormatOption  # type: ignore

-                        vl_pipe = VlmPipelineOptions()
-
                         # VLM paths generally don't need OCR; keep OCR off by default here.
                         fmt = {}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# --- Vision-Language Model (VLM) pipeline ---
if pipeline == "vlm":
try:
from docling.pipeline.vlm_pipeline import VlmPipeline
from docling.document_converter import PdfFormatOption # type: ignore
vl_pipe = VlmPipelineOptions()
# VLM paths generally don't need OCR; keep OCR off by default here.
fmt = {}
if hasattr(input_format, "PDF"):
fmt[getattr(input_format, "PDF")] = PdfFormatOption(pipeline_cls=VlmPipeline)
if hasattr(input_format, "IMAGE"):
fmt[getattr(input_format, "IMAGE")] = PdfFormatOption(pipeline_cls=VlmPipeline)
return DocumentConverter(format_options=fmt)
except Exception:
return DocumentConverter()
# --- Vision-Language Model (VLM) pipeline ---
if pipeline == "vlm":
try:
from docling.pipeline.vlm_pipeline import VlmPipeline
from docling.document_converter import PdfFormatOption # type: ignore
# VLM paths generally don't need OCR; keep OCR off by default here.
fmt = {}
if hasattr(input_format, "PDF"):
fmt[getattr(input_format, "PDF")] = PdfFormatOption(pipeline_cls=VlmPipeline)
if hasattr(input_format, "IMAGE"):
fmt[getattr(input_format, "IMAGE")] = PdfFormatOption(pipeline_cls=VlmPipeline)
return DocumentConverter(format_options=fmt)
except Exception:
return DocumentConverter()
🤖 Prompt for AI Agents
In src/backend/base/langflow/components/data/file.py around lines 393 to 410,
the try block for the "vlm" pipeline references VlmPipelineOptions() on line 399
which is not imported and the assigned variable vl_pipe is unused; remove the
unused vl_pipe assignment (delete the VlmPipelineOptions() line) and ensure no
other references to VlmPipelineOptions remain in this block so the code relies
on passing pipeline_cls=VlmPipeline to PdfFormatOption as intended, leaving the
except branch unchanged.

Comment on lines +611 to +621
def load_files_helper(self) -> DataFrame:
result = self.load_files()

# Error condition - raise error if no text and an error is present
if not hasattr(result, "text"):
if hasattr(result, "error"):
raise ValueError(result.error[0])
msg = "No content generated."
raise ValueError(msg)

return result
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Review error handling logic for DataFrame results.

The error checking at lines 615-619 uses hasattr(result, "text") and hasattr(result, "error") on a DataFrame object. DataFrames have columns (accessed via indexing) rather than direct attributes for data fields. This logic may not correctly detect errors or missing content.

Consider checking for column existence or DataFrame emptiness instead:

     def load_files_helper(self) -> DataFrame:
         result = self.load_files()
 
-        # Error condition - raise error if no text and an error is present
-        if not hasattr(result, "text"):
-            if hasattr(result, "error"):
-                raise ValueError(result.error[0])
-            msg = "No content generated."
-            raise ValueError(msg)
+        # Error condition - raise error if DataFrame is empty or contains errors
+        if result.empty:
+            msg = "No content generated."
+            raise ValueError(msg)
+        
+        if "error" in result.columns and not result["error"].isna().all():
+            raise ValueError(result["error"].iloc[0])
 
         return result
🤖 Prompt for AI Agents
In src/backend/base/langflow/components/data/file.py around lines 611 to 621,
the code incorrectly uses hasattr(result, "text") and hasattr(result, "error")
on a pandas DataFrame; instead check DataFrame internals: first ensure result is
a DataFrame and not empty (raise ValueError("No content generated.") if
result.empty), then check for an "error" column (if "error" in result.columns
and result["error"].notna().any() raise ValueError with the first non-null error
string), otherwise check for a "text" column (if "text" not in result.columns or
result["text"].dropna().eq("").all() raise ValueError("No content generated."));
return result only when the DataFrame contains valid text rows.

Comment on lines 231 to +232
collection = chroma._client.get_collection(name=self.knowledge_base)
embeddings_result = collection.get(where={"_id": {"$in": doc_ids}}, include=["embeddings", "metadatas"])
embeddings_result = collection.get(where={"_id": {"$in": doc_ids}}, include=["metadatas", "embeddings"])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

langchain_chroma Chroma API public methods documentation

💡 Result:

Here are official docs/API references for the langchain_chroma Chroma vector-store and its public methods:

  • LangChain Python Chroma integration (API reference page listing class and methods such as add_documents/add_vectors/get/update/upsert/delete/peek/query, from_documents/from_texts/from_existing_collection, etc.). [1]

  • LangChain docs (overview + how-to with link to API reference). [2]

  • LangChain JS/Community Chroma class reference (method index and signatures). [3]

Links:
[1] API reference — langchain_chroma.vectorstores.Chroma. [python.langchain.com API].
[2] LangChain Docs — Chroma integration (Python).
[3] LangChain.js / langchain-community — Chroma class reference.

If you want, I can fetch and paste the full list of public method names and short signatures from the API page. Which format do you prefer (plain list, code-style signatures, or downloadable link)?


🏁 Script executed:

# Let's examine the actual code to understand the context
fd "retrieval.py" src/backend/base/langflow/components/knowledge_bases/ | head -5

Repository: langflow-ai/langflow

Length of output: 129


🏁 Script executed:

# Read the retrieval.py file around the flagged lines
sed -n '220,240p' src/backend/base/langflow/components/knowledge_bases/retrieval.py | cat -n

Repository: langflow-ai/langflow

Length of output: 1363


🏁 Script executed:

# Let's see more context around this code to understand what's being done
sed -n '200,250p' src/backend/base/langflow/components/knowledge_bases/retrieval.py | cat -n

Repository: langflow-ai/langflow

Length of output: 2694


🌐 Web query:

langchain_chroma Chroma class get method embeddings by id filter

💡 Result:

Use Chroma.get(...) and request embeddings via the include parameter. Example (Python):

from langchain.vectorstores import Chroma

assume chroma = Chroma(...) already initialized

res = chroma.get(
ids=["id1", "id2"], # or None to use where only
where={"category": "invoice"}, # optional metadata filter
include=["embeddings", "metadatas"] # ask for embeddings to be returned
)
embeddings = dict(zip(res["ids"], res["embeddings"])) # map id -> embedding

Notes:

  • get returns a dict with keys "ids", "embeddings", "metadatas", "documents". IDs are always returned; include controls the other fields.
  • The where filter is a Chroma/Where dict (e.g. {"color":"red"} or nested $and/$or forms).
    Sources: LangChain Chroma API docs and Chroma collection reference. [1][2]

References
[1] LangChain Chroma get API.
[2] Chroma Collection get (includes include/where semantics).


🌐 Web query:

chroma python client API get_collection method

💡 Result:

The Chroma (chromadb) Python Client provides client.get_collection(...) to fetch an existing collection. Key points:

  • Purpose: return a Collection object by name or id; raises ValueError if the collection does not exist. [1][2]
  • Typical signature / arguments: name (or id), optional embedding_function, optional data_loader. Example:
    client.get_collection("my_collection") or client.get_collection(name="my_collection", embedding_function=emb_fn). [1][3]
  • Alternative helper: client.get_or_create_collection(...) will create the collection if missing. [1][3]
  • Usage notes: list_collections()/count_collections() are available; metadata can be provided when creating/getting via get_or_create_collection (existing metadata may be overwritten). [1][2]

Sources: Chroma cookbook/docs and PyPI package docs. [1][2][3]

References
[1] Chroma Cookbook — Collections (get_collection / get_or_create_collection).
[2] chromadb package page (PyPI) — client methods.
[3] chromadb client reference / docs and examples (docstring / community Q&A).


Replace private _client access with public Chroma.get() method.

The code accesses chroma._client.get_collection() to retrieve embeddings, but the LangChain Chroma class provides a public get() method that supports the same functionality. Replace the private API calls at lines 231-232 with:

embeddings_result = chroma.get(where={"_id": {"$in": doc_ids}}, include=["metadatas", "embeddings"])

This eliminates reliance on internal APIs and ensures compatibility with future langchain_chroma updates.

🤖 Prompt for AI Agents
In src/backend/base/langflow/components/knowledge_bases/retrieval.py around
lines 231 to 232, the code is directly accessing the private
chroma._client.get_collection to fetch embeddings; replace that private API
usage with the public Chroma.get() call by removing the _client.get_collection
call and using chroma.get(where={"_id": {"$in": doc_ids}}, include=["metadatas",
"embeddings"]) so the code relies on the public LangChain Chroma API and remains
compatible with future updates.

Comment on lines +177 to 181
if result or force_output:
self.status = self.true_case_message
self.iterate_and_stop_once("false_result")
if not force_output: # Only stop the other branch if not forcing due to max iterations
self.iterate_and_stop_once("false_result")
return self.true_case_message
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential issue: Missing branch stop when force_output is True.

When force_output is True, iterate_and_stop_once("false_result") is not called (line 179-180). This means the false_result branch is not stopped and not excluded.

Since false_response() will also execute (due to group_outputs=True), it will evaluate the condition independently and may also attempt to produce output or stop branches, leading to inconsistent behavior.

Should the forced output case also stop the alternate branch to maintain consistent routing semantics?

         if result or force_output:
             self.status = self.true_case_message
-            if not force_output:  # Only stop the other branch if not forcing due to max iterations
-                self.iterate_and_stop_once("false_result")
+            self.iterate_and_stop_once("false_result")
             return self.true_case_message

Or if the intent is to avoid iteration increment during force_output, consider a flag parameter to iterate_and_stop_once to skip the iteration update while still stopping the branch.

🤖 Prompt for AI Agents
In src/backend/base/langflow/components/logic/conditional_router.py around lines
177 to 181, the code currently skips calling
iterate_and_stop_once("false_result") when force_output is True which leaves the
false branch running; call iterate_and_stop_once("false_result") in the
force_output path as well to ensure the alternate branch is stopped, or modify
iterate_and_stop_once to accept a flag (e.g., skip_iteration=True) and call
iterate_and_stop_once("false_result", skip_iteration=True) when force_output is
True so the branch is stopped without incrementing iteration counters.


class SubFlowComponent(Component):
display_name = "Sub Flow [Deprecated]"
display_name = "Sub Flow"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Inconsistency: Display name no longer indicates deprecation while legacy flag remains true.

Removing "[Deprecated]" from the display name while keeping legacy: bool = True creates a disconnect between the internal status and what users see. Users won't realize this component is deprecated and that they should migrate to the replacement. This could hinder adoption of the logic.RunFlow component.

Consider either:

  1. Keeping "[Deprecated]" in the display name for clarity, or
  2. Adding a user-facing deprecation warning/notice that references the replacement attribute, or
  3. If the component is no longer deprecated, remove the legacy: bool = True flag

Also applies to: 18-19

🤖 Prompt for AI Agents
In src/backend/base/langflow/components/logic/sub_flow.py around lines 15 and
18-19, the display_name no longer indicates deprecation while legacy: bool =
True remains, causing a mismatch; update the code so UI and internal state match
by either (A) restoring "[Deprecated]" in display_name (e.g., "Sub Flow
[Deprecated]") and optionally emitting a user-facing deprecation warning
referencing the replacement, or (B) if the component is no longer deprecated,
remove or set legacy: bool = False and clear any deprecation messaging; pick one
of these two consistent actions and apply it across the mentioned lines.

@HimavarshaVS HimavarshaVS changed the base branch from main to release-1.6.0 December 10, 2025 15:40
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Dec 10, 2025
@HimavarshaVS
Copy link
Copy Markdown
Collaborator Author

@coderabbitai can you refactor the description?

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Dec 10, 2025

Tip

For best results, initiate chat on the files or code changes.

@HimavarshaVS I'd be happy to help refactor a description! However, I need a bit more context to provide the most useful response. Could you please clarify which description you'd like me to refactor?

Here are some possibilities:

  1. PR Description: Would you like me to create or improve the description for this pull request based on the changes made?

  2. Component Description: Are you referring to a specific component's description in one of the modified files (e.g., ChatInput, ChatOutput, FileComponent, etc.)?

  3. AI Summary: Would you like me to refine one of the AI-generated summaries shown above?

  4. Specific Code Comment: Is there a particular docstring or description in a specific file that needs refactoring?

Please let me know which description you're referring to, and I'll be glad to help refactor it! If you can point me to the specific file or context, that would be even more helpful.

@HimavarshaVS
Copy link
Copy Markdown
Collaborator Author

Tip

For best results, initiate chat on the files or code changes.

@HimavarshaVS I'd be happy to help refactor a description! However, I need a bit more context to provide the most useful response. Could you please clarify which description you'd like me to refactor?

Here are some possibilities:

  1. PR Description: Would you like me to create or improve the description for this pull request based on the changes made?
  2. Component Description: Are you referring to a specific component's description in one of the modified files (e.g., ChatInput, ChatOutput, FileComponent, etc.)?
  3. AI Summary: Would you like me to refine one of the AI-generated summaries shown above?
  4. Specific Code Comment: Is there a particular docstring or description in a specific file that needs refactoring?

Please let me know which description you're referring to, and I'll be glad to help refactor it! If you can point me to the specific file or context, that would be even more helpful.

I meant the PR description

@HimavarshaVS HimavarshaVS added the lgtm This PR has been approved by a maintainer label Dec 10, 2025
@github-actions
Copy link
Copy Markdown
Contributor

Frontend Unit Test Coverage Report

Coverage Summary

Lines Statements Branches Functions
Coverage: 9%
9.61% (2522/26241) 3.92% (793/20187) 4.05% (229/5648)

Unit Test Results

Tests Skipped Failures Errors Time
831 0 💤 0 ❌ 0 🔥 15.427s ⏱️

@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 10, 2025

Codecov Report

❌ Patch coverage is 80.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 32.68%. Comparing base (53366d4) to head (ebe0561).
⚠️ Report is 2 commits behind head on release-1.6.0.

Files with missing lines Patch % Lines
...kend/base/langflow/components/input_output/chat.py 80.00% 1 Missing ⚠️
...se/langflow/components/input_output/chat_output.py 80.00% 1 Missing ⚠️

❌ Your project status has failed because the head coverage (8.83%) is below the target coverage (10.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@                Coverage Diff                @@
##           release-1.6.0   #10946      +/-   ##
=================================================
- Coverage          35.74%   32.68%   -3.07%     
=================================================
  Files               1220     1220              
  Lines              58150    58212      +62     
  Branches            5530     5530              
=================================================
- Hits               20787    19027    -1760     
- Misses             37202    39024    +1822     
  Partials             161      161              
Flag Coverage Δ
backend 50.30% <80.00%> (-5.36%) ⬇️
frontend 8.83% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...kend/base/langflow/components/input_output/chat.py 93.75% <80.00%> (-6.25%) ⬇️
...se/langflow/components/input_output/chat_output.py 81.70% <80.00%> (-6.61%) ⬇️

... and 128 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@HzaRashid
Copy link
Copy Markdown
Collaborator

This backend test is broken (not because of the PR)

FAILED src/backend/tests/unit/components/processing/test_split_text_component.py::TestSplitTextComponent::test_with_url_loader - AssertionError: Expected DataFrame with more than 2 rows, got 2

I believe @erichare fixed this test in main.

As for the frontend test I am not sure, again seems like something is broken in the test suite.

Really confused by @coderabbitai, it looks like you only made a couple of small changes in chat/chat_output.py

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Dec 11, 2025

Tip

For best results, initiate chat on the files or code changes.

@HzaRashid, thank you for the context on the test failures!

Regarding the backend test failure in test_split_text_component.py, I understand this appears to be a pre-existing issue in the test suite rather than something introduced by this PR. If @erichare has already fixed it in main, merging the latest changes should resolve it.

About the chat component changes - I can see why there might be some confusion! While the changes may look small at first glance, there are actually several significant modifications to both chat.py (ChatInput) and chat_output.py (ChatOutput):

Changes made to both components:

  1. Added __init__ method to initialize a new _artifacts dictionary
  2. Added get_artifacts() method to expose the artifacts
  3. Removed multiple input definitions:
    • background_color
    • chat_icon (or chat_icon reference)
    • text_color
    • (ChatOutput also removed clean_data)
  4. Removed all icon and color-related properties handling from message creation
  5. Modified message handling logic to work with the simplified input structure

So while the diff might appear compact in some views, these changes represent a meaningful refactoring that:

  • Introduces artifact support framework
  • Simplifies the component interface by removing styling inputs
  • Streamlines message creation logic

The changes align with the PR's goal to "fix: update chat components" by cleaning up and modernizing the chat component architecture. Does this help clarify the scope of the modifications?

Copy link
Copy Markdown
Collaborator

@HzaRashid HzaRashid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as long as we can get ci passing and coderabbit calms down lgtm haha

@HimavarshaVS
Copy link
Copy Markdown
Collaborator Author

closing this as the fix is already on 1.7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working lgtm This PR has been approved by a maintainer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants