Skip to content

ref: URL and File components with Dataframe output#8117

Merged
erichare merged 94 commits into
mainfrom
native-components-clean
May 30, 2025
Merged

ref: URL and File components with Dataframe output#8117
erichare merged 94 commits into
mainfrom
native-components-clean

Conversation

@edwinjosechittilappilly
Copy link
Copy Markdown
Collaborator

@edwinjosechittilappilly edwinjosechittilappilly commented May 19, 2025

This pull request introduces several updates across multiple components to enhance functionality, improve code maintainability, and simplify data handling. Key changes include the removal of legacy methods and outputs, the addition of new configurable options, and the refactoring of data conversion logic to use a centralized utility function.

Updates to Data Handling and Outputs:

  • Removed the load_message method and its associated output from the BaseFile class, consolidating the focus on DataFrame handling (src/backend/base/langflow/base/data/base_file.py). [1] [2]
  • Updated the FileComponent description to reflect its focus on loading content as a DataFrame (src/backend/base/langflow/components/data/file.py).

Enhancements to the URL Component:

  • Added new configurable options like filter_text_html, continue_on_failure, check_response_status, and autoset_encoding to the URLComponent for more granular control over web scraping behavior (src/backend/base/langflow/components/data/url.py).
  • Replaced IntInput for crawl depth with SliderInput and introduced constants for default values to improve usability and maintainability (src/backend/base/langflow/components/data/url.py). [1] [2]
  • Refactored URL validation and loading logic for better error handling and modularity (src/backend/base/langflow/components/data/url.py).

Refactoring for Code Simplification:

  • Replaced custom _safe_convert methods in multiple components with a centralized safe_convert utility function to standardize data conversion (src/backend/base/langflow/components/outputs/chat.py, src/backend/base/langflow/components/processing/parser.py). [1] [2]
  • Removed unnecessary imports and legacy code from several files, streamlining the codebase (src/backend/base/langflow/components/processing/save_to_file.py, src/backend/base/langflow/components/processing/parser.py). [1] [2]

Marking Legacy Components:

  • Marked DataToDataFrameComponent and MessageToDataComponent as legacy to indicate their deprecated status (src/backend/base/langflow/components/processing/data_to_dataframe.py, src/backend/base/langflow/components/processing/message_to_data.py). [1] [2]

Summary by CodeRabbit

  • Refactor

    • Centralized data-to-string conversion across components by delegating to a shared helper function, simplifying internal logic and improving consistency.
    • Updated file and URL processing components to clarify descriptions, streamline outputs, and improve input handling.
    • Introduced and marked legacy status for certain processing components.
    • Enhanced test coverage and updated test selectors to align with UI and component changes.
  • New Features

    • Added a component to save data or messages to files in various formats, with asynchronous saving and uploading capabilities.
    • Improved URL loading with enhanced validation, recursive crawling, and more detailed output formatting.
  • Bug Fixes

    • Improved error handling and validation in file saving and URL processing components.
    • Updated tests to better reflect expected behaviors and handle edge cases.
  • Chores

    • Reformatted and updated starter project configurations for clarity and consistency.
    • Adjusted test scripts to match updated component outputs and UI changes.

@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 19, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 19, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 19, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 19, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 19, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 20, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 20, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 20, 2025
@github-actions github-actions Bot removed the enhancement New feature or request label May 20, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 86d142a and 3e42c5c.

📒 Files selected for processing (10)
  • src/backend/base/langflow/components/input_output/chat_output.py (3 hunks)
  • src/backend/base/langflow/initial_setup/starter_projects/blog_writer.py (3 hunks)
  • src/backend/base/langflow/initial_setup/starter_projects/document_qa.py (2 hunks)
  • src/backend/base/langflow/initial_setup/starter_projects/vector_store_rag.py (3 hunks)
  • src/backend/tests/unit/initial_setup/starter_projects/test_vector_store_rag.py (1 hunks)
  • src/frontend/tests/core/features/freeze.spec.ts (1 hunks)
  • src/frontend/tests/core/features/stop-building.spec.ts (1 hunks)
  • src/frontend/tests/core/unit/fileUploadComponent.spec.ts (1 hunks)
  • src/frontend/tests/extended/features/loop-component.spec.ts (3 hunks)
  • src/frontend/tests/extended/integrations/chatInputOutputUser-shard-1.spec.ts (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
  • src/frontend/tests/core/features/stop-building.spec.ts
  • src/frontend/tests/core/features/freeze.spec.ts
  • src/backend/base/langflow/initial_setup/starter_projects/blog_writer.py
  • src/backend/tests/unit/initial_setup/starter_projects/test_vector_store_rag.py
  • src/backend/base/langflow/initial_setup/starter_projects/document_qa.py
  • src/backend/base/langflow/initial_setup/starter_projects/vector_store_rag.py
  • src/frontend/tests/extended/integrations/chatInputOutputUser-shard-1.spec.ts
  • src/frontend/tests/core/unit/fileUploadComponent.spec.ts
🧰 Additional context used
🧬 Code Graph Analysis (2)
src/frontend/tests/extended/features/loop-component.spec.ts (1)
src/frontend/tests/utils/upload-file.ts (1)
  • uploadFile (34-83)
src/backend/base/langflow/components/input_output/chat_output.py (2)
src/backend/base/langflow/helpers/data.py (1)
  • safe_convert (165-191)
src/backend/base/langflow/schema/data.py (1)
  • Data (17-249)
⏰ Context from checks skipped due to timeout of 90000ms (19)
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 10/10
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 9/10
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 3/10
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 7/10
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 4/10
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 6/10
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 5/10
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 2/10
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 8/10
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 1/10
  • GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 5
  • GitHub Check: Lint Backend / Run Mypy (3.13)
  • GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 3
  • GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 2
  • GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 1
  • GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 4
  • GitHub Check: Optimize new Python code in this PR
  • GitHub Check: Run benchmarks (3.12)
  • GitHub Check: Update Starter Projects
🔇 Additional comments (6)
src/backend/base/langflow/components/input_output/chat_output.py (2)

8-8: LGTM: Proper import of centralized utility function.

The import of safe_convert from langflow.helpers.data aligns with the refactoring to centralize data conversion logic across components.


161-168: LGTM: Well-implemented data serialization method.

The _serialize_data method properly handles JSON serialization with appropriate error handling and markdown formatting. The use of orjson with pretty printing and proper encoding conversion is good practice.

src/frontend/tests/extended/features/loop-component.spec.ts (4)

4-4: LGTM: Proper import of utility function.

The import of uploadFile utility function is correctly added and will be used later in the test.


131-131: LGTM: Updated selector reflects backend changes.

The selector change from "handle-urlcomponent-shownode-data-right" to "handle-urlcomponent-shownode-page results-right" correctly reflects the backend URL component output renaming mentioned in the PR objectives.


199-199: LGTM: File upload step enhances test coverage.

The addition of the file upload step using uploadFile(page, "test_file.txt") enhances the test flow and aligns with the backend file handling improvements.


204-204: LGTM: Updated expectation reflects improved behavior.

The change from expecting an error to expecting a success message ("built successfully") correctly reflects the backend improvements that now allow the loop component flow to complete successfully rather than fail.

Comment on lines +197 to +200
return "\n".join([safe_convert(item, clean_data=self.clean_data) for item in self.input_value])
if isinstance(self.input_value, Generator):
return self.input_value
return self._safe_convert(self.input_value)
return safe_convert(self.input_value)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix inconsistent parameter usage in safe_convert calls.

There's an inconsistency in how safe_convert is called:

  • Line 197: Passes clean_data=self.clean_data parameter
  • Line 200: Does not pass the clean_data parameter

This inconsistency could lead to different data cleaning behavior for list vs non-list inputs.

Apply this fix to ensure consistent behavior:

-        return safe_convert(self.input_value)
+        return safe_convert(self.input_value, clean_data=self.clean_data)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return "\n".join([safe_convert(item, clean_data=self.clean_data) for item in self.input_value])
if isinstance(self.input_value, Generator):
return self.input_value
return self._safe_convert(self.input_value)
return safe_convert(self.input_value)
return "\n".join([safe_convert(item, clean_data=self.clean_data) for item in self.input_value])
if isinstance(self.input_value, Generator):
return self.input_value
- return safe_convert(self.input_value)
+ return safe_convert(self.input_value, clean_data=self.clean_data)
🤖 Prompt for AI Agents
In src/backend/base/langflow/components/input_output/chat_output.py lines 197 to
200, the calls to safe_convert are inconsistent regarding the clean_data
parameter; line 197 passes clean_data=self.clean_data while line 200 does not.
To fix this, modify the call on line 200 to also include
clean_data=self.clean_data so that both list and non-list inputs are processed
with consistent data cleaning behavior.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
.github/workflows/typescript_test.yml (1)

223-229: Optional: Extract shard parameters into configurable variables
For better maintainability, consider defining TESTS_PER_SHARD (5) and MAX_SHARD_COUNT (40) as env variables at the top of the workflow. Then reference them in your shard calculation and bounds check:

env:
  TESTS_PER_SHARD: 5
  MAX_SHARD_COUNT: 40
#
          SHARD_COUNT=$(( (TEST_COUNT + TESTS_PER_SHARD - 1) / TESTS_PER_SHARD ))
          if [ $SHARD_COUNT -lt 1 ]; then
            SHARD_COUNT=1
          elif [ $SHARD_COUNT -gt $MAX_SHARD_COUNT ]; then
            SHARD_COUNT=$MAX_SHARD_COUNT
          fi

This makes future adjustments clearer and avoids hard-coded numbers in the shell logic.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3e42c5c and a40c928.

📒 Files selected for processing (4)
  • .github/workflows/typescript_test.yml (1 hunks)
  • src/backend/base/langflow/initial_setup/starter_projects/Custom Component Maker.json (1 hunks)
  • src/backend/base/langflow/initial_setup/starter_projects/Portfolio Website Code Generator.json (5 hunks)
  • src/backend/base/langflow/initial_setup/starter_projects/Research Translation Loop.json (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/backend/base/langflow/initial_setup/starter_projects/Custom Component Maker.json
  • src/backend/base/langflow/initial_setup/starter_projects/Portfolio Website Code Generator.json
⏰ Context from checks skipped due to timeout of 90000ms (49)
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 38/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 37/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 39/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 40/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 36/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 34/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 35/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 32/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 30/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 28/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 31/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 33/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 23/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 29/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 26/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 27/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 19/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 21/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 24/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 25/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 22/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 18/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 17/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 20/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 16/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 12/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 8/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 15/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 14/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 13/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 10/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 11/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 2/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 5/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 9/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 4/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 6/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 7/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 1/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 3/40
  • GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 5
  • GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 3
  • GitHub Check: Lint Backend / Run Mypy (3.13)
  • GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 4
  • GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 2
  • GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 1
  • GitHub Check: Run Backend Tests / Integration Tests - Python 3.10
  • GitHub Check: Optimize new Python code in this PR
  • GitHub Check: Run benchmarks (3.12)
🔇 Additional comments (3)
.github/workflows/typescript_test.yml (1)

223-223: Max shard count increased from 10 to 40
The change correctly updates both the comment and the conditional logic to cap shards at 40, supporting larger test suites without overwhelming the runner.

Also applies to: 227-228

src/backend/base/langflow/initial_setup/starter_projects/Research Translation Loop.json (2)

715-715: Marking the MessageToData component as legacy
Adding "legacy": true to the MessagetoData node correctly flags this component as deprecated in starter projects.


755-755: Legacy component code remains unchanged
The convert_message_to_data code block appears identical to its previous implementation, which is expected since this is now a legacy component. No issues detected.

"title_case": false,
"type": "code",
"value": "from collections.abc import Generator\nfrom typing import Any\n\nimport orjson\nfrom fastapi.encoders import jsonable_encoder\n\nfrom langflow.base.io.chat import ChatComponent\nfrom langflow.inputs import BoolInput\nfrom langflow.inputs.inputs import HandleInput\nfrom langflow.io import DropdownInput, MessageTextInput, Output\nfrom langflow.schema.data import Data\nfrom langflow.schema.dataframe import DataFrame\nfrom langflow.schema.message import Message\nfrom langflow.schema.properties import Source\nfrom langflow.utils.constants import (\n MESSAGE_SENDER_AI,\n MESSAGE_SENDER_NAME_AI,\n MESSAGE_SENDER_USER,\n)\n\n\nclass ChatOutput(ChatComponent):\n display_name = \"Chat Output\"\n description = \"Display a chat message in the Playground.\"\n icon = \"MessagesSquare\"\n name = \"ChatOutput\"\n minimized = True\n\n inputs = [\n HandleInput(\n name=\"input_value\",\n display_name=\"Text\",\n info=\"Message to be passed as output.\",\n input_types=[\"Data\", \"DataFrame\", \"Message\"],\n required=True,\n ),\n BoolInput(\n name=\"should_store_message\",\n display_name=\"Store Messages\",\n info=\"Store the message in the history.\",\n value=True,\n advanced=True,\n ),\n DropdownInput(\n name=\"sender\",\n display_name=\"Sender Type\",\n options=[MESSAGE_SENDER_AI, MESSAGE_SENDER_USER],\n value=MESSAGE_SENDER_AI,\n advanced=True,\n info=\"Type of sender.\",\n ),\n MessageTextInput(\n name=\"sender_name\",\n display_name=\"Sender Name\",\n info=\"Name of the sender.\",\n value=MESSAGE_SENDER_NAME_AI,\n advanced=True,\n ),\n MessageTextInput(\n name=\"session_id\",\n display_name=\"Session ID\",\n info=\"The session ID of the chat. If empty, the current session ID parameter will be used.\",\n advanced=True,\n ),\n MessageTextInput(\n name=\"data_template\",\n display_name=\"Data Template\",\n value=\"{text}\",\n advanced=True,\n info=\"Template to convert Data to Text. If left empty, it will be dynamically set to the Data's text key.\",\n ),\n MessageTextInput(\n name=\"background_color\",\n display_name=\"Background Color\",\n info=\"The background color of the icon.\",\n advanced=True,\n ),\n MessageTextInput(\n name=\"chat_icon\",\n display_name=\"Icon\",\n info=\"The icon of the message.\",\n advanced=True,\n ),\n MessageTextInput(\n name=\"text_color\",\n display_name=\"Text Color\",\n info=\"The text color of the name\",\n advanced=True,\n ),\n BoolInput(\n name=\"clean_data\",\n display_name=\"Basic Clean Data\",\n value=True,\n info=\"Whether to clean the data\",\n advanced=True,\n ),\n ]\n outputs = [\n Output(\n display_name=\"Message\",\n name=\"message\",\n method=\"message_response\",\n ),\n ]\n\n def _build_source(self, id_: str | None, display_name: str | None, source: str | None) -> Source:\n source_dict = {}\n if id_:\n source_dict[\"id\"] = id_\n if display_name:\n source_dict[\"display_name\"] = display_name\n if source:\n # Handle case where source is a ChatOpenAI object\n if hasattr(source, \"model_name\"):\n source_dict[\"source\"] = source.model_name\n elif hasattr(source, \"model\"):\n source_dict[\"source\"] = str(source.model)\n else:\n source_dict[\"source\"] = str(source)\n return Source(**source_dict)\n\n async def message_response(self) -> Message:\n # First convert the input to string if needed\n text = self.convert_to_string()\n\n # Get source properties\n source, icon, display_name, source_id = self.get_properties_from_source_component()\n background_color = self.background_color\n text_color = self.text_color\n if self.chat_icon:\n icon = self.chat_icon\n\n # Create or use existing Message object\n if isinstance(self.input_value, Message):\n message = self.input_value\n # Update message properties\n message.text = text\n else:\n message = Message(text=text)\n\n # Set message properties\n message.sender = self.sender\n message.sender_name = self.sender_name\n message.session_id = self.session_id\n message.flow_id = self.graph.flow_id if hasattr(self, \"graph\") else None\n message.properties.source = self._build_source(source_id, display_name, source)\n message.properties.icon = icon\n message.properties.background_color = background_color\n message.properties.text_color = text_color\n\n # Store message if needed\n if self.session_id and self.should_store_message:\n stored_message = await self.send_message(message)\n self.message.value = stored_message\n message = stored_message\n\n self.status = message\n return message\n\n def _validate_input(self) -> None:\n \"\"\"Validate the input data and raise ValueError if invalid.\"\"\"\n if self.input_value is None:\n msg = \"Input data cannot be None\"\n raise ValueError(msg)\n if isinstance(self.input_value, list) and not all(\n isinstance(item, Message | Data | DataFrame | str) for item in self.input_value\n ):\n invalid_types = [\n type(item).__name__\n for item in self.input_value\n if not isinstance(item, Message | Data | DataFrame | str)\n ]\n msg = f\"Expected Data or DataFrame or Message or str, got {invalid_types}\"\n raise TypeError(msg)\n if not isinstance(\n self.input_value,\n Message | Data | DataFrame | str | list | Generator | type(None),\n ):\n type_name = type(self.input_value).__name__\n msg = f\"Expected Data or DataFrame or Message or str, Generator or None, got {type_name}\"\n raise TypeError(msg)\n\n def _serialize_data(self, data: Data) -> str:\n \"\"\"Serialize Data object to JSON string.\"\"\"\n # Convert data.data to JSON-serializable format\n serializable_data = jsonable_encoder(data.data)\n # Serialize with orjson, enabling pretty printing with indentation\n json_bytes = orjson.dumps(serializable_data, option=orjson.OPT_INDENT_2)\n # Convert bytes to string and wrap in Markdown code blocks\n return \"```json\\n\" + json_bytes.decode(\"utf-8\") + \"\\n```\"\n\n def _safe_convert(self, data: Any) -> str:\n \"\"\"Safely convert input data to string.\"\"\"\n try:\n if isinstance(data, str):\n return data\n if isinstance(data, Message):\n return data.get_text()\n if isinstance(data, Data):\n return self._serialize_data(data)\n if isinstance(data, DataFrame):\n if self.clean_data:\n # Remove empty rows\n data = data.dropna(how=\"all\")\n # Remove empty lines in each cell\n data = data.replace(r\"^\\s*$\", \"\", regex=True)\n # Replace multiple newlines with a single newline\n data = data.replace(r\"\\n+\", \"\\n\", regex=True)\n\n # Replace pipe characters to avoid markdown table issues\n processed_data = data.replace(r\"\\|\", r\"\\\\|\", regex=True)\n\n processed_data = processed_data.map(\n lambda x: str(x).replace(\"\\n\", \"<br/>\") if isinstance(x, str) else x\n )\n\n return processed_data.to_markdown(index=False)\n return str(data)\n except (ValueError, TypeError, AttributeError) as e:\n msg = f\"Error converting data: {e!s}\"\n raise ValueError(msg) from e\n\n def convert_to_string(self) -> str | Generator[Any, None, None]:\n \"\"\"Convert input data to string with proper error handling.\"\"\"\n self._validate_input()\n if isinstance(self.input_value, list):\n return \"\\n\".join([self._safe_convert(item) for item in self.input_value])\n if isinstance(self.input_value, Generator):\n return self.input_value\n return self._safe_convert(self.input_value)\n"
"value": "from collections.abc import Generator\nfrom typing import Any\n\nimport orjson\nfrom fastapi.encoders import jsonable_encoder\n\nfrom langflow.base.io.chat import ChatComponent\nfrom langflow.helpers.data import safe_convert\nfrom langflow.inputs import BoolInput\nfrom langflow.inputs.inputs import HandleInput\nfrom langflow.io import DropdownInput, MessageTextInput, Output\nfrom langflow.schema.data import Data\nfrom langflow.schema.dataframe import DataFrame\nfrom langflow.schema.message import Message\nfrom langflow.schema.properties import Source\nfrom langflow.utils.constants import (\n MESSAGE_SENDER_AI,\n MESSAGE_SENDER_NAME_AI,\n MESSAGE_SENDER_USER,\n)\n\n\nclass ChatOutput(ChatComponent):\n display_name = \"Chat Output\"\n description = \"Display a chat message in the Playground.\"\n icon = \"MessagesSquare\"\n name = \"ChatOutput\"\n minimized = True\n\n inputs = [\n HandleInput(\n name=\"input_value\",\n display_name=\"Text\",\n info=\"Message to be passed as output.\",\n input_types=[\"Data\", \"DataFrame\", \"Message\"],\n required=True,\n ),\n BoolInput(\n name=\"should_store_message\",\n display_name=\"Store Messages\",\n info=\"Store the message in the history.\",\n value=True,\n advanced=True,\n ),\n DropdownInput(\n name=\"sender\",\n display_name=\"Sender Type\",\n options=[MESSAGE_SENDER_AI, MESSAGE_SENDER_USER],\n value=MESSAGE_SENDER_AI,\n advanced=True,\n info=\"Type of sender.\",\n ),\n MessageTextInput(\n name=\"sender_name\",\n display_name=\"Sender Name\",\n info=\"Name of the sender.\",\n value=MESSAGE_SENDER_NAME_AI,\n advanced=True,\n ),\n MessageTextInput(\n name=\"session_id\",\n display_name=\"Session ID\",\n info=\"The session ID of the chat. If empty, the current session ID parameter will be used.\",\n advanced=True,\n ),\n MessageTextInput(\n name=\"data_template\",\n display_name=\"Data Template\",\n value=\"{text}\",\n advanced=True,\n info=\"Template to convert Data to Text. If left empty, it will be dynamically set to the Data's text key.\",\n ),\n MessageTextInput(\n name=\"background_color\",\n display_name=\"Background Color\",\n info=\"The background color of the icon.\",\n advanced=True,\n ),\n MessageTextInput(\n name=\"chat_icon\",\n display_name=\"Icon\",\n info=\"The icon of the message.\",\n advanced=True,\n ),\n MessageTextInput(\n name=\"text_color\",\n display_name=\"Text Color\",\n info=\"The text color of the name\",\n advanced=True,\n ),\n BoolInput(\n name=\"clean_data\",\n display_name=\"Basic Clean Data\",\n value=True,\n info=\"Whether to clean the data\",\n advanced=True,\n ),\n ]\n outputs = [\n Output(\n display_name=\"Message\",\n name=\"message\",\n method=\"message_response\",\n ),\n ]\n\n def _build_source(self, id_: str | None, display_name: str | None, source: str | None) -> Source:\n source_dict = {}\n if id_:\n source_dict[\"id\"] = id_\n if display_name:\n source_dict[\"display_name\"] = display_name\n if source:\n # Handle case where source is a ChatOpenAI object\n if hasattr(source, \"model_name\"):\n source_dict[\"source\"] = source.model_name\n elif hasattr(source, \"model\"):\n source_dict[\"source\"] = str(source.model)\n else:\n source_dict[\"source\"] = str(source)\n return Source(**source_dict)\n\n async def message_response(self) -> Message:\n # First convert the input to string if needed\n text = self.convert_to_string()\n\n # Get source properties\n source, icon, display_name, source_id = self.get_properties_from_source_component()\n background_color = self.background_color\n text_color = self.text_color\n if self.chat_icon:\n icon = self.chat_icon\n\n # Create or use existing Message object\n if isinstance(self.input_value, Message):\n message = self.input_value\n # Update message properties\n message.text = text\n else:\n message = Message(text=text)\n\n # Set message properties\n message.sender = self.sender\n message.sender_name = self.sender_name\n message.session_id = self.session_id\n message.flow_id = self.graph.flow_id if hasattr(self, \"graph\") else None\n message.properties.source = self._build_source(source_id, display_name, source)\n message.properties.icon = icon\n message.properties.background_color = background_color\n message.properties.text_color = text_color\n\n # Store message if needed\n if self.session_id and self.should_store_message:\n stored_message = await self.send_message(message)\n self.message.value = stored_message\n message = stored_message\n\n self.status = message\n return message\n\n def _serialize_data(self, data: Data) -> str:\n \"\"\"Serialize Data object to JSON string.\"\"\"\n # Convert data.data to JSON-serializable format\n serializable_data = jsonable_encoder(data.data)\n # Serialize with orjson, enabling pretty printing with indentation\n json_bytes = orjson.dumps(serializable_data, option=orjson.OPT_INDENT_2)\n # Convert bytes to string and wrap in Markdown code blocks\n return \"```json\\n\" + json_bytes.decode(\"utf-8\") + \"\\n```\"\n\n def _validate_input(self) -> None:\n \"\"\"Validate the input data and raise ValueError if invalid.\"\"\"\n if self.input_value is None:\n msg = \"Input data cannot be None\"\n raise ValueError(msg)\n if isinstance(self.input_value, list) and not all(\n isinstance(item, Message | Data | DataFrame | str) for item in self.input_value\n ):\n invalid_types = [\n type(item).__name__\n for item in self.input_value\n if not isinstance(item, Message | Data | DataFrame | str)\n ]\n msg = f\"Expected Data or DataFrame or Message or str, got {invalid_types}\"\n raise TypeError(msg)\n if not isinstance(\n self.input_value,\n Message | Data | DataFrame | str | list | Generator | type(None),\n ):\n type_name = type(self.input_value).__name__\n msg = f\"Expected Data or DataFrame or Message or str, Generator or None, got {type_name}\"\n raise TypeError(msg)\n\n def convert_to_string(self) -> str | Generator[Any, None, None]:\n \"\"\"Convert input data to string with proper error handling.\"\"\"\n self._validate_input()\n if isinstance(self.input_value, list):\n return \"\\n\".join([safe_convert(item, clean_data=self.clean_data) for item in self.input_value])\n if isinstance(self.input_value, Generator):\n return self.input_value\n return safe_convert(self.input_value)\n"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure clean_data is applied for single values in ChatOutput
In convert_to_string, the list branch passes clean_data=self.clean_data, but the single‐value branch calls safe_convert(self.input_value) without it. This can lead to inconsistent output cleaning. Consider:

-    return safe_convert(self.input_value)
+    return safe_convert(self.input_value, clean_data=self.clean_data)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"value": "from collections.abc import Generator\nfrom typing import Any\n\nimport orjson\nfrom fastapi.encoders import jsonable_encoder\n\nfrom langflow.base.io.chat import ChatComponent\nfrom langflow.helpers.data import safe_convert\nfrom langflow.inputs import BoolInput\nfrom langflow.inputs.inputs import HandleInput\nfrom langflow.io import DropdownInput, MessageTextInput, Output\nfrom langflow.schema.data import Data\nfrom langflow.schema.dataframe import DataFrame\nfrom langflow.schema.message import Message\nfrom langflow.schema.properties import Source\nfrom langflow.utils.constants import (\n MESSAGE_SENDER_AI,\n MESSAGE_SENDER_NAME_AI,\n MESSAGE_SENDER_USER,\n)\n\n\nclass ChatOutput(ChatComponent):\n display_name = \"Chat Output\"\n description = \"Display a chat message in the Playground.\"\n icon = \"MessagesSquare\"\n name = \"ChatOutput\"\n minimized = True\n\n inputs = [\n HandleInput(\n name=\"input_value\",\n display_name=\"Text\",\n info=\"Message to be passed as output.\",\n input_types=[\"Data\", \"DataFrame\", \"Message\"],\n required=True,\n ),\n BoolInput(\n name=\"should_store_message\",\n display_name=\"Store Messages\",\n info=\"Store the message in the history.\",\n value=True,\n advanced=True,\n ),\n DropdownInput(\n name=\"sender\",\n display_name=\"Sender Type\",\n options=[MESSAGE_SENDER_AI, MESSAGE_SENDER_USER],\n value=MESSAGE_SENDER_AI,\n advanced=True,\n info=\"Type of sender.\",\n ),\n MessageTextInput(\n name=\"sender_name\",\n display_name=\"Sender Name\",\n info=\"Name of the sender.\",\n value=MESSAGE_SENDER_NAME_AI,\n advanced=True,\n ),\n MessageTextInput(\n name=\"session_id\",\n display_name=\"Session ID\",\n info=\"The session ID of the chat. If empty, the current session ID parameter will be used.\",\n advanced=True,\n ),\n MessageTextInput(\n name=\"data_template\",\n display_name=\"Data Template\",\n value=\"{text}\",\n advanced=True,\n info=\"Template to convert Data to Text. If left empty, it will be dynamically set to the Data's text key.\",\n ),\n MessageTextInput(\n name=\"background_color\",\n display_name=\"Background Color\",\n info=\"The background color of the icon.\",\n advanced=True,\n ),\n MessageTextInput(\n name=\"chat_icon\",\n display_name=\"Icon\",\n info=\"The icon of the message.\",\n advanced=True,\n ),\n MessageTextInput(\n name=\"text_color\",\n display_name=\"Text Color\",\n info=\"The text color of the name\",\n advanced=True,\n ),\n BoolInput(\n name=\"clean_data\",\n display_name=\"Basic Clean Data\",\n value=True,\n info=\"Whether to clean the data\",\n advanced=True,\n ),\n ]\n outputs = [\n Output(\n display_name=\"Message\",\n name=\"message\",\n method=\"message_response\",\n ),\n ]\n\n def _build_source(self, id_: str | None, display_name: str | None, source: str | None) -> Source:\n source_dict = {}\n if id_:\n source_dict[\"id\"] = id_\n if display_name:\n source_dict[\"display_name\"] = display_name\n if source:\n # Handle case where source is a ChatOpenAI object\n if hasattr(source, \"model_name\"):\n source_dict[\"source\"] = source.model_name\n elif hasattr(source, \"model\"):\n source_dict[\"source\"] = str(source.model)\n else:\n source_dict[\"source\"] = str(source)\n return Source(**source_dict)\n\n async def message_response(self) -> Message:\n # First convert the input to string if needed\n text = self.convert_to_string()\n\n # Get source properties\n source, icon, display_name, source_id = self.get_properties_from_source_component()\n background_color = self.background_color\n text_color = self.text_color\n if self.chat_icon:\n icon = self.chat_icon\n\n # Create or use existing Message object\n if isinstance(self.input_value, Message):\n message = self.input_value\n # Update message properties\n message.text = text\n else:\n message = Message(text=text)\n\n # Set message properties\n message.sender = self.sender\n message.sender_name = self.sender_name\n message.session_id = self.session_id\n message.flow_id = self.graph.flow_id if hasattr(self, \"graph\") else None\n message.properties.source = self._build_source(source_id, display_name, source)\n message.properties.icon = icon\n message.properties.background_color = background_color\n message.properties.text_color = text_color\n\n # Store message if needed\n if self.session_id and self.should_store_message:\n stored_message = await self.send_message(message)\n self.message.value = stored_message\n message = stored_message\n\n self.status = message\n return message\n\n def _serialize_data(self, data: Data) -> str:\n \"\"\"Serialize Data object to JSON string.\"\"\"\n # Convert data.data to JSON-serializable format\n serializable_data = jsonable_encoder(data.data)\n # Serialize with orjson, enabling pretty printing with indentation\n json_bytes = orjson.dumps(serializable_data, option=orjson.OPT_INDENT_2)\n # Convert bytes to string and wrap in Markdown code blocks\n return \"```json\\n\" + json_bytes.decode(\"utf-8\") + \"\\n```\"\n\n def _validate_input(self) -> None:\n \"\"\"Validate the input data and raise ValueError if invalid.\"\"\"\n if self.input_value is None:\n msg = \"Input data cannot be None\"\n raise ValueError(msg)\n if isinstance(self.input_value, list) and not all(\n isinstance(item, Message | Data | DataFrame | str) for item in self.input_value\n ):\n invalid_types = [\n type(item).__name__\n for item in self.input_value\n if not isinstance(item, Message | Data | DataFrame | str)\n ]\n msg = f\"Expected Data or DataFrame or Message or str, got {invalid_types}\"\n raise TypeError(msg)\n if not isinstance(\n self.input_value,\n Message | Data | DataFrame | str | list | Generator | type(None),\n ):\n type_name = type(self.input_value).__name__\n msg = f\"Expected Data or DataFrame or Message or str, Generator or None, got {type_name}\"\n raise TypeError(msg)\n\n def convert_to_string(self) -> str | Generator[Any, None, None]:\n \"\"\"Convert input data to string with proper error handling.\"\"\"\n self._validate_input()\n if isinstance(self.input_value, list):\n return \"\\n\".join([safe_convert(item, clean_data=self.clean_data) for item in self.input_value])\n if isinstance(self.input_value, Generator):\n return self.input_value\n return safe_convert(self.input_value)\n"
def convert_to_string(self) -> str | Generator[Any, None, None]:
"""Convert input data to string with proper error handling."""
self._validate_input()
if isinstance(self.input_value, list):
return "\n".join(
[safe_convert(item, clean_data=self.clean_data) for item in self.input_value]
)
if isinstance(self.input_value, Generator):
return self.input_value
return safe_convert(self.input_value, clean_data=self.clean_data)
🤖 Prompt for AI Agents
In src/backend/base/langflow/initial_setup/starter_projects/Research Translation
Loop.json at line 929, the convert_to_string method applies the clean_data flag
when input_value is a list but omits it for single values. To fix this, modify
the single-value branch to call safe_convert with clean_data=self.clean_data to
ensure consistent data cleaning across all input types.

"title_case": false,
"type": "code",
"value": "import json\nfrom typing import Any\n\nfrom langflow.custom import Component\nfrom langflow.io import (\n BoolInput,\n HandleInput,\n MessageTextInput,\n MultilineInput,\n Output,\n TabInput,\n)\nfrom langflow.schema import Data, DataFrame\nfrom langflow.schema.message import Message\n\n\nclass ParserComponent(Component):\n display_name = \"Parser\"\n description = (\n \"Format a DataFrame or Data object into text using a template. \"\n \"Enable 'Stringify' to convert input into a readable string instead.\"\n )\n icon = \"braces\"\n\n inputs = [\n TabInput(\n name=\"mode\",\n display_name=\"Mode\",\n options=[\"Parser\", \"Stringify\"],\n value=\"Parser\",\n info=\"Convert into raw string instead of using a template.\",\n real_time_refresh=True,\n ),\n MultilineInput(\n name=\"pattern\",\n display_name=\"Template\",\n info=(\n \"Use variables within curly brackets to extract column values for DataFrames \"\n \"or key values for Data.\"\n \"For example: `Name: {Name}, Age: {Age}, Country: {Country}`\"\n ),\n value=\"Text: {text}\", # Example default\n dynamic=True,\n show=True,\n required=True,\n ),\n HandleInput(\n name=\"input_data\",\n display_name=\"Data or DataFrame\",\n input_types=[\"DataFrame\", \"Data\"],\n info=\"Accepts either a DataFrame or a Data object.\",\n required=True,\n ),\n MessageTextInput(\n name=\"sep\",\n display_name=\"Separator\",\n advanced=True,\n value=\"\\n\",\n info=\"String used to separate rows/items.\",\n ),\n ]\n\n outputs = [\n Output(\n display_name=\"Parsed Text\",\n name=\"parsed_text\",\n info=\"Formatted text output.\",\n method=\"parse_combined_text\",\n ),\n ]\n\n def update_build_config(self, build_config, field_value, field_name=None):\n \"\"\"Dynamically hide/show `template` and enforce requirement based on `stringify`.\"\"\"\n if field_name == \"mode\":\n build_config[\"pattern\"][\"show\"] = self.mode == \"Parser\"\n build_config[\"pattern\"][\"required\"] = self.mode == \"Parser\"\n if field_value:\n clean_data = BoolInput(\n name=\"clean_data\",\n display_name=\"Clean Data\",\n info=(\n \"Enable to clean the data by removing empty rows and lines \"\n \"in each cell of the DataFrame/ Data object.\"\n ),\n value=True,\n advanced=True,\n required=False,\n )\n build_config[\"clean_data\"] = clean_data.to_dict()\n else:\n build_config.pop(\"clean_data\", None)\n\n return build_config\n\n def _clean_args(self):\n \"\"\"Prepare arguments based on input type.\"\"\"\n input_data = self.input_data\n\n match input_data:\n case list() if all(isinstance(item, Data) for item in input_data):\n msg = \"List of Data objects is not supported.\"\n raise ValueError(msg)\n case DataFrame():\n return input_data, None\n case Data():\n return None, input_data\n case dict() if \"data\" in input_data:\n try:\n if \"columns\" in input_data: # Likely a DataFrame\n return DataFrame.from_dict(input_data), None\n # Likely a Data object\n return None, Data(**input_data)\n except (TypeError, ValueError, KeyError) as e:\n msg = f\"Invalid structured input provided: {e!s}\"\n raise ValueError(msg) from e\n case _:\n msg = f\"Unsupported input type: {type(input_data)}. Expected DataFrame or Data.\"\n raise ValueError(msg)\n\n def parse_combined_text(self) -> Message:\n \"\"\"Parse all rows/items into a single text or convert input to string if `stringify` is enabled.\"\"\"\n # Early return for stringify option\n if self.mode == \"Stringify\":\n return self.convert_to_string()\n\n df, data = self._clean_args()\n\n lines = []\n if df is not None:\n for _, row in df.iterrows():\n formatted_text = self.pattern.format(**row.to_dict())\n lines.append(formatted_text)\n elif data is not None:\n formatted_text = self.pattern.format(**data.data)\n lines.append(formatted_text)\n\n combined_text = self.sep.join(lines)\n self.status = combined_text\n return Message(text=combined_text)\n\n def _safe_convert(self, data: Any) -> str:\n \"\"\"Safely convert input data to string.\"\"\"\n try:\n if isinstance(data, str):\n return data\n if isinstance(data, Message):\n return data.get_text()\n if isinstance(data, Data):\n return json.dumps(data.data)\n if isinstance(data, DataFrame):\n if hasattr(self, \"clean_data\") and self.clean_data:\n # Remove empty rows\n data = data.dropna(how=\"all\")\n # Remove empty lines in each cell\n data = data.replace(r\"^\\s*$\", \"\", regex=True)\n # Replace multiple newlines with a single newline\n data = data.replace(r\"\\n+\", \"\\n\", regex=True)\n return data.to_markdown(index=False)\n return str(data)\n except (ValueError, TypeError, AttributeError) as e:\n msg = f\"Error converting data: {e!s}\"\n raise ValueError(msg) from e\n\n def convert_to_string(self) -> Message:\n \"\"\"Convert input data to string with proper error handling.\"\"\"\n result = \"\"\n if isinstance(self.input_data, list):\n result = \"\\n\".join([self._safe_convert(item) for item in self.input_data])\n else:\n result = self._safe_convert(self.input_data)\n self.log(f\"Converted to string with length: {len(result)}\")\n\n message = Message(text=result)\n self.status = message\n return message\n"
"value": "from langflow.custom import Component\nfrom langflow.helpers.data import safe_convert\nfrom langflow.io import (\n BoolInput,\n HandleInput,\n MessageTextInput,\n MultilineInput,\n Output,\n TabInput,\n)\nfrom langflow.schema import Data, DataFrame\nfrom langflow.schema.message import Message\n\n\nclass ParserComponent(Component):\n display_name = \"Parser\"\n description = (\n \"Format a DataFrame or Data object into text using a template. \"\n \"Enable 'Stringify' to convert input into a readable string instead.\"\n )\n icon = \"braces\"\n\n inputs = [\n TabInput(\n name=\"mode\",\n display_name=\"Mode\",\n options=[\"Parser\", \"Stringify\"],\n value=\"Parser\",\n info=\"Convert into raw string instead of using a template.\",\n real_time_refresh=True,\n ),\n MultilineInput(\n name=\"pattern\",\n display_name=\"Template\",\n info=(\n \"Use variables within curly brackets to extract column values for DataFrames \"\n \"or key values for Data.\"\n \"For example: `Name: {Name}, Age: {Age}, Country: {Country}`\"\n ),\n value=\"Text: {text}\", # Example default\n dynamic=True,\n show=True,\n required=True,\n ),\n HandleInput(\n name=\"input_data\",\n display_name=\"Data or DataFrame\",\n input_types=[\"DataFrame\", \"Data\"],\n info=\"Accepts either a DataFrame or a Data object.\",\n required=True,\n ),\n MessageTextInput(\n name=\"sep\",\n display_name=\"Separator\",\n advanced=True,\n value=\"\\n\",\n info=\"String used to separate rows/items.\",\n ),\n ]\n\n outputs = [\n Output(\n display_name=\"Parsed Text\",\n name=\"parsed_text\",\n info=\"Formatted text output.\",\n method=\"parse_combined_text\",\n ),\n ]\n\n def update_build_config(self, build_config, field_value, field_name=None):\n \"\"\"Dynamically hide/show `template` and enforce requirement based on `stringify`.\"\"\"\n if field_name == \"mode\":\n build_config[\"pattern\"][\"show\"] = self.mode == \"Parser\"\n build_config[\"pattern\"][\"required\"] = self.mode == \"Parser\"\n if field_value:\n clean_data = BoolInput(\n name=\"clean_data\",\n display_name=\"Clean Data\",\n info=(\n \"Enable to clean the data by removing empty rows and lines \"\n \"in each cell of the DataFrame/ Data object.\"\n ),\n value=True,\n advanced=True,\n required=False,\n )\n build_config[\"clean_data\"] = clean_data.to_dict()\n else:\n build_config.pop(\"clean_data\", None)\n\n return build_config\n\n def _clean_args(self):\n \"\"\"Prepare arguments based on input type.\"\"\"\n input_data = self.input_data\n\n match input_data:\n case list() if all(isinstance(item, Data) for item in input_data):\n msg = \"List of Data objects is not supported.\"\n raise ValueError(msg)\n case DataFrame():\n return input_data, None\n case Data():\n return None, input_data\n case dict() if \"data\" in input_data:\n try:\n if \"columns\" in input_data: # Likely a DataFrame\n return DataFrame.from_dict(input_data), None\n # Likely a Data object\n return None, Data(**input_data)\n except (TypeError, ValueError, KeyError) as e:\n msg = f\"Invalid structured input provided: {e!s}\"\n raise ValueError(msg) from e\n case _:\n msg = f\"Unsupported input type: {type(input_data)}. Expected DataFrame or Data.\"\n raise ValueError(msg)\n\n def parse_combined_text(self) -> Message:\n \"\"\"Parse all rows/items into a single text or convert input to string if `stringify` is enabled.\"\"\"\n # Early return for stringify option\n if self.mode == \"Stringify\":\n return self.convert_to_string()\n\n df, data = self._clean_args()\n\n lines = []\n if df is not None:\n for _, row in df.iterrows():\n formatted_text = self.pattern.format(**row.to_dict())\n lines.append(formatted_text)\n elif data is not None:\n formatted_text = self.pattern.format(**data.data)\n lines.append(formatted_text)\n\n combined_text = self.sep.join(lines)\n self.status = combined_text\n return Message(text=combined_text)\n\n def convert_to_string(self) -> Message:\n \"\"\"Convert input data to string with proper error handling.\"\"\"\n result = \"\"\n if isinstance(self.input_data, list):\n result = \"\\n\".join([safe_convert(item, clean_data=self.clean_data or False) for item in self.input_data])\n else:\n result = safe_convert(self.input_data or False)\n self.log(f\"Converted to string with length: {len(result)}\")\n\n message = Message(text=result)\n self.status = message\n return message\n"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Consistent clean_data handling in ParserComponent’s converter
The non‐list branch of convert_to_string omits the clean_data flag, while the list branch uses self.clean_data. For uniform behavior, update:

-        else:
-            result = safe_convert(self.input_data or False)
+        else:
+            result = safe_convert(self.input_data or False, clean_data=self.clean_data or False)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"value": "from langflow.custom import Component\nfrom langflow.helpers.data import safe_convert\nfrom langflow.io import (\n BoolInput,\n HandleInput,\n MessageTextInput,\n MultilineInput,\n Output,\n TabInput,\n)\nfrom langflow.schema import Data, DataFrame\nfrom langflow.schema.message import Message\n\n\nclass ParserComponent(Component):\n display_name = \"Parser\"\n description = (\n \"Format a DataFrame or Data object into text using a template. \"\n \"Enable 'Stringify' to convert input into a readable string instead.\"\n )\n icon = \"braces\"\n\n inputs = [\n TabInput(\n name=\"mode\",\n display_name=\"Mode\",\n options=[\"Parser\", \"Stringify\"],\n value=\"Parser\",\n info=\"Convert into raw string instead of using a template.\",\n real_time_refresh=True,\n ),\n MultilineInput(\n name=\"pattern\",\n display_name=\"Template\",\n info=(\n \"Use variables within curly brackets to extract column values for DataFrames \"\n \"or key values for Data.\"\n \"For example: `Name: {Name}, Age: {Age}, Country: {Country}`\"\n ),\n value=\"Text: {text}\", # Example default\n dynamic=True,\n show=True,\n required=True,\n ),\n HandleInput(\n name=\"input_data\",\n display_name=\"Data or DataFrame\",\n input_types=[\"DataFrame\", \"Data\"],\n info=\"Accepts either a DataFrame or a Data object.\",\n required=True,\n ),\n MessageTextInput(\n name=\"sep\",\n display_name=\"Separator\",\n advanced=True,\n value=\"\\n\",\n info=\"String used to separate rows/items.\",\n ),\n ]\n\n outputs = [\n Output(\n display_name=\"Parsed Text\",\n name=\"parsed_text\",\n info=\"Formatted text output.\",\n method=\"parse_combined_text\",\n ),\n ]\n\n def update_build_config(self, build_config, field_value, field_name=None):\n \"\"\"Dynamically hide/show `template` and enforce requirement based on `stringify`.\"\"\"\n if field_name == \"mode\":\n build_config[\"pattern\"][\"show\"] = self.mode == \"Parser\"\n build_config[\"pattern\"][\"required\"] = self.mode == \"Parser\"\n if field_value:\n clean_data = BoolInput(\n name=\"clean_data\",\n display_name=\"Clean Data\",\n info=(\n \"Enable to clean the data by removing empty rows and lines \"\n \"in each cell of the DataFrame/ Data object.\"\n ),\n value=True,\n advanced=True,\n required=False,\n )\n build_config[\"clean_data\"] = clean_data.to_dict()\n else:\n build_config.pop(\"clean_data\", None)\n\n return build_config\n\n def _clean_args(self):\n \"\"\"Prepare arguments based on input type.\"\"\"\n input_data = self.input_data\n\n match input_data:\n case list() if all(isinstance(item, Data) for item in input_data):\n msg = \"List of Data objects is not supported.\"\n raise ValueError(msg)\n case DataFrame():\n return input_data, None\n case Data():\n return None, input_data\n case dict() if \"data\" in input_data:\n try:\n if \"columns\" in input_data: # Likely a DataFrame\n return DataFrame.from_dict(input_data), None\n # Likely a Data object\n return None, Data(**input_data)\n except (TypeError, ValueError, KeyError) as e:\n msg = f\"Invalid structured input provided: {e!s}\"\n raise ValueError(msg) from e\n case _:\n msg = f\"Unsupported input type: {type(input_data)}. Expected DataFrame or Data.\"\n raise ValueError(msg)\n\n def parse_combined_text(self) -> Message:\n \"\"\"Parse all rows/items into a single text or convert input to string if `stringify` is enabled.\"\"\"\n # Early return for stringify option\n if self.mode == \"Stringify\":\n return self.convert_to_string()\n\n df, data = self._clean_args()\n\n lines = []\n if df is not None:\n for _, row in df.iterrows():\n formatted_text = self.pattern.format(**row.to_dict())\n lines.append(formatted_text)\n elif data is not None:\n formatted_text = self.pattern.format(**data.data)\n lines.append(formatted_text)\n\n combined_text = self.sep.join(lines)\n self.status = combined_text\n return Message(text=combined_text)\n\n def convert_to_string(self) -> Message:\n \"\"\"Convert input data to string with proper error handling.\"\"\"\n result = \"\"\n if isinstance(self.input_data, list):\n result = \"\\n\".join([safe_convert(item, clean_data=self.clean_data or False) for item in self.input_data])\n else:\n result = safe_convert(self.input_data or False)\n self.log(f\"Converted to string with length: {len(result)}\")\n\n message = Message(text=result)\n self.status = message\n return message\n"
def convert_to_string(self) -> Message:
"""Convert input data to string with proper error handling."""
result = ""
if isinstance(self.input_data, list):
result = "\n".join([safe_convert(item, clean_data=self.clean_data or False)
for item in self.input_data])
else:
result = safe_convert(self.input_data or False,
clean_data=self.clean_data or False)
self.log(f"Converted to string with length: {len(result)}")
message = Message(text=result)
self.status = message
return message
🤖 Prompt for AI Agents
In src/backend/base/langflow/initial_setup/starter_projects/Research Translation
Loop.json at line 1536, the convert_to_string method inconsistently handles the
clean_data flag: the list branch uses self.clean_data but the non-list branch
does not. To fix this, modify the non-list branch to also pass
clean_data=self.clean_data or False to the safe_convert call, ensuring
consistent behavior for both branches.

Cristhianzl and others added 9 commits May 30, 2025 16:15
✨ (stop-building.spec.ts): update test to use correct testid for element
✨ (loop-component.spec.ts): update test to use correct testid for element
✨ (chatInputOutputUser-shard-1.spec.ts): update tests to use correct testid for element
…tInputOutputUser-shard-1.spec.ts): update test selectors to match changes in the frontend UI, improving test reliability and maintainability.
…king element

✨ (loop-component.spec.ts): update test to use correct testId for clicking element
✨ (chatInputOutputUser-shard-1.spec.ts): update multiple tests to use correct testId for clicking element
…sure a maximum of 10 shards for test execution

🔧 (chatInputOutputUser-shard-1.spec.ts): update test selectors to match changes in the frontend output structure for integration tests
…tter clarity and consistency in the integration tests.
@erichare erichare added this pull request to the merge queue May 30, 2025
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 30, 2025
@erichare erichare added this pull request to the merge queue May 30, 2025
Merged via the queue into main with commit fd73cdc May 30, 2025
38 checks passed
@erichare erichare deleted the native-components-clean branch May 30, 2025 22:26
ogabrielluiz pushed a commit to bkatya2001/langflow that referenced this pull request Jun 24, 2025
* url component update.

* update to url component and tests

* Make directory component legacy

* Only output dataframe from file component

* Update base_file.py

* Update description and output

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* Deprecate Processing Components.

* Move Tool and CQL Astra to bundle

* Comprehensive improvements to Save to File

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* Clean up description, dont unlink file

* Remove print statement

* fix: Clean up the text output of the URL component (langflow-ai#8158)

* Clean text output from url component

* [autofix.ci] apply automated fixes

* Update data.py

* Make a visible function

* URL component cleaning refactor

* Update data.py

* [autofix.ci] apply automated fixes

* Update with chat output fixes and template updates

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes

* Fix linting issues

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>

* revert datastax component bundle

* Restore the two tools as well

* Two more template updates

* Update Vector Store RAG.json

* Update Vector Store RAG.json

* Update __init__.py

* Update directory.py

* Update url.py

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* Update test_basic_prompting.py

* Unit test updates

* Fix unit tests one more time

* Fix conversion in safe convert

* Update chat.py

* Temporary disabling of save to file tests

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* Fix some more unit tests

* Update test_split_text_component.py

* [autofix.ci] apply automated fixes

* Update test_url_component.py

* Update file component outputs in tests

* Fix starter projects with old data to message

* Update test_split_text_component.py

* fix slider inputs

* Update data.py

* [autofix.ci] apply automated fixes

* Update data.py

* 🐛 (typescript_test.yml): increase the maximum shard count to 40 to improve test distribution and performance

* Rename safe file component

* [autofix.ci] apply automated fixes

* Make sure we import the right save to file

* 🔧 (freeze.spec.ts): update test description to match the changed element's test ID
🔧 (Blog Writer.spec.ts): add click event to test file input element
🔧 (edit-tools.spec.ts): update assertion to check if rowsCount is greater than 2 instead of 3
🔧 (loop-component.spec.ts): add import statement for uploadFile function
🔧 (tool-mode.spec.ts): update targetPosition coordinates for dragTo action
🔧 (chatInputOutputUser-shard-1.spec.ts): update test description to match the changed element's test ID

* ✨ (stop-building.spec.ts): update click target for better test coverage and accuracy
✨ (fileUploadComponent.spec.ts): adjust drag target position and update click targets for improved testing flow and coverage

* 🐛 (typescript_test.yml): adjust the maximum shard count to 10 to prevent excessive parallelization and improve test performance

* Two url component types

* Update ruff formatting

* [autofix.ci] apply automated fixes

* Revert name of method

* 🐛 (typescript_test.yml): increase the maximum shard count to 40 to improve test distribution and performance

* ✨ (freeze.spec.ts): update test to use correct testid for element
✨ (stop-building.spec.ts): update test to use correct testid for element
✨ (loop-component.spec.ts): update test to use correct testid for element
✨ (chatInputOutputUser-shard-1.spec.ts): update tests to use correct testid for element

* ✨ (freeze.spec.ts, stop-building.spec.ts, loop-component.spec.ts, chatInputOutputUser-shard-1.spec.ts): update test selectors to match changes in the frontend UI, improving test reliability and maintainability.

* ✨ (stop-building.spec.ts): update test to use correct testId for clicking element
✨ (loop-component.spec.ts): update test to use correct testId for clicking element
✨ (chatInputOutputUser-shard-1.spec.ts): update multiple tests to use correct testId for clicking element

* 📝 (freeze.spec.ts): update test selector to match the correct element on the page for better test accuracy

* 🔧 (typescript_test.yml): adjust optimal shard count calculation to ensure a maximum of 10 shards for test execution
🔧 (chatInputOutputUser-shard-1.spec.ts): update test selectors to match changes in the frontend output structure for integration tests

* ✨ (chatInputOutputUser-shard-1.spec.ts): update test selectors for better clarity and consistency in the integration tests.

---------

Co-authored-by: Eric Hare <ericrhare@gmail.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: cristhianzl <cristhian.lousa@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request lgtm This PR has been approved by a maintainer size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants