refactor: update build_structured_output to return only last output element#8467
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThe changes update the Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant StructuredOutputComponent
participant build_structured_output_base
participant Data
User->>StructuredOutputComponent: build_structured_output()
StructuredOutputComponent->>build_structured_output_base: call()
build_structured_output_base-->>StructuredOutputComponent: output (list)
StructuredOutputComponent->>Data: Data(data=output[-1])
Data-->>StructuredOutputComponent: Data object
StructuredOutputComponent-->>User: Data object
✨ Finishing Touches🧪 Generate Unit Tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Pull Request Overview
This PR updates the StructuredOutputComponent return value to unwrap the last element of the model output directly, and aligns all starter project templates with this new behavior.
- Changed
build_structured_outputto returnData(data=output[-1])instead of wrapping with a"results"key. - Propagated the return signature change across all JSON starter project code templates.
- Updated the actual Python component (
structured_output.py) to match the new return format.
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/backend/base/langflow/initial_setup/starter_projects/Portfolio Website Code Generator.json | Updated build_structured_output return to Data(data=output[-1]) |
| src/backend/base/langflow/initial_setup/starter_projects/Market Research.json | Same return signature adjustment in JSON template |
| src/backend/base/langflow/initial_setup/starter_projects/Image Sentiment Analysis.json | Same return signature adjustment in JSON template |
| src/backend/base/langflow/initial_setup/starter_projects/Hybrid Search RAG.json | Same return signature adjustment in JSON template |
| src/backend/base/langflow/initial_setup/starter_projects/Financial Report Parser.json | Same return signature adjustment in JSON template |
| src/backend/base/langflow/components/processing/structured_output.py | Changed build_structured_output to return Data(data=output[-1]) |
Comments suppressed due to low confidence (1)
src/backend/base/langflow/components/processing/structured_output.py:165
- Add unit tests for
build_structured_outputto cover scenarios with single, multiple, and zero elements in the output list to prevent regressions or IndexError.
def build_structured_output(self) -> Data:
| @@ -165,4 +165,4 @@ def build_structured_output_base(self) -> Data: | |||
| def build_structured_output(self) -> Data: | |||
| output = self.build_structured_output_base() | |||
|
|
|||
There was a problem hiding this comment.
Accessing output[-1] without checking if the list is empty can raise an IndexError. Consider handling the empty case explicitly (e.g., return an empty Data object or raise a clearer exception).
| if not output: | |
| return Data(data=[]) |
| "title_case": false, | ||
| "type": "code", | ||
| "value": "from pydantic import BaseModel, Field, create_model\nfrom trustcall import create_extractor\n\nfrom langflow.base.models.chat_result import get_chat_result\nfrom langflow.custom.custom_component.component import Component\nfrom langflow.helpers.base_model import build_model_from_schema\nfrom langflow.io import (\n HandleInput,\n MessageTextInput,\n MultilineInput,\n Output,\n TableInput,\n)\nfrom langflow.schema.data import Data\nfrom langflow.schema.table import EditMode\n\n\nclass StructuredOutputComponent(Component):\n display_name = \"Structured Output\"\n description = \"Uses an LLM to generate structured data. Ideal for extraction and consistency.\"\n name = \"StructuredOutput\"\n icon = \"braces\"\n\n inputs = [\n HandleInput(\n name=\"llm\",\n display_name=\"Language Model\",\n info=\"The language model to use to generate the structured output.\",\n input_types=[\"LanguageModel\"],\n required=True,\n ),\n MessageTextInput(\n name=\"input_value\",\n display_name=\"Input Message\",\n info=\"The input message to the language model.\",\n tool_mode=True,\n required=True,\n ),\n MultilineInput(\n name=\"system_prompt\",\n display_name=\"Format Instructions\",\n info=\"The instructions to the language model for formatting the output.\",\n value=(\n \"You are an AI system designed to extract structured information from unstructured text.\"\n \"Given the input_text, return a JSON object with predefined keys based on the expected structure.\"\n \"Extract values accurately and format them according to the specified type \"\n \"(e.g., string, integer, float, date).\"\n \"If a value is missing or cannot be determined, return a default \"\n \"(e.g., null, 0, or 'N/A').\"\n \"If multiple instances of the expected structure exist within the input_text, \"\n \"stream each as a separate JSON object.\"\n ),\n required=True,\n advanced=True,\n ),\n MessageTextInput(\n name=\"schema_name\",\n display_name=\"Schema Name\",\n info=\"Provide a name for the output data schema.\",\n advanced=True,\n ),\n TableInput(\n name=\"output_schema\",\n display_name=\"Output Schema\",\n info=\"Define the structure and data types for the model's output.\",\n required=True,\n # TODO: remove deault value\n table_schema=[\n {\n \"name\": \"name\",\n \"display_name\": \"Name\",\n \"type\": \"str\",\n \"description\": \"Specify the name of the output field.\",\n \"default\": \"field\",\n \"edit_mode\": EditMode.INLINE,\n },\n {\n \"name\": \"description\",\n \"display_name\": \"Description\",\n \"type\": \"str\",\n \"description\": \"Describe the purpose of the output field.\",\n \"default\": \"description of field\",\n \"edit_mode\": EditMode.POPOVER,\n },\n {\n \"name\": \"type\",\n \"display_name\": \"Type\",\n \"type\": \"str\",\n \"edit_mode\": EditMode.INLINE,\n \"description\": (\"Indicate the data type of the output field (e.g., str, int, float, bool, dict).\"),\n \"options\": [\"str\", \"int\", \"float\", \"bool\", \"dict\"],\n \"default\": \"str\",\n },\n {\n \"name\": \"multiple\",\n \"display_name\": \"As List\",\n \"type\": \"boolean\",\n \"description\": \"Set to True if this output field should be a list of the specified type.\",\n \"default\": \"False\",\n \"edit_mode\": EditMode.INLINE,\n },\n ],\n value=[\n {\n \"name\": \"field\",\n \"description\": \"description of field\",\n \"type\": \"str\",\n \"multiple\": \"False\",\n }\n ],\n ),\n ]\n\n outputs = [\n Output(\n name=\"structured_output\",\n display_name=\"Structured Output\",\n method=\"build_structured_output\",\n ),\n ]\n\n def build_structured_output_base(self) -> Data:\n schema_name = self.schema_name or \"OutputModel\"\n\n if not hasattr(self.llm, \"with_structured_output\"):\n msg = \"Language model does not support structured output.\"\n raise TypeError(msg)\n if not self.output_schema:\n msg = \"Output schema cannot be empty\"\n raise ValueError(msg)\n\n output_model_ = build_model_from_schema(self.output_schema)\n\n output_model = create_model(\n schema_name,\n __doc__=f\"A list of {schema_name}.\",\n objects=(list[output_model_], Field(description=f\"A list of {schema_name}.\")), # type: ignore[valid-type]\n )\n\n try:\n llm_with_structured_output = create_extractor(self.llm, tools=[output_model])\n except NotImplementedError as exc:\n msg = f\"{self.llm.__class__.__name__} does not support structured output.\"\n raise TypeError(msg) from exc\n config_dict = {\n \"run_name\": self.display_name,\n \"project_name\": self.get_project_name(),\n \"callbacks\": self.get_langchain_callbacks(),\n }\n result = get_chat_result(\n runnable=llm_with_structured_output,\n system_message=self.system_prompt,\n input_value=self.input_value,\n config=config_dict,\n )\n if isinstance(result, BaseModel):\n result = result.model_dump()\n if responses := result.get(\"responses\"):\n result = responses[0].model_dump()\n if result and \"objects\" in result:\n return result[\"objects\"]\n\n return result\n\n def build_structured_output(self) -> Data:\n output = self.build_structured_output_base()\n\n return Data(text_key=\"results\", data={\"results\": output})\n" | ||
| "value": "from pydantic import BaseModel, Field, create_model\nfrom trustcall import create_extractor\n\nfrom langflow.base.models.chat_result import get_chat_result\nfrom langflow.custom.custom_component.component import Component\nfrom langflow.helpers.base_model import build_model_from_schema\nfrom langflow.io import (\n HandleInput,\n MessageTextInput,\n MultilineInput,\n Output,\n TableInput,\n)\nfrom langflow.schema.data import Data\nfrom langflow.schema.table import EditMode\n\n\nclass StructuredOutputComponent(Component):\n display_name = \"Structured Output\"\n description = \"Uses an LLM to generate structured data. Ideal for extraction and consistency.\"\n name = \"StructuredOutput\"\n icon = \"braces\"\n\n inputs = [\n HandleInput(\n name=\"llm\",\n display_name=\"Language Model\",\n info=\"The language model to use to generate the structured output.\",\n input_types=[\"LanguageModel\"],\n required=True,\n ),\n MessageTextInput(\n name=\"input_value\",\n display_name=\"Input Message\",\n info=\"The input message to the language model.\",\n tool_mode=True,\n required=True,\n ),\n MultilineInput(\n name=\"system_prompt\",\n display_name=\"Format Instructions\",\n info=\"The instructions to the language model for formatting the output.\",\n value=(\n \"You are an AI system designed to extract structured information from unstructured text.\"\n \"Given the input_text, return a JSON object with predefined keys based on the expected structure.\"\n \"Extract values accurately and format them according to the specified type \"\n \"(e.g., string, integer, float, date).\"\n \"If a value is missing or cannot be determined, return a default \"\n \"(e.g., null, 0, or 'N/A').\"\n \"If multiple instances of the expected structure exist within the input_text, \"\n \"stream each as a separate JSON object.\"\n ),\n required=True,\n advanced=True,\n ),\n MessageTextInput(\n name=\"schema_name\",\n display_name=\"Schema Name\",\n info=\"Provide a name for the output data schema.\",\n advanced=True,\n ),\n TableInput(\n name=\"output_schema\",\n display_name=\"Output Schema\",\n info=\"Define the structure and data types for the model's output.\",\n required=True,\n # TODO: remove deault value\n table_schema=[\n {\n \"name\": \"name\",\n \"display_name\": \"Name\",\n \"type\": \"str\",\n \"description\": \"Specify the name of the output field.\",\n \"default\": \"field\",\n \"edit_mode\": EditMode.INLINE,\n },\n {\n \"name\": \"description\",\n \"display_name\": \"Description\",\n \"type\": \"str\",\n \"description\": \"Describe the purpose of the output field.\",\n \"default\": \"description of field\",\n \"edit_mode\": EditMode.POPOVER,\n },\n {\n \"name\": \"type\",\n \"display_name\": \"Type\",\n \"type\": \"str\",\n \"edit_mode\": EditMode.INLINE,\n \"description\": (\"Indicate the data type of the output field (e.g., str, int, float, bool, dict).\"),\n \"options\": [\"str\", \"int\", \"float\", \"bool\", \"dict\"],\n \"default\": \"str\",\n },\n {\n \"name\": \"multiple\",\n \"display_name\": \"As List\",\n \"type\": \"boolean\",\n \"description\": \"Set to True if this output field should be a list of the specified type.\",\n \"default\": \"False\",\n \"edit_mode\": EditMode.INLINE,\n },\n ],\n value=[\n {\n \"name\": \"field\",\n \"description\": \"description of field\",\n \"type\": \"str\",\n \"multiple\": \"False\",\n }\n ],\n ),\n ]\n\n outputs = [\n Output(\n name=\"structured_output\",\n display_name=\"Structured Output\",\n method=\"build_structured_output\",\n ),\n ]\n\n def build_structured_output_base(self) -> Data:\n schema_name = self.schema_name or \"OutputModel\"\n\n if not hasattr(self.llm, \"with_structured_output\"):\n msg = \"Language model does not support structured output.\"\n raise TypeError(msg)\n if not self.output_schema:\n msg = \"Output schema cannot be empty\"\n raise ValueError(msg)\n\n output_model_ = build_model_from_schema(self.output_schema)\n\n output_model = create_model(\n schema_name,\n __doc__=f\"A list of {schema_name}.\",\n objects=(list[output_model_], Field(description=f\"A list of {schema_name}.\")), # type: ignore[valid-type]\n )\n\n try:\n llm_with_structured_output = create_extractor(self.llm, tools=[output_model])\n except NotImplementedError as exc:\n msg = f\"{self.llm.__class__.__name__} does not support structured output.\"\n raise TypeError(msg) from exc\n config_dict = {\n \"run_name\": self.display_name,\n \"project_name\": self.get_project_name(),\n \"callbacks\": self.get_langchain_callbacks(),\n }\n result = get_chat_result(\n runnable=llm_with_structured_output,\n system_message=self.system_prompt,\n input_value=self.input_value,\n config=config_dict,\n )\n if isinstance(result, BaseModel):\n result = result.model_dump()\n if responses := result.get(\"responses\"):\n result = responses[0].model_dump()\n if result and \"objects\" in result:\n return result[\"objects\"]\n\n return result\n\n def build_structured_output(self) -> Data:\n output = self.build_structured_output_base()\n\n return Data(data=output[-1])\n" |
There was a problem hiding this comment.
Typo in comment: "deault" should be spelled "default".
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
src/backend/base/langflow/components/processing/structured_output.py (1)
165-169:⚠️ Potential issueGuard against empty
outputlist before subscripting
output[-1]will raise anIndexErrorwhen the extractor returns an empty list (e.g., LLM fails or is instructed to return nothing).
This was called out in a previous review and is still unaddressed.- return Data(data=output[-1]) + if not output: # graceful fallback + return Data(data=[]) + return Data(data=output[-1])Consider also whether callers might need the whole list; discarding everything but the last element changes the public contract.
src/backend/base/langflow/initial_setup/starter_projects/Image Sentiment Analysis.json (1)
1405-1411:⚠️ Potential issueSame unhandled empty-list risk inside starter-project component
The embedded code of
StructuredOutputComponentstill accessesoutput[-1]without checking for emptiness, propagating the same potentialIndexErrorto every flow generated from this template. Apply the same guard here.- return Data(data=output[-1]) + if not output: + return Data(data=[]) + return Data(data=output[-1])
🧹 Nitpick comments (1)
src/backend/base/langflow/initial_setup/starter_projects/Hybrid Search RAG.json (1)
611-618: Guard against empty or non-list outputs.The new
build_structured_outputimplementation unconditionally doesoutput[-1], which will raise anIndexErrorifoutputis empty or aTypeErrorif it’s not a sequence. Consider adding a pre-check:def build_structured_output(self) -> Data: output = self.build_structured_output_base() if not isinstance(output, (list, tuple)) or not output: raise ValueError("Structured output is empty or invalid.") return Data(data=output[-1])This ensures robustness and clearer error messages when no structured items are produced.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
src/backend/base/langflow/components/processing/structured_output.py(1 hunks)src/backend/base/langflow/initial_setup/starter_projects/Financial Report Parser.json(2 hunks)src/backend/base/langflow/initial_setup/starter_projects/Hybrid Search RAG.json(1 hunks)src/backend/base/langflow/initial_setup/starter_projects/Image Sentiment Analysis.json(1 hunks)src/backend/base/langflow/initial_setup/starter_projects/Market Research.json(1 hunks)src/backend/base/langflow/initial_setup/starter_projects/Portfolio Website Code Generator.json(1 hunks)
🔇 Additional comments (2)
src/backend/base/langflow/initial_setup/starter_projects/Financial Report Parser.json (2)
1259-1260: Mirror core output change in template
Thebuild_structured_outputmethod now returns only the last element (Data(data=output[-1])) instead of wrapping the full list under"results", aligning this starter configuration with the core component behavior.
1654-1654: Simplify project name
Removed the legacy “(1)” suffix from thenamefield to maintain consistent naming across all starter projects.
| "title_case": false, | ||
| "type": "code", | ||
| "value": "from pydantic import BaseModel, Field, create_model\nfrom trustcall import create_extractor\n\nfrom langflow.base.models.chat_result import get_chat_result\nfrom langflow.custom.custom_component.component import Component\nfrom langflow.helpers.base_model import build_model_from_schema\nfrom langflow.io import (\n HandleInput,\n MessageTextInput,\n MultilineInput,\n Output,\n TableInput,\n)\nfrom langflow.schema.data import Data\nfrom langflow.schema.table import EditMode\n\n\nclass StructuredOutputComponent(Component):\n display_name = \"Structured Output\"\n description = \"Uses an LLM to generate structured data. Ideal for extraction and consistency.\"\n name = \"StructuredOutput\"\n icon = \"braces\"\n\n inputs = [\n HandleInput(\n name=\"llm\",\n display_name=\"Language Model\",\n info=\"The language model to use to generate the structured output.\",\n input_types=[\"LanguageModel\"],\n required=True,\n ),\n MessageTextInput(\n name=\"input_value\",\n display_name=\"Input Message\",\n info=\"The input message to the language model.\",\n tool_mode=True,\n required=True,\n ),\n MultilineInput(\n name=\"system_prompt\",\n display_name=\"Format Instructions\",\n info=\"The instructions to the language model for formatting the output.\",\n value=(\n \"You are an AI system designed to extract structured information from unstructured text.\"\n \"Given the input_text, return a JSON object with predefined keys based on the expected structure.\"\n \"Extract values accurately and format them according to the specified type \"\n \"(e.g., string, integer, float, date).\"\n \"If a value is missing or cannot be determined, return a default \"\n \"(e.g., null, 0, or 'N/A').\"\n \"If multiple instances of the expected structure exist within the input_text, \"\n \"stream each as a separate JSON object.\"\n ),\n required=True,\n advanced=True,\n ),\n MessageTextInput(\n name=\"schema_name\",\n display_name=\"Schema Name\",\n info=\"Provide a name for the output data schema.\",\n advanced=True,\n ),\n TableInput(\n name=\"output_schema\",\n display_name=\"Output Schema\",\n info=\"Define the structure and data types for the model's output.\",\n required=True,\n # TODO: remove deault value\n table_schema=[\n {\n \"name\": \"name\",\n \"display_name\": \"Name\",\n \"type\": \"str\",\n \"description\": \"Specify the name of the output field.\",\n \"default\": \"field\",\n \"edit_mode\": EditMode.INLINE,\n },\n {\n \"name\": \"description\",\n \"display_name\": \"Description\",\n \"type\": \"str\",\n \"description\": \"Describe the purpose of the output field.\",\n \"default\": \"description of field\",\n \"edit_mode\": EditMode.POPOVER,\n },\n {\n \"name\": \"type\",\n \"display_name\": \"Type\",\n \"type\": \"str\",\n \"edit_mode\": EditMode.INLINE,\n \"description\": (\"Indicate the data type of the output field (e.g., str, int, float, bool, dict).\"),\n \"options\": [\"str\", \"int\", \"float\", \"bool\", \"dict\"],\n \"default\": \"str\",\n },\n {\n \"name\": \"multiple\",\n \"display_name\": \"As List\",\n \"type\": \"boolean\",\n \"description\": \"Set to True if this output field should be a list of the specified type.\",\n \"default\": \"False\",\n \"edit_mode\": EditMode.INLINE,\n },\n ],\n value=[\n {\n \"name\": \"field\",\n \"description\": \"description of field\",\n \"type\": \"str\",\n \"multiple\": \"False\",\n }\n ],\n ),\n ]\n\n outputs = [\n Output(\n name=\"structured_output\",\n display_name=\"Structured Output\",\n method=\"build_structured_output\",\n ),\n ]\n\n def build_structured_output_base(self) -> Data:\n schema_name = self.schema_name or \"OutputModel\"\n\n if not hasattr(self.llm, \"with_structured_output\"):\n msg = \"Language model does not support structured output.\"\n raise TypeError(msg)\n if not self.output_schema:\n msg = \"Output schema cannot be empty\"\n raise ValueError(msg)\n\n output_model_ = build_model_from_schema(self.output_schema)\n\n output_model = create_model(\n schema_name,\n __doc__=f\"A list of {schema_name}.\",\n objects=(list[output_model_], Field(description=f\"A list of {schema_name}.\")), # type: ignore[valid-type]\n )\n\n try:\n llm_with_structured_output = create_extractor(self.llm, tools=[output_model])\n except NotImplementedError as exc:\n msg = f\"{self.llm.__class__.__name__} does not support structured output.\"\n raise TypeError(msg) from exc\n config_dict = {\n \"run_name\": self.display_name,\n \"project_name\": self.get_project_name(),\n \"callbacks\": self.get_langchain_callbacks(),\n }\n result = get_chat_result(\n runnable=llm_with_structured_output,\n system_message=self.system_prompt,\n input_value=self.input_value,\n config=config_dict,\n )\n if isinstance(result, BaseModel):\n result = result.model_dump()\n if responses := result.get(\"responses\"):\n result = responses[0].model_dump()\n if result and \"objects\" in result:\n return result[\"objects\"]\n\n return result\n\n def build_structured_output(self) -> Data:\n output = self.build_structured_output_base()\n\n return Data(text_key=\"results\", data={\"results\": output})\n" | ||
| "value": "from pydantic import BaseModel, Field, create_model\nfrom trustcall import create_extractor\n\nfrom langflow.base.models.chat_result import get_chat_result\nfrom langflow.custom.custom_component.component import Component\nfrom langflow.helpers.base_model import build_model_from_schema\nfrom langflow.io import (\n HandleInput,\n MessageTextInput,\n MultilineInput,\n Output,\n TableInput,\n)\nfrom langflow.schema.data import Data\nfrom langflow.schema.table import EditMode\n\n\nclass StructuredOutputComponent(Component):\n display_name = \"Structured Output\"\n description = \"Uses an LLM to generate structured data. Ideal for extraction and consistency.\"\n name = \"StructuredOutput\"\n icon = \"braces\"\n\n inputs = [\n HandleInput(\n name=\"llm\",\n display_name=\"Language Model\",\n info=\"The language model to use to generate the structured output.\",\n input_types=[\"LanguageModel\"],\n required=True,\n ),\n MessageTextInput(\n name=\"input_value\",\n display_name=\"Input Message\",\n info=\"The input message to the language model.\",\n tool_mode=True,\n required=True,\n ),\n MultilineInput(\n name=\"system_prompt\",\n display_name=\"Format Instructions\",\n info=\"The instructions to the language model for formatting the output.\",\n value=(\n \"You are an AI system designed to extract structured information from unstructured text.\"\n \"Given the input_text, return a JSON object with predefined keys based on the expected structure.\"\n \"Extract values accurately and format them according to the specified type \"\n \"(e.g., string, integer, float, date).\"\n \"If a value is missing or cannot be determined, return a default \"\n \"(e.g., null, 0, or 'N/A').\"\n \"If multiple instances of the expected structure exist within the input_text, \"\n \"stream each as a separate JSON object.\"\n ),\n required=True,\n advanced=True,\n ),\n MessageTextInput(\n name=\"schema_name\",\n display_name=\"Schema Name\",\n info=\"Provide a name for the output data schema.\",\n advanced=True,\n ),\n TableInput(\n name=\"output_schema\",\n display_name=\"Output Schema\",\n info=\"Define the structure and data types for the model's output.\",\n required=True,\n # TODO: remove deault value\n table_schema=[\n {\n \"name\": \"name\",\n \"display_name\": \"Name\",\n \"type\": \"str\",\n \"description\": \"Specify the name of the output field.\",\n \"default\": \"field\",\n \"edit_mode\": EditMode.INLINE,\n },\n {\n \"name\": \"description\",\n \"display_name\": \"Description\",\n \"type\": \"str\",\n \"description\": \"Describe the purpose of the output field.\",\n \"default\": \"description of field\",\n \"edit_mode\": EditMode.POPOVER,\n },\n {\n \"name\": \"type\",\n \"display_name\": \"Type\",\n \"type\": \"str\",\n \"edit_mode\": EditMode.INLINE,\n \"description\": (\"Indicate the data type of the output field (e.g., str, int, float, bool, dict).\"),\n \"options\": [\"str\", \"int\", \"float\", \"bool\", \"dict\"],\n \"default\": \"str\",\n },\n {\n \"name\": \"multiple\",\n \"display_name\": \"As List\",\n \"type\": \"boolean\",\n \"description\": \"Set to True if this output field should be a list of the specified type.\",\n \"default\": \"False\",\n \"edit_mode\": EditMode.INLINE,\n },\n ],\n value=[\n {\n \"name\": \"field\",\n \"description\": \"description of field\",\n \"type\": \"str\",\n \"multiple\": \"False\",\n }\n ],\n ),\n ]\n\n outputs = [\n Output(\n name=\"structured_output\",\n display_name=\"Structured Output\",\n method=\"build_structured_output\",\n ),\n ]\n\n def build_structured_output_base(self) -> Data:\n schema_name = self.schema_name or \"OutputModel\"\n\n if not hasattr(self.llm, \"with_structured_output\"):\n msg = \"Language model does not support structured output.\"\n raise TypeError(msg)\n if not self.output_schema:\n msg = \"Output schema cannot be empty\"\n raise ValueError(msg)\n\n output_model_ = build_model_from_schema(self.output_schema)\n\n output_model = create_model(\n schema_name,\n __doc__=f\"A list of {schema_name}.\",\n objects=(list[output_model_], Field(description=f\"A list of {schema_name}.\")), # type: ignore[valid-type]\n )\n\n try:\n llm_with_structured_output = create_extractor(self.llm, tools=[output_model])\n except NotImplementedError as exc:\n msg = f\"{self.llm.__class__.__name__} does not support structured output.\"\n raise TypeError(msg) from exc\n config_dict = {\n \"run_name\": self.display_name,\n \"project_name\": self.get_project_name(),\n \"callbacks\": self.get_langchain_callbacks(),\n }\n result = get_chat_result(\n runnable=llm_with_structured_output,\n system_message=self.system_prompt,\n input_value=self.input_value,\n config=config_dict,\n )\n if isinstance(result, BaseModel):\n result = result.model_dump()\n if responses := result.get(\"responses\"):\n result = responses[0].model_dump()\n if result and \"objects\" in result:\n return result[\"objects\"]\n\n return result\n\n def build_structured_output(self) -> Data:\n output = self.build_structured_output_base()\n\n return Data(data=output[-1])\n" |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Ensure safe indexing of structured output list
The updated build_structured_output now returns Data(data=output[-1]) directly, which will raise an IndexError if output is empty. Consider adding a guard clause to validate that output is non-empty (e.g., if not output: raise ValueError("No structured output generated")) before accessing the last element.
🤖 Prompt for AI Agents
In src/backend/base/langflow/initial_setup/starter_projects/Portfolio Website
Code Generator.json at line 1516, the method build_structured_output accesses
the last element of output with output[-1] without checking if output is empty,
which can cause an IndexError. Add a guard clause before this access to check if
output is empty, and if so, raise a ValueError with a clear message like "No
structured output generated" to prevent runtime errors.
| "value": "from pydantic import BaseModel, Field, create_model\nfrom trustcall import create_extractor\n\nfrom langflow.base.models.chat_result import get_chat_result\nfrom langflow.custom.custom_component.component import Component\nfrom langflow.helpers.base_model import build_model_from_schema\nfrom langflow.io import (\n HandleInput,\n MessageTextInput,\n MultilineInput,\n Output,\n TableInput,\n)\nfrom langflow.schema.data import Data\nfrom langflow.schema.table import EditMode\n\n\nclass StructuredOutputComponent(Component):\n display_name = \"Structured Output\"\n description = \"Uses an LLM to generate structured data. Ideal for extraction and consistency.\"\n name = \"StructuredOutput\"\n icon = \"braces\"\n\n inputs = [\n HandleInput(\n name=\"llm\",\n display_name=\"Language Model\",\n info=\"The language model to use to generate the structured output.\",\n input_types=[\"LanguageModel\"],\n required=True,\n ),\n MessageTextInput(\n name=\"input_value\",\n display_name=\"Input Message\",\n info=\"The input message to the language model.\",\n tool_mode=True,\n required=True,\n ),\n MultilineInput(\n name=\"system_prompt\",\n display_name=\"Format Instructions\",\n info=\"The instructions to the language model for formatting the output.\",\n value=(\n \"You are an AI system designed to extract structured information from unstructured text.\"\n \"Given the input_text, return a JSON object with predefined keys based on the expected structure.\"\n \"Extract values accurately and format them according to the specified type \"\n \"(e.g., string, integer, float, date).\"\n \"If a value is missing or cannot be determined, return a default \"\n \"(e.g., null, 0, or 'N/A').\"\n \"If multiple instances of the expected structure exist within the input_text, \"\n \"stream each as a separate JSON object.\"\n ),\n required=True,\n advanced=True,\n ),\n MessageTextInput(\n name=\"schema_name\",\n display_name=\"Schema Name\",\n info=\"Provide a name for the output data schema.\",\n advanced=True,\n ),\n TableInput(\n name=\"output_schema\",\n display_name=\"Output Schema\",\n info=\"Define the structure and data types for the model's output.\",\n required=True,\n # TODO: remove deault value\n table_schema=[\n {\n \"name\": \"name\",\n \"display_name\": \"Name\",\n \"type\": \"str\",\n \"description\": \"Specify the name of the output field.\",\n \"default\": \"field\",\n \"edit_mode\": EditMode.INLINE,\n },\n {\n \"name\": \"description\",\n \"display_name\": \"Description\",\n \"type\": \"str\",\n \"description\": \"Describe the purpose of the output field.\",\n \"default\": \"description of field\",\n \"edit_mode\": EditMode.POPOVER,\n },\n {\n \"name\": \"type\",\n \"display_name\": \"Type\",\n \"type\": \"str\",\n \"edit_mode\": EditMode.INLINE,\n \"description\": (\"Indicate the data type of the output field (e.g., str, int, float, bool, dict).\"),\n \"options\": [\"str\", \"int\", \"float\", \"bool\", \"dict\"],\n \"default\": \"str\",\n },\n {\n \"name\": \"multiple\",\n \"display_name\": \"As List\",\n \"type\": \"boolean\",\n \"description\": \"Set to True if this output field should be a list of the specified type.\",\n \"default\": \"False\",\n \"edit_mode\": EditMode.INLINE,\n },\n ],\n value=[\n {\n \"name\": \"field\",\n \"description\": \"description of field\",\n \"type\": \"str\",\n \"multiple\": \"False\",\n }\n ],\n ),\n ]\n\n outputs = [\n Output(\n name=\"structured_output\",\n display_name=\"Structured Output\",\n method=\"build_structured_output\",\n ),\n ]\n\n def build_structured_output_base(self) -> Data:\n schema_name = self.schema_name or \"OutputModel\"\n\n if not hasattr(self.llm, \"with_structured_output\"):\n msg = \"Language model does not support structured output.\"\n raise TypeError(msg)\n if not self.output_schema:\n msg = \"Output schema cannot be empty\"\n raise ValueError(msg)\n\n output_model_ = build_model_from_schema(self.output_schema)\n\n output_model = create_model(\n schema_name,\n __doc__=f\"A list of {schema_name}.\",\n objects=(list[output_model_], Field(description=f\"A list of {schema_name}.\")), # type: ignore[valid-type]\n )\n\n try:\n llm_with_structured_output = create_extractor(self.llm, tools=[output_model])\n except NotImplementedError as exc:\n msg = f\"{self.llm.__class__.__name__} does not support structured output.\"\n raise TypeError(msg) from exc\n config_dict = {\n \"run_name\": self.display_name,\n \"project_name\": self.get_project_name(),\n \"callbacks\": self.get_langchain_callbacks(),\n }\n result = get_chat_result(\n runnable=llm_with_structured_output,\n system_message=self.system_prompt,\n input_value=self.input_value,\n config=config_dict,\n )\n if isinstance(result, BaseModel):\n result = result.model_dump()\n if responses := result.get(\"responses\"):\n result = responses[0].model_dump()\n if result and \"objects\" in result:\n return result[\"objects\"]\n\n return result\n\n def build_structured_output(self) -> Data:\n output = self.build_structured_output_base()\n\n return Data(data=output[-1])\n" | ||
| }, |
There was a problem hiding this comment.
Guard against empty or non‐list structured output
The new implementation does Data(data=output[-1]) without checking that output is a non-empty list. If build_structured_output_base() ever returns an empty list (or a dict), this will raise at runtime. Add a guard to:
output = self.build_structured_output_base()
if not isinstance(output, list) or not output:
# handle empty or unexpected type case
raise ValueError("No structured output returned")
return Data(data=output[-1])🤖 Prompt for AI Agents
In src/backend/base/langflow/initial_setup/starter_projects/Market Research.json
around lines 876 to 877, the method build_structured_output calls
build_structured_output_base and directly accesses output[-1] assuming output is
a non-empty list. To fix this, add a guard after calling
build_structured_output_base to check if output is a list and not empty; if not,
raise a ValueError indicating no structured output was returned. This prevents
runtime errors when output is empty or not a list.
…lement (#8467) * Update structured_output.py * template updates * Update Financial Report Parser.json
Summary by CodeRabbit