Skip to content

fix: Add output item processing for tool results#10348

Merged
edwinjosechittilappilly merged 9 commits into
mainfrom
fix-mcp-output
Oct 29, 2025
Merged

fix: Add output item processing for tool results#10348
edwinjosechittilappilly merged 9 commits into
mainfrom
fix-mcp-output

Conversation

@edwinjosechittilappilly
Copy link
Copy Markdown
Collaborator

@edwinjosechittilappilly edwinjosechittilappilly commented Oct 21, 2025

Introduced process_output_item to handle tool output items, attempting to parse text-type items as JSON. This improves downstream handling of tool outputs by converting JSON strings to dictionaries when possible.

Summary by CodeRabbit

  • Bug Fixes
    • Improved processing of tool outputs with enhanced JSON text parsing
    • Optimized tool output data conversion for better handling of multiple output types

Introduced process_output_item to handle tool output items, attempting to parse text-type items as JSON. This improves downstream handling of tool outputs by converting JSON strings to dictionaries when possible.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Oct 21, 2025

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Adds JSON parsing capability to MCPToolsComponent.build_output method. Text-type tool outputs are parsed as JSON when possible through a new process_output_item method. Includes early return path when all output items are dictionaries.

Changes

Cohort / File(s) Summary
JSON parsing enhancement in MCPToolsComponent
src/lfx/src/lfx/components/agents/mcp_component.py
Added json import. Introduced process_output_item(item_dict) method to process individual output items. Modified build_output to apply JSON parsing to text-type items, attempt to convert JSON strings to dicts, and return early when all items are dictionaries. Falls back to original item on JSON parse failure.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Possibly related PRs

Pre-merge checks and finishing touches

❌ Failed checks (1 error, 3 warnings)
Check name Status Explanation Resolution
Test Coverage For New Implementations ❌ Error The PR adds a new public method process_output_item to the MCPToolsComponent class and modifies the build_output method to parse JSON-encoded text items in tool outputs. However, the PR only modifies mcp_component.py—no test files are added or updated. While test files exist for MCPToolsComponent (including test_mcp_component.py, test_mcp_component_cache.py, and integration tests), none of them contain tests for the new process_output_item method or validate the modified build_output behavior. The existing tests focus on caching functionality, with a TODO comment noting that more tests are needed for MCPToolsComponent. Add unit tests to an existing test file (e.g., src/backend/tests/unit/components/data/test_mcp_component.py) covering: (1) process_output_item with valid JSON text successfully parsed to a dict, (2) invalid JSON text falling back to the original item_dict, (3) non-text items passing through unchanged, (4) the build_output method's early-return path when all items are dicts, and (5) the modified handling of output items through the processing pipeline. This ensures the bug fix is properly tested and prevents regressions in tool output handling.
Test Quality And Coverage ⚠️ Warning
Test File Naming And Structure ⚠️ Warning The repository establishes clear pytest conventions with test files following the test_*.py naming pattern located in centralized test directories such as ./src/backend/tests/unit/components/agents/. The existing test file test_mcp_component_cache.py demonstrates proper pytest structure with 13 descriptive test functions. However, the PR that introduces the new process_output_item method contains no corresponding test cases. A search across all test files reveals zero references to process_output_item, indicating the method has no test coverage. This violates the custom check requirement to include test files with descriptive names explaining what is being tested and covering both positive and negative scenarios. The new method processes JSON parsing logic that is critical to tool output handling but lacks dedicated test validation for success cases, failure cases, and edge conditions. Add test cases to ./src/backend/tests/unit/components/agents/test_mcp_component_cache.py or a dedicated test file using the test_*.py naming convention. Include comprehensive test functions with descriptive names prefixed with test_ covering: (1) test_process_output_item_parses_valid_json_text to validate JSON string parsing, (2) test_process_output_item_preserves_invalid_json to ensure non-JSON strings are returned unchanged, (3) test_process_output_item_returns_non_text_items_unchanged for items without type="text", (4) test_process_output_item_handles_missing_fields for edge cases, and (5) test_process_output_item_handles_empty_text for empty string inputs. Ensure each test follows pytest conventions with clear assertions validating expected behavior.
Excessive Mock Usage Warning ⚠️ Warning The PR introduces the new process_output_item method that parses JSON text in tool output items, and it is called directly within the build_output method's core loop. However, there are no dedicated unit tests for this new method, and the existing test_build_output_with_cached_tools test uses nine mock/patch instances to test build_output while obscuring the actual JSON parsing behavior. The test mocks the tool execution (exec_tool.coroutine), result objects (mock_result.content), and component methods (get_inputs_for_all_tools, update_tool_list), preventing validation that process_output_item correctly parses valid JSON, handles invalid JSON gracefully, and preserves non-text items. The test verifies only that mocks work as configured, not that the actual core logic introduced by this PR functions correctly, which is excessive mocking that violates the principle of testing real behavior. Add a dedicated parametrized unit test for process_output_item using a real MCPToolsComponent instance, testing: (1) text items with valid JSON strings (e.g., {"type": "text", "text": '{"key": "value"}'}) that should parse and return dicts, (2) text items with invalid JSON (e.g., {"type": "text", "text": 'not json'}) that should return the original item_dict, (3) non-text items that should pass through unchanged, and (4) edge cases like null text. Refactor test_build_output_with_cached_tools to replace deep mocking of core logic with minimal stubs for only external MCP server calls, and use realistic tool result objects to validate actual DataFrame construction and process_output_item integration.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "fix: Add output item processing for tool results" directly and accurately reflects the main change in the pull request: the introduction of the process_output_item method to handle tool output items. The title is concise, specific, and avoids vague terminology or noise. It clearly conveys that a new processing feature is being added for tool results, which aligns with the core objective of handling tool output items and parsing JSON text-type items. A teammate scanning the history would easily understand the primary change from this title.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added the bug Something isn't working label Oct 21, 2025
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Oct 21, 2025
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Oct 21, 2025
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Oct 21, 2025

Frontend Unit Test Coverage Report

Coverage Summary

Lines Statements Branches Functions
Coverage: 11%
11.3% (3017/26676) 5.16% (1052/20368) 6.79% (397/5844)

Unit Test Results

Tests Skipped Failures Errors Time
1230 0 💤 0 ❌ 0 🔥 14.179s ⏱️

@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Oct 21, 2025
@codecov
Copy link
Copy Markdown

codecov Bot commented Oct 21, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 30.15%. Comparing base (722c6cc) to head (b6f8c31).
⚠️ Report is 3 commits behind head on main.

❌ Your project status has failed because the head coverage (39.40%) is below the target coverage (60.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main   #10348      +/-   ##
==========================================
- Coverage   30.16%   30.15%   -0.01%     
==========================================
  Files        1318     1318              
  Lines       59680    59680              
  Branches     8925     8925              
==========================================
- Hits        18003    17999       -4     
- Misses      40846    40849       +3     
- Partials      831      832       +1     
Flag Coverage Δ
backend 50.85% <ø> (-0.02%) ⬇️
frontend 10.47% <ø> (ø)
lfx 39.40% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cea4b42 and 13b9b0d.

📒 Files selected for processing (1)
  • src/lfx/src/lfx/components/agents/mcp_component.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/lfx/src/lfx/components/agents/mcp_component.py (1)
src/lfx/src/lfx/schema/message.py (1)
  • json (312-314)
🪛 GitHub Actions: Ruff Style Check
src/lfx/src/lfx/components/agents/mcp_component.py

[error] 544-544: Command: uv run --only-dev ruff check --output-format=github . | ruff: TRY300: Consider moving this statement to an 'else' block.

🪛 GitHub Check: Ruff Style Check (3.13)
src/lfx/src/lfx/components/agents/mcp_component.py

[failure] 545-545: Ruff (BLE001)
src/lfx/src/lfx/components/agents/mcp_component.py:545:20: BLE001 Do not catch blind exception: Exception


[failure] 544-544: Ruff (RET504)
src/lfx/src/lfx/components/agents/mcp_component.py:544:24: RET504 Unnecessary assignment to json_dict before return statement


[failure] 544-544: Ruff (TRY300)
src/lfx/src/lfx/components/agents/mcp_component.py:544:17: TRY300 Consider moving this statement to an else block

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Update Starter Projects
🔇 Additional comments (1)
src/lfx/src/lfx/components/agents/mcp_component.py (1)

4-4: LGTM!

The json import is necessary for the new JSON parsing functionality in process_output_item.

Comment on lines +525 to 530
item_dict = self.process_output_item(item_dict)
tool_content.append(item_dict)

if isinstance(tool_content, list) and all(isinstance(x, dict) for x in tool_content):
return DataFrame(tool_content)
return DataFrame(data=tool_content)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify the DataFrame constructor behavior and simplify the list check.

The early return logic attempts to handle cases where parsed JSON might not be a dict. However, there are a couple of concerns:

  1. Redundant list check: tool_content is always a list (built via append in the loop), so isinstance(tool_content, list) is always True. This check can be simplified.

  2. Unclear difference: What's the semantic difference between DataFrame(tool_content) (line 529) and DataFrame(data=tool_content) (line 530)? Without seeing the DataFrame implementation, it's unclear if these behave differently or are equivalent.

Run the following script to check the DataFrame constructor signature:

Consider simplifying the check:

-                if isinstance(tool_content, list) and all(isinstance(x, dict) for x in tool_content):
+                if all(isinstance(x, dict) for x in tool_content):
                     return DataFrame(tool_content)
                 return DataFrame(data=tool_content)

🏁 Script executed:

#!/bin/bash
# Description: Examine DataFrame constructor to understand the difference between positional and keyword argument

# Search for DataFrame class definition and __init__ method
ast-grep --pattern $'class DataFrame:
  $$$
  def __init__($$$):
    $$$'

# Also search for DataFrame __init__ with different patterns
rg -n "class DataFrame" -A 30 src/lfx/src/lfx/schema/dataframe.py

Length of output: 1292


Remove the redundant list check; both DataFrame calls are equivalent.

The DataFrame constructor accepts data as its first parameter, so DataFrame(tool_content) and DataFrame(data=tool_content) behave identically. Since tool_content is always a list (built via append() in the loop), the isinstance(tool_content, list) check is redundant.

Simplify to:

-                if isinstance(tool_content, list) and all(isinstance(x, dict) for x in tool_content):
+                if all(isinstance(x, dict) for x in tool_content):
                     return DataFrame(tool_content)
                 return DataFrame(data=tool_content)

Or further simplify both lines to use consistent syntax:

-                if isinstance(tool_content, list) and all(isinstance(x, dict) for x in tool_content):
-                    return DataFrame(tool_content)
-                return DataFrame(data=tool_content)
+                if all(isinstance(x, dict) for x in tool_content):
+                    return DataFrame(data=tool_content)
+                return DataFrame(data=tool_content)

which reduces to:

-                if isinstance(tool_content, list) and all(isinstance(x, dict) for x in tool_content):
-                    return DataFrame(tool_content)
-                return DataFrame(data=tool_content)
+                return DataFrame(data=tool_content)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
item_dict = self.process_output_item(item_dict)
tool_content.append(item_dict)
if isinstance(tool_content, list) and all(isinstance(x, dict) for x in tool_content):
return DataFrame(tool_content)
return DataFrame(data=tool_content)
item_dict = self.process_output_item(item_dict)
tool_content.append(item_dict)
if all(isinstance(x, dict) for x in tool_content):
return DataFrame(tool_content)
return DataFrame(data=tool_content)
🤖 Prompt for AI Agents
In src/lfx/src/lfx/components/agents/mcp_component.py around lines 525 to 530,
the code redundantly checks isinstance(tool_content, list) before constructing a
DataFrame even though tool_content is always a list built via append(); replace
the conditional block with a single consistent DataFrame construction (e.g.,
return DataFrame(tool_content) or return DataFrame(data=tool_content)) and
remove the unnecessary isinstance/all checks so the function returns the
DataFrame directly.

Comment on lines +537 to +547
def process_output_item(self, item_dict):
"""Process the output of a tool."""
if item_dict.get("type") == "text":
text = item_dict.get("text")
try:
json_dict = json.loads(text)
# convert it to dict
return json_dict
except Exception:
return item_dict
return item_dict
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix static analysis issues and improve exception handling.

The method has several issues flagged by static analysis:

  1. Bare exception catch (BLE001): Catching Exception is too broad and can hide unexpected errors.
  2. Unnecessary assignment (RET504): The json_dict variable is assigned and immediately returned.
  3. Should use else block (TRY300): The successful parse should be in an else block.

Additionally, note that when JSON parsing succeeds, all metadata from item_dict (like the "type" field) is discarded. If this is intentional, consider documenting it.

Apply this diff to fix the static analysis issues:

 def process_output_item(self, item_dict):
-    """Process the output of a tool."""
+    """Process tool output item, parsing text-type items as JSON when possible.
+    
+    Returns the parsed JSON object for valid JSON text, otherwise returns the original item_dict.
+    Note: Successful JSON parsing discards the original item_dict metadata.
+    """
     if item_dict.get("type") == "text":
         text = item_dict.get("text")
         try:
-            json_dict = json.loads(text)
-            # convert it to dict
-            return json_dict
-        except Exception:
+            return json.loads(text)
+        except json.JSONDecodeError:
             return item_dict
+        else:
+            # This else is unreachable due to the return in try, but satisfies TRY300
+            pass
     return item_dict

Wait, the else block approach doesn't work with the return in try. Better approach:

 def process_output_item(self, item_dict):
-    """Process the output of a tool."""
+    """Process tool output item, parsing text-type items as JSON when possible.
+    
+    Returns the parsed JSON object for valid JSON text, otherwise returns the original item_dict.
+    Note: Successful JSON parsing discards the original item_dict metadata.
+    """
     if item_dict.get("type") == "text":
         text = item_dict.get("text")
         try:
-            json_dict = json.loads(text)
-            # convert it to dict
-            return json_dict
-        except Exception:
-            return item_dict
+            return json.loads(text)
+        except json.JSONDecodeError:
+            pass
-    return item_dict
+    return item_dict
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def process_output_item(self, item_dict):
"""Process the output of a tool."""
if item_dict.get("type") == "text":
text = item_dict.get("text")
try:
json_dict = json.loads(text)
# convert it to dict
return json_dict
except Exception:
return item_dict
return item_dict
def process_output_item(self, item_dict):
"""Process tool output item, parsing text-type items as JSON when possible.
Returns the parsed JSON object for valid JSON text, otherwise returns the original item_dict.
Note: Successful JSON parsing discards the original item_dict metadata.
"""
if item_dict.get("type") == "text":
text = item_dict.get("text")
try:
return json.loads(text)
except json.JSONDecodeError:
pass
return item_dict
🧰 Tools
🪛 GitHub Actions: Ruff Style Check

[error] 544-544: Command: uv run --only-dev ruff check --output-format=github . | ruff: TRY300: Consider moving this statement to an 'else' block.

🪛 GitHub Check: Ruff Style Check (3.13)

[failure] 545-545: Ruff (BLE001)
src/lfx/src/lfx/components/agents/mcp_component.py:545:20: BLE001 Do not catch blind exception: Exception


[failure] 544-544: Ruff (RET504)
src/lfx/src/lfx/components/agents/mcp_component.py:544:24: RET504 Unnecessary assignment to json_dict before return statement


[failure] 544-544: Ruff (TRY300)
src/lfx/src/lfx/components/agents/mcp_component.py:544:17: TRY300 Consider moving this statement to an else block

@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Oct 21, 2025
Copy link
Copy Markdown
Collaborator

@Adam-Aghili Adam-Aghili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't go through the JSON changes but other then the build failures LGTM!

thank you @edwinjosechittilappilly

Simplifies JSON parsing by returning the result directly and catching only JSONDecodeError. This improves error handling and code clarity.
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Oct 29, 2025
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Oct 29, 2025
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Oct 29, 2025
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Oct 29, 2025
@edwinjosechittilappilly edwinjosechittilappilly added this pull request to the merge queue Oct 29, 2025
Merged via the queue into main with commit aabc58c Oct 29, 2025
81 of 82 checks passed
@edwinjosechittilappilly edwinjosechittilappilly deleted the fix-mcp-output branch October 29, 2025 15:51
korenLazar pushed a commit to kiran-kate/langflow that referenced this pull request Nov 13, 2025
* Add output item processing for tool results

Introduced process_output_item to handle tool output items, attempting to parse text-type items as JSON. This improves downstream handling of tool outputs by converting JSON strings to dictionaries when possible.

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* [autofix.ci] apply automated fixes (attempt 3/3)

* Refactor JSON parsing in MCPToolsComponent

Simplifies JSON parsing by returning the result directly and catching only JSONDecodeError. This improves error handling and code clarity.

* Update component_index.json

* [autofix.ci] apply automated fixes

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants