fix: improved better health check and stream URL check on MCP, improved JSON recognition by lucaseduoli · Pull Request #8982 · langflow-ai/langflow

lucaseduoli · 2025-07-10T13:11:04Z

This pull request introduces enhancements to error handling, session management, and validation logic in the src/backend/base/langflow/base/mcp/util.py file. Key changes include improved session health checks, retry logic for closed resources, and adjustments to HTTP validation for SSE endpoints.

This fixes the Figma MCP server, both in STDIO and SSE.

Session Management Improvements:

Added a comprehensive health check for session streams, ensuring both background tasks and streams are functional before reusing a session.
Implemented retry logic for handling ClosedResourceError during tool execution, with automatic session cleanup and retries for up to two attempts. [1] [2]

Error Handling Enhancements:

Improved error handling in run_tool by distinguishing between expected errors (e.g., ConnectionError, TimeoutError) and unexpected ones, with proper re-raising and logging. [1] [2]

HTTP Validation for SSE Endpoints:

Modified HTTP validation logic to use GET requests with SSE headers instead of HEAD, accommodating servers that don't support HEAD requests. [1] [2]
Added handling for specific HTTP status codes (e.g., 404, 400, 500) in SSE validation, allowing graceful fallback for unsupported endpoints.

Miscellaneous:

Introduced constants for HTTP status codes (HTTP_NOT_FOUND, HTTP_BAD_REQUEST, HTTP_INTERNAL_SERVER_ERROR) to improve code readability and maintainability.

Summary by CodeRabbit

New Features
- Improved session health checks and error handling for more reliable client connections.
- Added automatic retries for tool operations on connection or timeout errors.
- Enhanced validation of streaming endpoints for better compatibility.
Bug Fixes
- Prevented hangs by adding timeouts to session creation and tool calls.
- Improved detection and handling of server switches during active sessions.

coderabbitai · 2025-07-10T13:11:12Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The update enhances session management and error handling for MCP client sessions, introducing improved health checks, session recreation on server switch, robust retry logic for tool execution, and more accurate SSE endpoint validation. New internal methods and modifications to existing ones ensure better detection of session issues and more reliable client-server interactions.

Changes

File(s)	Change Summary
src/backend/base/langflow/base/mcp/util.py	Added session connectivity validation, improved session health checks, forced session recreation on server switch, session creation with timeouts, enhanced session cleanup, retry logic for tool execution, and SSE validation.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant MCPSessionManager
    participant MCPClient (Stdio/SSE)
    participant Server

    User->>MCPSessionManager: get_session(context_id, params, transport_type)
    MCPSessionManager->>MCPClient: Check background task and stream health
    alt Session healthy
        MCPSessionManager->>MCPClient: _validate_session_connectivity()
        alt Connectivity OK
            MCPSessionManager-->>User: Return session
        else Connectivity failed or server switched
            MCPSessionManager->>MCPClient: disconnect()
            MCPSessionManager->>MCPSessionManager: Create new session
            MCPSessionManager-->>User: Return new session
        end
    else Session unhealthy
        MCPSessionManager->>MCPClient: disconnect()
        MCPSessionManager->>MCPSessionManager: Create new session
        MCPSessionManager-->>User: Return new session
    end

    User->>MCPClient: run_tool(tool_name, arguments)
    MCPClient->>Server: Call tool (with 30s timeout)
    alt Success
        MCPClient-->>User: Return result
    else Connection/Timeout error
        MCPClient->>MCPSessionManager: Clear session cache, reset state
        MCPClient->>Server: Retry call tool (max 2 attempts)
        alt Success
            MCPClient-->>User: Return result
        else Failure
            MCPClient-->>User: Raise error
        end
    end

Possibly related PRs

langflow-ai/langflow#8908: Refactors session health checks in get_session, but this PR adds more comprehensive checks and retry mechanisms.

Suggested labels

bug, size:L, lgtm

Suggested reviewers

phact

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/mcp_component_type

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai auto-generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🔭 Outside diff range comments (1)

src/backend/base/langflow/base/mcp/util.py (1)

776-887: Critical: Extract common retry logic to eliminate code duplication.

The run_tool implementations in both MCPStdioClient and MCPSseClient are nearly identical (100+ lines of duplicated code). This violates the DRY principle and makes maintenance difficult.

Additionally, the error detection using string matching (e.g., "ClosedResourceError" in str(type(e))) is fragile and could break with error message changes.

Create a base class or mixin with the common retry logic:

class MCPClientBase:
    """Base class for MCP clients with common retry logic."""
    
    async def _run_tool_with_retry(
        self, 
        tool_name: str, 
        arguments: dict[str, Any],
        get_session_func: Callable,
        session_context: str,
        cleanup_func: Callable
    ) -> Any:
        """Common retry logic for running tools."""
        max_retries = 2
        last_error_type = None
        
        for attempt in range(max_retries):
            try:
                logger.debug(f"Attempting to run tool '{tool_name}' (attempt {attempt + 1}/{max_retries})")
                session = await get_session_func()
                
                result = await asyncio.wait_for(
                    session.call_tool(tool_name, arguments=arguments),
                    timeout=30.0
                )
                logger.debug(f"Tool '{tool_name}' completed successfully")
                return result
                
            except Exception as e:
                current_error_type = type(e).__name__
                logger.warning(f"Tool '{tool_name}' failed on attempt {attempt + 1}: {current_error_type} - {e}")
                
                # Better error detection
                is_connection_error = self._is_connection_error(e)
                is_timeout_error = isinstance(e, (asyncio.TimeoutError, TimeoutError))
                
                if last_error_type == current_error_type and attempt > 0:
                    logger.error(f"Repeated {current_error_type} error for tool '{tool_name}', not retrying")
                    break
                    
                last_error_type = current_error_type
                
                if is_connection_error and attempt < max_retries - 1:
                    logger.warning(f"MCP session connection issue for tool '{tool_name}', retrying...")
                    await cleanup_func(session_context)
                    await asyncio.sleep(0.5)
                    continue
                    
                if is_timeout_error and attempt < max_retries - 1:
                    logger.warning(f"Tool '{tool_name}' timed out, retrying...")
                    await asyncio.sleep(1.0)
                    continue
                    
                # Handle final error
                if is_connection_error or is_timeout_error:
                    msg = f"Failed to run tool '{tool_name}' after {attempt + 1} attempts: {e}"
                    logger.error(msg)
                    self._connected = False
                    raise ValueError(msg) from e
                raise
                
        msg = f"Failed to run tool '{tool_name}': Maximum retries exceeded"
        logger.error(msg)
        raise ValueError(msg)
    
    def _is_connection_error(self, e: Exception) -> bool:
        """Check if exception is a connection-related error."""
        # Standard connection errors
        if isinstance(e, (ConnectionError, OSError, ValueError)):
            return True
            
        # MCP-specific errors - check type instead of string
        error_type_name = type(e).__name__
        if error_type_name in ("ClosedResourceError", "McpError"):
            return True
            
        # Check error message as fallback
        error_str = str(e)
        connection_keywords = [
            "Connection closed", "Connection lost", 
            "Transport closed", "Stream closed"
        ]
        return any(keyword in error_str for keyword in connection_keywords)

Then simplify the client implementations:

class MCPStdioClient(MCPClientBase):
    async def run_tool(self, tool_name: str, arguments: dict[str, Any]) -> Any:
        # ... validation code ...
        
        return await self._run_tool_with_retry(
            tool_name=tool_name,
            arguments=arguments,
            get_session_func=self._get_or_create_session,
            session_context=self._session_context,
            cleanup_func=lambda ctx: self._get_session_manager()._cleanup_session(ctx)
        )

Also applies to: 1063-1174

🧹 Nitpick comments (3)

src/backend/base/langflow/base/mcp/util.py (3)

26-30: Consider using httpx status code constants directly.

The file already imports httpx_codes (line 13). Instead of defining duplicate constants, use the existing httpx constants for consistency.

-# HTTP status codes used in validation
-HTTP_NOT_FOUND = 404
-HTTP_BAD_REQUEST = 400
-HTTP_INTERNAL_SERVER_ERROR = 500

Then update the usage in lines 950, 957-958 to:

-if response.status_code == HTTP_NOT_FOUND:
+if response.status_code == httpx_codes.NOT_FOUND:

-if (
-    HTTP_BAD_REQUEST <= response.status_code < HTTP_INTERNAL_SERVER_ERROR
-    and response.status_code != HTTP_NOT_FOUND
-):
+if (
+    httpx_codes.BAD_REQUEST <= response.status_code < httpx_codes.INTERNAL_SERVER_ERROR
+    and response.status_code != httpx_codes.NOT_FOUND
+):

461-489: Consider extracting stream health check logic for better maintainability.

The stream health check logic is comprehensive but complex with multiple conditional branches. Consider extracting this into a separate method for better readability and testability.

async def _check_stream_health(self, session) -> bool:
    """Check if the session's write stream is still healthy."""
    try:
        if not hasattr(session, "_write_stream"):
            return True  # Can't check, assume healthy
            
        write_stream = session._write_stream
        
        # Check for explicit closed state
        if hasattr(write_stream, "_closed") and write_stream._closed:
            return False
            
        # Check anyio stream state for send channels
        if hasattr(write_stream, "_state") and hasattr(write_stream._state, "open_send_channels"):
            return write_stream._state.open_send_channels > 0
            
        # Check for other stream closed indicators
        if hasattr(write_stream, "is_closing") and callable(write_stream.is_closing):
            return not write_stream.is_closing()
            
        # Default to healthy if we can't determine state
        return True
        
    except (AttributeError, TypeError) as e:
        logger.debug(f"Could not check stream health: {e}")
        return True

Then use it in the main method:

-# Additional check for stream health
-stream_is_healthy = True
-try:
-    # Check if the session's write stream is still open
-    if hasattr(session, "_write_stream"):
-        write_stream = session._write_stream
-        # ... (rest of the checks)
-except (AttributeError, TypeError) as e:
-    # ...
-    stream_is_healthy = True
+stream_is_healthy = await self._check_stream_health(session)

888-899: Consider including disconnect in the base class refactor.

The disconnect method is also duplicated between MCPStdioClient and MCPSseClient. When creating the base class suggested earlier, include this common cleanup logic.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 75e930d and e9c41cc.

📒 Files selected for processing (1)

src/backend/base/langflow/base/mcp/util.py (10 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

`src/backend/**/*.py`: Run make format_backend to format Python code early and often Run make lint to check for linting issues in backend Python code

src/backend/**/*.py: Run make format_backend to format Python code early and often
Run make lint to check for linting issues in backend Python code

📄 Source: CodeRabbit Inference Engine (.cursor/rules/backend_development.mdc)

List of files the instruction was applied to:

src/backend/base/langflow/base/mcp/util.py

🧠 Learnings (1)

src/backend/base/langflow/base/mcp/util.py (1)

Learnt from: CR
PR: langflow-ai/langflow#0
File: .cursor/rules/backend_development.mdc:0-0
Timestamp: 2025-06-30T14:39:17.464Z
Learning: Applies to src/backend/base/langflow/components/**/*.py : Use asyncio.create_task for background work in async components and ensure proper cleanup on cancellation

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Update Starter Projects
GitHub Check: Run Ruff Check and Format
GitHub Check: Ruff Style Check (3.13)
GitHub Check: Optimize new Python code in this PR

🔇 Additional comments (4)

src/backend/base/langflow/base/mcp/util.py (4)

388-429: Well-implemented session connectivity validation.

The method provides a robust lightweight connectivity test with:

Appropriate 3-second timeout for fast failure

Comprehensive error handling for both standard and MCP-specific errors

Proper response validation

Clear debug logging

580-598: Excellent timeout handling for session creation.

The 10-second timeout with proper cleanup ensures sessions don't hang indefinitely. The error handling and task cancellation logic is well-implemented in both STDIO and SSE session creation methods.

Also applies to: 643-661

931-973: Excellent SSE endpoint validation improvements.

The changes properly handle SSE server compatibility:

Using GET with Accept: text/event-stream header instead of HEAD

Treating 404 as potentially valid (many SSE servers don't support non-SSE requests)

Interpreting timeouts as expected behavior for streaming endpoints

This should significantly improve compatibility with various SSE server implementations.

1-1174: Reminder: Run backend formatting and linting.

As per the coding guidelines for src/backend/**/*.py files, ensure you run:

make format_backend to format the code

make lint to check for any linting issues

github-actions · 2025-07-10T16:17:39Z

Frontend Unit Test Coverage Report

Coverage Summary

Lines	Statements	Branches	Functions
	73.58% (117/159)	57.46% (77/134)	59.52% (25/42)

Unit Test Results

Tests	Skipped	Failures	Errors	Time
45	0 💤	0 ❌	0 🔥	2.613s ⏱️

…ed JSON recognition (langflow-ai#8982) * Improved health check and stream URL check on MCP * Improved health check by validating session connectivity * Changed mcp servers from json checks * Fixed imports * Fixed mcp server tab test

lucaseduoli added 2 commits July 10, 2025 09:40

Improved health check and stream URL check on MCP

a8ec5db

Improved health check by validating session connectivity

e9c41cc

lucaseduoli requested a review from ogabrielluiz July 10, 2025 13:11

lucaseduoli self-assigned this Jul 10, 2025

dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Jul 10, 2025

github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Jul 10, 2025

coderabbitai Bot reviewed Jul 10, 2025

View reviewed changes

Changed mcp servers from json checks

7ca8256

dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jul 10, 2025

ogabrielluiz requested changes Jul 10, 2025

View reviewed changes

Comment thread src/backend/base/langflow/base/mcp/util.py Outdated

Comment thread src/backend/base/langflow/base/mcp/util.py Outdated

lucaseduoli changed the title ~~fix: improved better health check and stream URL check on MCP~~ fix: improved better health check and stream URL check on MCP, improved JSON recognition Jul 10, 2025

github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Jul 10, 2025

Fixed imports

afe88f6

lucaseduoli requested a review from ogabrielluiz July 10, 2025 13:24

github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Jul 10, 2025

Merge branch 'main' into fix/mcp_component_type

d634356

github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Jul 10, 2025

Merge branch 'main' into fix/mcp_component_type

a93fb67

github-actions Bot removed the bug Something isn't working label Jul 10, 2025

github-actions Bot added the bug Something isn't working label Jul 10, 2025

Fixed mcp server tab test

47b9eef

github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Jul 10, 2025

Merge branch 'main' into fix/mcp_component_type

ff50ed1

github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Jul 10, 2025

ogabrielluiz requested a review from mfortman11 July 10, 2025 15:16

ogabrielluiz approved these changes Jul 10, 2025

View reviewed changes

dosubot Bot added the lgtm This PR has been approved by a maintainer label Jul 10, 2025

ogabrielluiz added this pull request to the merge queue Jul 10, 2025

Merged via the queue into main with commit 8779593 Jul 10, 2025
77 of 78 checks passed

ogabrielluiz deleted the fix/mcp_component_type branch July 10, 2025 17:20

This was referenced Jul 16, 2025

refactor(session): migrate to server-based session management and add tests #9077

Merged

fix: store mcp sse headers and use them on connection #9148

Merged

coderabbitai Bot mentioned this pull request Aug 20, 2025

fix: MCP - changing fixed timeout values to environment variables #9455

Open

coderabbitai Bot mentioned this pull request Sep 17, 2025

refactor: Simplify MCP session management and consolidate utility fun… #9893

Open

coderabbitai Bot mentioned this pull request Oct 7, 2025

feat: adds streamable http instead of sse on mcp client #10157

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: improved better health check and stream URL check on MCP, improved JSON recognition#8982

fix: improved better health check and stream URL check on MCP, improved JSON recognition#8982
ogabrielluiz merged 8 commits into
mainfrom
fix/mcp_component_type

lucaseduoli commented Jul 10, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jul 10, 2025 •

edited

Loading

Review skipped

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jul 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lucaseduoli commented Jul 10, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Session Management Improvements:

Error Handling Enhancements:

HTTP Validation for SSE Endpoints:

Miscellaneous:

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Suggested labels

Suggested reviewers

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jul 10, 2025

Frontend Unit Test Coverage Report

Coverage Summary

Unit Test Results

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lucaseduoli commented Jul 10, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jul 10, 2025 •

edited

Loading