Skip to content

⚡️ Speed up function has_chat_output by 67% in PR #9069 (openai-compatibility)#9546

Closed
codeflash-ai[bot] wants to merge 57 commits into
mainfrom
codeflash/optimize-pr9069-2025-08-26T16.41.19
Closed

⚡️ Speed up function has_chat_output by 67% in PR #9069 (openai-compatibility)#9546
codeflash-ai[bot] wants to merge 57 commits into
mainfrom
codeflash/optimize-pr9069-2025-08-26T16.41.19

Conversation

@codeflash-ai
Copy link
Copy Markdown
Contributor

@codeflash-ai codeflash-ai Bot commented Aug 26, 2025

⚡️ This pull request contains optimizations for PR #9069

If you approve this dependent PR, these changes will be merged into the original PR branch openai-compatibility.

This PR will be automatically closed if the original PR is merged.


📄 67% (0.67x) speedup for has_chat_output in langflow/api/v1/openai_responses.py

⏱️ Runtime : 822 microseconds 492 microseconds (best of 37 runs)

📝 Explanation and details

The optimized code achieves a 67% speedup by making three key performance improvements:

  1. Set-based membership testing: Replaces the list ["ChatOutput", "Chat Output"] with a set {"ChatOutput", "Chat Output"}. Set membership testing is O(1) on average vs O(n) for lists, making each type check significantly faster.

  2. Early return with explicit loop: Eliminates the any() generator expression overhead by using a direct for-loop that returns True immediately upon finding a match. This avoids processing all remaining nodes when a chat output is found early.

  3. Reduced nested dictionary access: Extracts node.get("data") into a variable to avoid repeated .get() calls in the conditional check.

The optimization is particularly effective for test cases with:

  • Large node lists with early matches (e.g., test_large_nodes_list_with_chatoutput_at_start) - benefits from early return
  • Large node lists without matches (e.g., test_large_nodes_list_with_no_chatoutput) - benefits from faster set membership testing across all 1000 nodes
  • Multiple ChatOutput nodes - first match triggers immediate return instead of continuing evaluation

The line profiler shows the original code spent 97.7% of time in the any() expression, while the optimized version distributes work more efficiently across the explicit loop iterations, resulting in better cache locality and reduced function call overhead.

Correctness verification report:

Test Status
⏪ Replay Tests 🔘 None Found
⚙️ Existing Unit Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
🌀 Generated Regression Tests 44 Passed
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import pytest  # used for our unit tests
from langflow.api.v1.openai_responses import has_chat_output

# unit tests

# ---------------- BASIC TEST CASES ----------------

def test_none_input():
    # None input should return False
    codeflash_output = has_chat_output(None)

def test_empty_dict():
    # Empty dict should return False
    codeflash_output = has_chat_output({})

def test_no_nodes_key():
    # Dict without 'nodes' key should return False
    codeflash_output = has_chat_output({'foo': []})

def test_empty_nodes_list():
    # 'nodes' key with empty list should return False
    codeflash_output = has_chat_output({'nodes': []})

def test_single_node_with_chatoutput():
    # Single node with 'ChatOutput' type should return True
    flow = {'nodes': [{'data': {'type': 'ChatOutput'}}]}
    codeflash_output = has_chat_output(flow)

def test_single_node_with_chat_output_spaced():
    # Single node with 'Chat Output' (with space) type should return True
    flow = {'nodes': [{'data': {'type': 'Chat Output'}}]}
    codeflash_output = has_chat_output(flow)

def test_single_node_with_other_type():
    # Single node with other type should return False
    flow = {'nodes': [{'data': {'type': 'TextInput'}}]}
    codeflash_output = has_chat_output(flow)

def test_multiple_nodes_one_chatoutput():
    # Multiple nodes, one with 'ChatOutput'
    flow = {
        'nodes': [
            {'data': {'type': 'TextInput'}},
            {'data': {'type': 'ChatOutput'}},
            {'data': {'type': 'ImageOutput'}}
        ]
    }
    codeflash_output = has_chat_output(flow)

def test_multiple_nodes_none_chatoutput():
    # Multiple nodes, none with 'ChatOutput' or 'Chat Output'
    flow = {
        'nodes': [
            {'data': {'type': 'TextInput'}},
            {'data': {'type': 'ImageOutput'}},
            {'data': {'type': 'NumberInput'}}
        ]
    }
    codeflash_output = has_chat_output(flow)

# ---------------- EDGE TEST CASES ----------------

def test_node_without_data_key():
    # Node missing 'data' key should not cause error, should be treated as not matching
    flow = {'nodes': [{'foo': 'bar'}]}
    codeflash_output = has_chat_output(flow)

def test_node_data_without_type_key():
    # Node with 'data' but no 'type' key
    flow = {'nodes': [{'data': {'foo': 'bar'}}]}
    codeflash_output = has_chat_output(flow)

def test_node_type_is_none():
    # Node with 'data' and 'type' is None
    flow = {'nodes': [{'data': {'type': None}}]}
    codeflash_output = has_chat_output(flow)

def test_node_type_is_empty_string():
    # Node with 'data' and 'type' is empty string
    flow = {'nodes': [{'data': {'type': ''}}]}
    codeflash_output = has_chat_output(flow)

def test_node_type_case_sensitivity():
    # Should be case sensitive: 'chatoutput' is not 'ChatOutput'
    flow = {'nodes': [{'data': {'type': 'chatoutput'}}]}
    codeflash_output = has_chat_output(flow)





def test_extra_keys_in_node():
    # Node has extra keys, but correct 'type'
    flow = {'nodes': [{'id': 1, 'data': {'type': 'ChatOutput'}, 'extra': 123}]}
    codeflash_output = has_chat_output(flow)



def test_large_number_of_nodes_none_chatoutput():
    # Large number of nodes, none with 'ChatOutput'
    flow = {'nodes': [{'data': {'type': 'TextInput'}} for _ in range(999)]}
    codeflash_output = has_chat_output(flow)

def test_large_number_of_nodes_last_is_chatoutput():
    # Large number of nodes, last one has 'ChatOutput'
    nodes = [{'data': {'type': 'TextInput'}} for _ in range(998)]
    nodes.append({'data': {'type': 'ChatOutput'}})
    flow = {'nodes': nodes}
    codeflash_output = has_chat_output(flow)

def test_large_number_of_nodes_first_is_chat_output_spaced():
    # Large number of nodes, first one has 'Chat Output' (with space)
    nodes = [{'data': {'type': 'Chat Output'}}] + [{'data': {'type': 'TextInput'}} for _ in range(999)]
    flow = {'nodes': nodes}
    codeflash_output = has_chat_output(flow)

def test_large_number_of_nodes_middle_is_chatoutput():
    # Large number of nodes, one in the middle has 'ChatOutput'
    nodes = [{'data': {'type': 'TextInput'}} for _ in range(499)]
    nodes.append({'data': {'type': 'ChatOutput'}})
    nodes += [{'data': {'type': 'TextInput'}} for _ in range(500)]
    flow = {'nodes': nodes}
    codeflash_output = has_chat_output(flow)

def test_large_number_of_nodes_mixed_types():
    # Large number of nodes, some malformed, one correct
    nodes = [{'foo': 'bar'} for _ in range(200)]
    nodes += [{'data': {'foo': 'bar'}} for _ in range(200)]
    nodes += [{'data': {'type': None}} for _ in range(200)]
    nodes += [{'data': {'type': 'ChatOutput'}}]
    nodes += [{'data': {'type': 'TextInput'}} for _ in range(399)]
    flow = {'nodes': nodes}
    codeflash_output = has_chat_output(flow)

def test_large_number_of_nodes_all_malformed():
    # Large number of nodes, all malformed (should return False)
    nodes = [{'foo': 'bar'} for _ in range(1000)]
    flow = {'nodes': nodes}
    codeflash_output = has_chat_output(flow)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import pytest  # used for our unit tests
from langflow.api.v1.openai_responses import has_chat_output

# unit tests

# -----------------------------
# Basic Test Cases
# -----------------------------

def test_none_input_returns_false():
    """Test that None input returns False."""
    codeflash_output = has_chat_output(None)

def test_empty_dict_returns_false():
    """Test that empty dict returns False."""
    codeflash_output = has_chat_output({})

def test_no_nodes_key_returns_false():
    """Test that dict without 'nodes' returns False."""
    codeflash_output = has_chat_output({"foo": "bar"})

def test_empty_nodes_list_returns_false():
    """Test that empty nodes list returns False."""
    codeflash_output = has_chat_output({"nodes": []})

def test_single_node_with_chatoutput_type_returns_true():
    """Test that a single node with type 'ChatOutput' returns True."""
    flow = {"nodes": [{"data": {"type": "ChatOutput"}}]}
    codeflash_output = has_chat_output(flow)

def test_single_node_with_chat_output_type_with_space_returns_true():
    """Test that a single node with type 'Chat Output' returns True."""
    flow = {"nodes": [{"data": {"type": "Chat Output"}}]}
    codeflash_output = has_chat_output(flow)

def test_single_node_with_other_type_returns_false():
    """Test that a single node with a non-chatoutput type returns False."""
    flow = {"nodes": [{"data": {"type": "TextInput"}}]}
    codeflash_output = has_chat_output(flow)

def test_multiple_nodes_one_with_chatoutput_returns_true():
    """Test multiple nodes, one with 'ChatOutput' type returns True."""
    flow = {
        "nodes": [
            {"data": {"type": "TextInput"}},
            {"data": {"type": "ChatOutput"}},
            {"data": {"type": "SomethingElse"}},
        ]
    }
    codeflash_output = has_chat_output(flow)

def test_multiple_nodes_none_with_chatoutput_returns_false():
    """Test multiple nodes, none with 'ChatOutput' type returns False."""
    flow = {
        "nodes": [
            {"data": {"type": "TextInput"}},
            {"data": {"type": "SomethingElse"}},
        ]
    }
    codeflash_output = has_chat_output(flow)

# -----------------------------
# Edge Test Cases
# -----------------------------

def test_node_with_missing_data_key():
    """Test node with missing 'data' key returns False."""
    flow = {"nodes": [{"foo": "bar"}]}
    codeflash_output = has_chat_output(flow)

def test_node_with_data_but_missing_type_key():
    """Test node with 'data' but missing 'type' key returns False."""
    flow = {"nodes": [{"data": {"foo": "bar"}}]}
    codeflash_output = has_chat_output(flow)

def test_node_with_type_none():
    """Test node with 'type' set to None returns False."""
    flow = {"nodes": [{"data": {"type": None}}]}
    codeflash_output = has_chat_output(flow)

def test_node_with_type_empty_string():
    """Test node with 'type' set to empty string returns False."""
    flow = {"nodes": [{"data": {"type": ""}}]}
    codeflash_output = has_chat_output(flow)

def test_node_with_type_case_sensitivity():
    """Test that 'chatoutput' (lowercase) is not accepted (case sensitive)."""
    flow = {"nodes": [{"data": {"type": "chatoutput"}}]}
    codeflash_output = has_chat_output(flow)

def test_node_with_type_with_extra_spaces():
    """Test that ' ChatOutput ' (with spaces) is not accepted."""
    flow = {"nodes": [{"data": {"type": " ChatOutput "}}]}
    codeflash_output = has_chat_output(flow)

def test_node_with_type_similar_but_not_exact():
    """Test that 'ChatOutputX' is not accepted."""
    flow = {"nodes": [{"data": {"type": "ChatOutputX"}}]}
    codeflash_output = has_chat_output(flow)






def test_nodes_list_with_dict_but_no_data_key():
    """Test nodes list with dicts missing 'data' key returns False."""
    flow = {"nodes": [{"foo": "bar"}, {"baz": 1}]}
    codeflash_output = has_chat_output(flow)

def test_nodes_list_with_data_type_is_falsey():
    """Test node with 'type' set to False returns False."""
    flow = {"nodes": [{"data": {"type": False}}]}
    codeflash_output = has_chat_output(flow)

# -----------------------------
# Large Scale Test Cases
# -----------------------------

def test_large_nodes_list_with_no_chatoutput():
    """Test large nodes list (1000) with no ChatOutput returns False."""
    flow = {"nodes": [{"data": {"type": "TextInput"}} for _ in range(1000)]}
    codeflash_output = has_chat_output(flow)

def test_large_nodes_list_with_chatoutput_at_start():
    """Test large nodes list (1000) with ChatOutput as first node returns True."""
    nodes = [{"data": {"type": "ChatOutput"}}] + [{"data": {"type": "TextInput"}} for _ in range(999)]
    flow = {"nodes": nodes}
    codeflash_output = has_chat_output(flow)

def test_large_nodes_list_with_chatoutput_at_end():
    """Test large nodes list (1000) with ChatOutput as last node returns True."""
    nodes = [{"data": {"type": "TextInput"}} for _ in range(999)] + [{"data": {"type": "ChatOutput"}}]
    flow = {"nodes": nodes}
    codeflash_output = has_chat_output(flow)

def test_large_nodes_list_with_multiple_chatoutput_types():
    """Test large nodes list with both 'ChatOutput' and 'Chat Output' types returns True."""
    nodes = (
        [{"data": {"type": "TextInput"}} for _ in range(500)] +
        [{"data": {"type": "Chat Output"}}] +
        [{"data": {"type": "TextInput"}} for _ in range(499)]
    )
    flow = {"nodes": nodes}
    codeflash_output = has_chat_output(flow)



def test_large_nodes_list_with_multiple_chatoutputs():
    """Test large nodes list with multiple ChatOutput nodes returns True."""
    nodes = [{"data": {"type": "ChatOutput"}} if i % 100 == 0 else {"data": {"type": "TextInput"}} for i in range(1000)]
    flow = {"nodes": nodes}
    codeflash_output = has_chat_output(flow)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr9069-2025-08-26T16.41.19 and push.

Codeflash

phact and others added 30 commits July 11, 2025 15:15
…ity and consistency. Replace print statements with appropriate logging levels, enhancing error handling and debugging capabilities.
…ility. Enhance error messaging in `run_flow_for_openai_responses` for clarity. Update response yielding format for better readability. Add noqa comments for linting compliance.
… handling

Extend integration test coverage with new test file containing validation for empty inputs, invalid models, tools parameter rejection, timeout scenarios, and concurrent request handling. Update existing integration tests to improve error handling and response validation.
phact and others added 21 commits August 12, 2025 00:46
fix: specify type for tool_calls in openai_responses.py

Updated the type annotation for the tool_calls variable to explicitly define it as a list of dictionaries with string keys and Any values, enhancing type safety and clarity in the code.

Co-authored-by: Sebastián Estévez <estevezsebastian@gmail.com>
…mpatibility`)

The optimized code achieves a **67% speedup** by making three key performance improvements:

1. **Set-based membership testing**: Replaces the list `["ChatOutput", "Chat Output"]` with a set `{"ChatOutput", "Chat Output"}`. Set membership testing is O(1) on average vs O(n) for lists, making each type check significantly faster.

2. **Early return with explicit loop**: Eliminates the `any()` generator expression overhead by using a direct for-loop that returns `True` immediately upon finding a match. This avoids processing all remaining nodes when a chat output is found early.

3. **Reduced nested dictionary access**: Extracts `node.get("data")` into a variable to avoid repeated `.get()` calls in the conditional check.

The optimization is particularly effective for test cases with:
- **Large node lists with early matches** (e.g., `test_large_nodes_list_with_chatoutput_at_start`) - benefits from early return
- **Large node lists without matches** (e.g., `test_large_nodes_list_with_no_chatoutput`) - benefits from faster set membership testing across all 1000 nodes
- **Multiple ChatOutput nodes** - first match triggers immediate return instead of continuing evaluation

The line profiler shows the original code spent 97.7% of time in the `any()` expression, while the optimized version distributes work more efficiently across the explicit loop iterations, resulting in better cache locality and reduced function call overhead.
@codeflash-ai codeflash-ai Bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Aug 26, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Aug 26, 2025

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Join our Discord community for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@sonarqubecloud
Copy link
Copy Markdown

Base automatically changed from openai-compatibility to main August 27, 2025 02:00
@codeflash-ai codeflash-ai Bot closed this Aug 27, 2025
@codeflash-ai
Copy link
Copy Markdown
Contributor Author

codeflash-ai Bot commented Aug 27, 2025

This PR has been automatically closed because the original PR #9069 by phact was closed.

@codeflash-ai codeflash-ai Bot deleted the codeflash/optimize-pr9069-2025-08-26T16.41.19 branch August 27, 2025 02:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants