Skip to content

⚡️ Speed up function _args_reference_urls by 23% in PR #10934 (feat/http-stream-mcp-1.7.0)#10935

Closed
codeflash-ai[bot] wants to merge 3 commits into
release-1.7.0from
codeflash/optimize-pr10934-2025-12-08T20.04.24
Closed

⚡️ Speed up function _args_reference_urls by 23% in PR #10934 (feat/http-stream-mcp-1.7.0)#10935
codeflash-ai[bot] wants to merge 3 commits into
release-1.7.0from
codeflash/optimize-pr10934-2025-12-08T20.04.24

Conversation

@codeflash-ai
Copy link
Copy Markdown
Contributor

@codeflash-ai codeflash-ai Bot commented Dec 8, 2025

⚡️ This pull request contains optimizations for PR #10934

If you approve this dependent PR, these changes will be merged into the original PR branch feat/http-stream-mcp-1.7.0.

This PR will be automatically closed if the original PR is merged.


📄 23% (0.23x) speedup for _args_reference_urls in src/backend/base/langflow/api/v1/mcp_projects.py

⏱️ Runtime : 1.00 millisecond 812 microseconds (best of 5 runs)

📝 Explanation and details

The optimization transforms a set-intersection approach into an early-exit loop that converts the urls list to a set for O(1) lookups. This yields a 23% speedup.

Key Changes:

  1. Pre-convert URLs to set: urls_set = set(urls) enables O(1) membership testing instead of O(n) list lookups
  2. Early exit strategy: The loop returns True immediately upon finding the first match, avoiding unnecessary processing
  3. Eliminate set comprehension: Removes the overhead of building a complete set from filtered args before intersection

Why This Is Faster:

  • Original approach: Creates a set comprehension of all string args, then performs set intersection - always processes all args regardless of early matches
  • Optimized approach: Stops at the first match and benefits from faster set lookups, especially effective when matches occur early in the sequence

Performance Characteristics:

  • Best case (early match): Dramatically faster due to early exit
  • Worst case (no matches): Still benefits from O(1) set lookups vs O(n) list searches for each URL comparison
  • Large datasets: The set conversion overhead (line taking 2.7% of time) is amortized across potentially many lookups

Test Case Analysis:
The optimization particularly excels in scenarios like test_large_args_and_urls_with_match() where the matching element appears near the end, and test_multiple_args_one_match() where early termination provides significant gains. Even in worst-case scenarios with no matches, the set-based lookups maintain performance advantages over list-based comparisons.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 60 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from collections.abc import Sequence
from typing import Any

# imports
import pytest  # used for our unit tests
from langflow.api.v1.mcp_projects import _args_reference_urls

# unit tests

# ---------------------------
# 1. Basic Test Cases
# ---------------------------

def test_basic_match_single_url():
    # Args contains a single string that matches a URL
    args = ["https://example.com"]
    urls = ["https://example.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_basic_no_match_single_url():
    # Args contains a single string that does not match any URL
    args = ["https://notfound.com"]
    urls = ["https://example.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_basic_multiple_args_one_match():
    # Args contains multiple strings, one matches a URL
    args = ["foo", "bar", "https://example.com"]
    urls = ["https://example.com", "https://other.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_basic_multiple_args_no_match():
    # Args contains multiple strings, none match any URL
    args = ["foo", "bar", "baz"]
    urls = ["https://example.com", "https://other.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_basic_multiple_urls_one_match():
    # Args contains a string that matches one of several URLs
    args = ["https://other.com"]
    urls = ["https://example.com", "https://other.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_basic_multiple_matches():
    # Args contains multiple strings that match multiple URLs
    args = ["https://example.com", "https://other.com", "foo"]
    urls = ["https://example.com", "https://other.com"]
    codeflash_output = _args_reference_urls(args, urls)

# ---------------------------
# 2. Edge Test Cases
# ---------------------------

def test_edge_args_none():
    # Args is None
    args = None
    urls = ["https://example.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_edge_args_empty():
    # Args is empty list
    args = []
    urls = ["https://example.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_edge_urls_empty():
    # URLs is empty list
    args = ["https://example.com"]
    urls = []
    codeflash_output = _args_reference_urls(args, urls)

def test_edge_both_empty():
    # Both args and urls are empty
    args = []
    urls = []
    codeflash_output = _args_reference_urls(args, urls)

def test_edge_args_contains_non_str_types():
    # Args contains non-string types
    args = ["https://example.com", 123, None, {"url": "https://example.com"}, ["https://example.com"]]
    urls = ["https://example.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_edge_urls_contains_non_str_types():
    # URLs contains non-string types, should only match strings in urls
    args = ["https://example.com"]
    urls = ["https://example.com", 123, None]
    codeflash_output = _args_reference_urls(args, urls)

def test_edge_args_and_urls_contain_non_str_types():
    # Both args and urls contain non-string types
    args = ["https://example.com", 123, None]
    urls = ["https://example.com", 456, None]
    codeflash_output = _args_reference_urls(args, urls)

def test_edge_args_contains_duplicates():
    # Args contains duplicate matching strings
    args = ["https://example.com", "https://example.com", "foo"]
    urls = ["https://example.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_edge_urls_contains_duplicates():
    # URLs contains duplicate matching strings
    args = ["https://example.com"]
    urls = ["https://example.com", "https://example.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_edge_case_sensitive():
    # Matching should be case-sensitive
    args = ["https://EXAMPLE.com"]
    urls = ["https://example.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_edge_args_contains_empty_string():
    # Args contains empty string
    args = [""]
    urls = ["https://example.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_edge_urls_contains_empty_string():
    # URLs contains empty string
    args = ["https://example.com"]
    urls = [""]
    codeflash_output = _args_reference_urls(args, urls)

def test_edge_args_is_tuple():
    # Args is a tuple instead of a list
    args = ("https://example.com", "foo")
    urls = ["https://example.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_edge_urls_is_tuple():
    # URLs is a tuple instead of a list
    args = ["https://example.com"]
    urls = ("https://example.com",)
    codeflash_output = _args_reference_urls(args, list(urls))  # function expects list[str]


def test_edge_args_contains_whitespace_strings():
    # Args contains strings with only whitespace
    args = [" ", "\t", "\n"]
    urls = ["https://example.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_edge_urls_contains_whitespace_strings():
    # URLs contains strings with only whitespace
    args = ["https://example.com"]
    urls = [" ", "\t", "\n"]
    codeflash_output = _args_reference_urls(args, urls)

def test_edge_args_contains_url_substring():
    # Args contains substring of a URL, should not match
    args = ["example.com"]
    urls = ["https://example.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_edge_args_contains_url_with_trailing_slash():
    # Args contains URL with trailing slash, should not match if URLs does not have slash
    args = ["https://example.com/"]
    urls = ["https://example.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_edge_args_contains_url_with_leading_whitespace():
    # Args contains URL with leading/trailing whitespace, should not match
    args = [" https://example.com "]
    urls = ["https://example.com"]
    codeflash_output = _args_reference_urls(args, urls)

# ---------------------------
# 3. Large Scale Test Cases
# ---------------------------

def test_large_args_many_elements_one_match():
    # Args contains 1000 elements, only one matches
    args = ["foo"] * 999 + ["https://example.com"]
    urls = ["https://example.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_large_args_many_elements_no_match():
    # Args contains 1000 elements, none match
    args = ["foo"] * 1000
    urls = ["https://example.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_large_urls_many_elements_one_match():
    # URLs contains 1000 elements, only one matches
    args = ["https://special.com"]
    urls = ["https://example.com"] * 999 + ["https://special.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_large_urls_many_elements_no_match():
    # URLs contains 1000 elements, none match
    args = ["https://notfound.com"]
    urls = ["https://example.com"] * 1000
    codeflash_output = _args_reference_urls(args, urls)

def test_large_args_and_urls_many_elements_one_match():
    # Both args and urls contain 1000 elements, one match
    args = ["foo"] * 999 + ["https://example.com"]
    urls = ["bar"] * 999 + ["https://example.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_large_args_and_urls_many_elements_no_match():
    # Both args and urls contain 1000 elements, no match
    args = ["foo"] * 1000
    urls = ["bar"] * 1000
    codeflash_output = _args_reference_urls(args, urls)

def test_large_args_and_urls_all_match():
    # All args and urls are the same, all match
    args = ["https://example.com"] * 1000
    urls = ["https://example.com"] * 1000
    codeflash_output = _args_reference_urls(args, urls)



#------------------------------------------------
from collections.abc import Sequence
from typing import Any

# imports
import pytest  # used for our unit tests
from langflow.api.v1.mcp_projects import _args_reference_urls

# unit tests

# ---------------------------
# Basic Test Cases
# ---------------------------

def test_no_args_and_no_urls():
    # Both args and urls are empty
    codeflash_output = _args_reference_urls([], [])

def test_args_none_urls_empty():
    # Args is None, urls is empty
    codeflash_output = _args_reference_urls(None, [])

def test_args_empty_urls_nonempty():
    # Args is empty, urls has elements
    codeflash_output = _args_reference_urls([], ["http://a.com"])

def test_args_nonempty_urls_empty():
    # Args has elements, urls is empty
    codeflash_output = _args_reference_urls(["http://a.com"], [])

def test_single_match():
    # One arg matches one url
    codeflash_output = _args_reference_urls(["http://a.com"], ["http://a.com"])

def test_single_no_match():
    # One arg does not match url
    codeflash_output = _args_reference_urls(["http://b.com"], ["http://a.com"])

def test_multiple_args_one_match():
    # Multiple args, one matches
    codeflash_output = _args_reference_urls(["x", "http://a.com", "y"], ["http://a.com"])

def test_multiple_args_no_match():
    # Multiple args, none match
    codeflash_output = _args_reference_urls(["x", "y", "z"], ["http://a.com"])

def test_multiple_urls_one_match():
    # Multiple urls, one matches
    codeflash_output = _args_reference_urls(["http://b.com"], ["http://a.com", "http://b.com", "http://c.com"])

def test_multiple_urls_no_match():
    # Multiple urls, none match
    codeflash_output = _args_reference_urls(["x", "y"], ["http://a.com", "http://b.com"])

def test_multiple_args_multiple_urls_multiple_matches():
    # Multiple args and urls, multiple matches
    codeflash_output = _args_reference_urls(
        ["http://a.com", "http://b.com", "foo"],
        ["http://a.com", "http://b.com", "http://c.com"]
    )

# ---------------------------
# Edge Test Cases
# ---------------------------

def test_args_with_non_string_types():
    # Args contains non-string types; only strings should be considered
    codeflash_output = _args_reference_urls(
        ["http://a.com", 123, None, {"url": "http://b.com"}],
        ["http://a.com", "http://b.com"]
    )

def test_args_with_all_non_string_types():
    # All args are non-string types; should not match
    codeflash_output = _args_reference_urls(
        [123, 456, None, {"url": "http://a.com"}],
        ["http://a.com"]
    )

def test_urls_with_duplicates():
    # URLs list has duplicates; should not affect result
    codeflash_output = _args_reference_urls(
        ["http://a.com"],
        ["http://a.com", "http://a.com"]
    )

def test_args_with_duplicates():
    # Args list has duplicates; should not affect result
    codeflash_output = _args_reference_urls(
        ["http://a.com", "http://a.com"],
        ["http://a.com"]
    )

def test_case_sensitivity():
    # URLs are case sensitive
    codeflash_output = _args_reference_urls(
        ["HTTP://A.COM"],
        ["http://a.com"]
    )

def test_partial_match():
    # Partial string match should not count
    codeflash_output = _args_reference_urls(
        ["http://a.com/extra"],
        ["http://a.com"]
    )

def test_args_sequence_type_tuple():
    # Args as a tuple, not a list
    codeflash_output = _args_reference_urls(
        ("http://a.com", "http://b.com"),
        ["http://b.com"]
    )


def test_urls_with_empty_string():
    # URLs list contains an empty string
    codeflash_output = _args_reference_urls(
        ["", "foo"],
        [""]
    )

def test_args_with_empty_string():
    # Args list contains an empty string, urls does not
    codeflash_output = _args_reference_urls(
        [""],
        ["http://a.com"]
    )

def test_args_none_urls_nonempty():
    # Args is None, urls is non-empty
    codeflash_output = _args_reference_urls(
        None,
        ["http://a.com"]
    )

def test_args_with_whitespace_strings():
    # Args contains whitespace-only strings
    codeflash_output = _args_reference_urls(
        ["  ", "\t", "\n"],
        ["  "]
    )

def test_urls_with_whitespace_strings():
    # URLs contains whitespace-only strings
    codeflash_output = _args_reference_urls(
        ["foo", "bar"],
        ["  ", "\t", "\n"]
    )

# ---------------------------
# Large Scale Test Cases
# ---------------------------

def test_large_args_and_urls_with_match():
    # Large lists, with one matching element at the end
    args = [f"url_{i}" for i in range(999)] + ["http://special.com"]
    urls = [f"http://site{i}.com" for i in range(999)] + ["http://special.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_large_args_and_urls_no_match():
    # Large lists, no matches
    args = [f"url_{i}" for i in range(1000)]
    urls = [f"http://site{i}.com" for i in range(1000)]
    codeflash_output = _args_reference_urls(args, urls)

def test_large_args_some_non_strings_with_match():
    # Large args, some are not strings, one string matches
    args = [123] * 500 + ["http://match.com"] + [None] * 499
    urls = ["http://match.com"]
    codeflash_output = _args_reference_urls(args, urls)

def test_large_args_and_urls_all_match():
    # Every arg is in urls
    args = [f"http://site{i}.com" for i in range(1000)]
    urls = [f"http://site{i}.com" for i in range(1000)]
    codeflash_output = _args_reference_urls(args, urls)

def test_large_args_and_urls_with_duplicates():
    # Large lists with many duplicates, one match
    args = ["http://a.com"] * 500 + ["foo"] * 500
    urls = ["http://a.com"] * 999
    codeflash_output = _args_reference_urls(args, urls)

To edit these changes git checkout codeflash/optimize-pr10934-2025-12-08T20.04.24 and push.

Codeflash

HzaRashid and others added 3 commits December 8, 2025 19:36
The optimization transforms a set-intersection approach into an early-exit loop that converts the `urls` list to a set for O(1) lookups. This yields a **23% speedup**.

**Key Changes:**
1. **Pre-convert URLs to set**: `urls_set = set(urls)` enables O(1) membership testing instead of O(n) list lookups
2. **Early exit strategy**: The loop returns `True` immediately upon finding the first match, avoiding unnecessary processing
3. **Eliminate set comprehension**: Removes the overhead of building a complete set from filtered args before intersection

**Why This Is Faster:**
- **Original approach**: Creates a set comprehension of all string args, then performs set intersection - always processes all args regardless of early matches
- **Optimized approach**: Stops at the first match and benefits from faster set lookups, especially effective when matches occur early in the sequence

**Performance Characteristics:**
- **Best case** (early match): Dramatically faster due to early exit
- **Worst case** (no matches): Still benefits from O(1) set lookups vs O(n) list searches for each URL comparison
- **Large datasets**: The set conversion overhead (line taking 2.7% of time) is amortized across potentially many lookups

**Test Case Analysis:**
The optimization particularly excels in scenarios like `test_large_args_and_urls_with_match()` where the matching element appears near the end, and `test_multiple_args_one_match()` where early termination provides significant gains. Even in worst-case scenarios with no matches, the set-based lookups maintain performance advantages over list-based comparisons.
@codeflash-ai codeflash-ai Bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Dec 8, 2025
@github-actions github-actions Bot added the community Pull Request from an external contributor label Dec 8, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Dec 8, 2025

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 8, 2025

Codecov Report

❌ Patch coverage is 68.31169% with 122 lines in your changes missing coverage. Please review.
✅ Project coverage is 33.06%. Comparing base (1174a6a) to head (c9c0d18).
⚠️ Report is 53 commits behind head on release-1.7.0.

Files with missing lines Patch % Lines
src/backend/base/langflow/api/v1/mcp_projects.py 69.43% 59 Missing ⚠️
...ackend/base/langflow/api/utils/mcp/config_utils.py 36.11% 23 Missing ⚠️
src/backend/base/langflow/api/v1/mcp.py 83.56% 12 Missing ⚠️
...frontend/src/customization/utils/custom-mcp-url.ts 0.00% 12 Missing ⚠️
src/backend/base/langflow/api/v1/mcp_utils.py 76.19% 5 Missing ⚠️
...ages/MainPage/pages/homePage/hooks/useMcpServer.ts 50.00% 0 Missing and 5 partials ⚠️
src/backend/base/langflow/main.py 73.33% 4 Missing ⚠️
src/backend/base/langflow/api/v1/projects.py 50.00% 1 Missing ⚠️
...ontrollers/API/queries/mcp/use-get-composer-url.ts 0.00% 1 Missing ⚠️

❌ Your project status has failed because the head coverage (40.03%) is below the target coverage (60.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@                Coverage Diff                @@
##           release-1.7.0   #10935      +/-   ##
=================================================
+ Coverage          32.43%   33.06%   +0.62%     
=================================================
  Files               1367     1368       +1     
  Lines              63315    63807     +492     
  Branches            9357     9388      +31     
=================================================
+ Hits               20538    21098     +560     
+ Misses             41744    41666      -78     
- Partials            1033     1043      +10     
Flag Coverage Δ
backend 52.80% <70.02%> (+1.56%) ⬆️
lfx 40.03% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/backend/base/langflow/api/v1/schemas.py 96.15% <100.00%> (+0.10%) ⬆️
...controllers/API/queries/mcp/use-patch-flows-mcp.ts 0.00% <ø> (ø)
...ntrollers/API/queries/mcp/use-patch-install-mcp.ts 0.00% <ø> (ø)
...ages/homePage/components/McpAutoInstallContent.tsx 80.00% <100.00%> (ø)
...nPage/pages/homePage/components/McpJsonContent.tsx 84.44% <100.00%> (+1.11%) ⬆️
...ainPage/pages/homePage/components/McpServerTab.tsx 91.80% <100.00%> (+0.27%) ⬆️
...s/MainPage/pages/homePage/utils/mcpServerUtils.tsx 93.44% <100.00%> (+0.33%) ⬆️
src/lfx/src/lfx/services/mcp_composer/service.py 57.68% <100.00%> (+0.12%) ⬆️
src/backend/base/langflow/api/v1/projects.py 29.68% <50.00%> (ø)
...ontrollers/API/queries/mcp/use-get-composer-url.ts 0.00% <0.00%> (ø)
... and 7 more

... and 37 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Base automatically changed from feat/http-stream-mcp-1.7.0 to release-1.7.0 December 8, 2025 21:02
@ogabrielluiz
Copy link
Copy Markdown
Contributor

Closing automated codeflash PR.

@codeflash-ai codeflash-ai Bot deleted the codeflash/optimize-pr10934-2025-12-08T20.04.24 branch March 3, 2026 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI community Pull Request from an external contributor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants