Skip to content

⚡️ Speed up function get_email_model by 106% in PR #10432 (mp_LFOSS-2632_telemetry)#10556

Closed
codeflash-ai[bot] wants to merge 9 commits into
mainfrom
codeflash/optimize-pr10432-2025-11-11T03.05.05
Closed

⚡️ Speed up function get_email_model by 106% in PR #10432 (mp_LFOSS-2632_telemetry)#10556
codeflash-ai[bot] wants to merge 9 commits into
mainfrom
codeflash/optimize-pr10432-2025-11-11T03.05.05

Conversation

@codeflash-ai
Copy link
Copy Markdown
Contributor

@codeflash-ai codeflash-ai Bot commented Nov 11, 2025

⚡️ This pull request contains optimizations for PR #10432

If you approve this dependent PR, these changes will be merged into the original PR branch mp_LFOSS-2632_telemetry.

This PR will be automatically closed if the original PR is merged.


📄 106% (1.06x) speedup for get_email_model in src/backend/base/langflow/utils/registered_email_util.py

⏱️ Runtime : 339 microseconds 164 microseconds (best of 39 runs)

📝 Explanation and details

The optimization achieves a 105% speedup by eliminating expensive method call overhead in the hot path through direct attribute access.

Key optimizations applied:

  1. Explicit class variable initialization: Added _email_model = None and _resolved = False as class attributes to avoid Python's costly attribute resolution on first access.

  2. Direct attribute access optimization: Replaced expensive method calls _RegisteredEmailCache.get_email_model() and _RegisteredEmailCache.is_resolved() with direct class attribute reads (_cache._email_model, _cache._resolved) after storing the class reference in a local variable _cache.

  3. Streamlined control flow: Changed if email: to if email is not None: for more explicit null checking and removed intermediate variable assignment in _parse_email_registration.

Why this leads to speedup:

The line profiler shows the original code spent significant time in method calls - get_email_model() took 4.84ms and is_resolved() took 4.64ms out of 10.5ms total. The optimized version reduces the main function time to 2.26ms by eliminating these method call overheads. In Python, method calls involve attribute lookup, bound method creation, and function call overhead, which is expensive when executed repeatedly.

Performance characteristics:

  • Cache hits (most common case after first call): ~4.4x faster due to direct attribute access
  • Cold path (first call): Similar performance as the file I/O and parsing dominate
  • Large-scale workloads: The test with 500 repeated calls shows this optimization is particularly effective for high-frequency access patterns, making it ideal for telemetry or configuration systems that check email status frequently.

The optimization preserves all functionality while dramatically improving performance for cached lookups.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 1030 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime

import sys
import types

imports

import pytest
from langflow.utils.registered_email_util import get_email_model

--- Minimal fakes for dependencies ---

Fake EmailPayload class with validation

class EmailPayload:
def init(self, email):
if not isinstance(email, str) or "@" not in email or email.startswith("@") or email.endswith("@") or email.count("@") != 1 or email.strip() != email or len(email) == 0:
raise ValueError("Invalid email")
self.email = email

def __eq__(self, other):
    return isinstance(other, EmailPayload) and self.email == other.email

def __repr__(self):
    return f"EmailPayload(email={self.email!r})"

def patch_load_registration(monkeypatch, return_value=None, side_effect=None):
"""Patch load_registration to return a value or raise an exception."""
def _fake_load_registration():
if side_effect:
raise side_effect
return return_value
monkeypatch.setattr(
"langflow.api.v2.registration.load_registration",
_fake_load_registration,
raising=True,
)

--- Basic Test Cases ---

def test_valid_email(monkeypatch):
"""Basic: Registration file returns a valid email."""
patch_load_registration(monkeypatch, return_value={"email": "user@example.com"})
codeflash_output = get_email_model(); result = codeflash_output

def test_valid_email_with_cache(monkeypatch):
"""Basic: Cached email is returned without calling load_registration again."""
patch_load_registration(monkeypatch, return_value={"email": "cached@example.com"})
# First call populates cache
codeflash_output = get_email_model(); result1 = codeflash_output
# Patch to return a different value, but cache should be used
patch_load_registration(monkeypatch, return_value={"email": "other@example.com"})
codeflash_output = get_email_model(); result2 = codeflash_output

def test_none_registration(monkeypatch):
"""Basic: Registration file returns None (no registration file)."""
patch_load_registration(monkeypatch, return_value=None)
codeflash_output = get_email_model(); result = codeflash_output

def test_empty_email(monkeypatch):
"""Basic: Registration file returns empty string for email."""
patch_load_registration(monkeypatch, return_value={"email": ""})
codeflash_output = get_email_model(); result = codeflash_output

def test_email_missing(monkeypatch):
"""Basic: Registration file returns dict without 'email' key."""
patch_load_registration(monkeypatch, return_value={"username": "foo"})
codeflash_output = get_email_model(); result = codeflash_output

--- Edge Test Cases ---

@pytest.mark.parametrize("bad_email", [
"noatsymbol",
"foo@",
"@bar.com",
"foo@@bar.com",
"foo@bar@baz.com",
" foo@bar.com",
"foo@bar.com ",
"\nfoo@bar.com",
"foo@bar.com\n",
None,
123,
[],
{},
])
def test_invalid_email_syntax(monkeypatch, bad_email):
"""Edge: Registration file returns syntactically invalid email."""
patch_load_registration(monkeypatch, return_value={"email": bad_email})
codeflash_output = get_email_model(); result = codeflash_output

def test_registration_not_dict(monkeypatch):
"""Edge: Registration file returns a non-dict (e.g., string)."""
patch_load_registration(monkeypatch, return_value="not a dict")
codeflash_output = get_email_model(); result = codeflash_output

def test_registration_dict_email_none(monkeypatch):
"""Edge: Registration file returns dict with email=None."""
patch_load_registration(monkeypatch, return_value={"email": None})
codeflash_output = get_email_model(); result = codeflash_output

def test_load_registration_oserror(monkeypatch):
"""Edge: load_registration raises OSError."""
patch_load_registration(monkeypatch, side_effect=OSError("file error"))
codeflash_output = get_email_model(); result = codeflash_output

def test_load_registration_unicode_error(monkeypatch):
"""Edge: load_registration raises UnicodeDecodeError."""
patch_load_registration(monkeypatch, side_effect=UnicodeDecodeError("utf-8", b"x", 0, 1, "bad"))
codeflash_output = get_email_model(); result = codeflash_output

def test_load_registration_attribute_error(monkeypatch):
"""Edge: load_registration raises AttributeError."""
patch_load_registration(monkeypatch, side_effect=AttributeError("missing"))
codeflash_output = get_email_model(); result = codeflash_output

def test_cache_resolved_no_email(monkeypatch):
"""Edge: Once resolved, get_email_model returns None without calling load_registration again."""
patch_load_registration(monkeypatch, return_value=None)
# First call sets resolved
codeflash_output = get_email_model(); result1 = codeflash_output
# Patch to return a valid email, but should not be called
patch_load_registration(monkeypatch, return_value={"email": "shouldnotbeused@example.com"})
codeflash_output = get_email_model(); result2 = codeflash_output

def test_cache_resolved_with_invalid_email(monkeypatch):
"""Edge: Once resolved with invalid email, get_email_model returns None from cache."""
patch_load_registration(monkeypatch, return_value={"email": "noatsymbol"})
codeflash_output = get_email_model(); result1 = codeflash_output
# Patch to return a valid email, but should not be called
patch_load_registration(monkeypatch, return_value={"email": "shouldnotbeused@example.com"})
codeflash_output = get_email_model(); result2 = codeflash_output

--- Large Scale Test Cases ---

def test_large_number_of_calls_same_email(monkeypatch):
"""Large Scale: Many calls to get_email_model with same valid email, cache is used."""
patch_load_registration(monkeypatch, return_value={"email": "bulk@example.com"})
codeflash_output = get_email_model(); first = codeflash_output
for _ in range(500):
codeflash_output = get_email_model(); result = codeflash_output

def test_large_number_of_calls_with_invalid(monkeypatch):
"""Large Scale: Many calls to get_email_model with invalid email, always returns None, cache is used."""
patch_load_registration(monkeypatch, return_value={"email": "bademail"})
codeflash_output = get_email_model(); first = codeflash_output
for _ in range(500):
codeflash_output = get_email_model(); result = codeflash_output

#------------------------------------------------
import sys
import types

imports

import pytest
from langflow.utils.registered_email_util import get_email_model

--- Minimal stubs for dependencies (EmailPayload, logger, load_registration) ---

class EmailPayload:
"""Minimal EmailPayload stub with validation."""
def init(self, email):
if not isinstance(email, str) or "@" not in email or "." not in email.split("@")[-1]:
raise ValueError("Invalid email address")
self.email = email

def __eq__(self, other):
    return isinstance(other, EmailPayload) and self.email == other.email

def __repr__(self):
    return f"EmailPayload(email={self.email!r})"

class DummyLogger:
def init(self):
self.errors = []
def error(self, msg):
self.errors.append(msg)

Patch logger and load_registration for the test module

dummy_logger = DummyLogger()

def dummy_load_registration_func_factory(return_value=None, raise_exc=None):
def _func():
if raise_exc:
raise raise_exc
return return_value
return _func

--- Basic Test Cases ---

To edit these changes git checkout codeflash/optimize-pr10432-2025-11-11T03.05.05 and push.

Codeflash

mpawlow and others added 9 commits November 10, 2025 13:40
… email address

- Temporarily add hardcoded registered email address to common SCARF telemetry events
… email address

- Only send registered email address if it's defined and the context is Langflow Desktop
… email address

- Implement a  utility to load and fetch the registered email address
- Bootstrap common telemetry fields with the registered email address (only in the Langflow Desktop context)
… email address

- Lazy load registered email address
- Cache loaded registered email address
- Adopt latest email registration storage format / schema
…ess #10432

- Create a new telemetry schema for the registered email address
- Implement utility methods to send new telemetry event for the registered email address
- Send new telemetry event for the registered email address on telemetry service start-up lifecycle method
- Code clean-up, commenting & refactoring
- Implement 4 new test scenarios for testing the new Email payload schema
- Fix all pre-existing style errors
The optimization achieves a **105% speedup** by eliminating expensive method call overhead in the hot path through direct attribute access. 

**Key optimizations applied:**

1. **Explicit class variable initialization**: Added `_email_model = None` and `_resolved = False` as class attributes to avoid Python's costly attribute resolution on first access.

2. **Direct attribute access optimization**: Replaced expensive method calls `_RegisteredEmailCache.get_email_model()` and `_RegisteredEmailCache.is_resolved()` with direct class attribute reads (`_cache._email_model`, `_cache._resolved`) after storing the class reference in a local variable `_cache`.

3. **Streamlined control flow**: Changed `if email:` to `if email is not None:` for more explicit null checking and removed intermediate variable assignment in `_parse_email_registration`.

**Why this leads to speedup:**

The line profiler shows the original code spent significant time in method calls - `get_email_model()` took 4.84ms and `is_resolved()` took 4.64ms out of 10.5ms total. The optimized version reduces the main function time to 2.26ms by eliminating these method call overheads. In Python, method calls involve attribute lookup, bound method creation, and function call overhead, which is expensive when executed repeatedly.

**Performance characteristics:**

- **Cache hits** (most common case after first call): ~4.4x faster due to direct attribute access
- **Cold path** (first call): Similar performance as the file I/O and parsing dominate
- **Large-scale workloads**: The test with 500 repeated calls shows this optimization is particularly effective for high-frequency access patterns, making it ideal for telemetry or configuration systems that check email status frequently.

The optimization preserves all functionality while dramatically improving performance for cached lookups.
@codeflash-ai codeflash-ai Bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Nov 11, 2025
@github-actions github-actions Bot added the community Pull Request from an external contributor label Nov 11, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Nov 11, 2025

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented Nov 11, 2025

Codecov Report

❌ Patch coverage is 34.17722% with 52 lines in your changes missing coverage. Please review.
✅ Project coverage is 31.47%. Comparing base (91d73e7) to head (8817fd2).
⚠️ Report is 15 commits behind head on main.

Files with missing lines Patch % Lines
...ckend/base/langflow/utils/registered_email_util.py 30.61% 34 Missing ⚠️
...ackend/base/langflow/services/telemetry/service.py 33.33% 18 Missing ⚠️

❌ Your patch status has failed because the patch coverage (34.17%) is below the target coverage (40.00%). You can increase the patch coverage or adjust the target coverage.
❌ Your project status has failed because the head coverage (39.35%) is below the target coverage (60.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main   #10556   +/-   ##
=======================================
  Coverage   31.47%   31.47%           
=======================================
  Files        1328     1329    +1     
  Lines       60091    60163   +72     
  Branches     8986     8986           
=======================================
+ Hits        18912    18935   +23     
- Misses      40272    40321   +49     
  Partials      907      907           
Flag Coverage Δ
backend 51.05% <34.17%> (-0.10%) ⬇️
lfx 39.35% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...backend/base/langflow/services/telemetry/schema.py 100.00% <100.00%> (ø)
...ackend/base/langflow/services/telemetry/service.py 78.61% <33.33%> (-7.62%) ⬇️
...ckend/base/langflow/utils/registered_email_util.py 30.61% <30.61%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mpawlow mpawlow force-pushed the mp_LFOSS-2632_telemetry branch 4 times, most recently from 6414bf5 to 5d32dc4 Compare November 11, 2025 23:45
Base automatically changed from mp_LFOSS-2632_telemetry to main November 12, 2025 14:03
@codeflash-ai codeflash-ai Bot closed this Nov 13, 2025
@codeflash-ai
Copy link
Copy Markdown
Contributor Author

codeflash-ai Bot commented Nov 13, 2025

This PR has been automatically closed because the original PR #6923 by shangwenhe was closed.

@codeflash-ai codeflash-ai Bot deleted the codeflash/optimize-pr10432-2025-11-11T03.05.05 branch November 13, 2025 02:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI community Pull Request from an external contributor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant