⚡️ Speed up function get_email_model by 106% in PR #10432 (mp_LFOSS-2632_telemetry)#10556
⚡️ Speed up function get_email_model by 106% in PR #10432 (mp_LFOSS-2632_telemetry)#10556codeflash-ai[bot] wants to merge 9 commits into
get_email_model by 106% in PR #10432 (mp_LFOSS-2632_telemetry)#10556Conversation
… email address - Temporarily add hardcoded registered email address to common SCARF telemetry events
… email address - Only send registered email address if it's defined and the context is Langflow Desktop
… email address - Implement a utility to load and fetch the registered email address - Bootstrap common telemetry fields with the registered email address (only in the Langflow Desktop context)
… email address - Lazy load registered email address - Cache loaded registered email address - Adopt latest email registration storage format / schema
…ess #10432 - Create a new telemetry schema for the registered email address - Implement utility methods to send new telemetry event for the registered email address - Send new telemetry event for the registered email address on telemetry service start-up lifecycle method - Code clean-up, commenting & refactoring
- Implement 4 new test scenarios for testing the new Email payload schema - Fix all pre-existing style errors
The optimization achieves a **105% speedup** by eliminating expensive method call overhead in the hot path through direct attribute access. **Key optimizations applied:** 1. **Explicit class variable initialization**: Added `_email_model = None` and `_resolved = False` as class attributes to avoid Python's costly attribute resolution on first access. 2. **Direct attribute access optimization**: Replaced expensive method calls `_RegisteredEmailCache.get_email_model()` and `_RegisteredEmailCache.is_resolved()` with direct class attribute reads (`_cache._email_model`, `_cache._resolved`) after storing the class reference in a local variable `_cache`. 3. **Streamlined control flow**: Changed `if email:` to `if email is not None:` for more explicit null checking and removed intermediate variable assignment in `_parse_email_registration`. **Why this leads to speedup:** The line profiler shows the original code spent significant time in method calls - `get_email_model()` took 4.84ms and `is_resolved()` took 4.64ms out of 10.5ms total. The optimized version reduces the main function time to 2.26ms by eliminating these method call overheads. In Python, method calls involve attribute lookup, bound method creation, and function call overhead, which is expensive when executed repeatedly. **Performance characteristics:** - **Cache hits** (most common case after first call): ~4.4x faster due to direct attribute access - **Cold path** (first call): Similar performance as the file I/O and parsing dominate - **Large-scale workloads**: The test with 500 repeated calls shows this optimization is particularly effective for high-frequency access patterns, making it ideal for telemetry or configuration systems that check email status frequently. The optimization preserves all functionality while dramatically improving performance for cached lookups.
|
Important Review skippedBot user detected. To trigger a single review, invoke the You can disable this status message by setting the Comment |
Codecov Report❌ Patch coverage is
❌ Your patch status has failed because the patch coverage (34.17%) is below the target coverage (40.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #10556 +/- ##
=======================================
Coverage 31.47% 31.47%
=======================================
Files 1328 1329 +1
Lines 60091 60163 +72
Branches 8986 8986
=======================================
+ Hits 18912 18935 +23
- Misses 40272 40321 +49
Partials 907 907
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
6414bf5 to
5d32dc4
Compare
|
⚡️ This pull request contains optimizations for PR #10432
If you approve this dependent PR, these changes will be merged into the original PR branch
mp_LFOSS-2632_telemetry.📄 106% (1.06x) speedup for
get_email_modelinsrc/backend/base/langflow/utils/registered_email_util.py⏱️ Runtime :
339 microseconds→164 microseconds(best of39runs)📝 Explanation and details
The optimization achieves a 105% speedup by eliminating expensive method call overhead in the hot path through direct attribute access.
Key optimizations applied:
Explicit class variable initialization: Added
_email_model = Noneand_resolved = Falseas class attributes to avoid Python's costly attribute resolution on first access.Direct attribute access optimization: Replaced expensive method calls
_RegisteredEmailCache.get_email_model()and_RegisteredEmailCache.is_resolved()with direct class attribute reads (_cache._email_model,_cache._resolved) after storing the class reference in a local variable_cache.Streamlined control flow: Changed
if email:toif email is not None:for more explicit null checking and removed intermediate variable assignment in_parse_email_registration.Why this leads to speedup:
The line profiler shows the original code spent significant time in method calls -
get_email_model()took 4.84ms andis_resolved()took 4.64ms out of 10.5ms total. The optimized version reduces the main function time to 2.26ms by eliminating these method call overheads. In Python, method calls involve attribute lookup, bound method creation, and function call overhead, which is expensive when executed repeatedly.Performance characteristics:
The optimization preserves all functionality while dramatically improving performance for cached lookups.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
import sys
import types
imports
import pytest
from langflow.utils.registered_email_util import get_email_model
--- Minimal fakes for dependencies ---
Fake EmailPayload class with validation
class EmailPayload:
def init(self, email):
if not isinstance(email, str) or "@" not in email or email.startswith("@") or email.endswith("@") or email.count("@") != 1 or email.strip() != email or len(email) == 0:
raise ValueError("Invalid email")
self.email = email
def patch_load_registration(monkeypatch, return_value=None, side_effect=None):
"""Patch load_registration to return a value or raise an exception."""
def _fake_load_registration():
if side_effect:
raise side_effect
return return_value
monkeypatch.setattr(
"langflow.api.v2.registration.load_registration",
_fake_load_registration,
raising=True,
)
--- Basic Test Cases ---
def test_valid_email(monkeypatch):
"""Basic: Registration file returns a valid email."""
patch_load_registration(monkeypatch, return_value={"email": "user@example.com"})
codeflash_output = get_email_model(); result = codeflash_output
def test_valid_email_with_cache(monkeypatch):
"""Basic: Cached email is returned without calling load_registration again."""
patch_load_registration(monkeypatch, return_value={"email": "cached@example.com"})
# First call populates cache
codeflash_output = get_email_model(); result1 = codeflash_output
# Patch to return a different value, but cache should be used
patch_load_registration(monkeypatch, return_value={"email": "other@example.com"})
codeflash_output = get_email_model(); result2 = codeflash_output
def test_none_registration(monkeypatch):
"""Basic: Registration file returns None (no registration file)."""
patch_load_registration(monkeypatch, return_value=None)
codeflash_output = get_email_model(); result = codeflash_output
def test_empty_email(monkeypatch):
"""Basic: Registration file returns empty string for email."""
patch_load_registration(monkeypatch, return_value={"email": ""})
codeflash_output = get_email_model(); result = codeflash_output
def test_email_missing(monkeypatch):
"""Basic: Registration file returns dict without 'email' key."""
patch_load_registration(monkeypatch, return_value={"username": "foo"})
codeflash_output = get_email_model(); result = codeflash_output
--- Edge Test Cases ---
@pytest.mark.parametrize("bad_email", [
"noatsymbol",
"foo@",
"@bar.com",
"foo@@bar.com",
"foo@bar@baz.com",
" foo@bar.com",
"foo@bar.com ",
"\nfoo@bar.com",
"foo@bar.com\n",
None,
123,
[],
{},
])
def test_invalid_email_syntax(monkeypatch, bad_email):
"""Edge: Registration file returns syntactically invalid email."""
patch_load_registration(monkeypatch, return_value={"email": bad_email})
codeflash_output = get_email_model(); result = codeflash_output
def test_registration_not_dict(monkeypatch):
"""Edge: Registration file returns a non-dict (e.g., string)."""
patch_load_registration(monkeypatch, return_value="not a dict")
codeflash_output = get_email_model(); result = codeflash_output
def test_registration_dict_email_none(monkeypatch):
"""Edge: Registration file returns dict with email=None."""
patch_load_registration(monkeypatch, return_value={"email": None})
codeflash_output = get_email_model(); result = codeflash_output
def test_load_registration_oserror(monkeypatch):
"""Edge: load_registration raises OSError."""
patch_load_registration(monkeypatch, side_effect=OSError("file error"))
codeflash_output = get_email_model(); result = codeflash_output
def test_load_registration_unicode_error(monkeypatch):
"""Edge: load_registration raises UnicodeDecodeError."""
patch_load_registration(monkeypatch, side_effect=UnicodeDecodeError("utf-8", b"x", 0, 1, "bad"))
codeflash_output = get_email_model(); result = codeflash_output
def test_load_registration_attribute_error(monkeypatch):
"""Edge: load_registration raises AttributeError."""
patch_load_registration(monkeypatch, side_effect=AttributeError("missing"))
codeflash_output = get_email_model(); result = codeflash_output
def test_cache_resolved_no_email(monkeypatch):
"""Edge: Once resolved, get_email_model returns None without calling load_registration again."""
patch_load_registration(monkeypatch, return_value=None)
# First call sets resolved
codeflash_output = get_email_model(); result1 = codeflash_output
# Patch to return a valid email, but should not be called
patch_load_registration(monkeypatch, return_value={"email": "shouldnotbeused@example.com"})
codeflash_output = get_email_model(); result2 = codeflash_output
def test_cache_resolved_with_invalid_email(monkeypatch):
"""Edge: Once resolved with invalid email, get_email_model returns None from cache."""
patch_load_registration(monkeypatch, return_value={"email": "noatsymbol"})
codeflash_output = get_email_model(); result1 = codeflash_output
# Patch to return a valid email, but should not be called
patch_load_registration(monkeypatch, return_value={"email": "shouldnotbeused@example.com"})
codeflash_output = get_email_model(); result2 = codeflash_output
--- Large Scale Test Cases ---
def test_large_number_of_calls_same_email(monkeypatch):
"""Large Scale: Many calls to get_email_model with same valid email, cache is used."""
patch_load_registration(monkeypatch, return_value={"email": "bulk@example.com"})
codeflash_output = get_email_model(); first = codeflash_output
for _ in range(500):
codeflash_output = get_email_model(); result = codeflash_output
def test_large_number_of_calls_with_invalid(monkeypatch):
"""Large Scale: Many calls to get_email_model with invalid email, always returns None, cache is used."""
patch_load_registration(monkeypatch, return_value={"email": "bademail"})
codeflash_output = get_email_model(); first = codeflash_output
for _ in range(500):
codeflash_output = get_email_model(); result = codeflash_output
#------------------------------------------------
import sys
import types
imports
import pytest
from langflow.utils.registered_email_util import get_email_model
--- Minimal stubs for dependencies (EmailPayload, logger, load_registration) ---
class EmailPayload:
"""Minimal EmailPayload stub with validation."""
def init(self, email):
if not isinstance(email, str) or "@" not in email or "." not in email.split("@")[-1]:
raise ValueError("Invalid email address")
self.email = email
class DummyLogger:
def init(self):
self.errors = []
def error(self, msg):
self.errors.append(msg)
Patch logger and load_registration for the test module
dummy_logger = DummyLogger()
def dummy_load_registration_func_factory(return_value=None, raise_exc=None):
def _func():
if raise_exc:
raise raise_exc
return return_value
return _func
--- Basic Test Cases ---
To edit these changes
git checkout codeflash/optimize-pr10432-2025-11-11T03.05.05and push.