⚡️ Speed up method `TransactionLogsResponse.serialize_inputs` by 14% in PR #10820 (`cz/add-logs-feature`) by codeflash-ai[bot] · Pull Request #11173 · langflow-ai/langflow

codeflash-ai · 2025-12-30T19:18:51Z

⚡️ This pull request contains optimizations for PR #10820

If you approve this dependent PR, these changes will be merged into the original PR branch cz/add-logs-feature.

This PR will be automatically closed if the original PR is merged.

📄 14% (0.14x) speedup for `TransactionLogsResponse.serialize_inputs` in `src/backend/base/langflow/services/database/models/transactions/model.py`

⏱️ Runtime : 4.58 milliseconds → 4.01 milliseconds (best of 51 runs)

📝 Explanation and details

The optimized code achieves a 14% speedup (from 4.58ms to 4.01ms) through strategic short-circuit optimizations in frequently-called serialization paths:

Key Optimizations

1. Fast-path for primitives in `serialize()`

The optimized version adds an early exit for common primitive types before expensive dispatcher logic:

if obj is None or isinstance(obj, (str, int, float, bool)):
    return obj

This avoids calling _serialize_dispatcher() for the most common data types. Since serialization often processes nested dictionaries containing many primitive values, this check eliminates significant overhead.

2. Reordered checks in `sanitize_data()`

The original checks if data is None first, then if not isinstance(data, dict). The optimized version reverses this:

if not isinstance(data, dict):
    return data
if data is None:
    return None

Since None is a valid non-dict type that would be caught by the isinstance check anyway, checking for dict-ness first is more efficient. This also adds an early return for empty dicts (if not data: return {}), avoiding unnecessary calls to _sanitize_dict({}).

Why This Matters

Based on the test suite, the code frequently serializes:

Large nested structures with many primitive values (strings, ints, bools)
Lists of dictionaries containing both sensitive and non-sensitive data
Mixed-type data with primitives alongside complex objects

The primitive fast-path optimization is particularly effective here because:

Every nested dict/list traversal hits multiple primitive values
Tests like test_serialize_inputs_large_list_of_sensitive_dicts (100 items) and test_serialize_inputs_performance_large (500 users) show the multiplicative benefit of avoiding dispatcher overhead on each primitive

The sanitize_data() optimization helps in edge cases with empty dicts or None values, providing small but consistent gains across the test suite.

Impact Assessment

The 14% speedup compounds when serialize_inputs() is called repeatedly in transaction logging workflows. Since this appears to be a database model for transaction logs, these functions likely execute in high-volume scenarios where even microsecond improvements per call translate to meaningful latency reductions at scale.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 44 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Click to see Generated Regression Tests

import re
# Patch the imported get_settings_service in the serialize module
import sys
import types
from datetime import datetime, timedelta, timezone
from typing import Any
from uuid import UUID, uuid4

# imports
import pytest
from langflow.serialization.serialization import (get_max_items_length,
                                                  get_max_text_length,
                                                  serialize)
from langflow.services.database.models.transactions.model import \
    TransactionLogsResponse

# ------------------- UNIT TESTS BELOW -------------------

@pytest.fixture
def basic_instance():
    return TransactionLogsResponse(
        id=uuid4(),
        timestamp=datetime.now(timezone.utc),
        vertex_id="vertex123",
        target_id="target456",
        inputs=None,
        outputs=None,
        status="success"
    )

# 1. BASIC TEST CASES



def test_serialize_inputs_no_sensitive_keys(basic_instance):
    # Should serialize a dict with no sensitive keys unchanged
    data = {"foo": "bar", "number": 123}
    codeflash_output = basic_instance.serialize_inputs(data)

def test_serialize_inputs_simple_sensitive_key(basic_instance):
    # Should mask sensitive keys (e.g., 'password')
    data = {"password": "mysecretpassword"}
    codeflash_output = basic_instance.serialize_inputs(data); result = codeflash_output

def test_serialize_inputs_short_sensitive_value(basic_instance):
    # Should fully mask short sensitive values (<12 chars)
    data = {"token": "short"}
    codeflash_output = basic_instance.serialize_inputs(data); result = codeflash_output



def test_serialize_inputs_list_of_dicts_with_sensitive(basic_instance):
    # Should mask sensitive keys in list of dicts
    data = {
        "users": [
            {"username": "bob", "password": "hunter2"},
            {"username": "alice", "password": "supersecretpassword"}
        ]
    }
    codeflash_output = basic_instance.serialize_inputs(data); result = codeflash_output
    pw = result["users"][1]["password"]

def test_serialize_inputs_mixed_types(basic_instance):
    # Should handle ints, floats, bools, None, etc.
    data = {
        "int": 1,
        "float": 2.5,
        "bool": True,
        "none": None,
        "list": [1, 2, 3]
    }
    codeflash_output = basic_instance.serialize_inputs(data); result = codeflash_output

# 2. EDGE TEST CASES

def test_serialize_inputs_sensitive_key_variants(basic_instance):
    # Should match keys like 'api_key', 'API_KEY', 'api-key', 'apiKey', 'my_api_key'
    data = {
        "API_KEY": "ABCDEF1234567890",
        "api-key": "1234567890ABCDEF",
        "my_api_key": "ZZZZZZZZZZZZZZZZ",
        "notakey": "should not mask"
    }
    codeflash_output = basic_instance.serialize_inputs(data); result = codeflash_output
    # All but 'notakey' should be masked
    for k in ["API_KEY", "api-key", "my_api_key"]:
        v = result[k]


def test_serialize_inputs_sensitive_key_with_empty_value(basic_instance):
    # Should mask empty sensitive values as "***REDACTED***"
    data = {"password": ""}
    codeflash_output = basic_instance.serialize_inputs(data); result = codeflash_output

def test_serialize_inputs_sensitive_key_with_none_value(basic_instance):
    # Should mask None sensitive values as "***REDACTED***"
    data = {"token": None}
    codeflash_output = basic_instance.serialize_inputs(data); result = codeflash_output

def test_serialize_inputs_sensitive_key_with_non_str_value(basic_instance):
    # Should mask non-string sensitive values as "***REDACTED***"
    data = {"api_key": 123456789}
    codeflash_output = basic_instance.serialize_inputs(data); result = codeflash_output

def test_serialize_inputs_truncates_long_strings(basic_instance):
    # Should truncate long strings according to max_text_length
    data = {"foo": "x" * 100}
    codeflash_output = basic_instance.serialize_inputs(data); result = codeflash_output

def test_serialize_inputs_truncates_long_lists(basic_instance):
    # Should truncate lists according to max_items_length
    data = {"lst": list(range(50))}
    codeflash_output = basic_instance.serialize_inputs(data); result = codeflash_output

def test_serialize_inputs_excludes_code_key_nested(basic_instance):
    # Should exclude 'code' key even if nested
    data = {"outer": {"code": "should not appear", "foo": "bar"}}
    codeflash_output = basic_instance.serialize_inputs(data); result = codeflash_output

# 3. LARGE SCALE TEST CASES


def test_serialize_inputs_large_list_of_sensitive_dicts(basic_instance):
    # Should mask all sensitive keys and truncate list
    data = {"users": [{"password": f"pass{i:02d}secret"} for i in range(100)]}
    codeflash_output = basic_instance.serialize_inputs(data); result = codeflash_output
    for user_dict in result["users"]:
        pw = user_dict["password"]

def test_serialize_inputs_large_nested_structure(basic_instance):
    # Deeply nested structure with sensitive keys
    data = {
        "level1": [
            {"level2": {"api_key": f"KEY{i:04d}SECRET"}} for i in range(20)
        ]
    }
    codeflash_output = basic_instance.serialize_inputs(data); result = codeflash_output
    for d in result["level1"]:
        masked = d["level2"]["api_key"]

def test_serialize_inputs_performance_large(basic_instance):
    # This is not a timing test, but ensures no crash and correct truncation/masking for near-limit size
    data = {
        f"user{i}": {
            "username": f"user{i}",
            "password": f"password{i}withlongtail"
        }
        for i in range(500)
    }
    codeflash_output = basic_instance.serialize_inputs(data); result = codeflash_output
    for v in result.values():
        pw = v["password"]
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import re
# Patch the imported get_settings_service in the module where serialize is defined
import sys
import types
from datetime import datetime, timedelta, timezone
from typing import Any
from uuid import UUID, uuid4

# imports
import pytest
from langflow.services.database.models.transactions.model import \
    TransactionLogsResponse

# --- Unit Tests ---

# Basic test cases

To edit these changes git checkout codeflash/optimize-pr10820-2025-12-30T19.18.44 and push.

… into cz/add-logs-feature

The optimized code achieves a **14% speedup** (from 4.58ms to 4.01ms) through strategic short-circuit optimizations in frequently-called serialization paths: ## Key Optimizations ### 1. **Fast-path for primitives in `serialize()`** The optimized version adds an early exit for common primitive types before expensive dispatcher logic: ```python if obj is None or isinstance(obj, (str, int, float, bool)): return obj ``` This avoids calling `_serialize_dispatcher()` for the most common data types. Since serialization often processes nested dictionaries containing many primitive values, this check eliminates significant overhead. ### 2. **Reordered checks in `sanitize_data()`** The original checks `if data is None` first, then `if not isinstance(data, dict)`. The optimized version reverses this: ```python if not isinstance(data, dict): return data if data is None: return None ``` Since `None` is a valid non-dict type that would be caught by the `isinstance` check anyway, checking for dict-ness first is more efficient. This also adds an early return for empty dicts (`if not data: return {}`), avoiding unnecessary calls to `_sanitize_dict({})`. ## Why This Matters Based on the test suite, the code frequently serializes: - **Large nested structures** with many primitive values (strings, ints, bools) - **Lists of dictionaries** containing both sensitive and non-sensitive data - **Mixed-type data** with primitives alongside complex objects The primitive fast-path optimization is particularly effective here because: - Every nested dict/list traversal hits multiple primitive values - Tests like `test_serialize_inputs_large_list_of_sensitive_dicts` (100 items) and `test_serialize_inputs_performance_large` (500 users) show the multiplicative benefit of avoiding dispatcher overhead on each primitive The `sanitize_data()` optimization helps in edge cases with empty dicts or `None` values, providing small but consistent gains across the test suite. ## Impact Assessment The 14% speedup compounds when `serialize_inputs()` is called repeatedly in transaction logging workflows. Since this appears to be a database model for transaction logs, these functions likely execute in high-volume scenarios where even microsecond improvements per call translate to meaningful latency reductions at scale.

coderabbitai · 2025-12-30T19:18:57Z

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2025-12-30T19:22:11Z

Codecov Report

❌ Patch coverage is 61.86047% with 82 lines in your changes missing coverage. Please review.
✅ Project coverage is 33.33%. Comparing base (9ce7d84) to head (3880e92).
⚠️ Report is 7 commits behind head on main.

Files with missing lines	Patch %	Lines
src/frontend/src/modals/flowLogsModal/index.tsx	0.00%	25 Missing ⚠️
...rc/modals/flowLogsModal/config/flowLogsColumns.tsx	0.00%	21 Missing ⚠️
...odals/flowLogsModal/components/LogDetailViewer.tsx	0.00%	14 Missing ⚠️
...low/services/database/models/transactions/model.py	90.27%	7 Missing ⚠️
src/lfx/src/lfx/graph/utils.py	30.00%	5 Missing and 2 partials ⚠️
src/lfx/src/lfx/graph/vertex/base.py	57.14%	4 Missing and 2 partials ⚠️
src/lfx/src/lfx/services/transaction/service.py	77.77%	2 Missing ⚠️

❌ Your project status has failed because the head coverage (39.50%) is below the target coverage (60.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #11173      +/-   ##
==========================================
+ Coverage   33.23%   33.33%   +0.10%     
==========================================
  Files        1394     1399       +5     
  Lines       66068    66222     +154     
  Branches     9778     9785       +7     
==========================================
+ Hits        21956    22076     +120     
- Misses      42986    43021      +35     
+ Partials     1126     1125       -1

Flag	Coverage Δ
lfx	`39.50% <62.50%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
src/backend/base/langflow/api/v1/monitor.py	`50.00% <100.00%> (ø)`
...ckend/base/langflow/serialization/serialization.py	`72.28% <ø> (-0.30%)`	⬇️
...flow/services/database/models/transactions/crud.py	`78.78% <100.00%> (+40.85%)`	⬆️
...kend/base/langflow/services/transaction/factory.py	`100.00% <100.00%> (ø)`
...kend/base/langflow/services/transaction/service.py	`100.00% <100.00%> (ø)`
src/backend/base/langflow/services/utils.py	`81.09% <100.00%> (-0.39%)`	⬇️
...s/API/queries/transactions/use-get-transactions.ts	`0.00% <ø> (ø)`
src/lfx/src/lfx/graph/vertex/vertex_types.py	`43.38% <ø> (-0.30%)`	⬇️
src/lfx/src/lfx/services/deps.py	`59.49% <100.00%> (-3.67%)`	⬇️
src/lfx/src/lfx/services/interfaces.py	`100.00% <100.00%> (ø)`
... and 8 more

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codeflash-ai · 2026-01-05T17:44:21Z

This PR has been automatically closed because the original PR #10820 by Cristhianzl was closed.

Cristhianzl and others added 20 commits December 1, 2025 16:50

add logs feature back

5a16f8d

Merge branch 'main' into cz/add-logs-feature

268238f

[autofix.ci] apply automated fixes

12c743e

fix integration tests

ba595ff

Merge branch 'cz/add-logs-feature' of github.com:langflow-ai/langflow…

16d55c4

… into cz/add-logs-feature

github suggestions

2dd96cc

Merge branch 'main' into cz/add-logs-feature

797b579

[autofix.ci] apply automated fixes

21d806d

merge fix

a21a690

[autofix.ci] apply automated fixes

6add5c5

improve code quality

99d502b

Merge branch 'cz/add-logs-feature' of github.com:langflow-ai/langflow…

6554501

… into cz/add-logs-feature

add logs dialog viewer and sanitize result

8305be6

ruff checkers and format

0cc78d5

add output to logs

1304ba0

remove target column to avoid duplication

549c6c6

create lfx service to execute transactions standalone lfx

fc00489

[autofix.ci] apply automated fixes

e7d729a

[autofix.ci] apply automated fixes (attempt 2/3)

79c1e5c

codeflash-ai Bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Dec 30, 2025

github-actions Bot added the community Pull Request from an external contributor label Dec 30, 2025

Base automatically changed from cz/add-logs-feature to main January 5, 2026 17:44

codeflash-ai Bot closed this Jan 5, 2026

codeflash-ai Bot deleted the codeflash/optimize-pr10820-2025-12-30T19.18.44 branch January 5, 2026 17:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⚡️ Speed up method `TransactionLogsResponse.serialize_inputs` by 14% in PR #10820 (`cz/add-logs-feature`)#11173

⚡️ Speed up method `TransactionLogsResponse.serialize_inputs` by 14% in PR #10820 (`cz/add-logs-feature`)#11173
codeflash-ai[bot] wants to merge 20 commits into
mainfrom
codeflash/optimize-pr10820-2025-12-30T19.18.44

codeflash-ai Bot commented Dec 30, 2025

Uh oh!

coderabbitai Bot commented Dec 30, 2025

Review skipped

Uh oh!

codecov Bot commented Dec 30, 2025 •

edited

Loading

Uh oh!

codeflash-ai Bot commented Jan 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

codeflash-ai Bot commented Dec 30, 2025

⚡️ This pull request contains optimizations for PR #10820

📄 14% (0.14x) speedup for TransactionLogsResponse.serialize_inputs in src/backend/base/langflow/services/database/models/transactions/model.py

📝 Explanation and details

Key Optimizations

1. Fast-path for primitives in serialize()

2. Reordered checks in sanitize_data()

Why This Matters

Impact Assessment

Uh oh!

coderabbitai Bot commented Dec 30, 2025

Review skipped

Uh oh!

codecov Bot commented Dec 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

codeflash-ai Bot commented Jan 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

📄 14% (0.14x) speedup for `TransactionLogsResponse.serialize_inputs` in `src/backend/base/langflow/services/database/models/transactions/model.py`

1. Fast-path for primitives in `serialize()`

2. Reordered checks in `sanitize_data()`

codecov Bot commented Dec 30, 2025 •

edited

Loading