⚡️ Speed up method LangFuseTracer.end_trace by 13% in PR #11114 (feat/langchain-1.0)#11832
Closed
codeflash-ai[bot] wants to merge 2 commits into
Closed
Conversation
The optimized code achieves a **12% speedup** by adding a fast-path check in the `serialize()` function that bypasses expensive dispatcher logic for simple JSON-like structures (nested dicts/lists containing only primitives). **Key Optimization:** A new helper function `_is_simple_json_structure()` recursively validates whether an object contains only JSON-serializable primitives (None, str, int, float, bool) and simple containers (dict with string keys, list, tuple). When `serialize()` is called with `no_limits=True` and `to_str=False` on such structures, it immediately returns the object without invoking `_serialize_dispatcher()`. **Why This Is Faster:** The line profiler reveals the bottleneck: in the original code, `_serialize_dispatcher()` consumes **63.1%** of `serialize()`'s runtime (38.3ms out of 60.6ms). In the optimized version, the new fast-path check takes **96.7%** of the much smaller total time (19.4ms out of 20.1ms), but this is still a net win because we avoid the dispatcher entirely. The dispatcher involves expensive pattern matching across many type checks (datetime, Decimal, UUID, Document, pandas types, numpy types, etc.). **Impact on `LangFuseTracer.end_trace()`:** The `end_trace()` method calls `serialize(output)` where `output` is typically a simple dict containing strings, lists, and primitives (outputs, error messages, logs). The optimization reduces the time spent in `span.update(output=serialize(output))` from **53.5%** (153.9ms) to **42.5%** (96.5ms) of the method's runtime—a **37% reduction** in serialization overhead within the tracing context. **Test Results:** The annotated tests show the optimization excels for: - **Simple nested structures**: Tests with dict/list of primitives (e.g., `test_end_trace_output_with_nested_dict`, `test_end_trace_with_list_in_outputs`) benefit most since they match the fast-path criteria - **Large simple outputs**: Tests like `test_end_trace_with_large_outputs` (1000 key-value pairs) and `test_end_trace_with_many_logs` (1000 log entries) avoid repeated dispatcher overhead - **High-frequency tracing**: `test_end_trace_multiple_spans_sequential` (100 spans) shows cumulative benefits when serialization is called repeatedly **Workload Considerations:** Since `end_trace()` is called at the completion of every traced operation in the LangFuse tracing system, and tracing outputs are predominantly simple JSON structures, this optimization provides consistent benefits in production workloads involving instrumentation and observability. The 12% overall speedup compounds across many trace operations.
Contributor
Codecov Report❌ Patch coverage is
❌ Your project status has failed because the head coverage (42.03%) is below the target coverage (60.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## feat/langchain-1.0 #11832 +/- ##
======================================================
- Coverage 35.33% 35.33% -0.01%
======================================================
Files 1521 1521
Lines 73033 73053 +20
Branches 10951 10951
======================================================
+ Hits 25808 25813 +5
- Misses 45829 45845 +16
+ Partials 1396 1395 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Contributor
|
Closing automated codeflash PR. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
⚡️ This pull request contains optimizations for PR #11114
If you approve this dependent PR, these changes will be merged into the original PR branch
feat/langchain-1.0.📄 13% (0.13x) speedup for
LangFuseTracer.end_traceinsrc/backend/base/langflow/services/tracing/langfuse.py⏱️ Runtime :
65.5 milliseconds→58.2 milliseconds(best of33runs)📝 Explanation and details
The optimized code achieves a 12% speedup by adding a fast-path check in the
serialize()function that bypasses expensive dispatcher logic for simple JSON-like structures (nested dicts/lists containing only primitives).Key Optimization:
A new helper function
_is_simple_json_structure()recursively validates whether an object contains only JSON-serializable primitives (None, str, int, float, bool) and simple containers (dict with string keys, list, tuple). Whenserialize()is called withno_limits=Trueandto_str=Falseon such structures, it immediately returns the object without invoking_serialize_dispatcher().Why This Is Faster:
The line profiler reveals the bottleneck: in the original code,
_serialize_dispatcher()consumes 63.1% ofserialize()'s runtime (38.3ms out of 60.6ms). In the optimized version, the new fast-path check takes 96.7% of the much smaller total time (19.4ms out of 20.1ms), but this is still a net win because we avoid the dispatcher entirely. The dispatcher involves expensive pattern matching across many type checks (datetime, Decimal, UUID, Document, pandas types, numpy types, etc.).Impact on
LangFuseTracer.end_trace():The
end_trace()method callsserialize(output)whereoutputis typically a simple dict containing strings, lists, and primitives (outputs, error messages, logs). The optimization reduces the time spent inspan.update(output=serialize(output))from 53.5% (153.9ms) to 42.5% (96.5ms) of the method's runtime—a 37% reduction in serialization overhead within the tracing context.Test Results:
The annotated tests show the optimization excels for:
test_end_trace_output_with_nested_dict,test_end_trace_with_list_in_outputs) benefit most since they match the fast-path criteriatest_end_trace_with_large_outputs(1000 key-value pairs) andtest_end_trace_with_many_logs(1000 log entries) avoid repeated dispatcher overheadtest_end_trace_multiple_spans_sequential(100 spans) shows cumulative benefits when serialization is called repeatedlyWorkload Considerations:
Since
end_trace()is called at the completion of every traced operation in the LangFuse tracing system, and tracing outputs are predominantly simple JSON structures, this optimization provides consistent benefits in production workloads involving instrumentation and observability. The 12% overall speedup compounds across many trace operations.✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
import os
import uuid
from typing import Any, Sequence
import pytest # used for our unit tests
from langflow.serialization.serialization import serialize
from langflow.services.tracing.langfuse import LangFuseTracer
Note:
The tests below exercise LangFuseTracer.end_trace under a variety of conditions.
We avoid relying on an actual langfuse client by directly manipulating the tracer
instance state (setting _ready and providing a span-like object). The span-like
object below is a tiny, concrete helper class defined in this test module to
capture calls to
updateandend. While the production code expects realLangfuseSpan instances, for unit testing the method behavior we assert that
when a span object with the expected interface is present, end_trace will:
- remove the span from tracer.spans,
- call span.update with a serialized output,
- call span.end().
Each test includes comments explaining its purpose and the assertions made.
class _SpanRecorder:
"""
Minimal concrete span-like object used for tests.
def test_end_trace_when_not_ready_does_nothing():
"""
If tracer._ready is False (default when no Langfuse config), end_trace should
return immediately and not raise. Also, spans dict should remain unchanged.
"""
trace_id = str(uuid.uuid4())
# Create a LangFuseTracer with minimal, valid constructor args.
tracer = LangFuseTracer(
trace_name="flow - basic",
trace_type="type",
project_name="proj",
trace_id=uuid.uuid4(),
user_id=None,
session_id=None,
)
def test_end_trace_ready_but_no_matching_span_is_noop():
"""
If tracer._ready is True but no span for the provided trace_id exists,
end_trace should not raise and should leave tracer.spans unchanged.
"""
trace_id = "nonexistent-id"
tracer = LangFuseTracer(
trace_name="flow - ready_no_span",
trace_type="type",
project_name="proj",
trace_id=uuid.uuid4(),
)
def test_end_trace_with_span_calls_update_and_end():
"""
When tracer._ready is True and a span exists for the trace_id,
end_trace must:
- pop the span from tracer.spans,
- call span.update with output=serialize(merged_output),
- call span.end().
Also verify that outputs, error, and logs are merged correctly before serialization.
"""
trace_id = "trace-123"
tracer = LangFuseTracer(
trace_name="flow - update_end",
trace_type="type",
project_name="proj",
trace_id=uuid.uuid4(),
)
tracer._ready = True
def test_end_trace_with_empty_and_none_values_serializes_correctly():
"""
Test end_trace with outputs=None, error=None, and empty logs.
When a span exists, update must be called with output serialized as {} (empty dict).
"""
trace_id = "edge-empty-none"
tracer = LangFuseTracer(
trace_name="flow - edge",
trace_type="type",
project_name="proj",
trace_id=uuid.uuid4(),
)
tracer._ready = True
def test_end_trace_with_special_characters_in_outputs():
"""
Ensure strings with special characters are preserved through merging and serialization.
"""
trace_id = "edge-special-chars"
tracer = LangFuseTracer(
trace_name="flow - special",
trace_type="type",
project_name="proj",
trace_id=uuid.uuid4(),
)
tracer._ready = True
def test_end_trace_multiple_consecutive_calls_handle_state_correctly():
"""
Verify that consecutive calls to end_trace correctly pop spans one by one
and do not interfere with remaining spans in the tracer.
"""
base_trace_id = "multi-call-"
tracer = LangFuseTracer(
trace_name="flow - multi",
trace_type="type",
project_name="proj",
trace_id=uuid.uuid4(),
)
tracer._ready = True
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from collections import OrderedDict
from unittest.mock import MagicMock, PropertyMock, patch
from uuid import UUID
imports
import pytest
from langflow.services.tracing.langfuse import LangFuseTracer
from langflow.services.tracing.schema import Log
def test_end_trace_basic_with_outputs():
"""Test end_trace updates span with output and ends it."""
# Create a tracer instance with mocked langfuse setup
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_with_error():
"""Test end_trace adds error to output when exception is provided."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_with_logs():
"""Test end_trace adds logs to output when provided."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_not_ready():
"""Test end_trace returns early when tracer is not ready."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_nonexistent_span():
"""Test end_trace handles case when span_id doesn't exist in spans dict."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_with_all_parameters():
"""Test end_trace with outputs, error, and logs all provided."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_with_none_outputs():
"""Test end_trace when outputs is None."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_with_empty_outputs():
"""Test end_trace with empty outputs dict."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_with_empty_logs():
"""Test end_trace with empty logs sequence."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_with_none_error():
"""Test end_trace when error is None (default)."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_with_special_characters_in_trace_id():
"""Test end_trace with special characters in trace_id."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_with_unicode_in_error_message():
"""Test end_trace with unicode characters in error message."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_output_with_nested_dict():
"""Test end_trace with nested dictionary in outputs."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_with_list_in_outputs():
"""Test end_trace with list values in outputs."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_span_removal_idempotent():
"""Test that calling end_trace multiple times with same id doesn't break."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_with_very_long_trace_id():
"""Test end_trace with very long trace_id string."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_merge_outputs_and_error():
"""Test that outputs and error are properly merged in output dict."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_with_large_outputs():
"""Test end_trace with large dictionary in outputs."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_with_many_logs():
"""Test end_trace with large number of log entries."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_with_deeply_nested_structure():
"""Test end_trace with deeply nested output structure."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_multiple_spans_sequential():
"""Test end_trace called multiple times on different span ids."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_spans_dict_maintains_order():
"""Test that spans dict maintains insertion order through end_trace calls."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_with_large_string_values():
"""Test end_trace with very large string values in outputs."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_error_with_large_traceback():
"""Test end_trace with exception that has large traceback."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
def test_end_trace_with_mixed_data_types():
"""Test end_trace with outputs containing various Python types."""
with patch.object(LangFuseTracer, '_setup_langfuse', return_value=True):
tracer = LangFuseTracer(
trace_name="test_trace - flow_1",
trace_type="flow",
project_name="test_project",
trace_id=UUID('12345678-1234-5678-1234-567812345678'),
)
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
To edit these changes
git checkout codeflash/optimize-pr11114-2026-02-19T20.45.30and push.