fix(langchain): detach orphaned context_api.attach() calls that corrupt OTel context#3807
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughThe PR adds tracking for a second OpenTelemetry context attachment token ("association_properties") during span creation, stores it in Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py (1)
368-385:⚠️ Potential issue | 🟠 MajorFix double-encoding of OpenAI-style tool call arguments.
Lines 373-385 incorrectly handle pre-serialized arguments from OpenAI API responses. When tool_args comes from
function.arguments(the fallback path), it's already a JSON string. Applyingjson.dumps()to a string wraps it in additional quotes/escapes, producing invalid span attributes. Check the type before serializing:Suggested fix
- _set_span_attribute( - span, - f"{tool_call_prefix}.arguments", - json.dumps(tool_args, cls=CallbackFilteredJSONEncoder), - ) + serialized_args = ( + tool_args + if isinstance(tool_args, str) or tool_args is None + else json.dumps(tool_args, cls=CallbackFilteredJSONEncoder) + ) + _set_span_attribute(span, f"{tool_call_prefix}.arguments", serialized_args)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py` around lines 368 - 385, _in _set_chat_tool_calls, pre-serialized OpenAI function.arguments (a JSON string) is being passed through json.dumps which double-encodes it; update the logic in _set_chat_tool_calls to check the type of tool_args and only call json.dumps(tool_args, cls=CallbackFilteredJSONEncoder) when tool_args is not a str (i.e., when it is a dict/list), otherwise pass the string verbatim to _set_span_attribute for f"{tool_call_prefix}.arguments"; use the existing CallbackFilteredJSONEncoder only for non-string objects and keep using _set_span_attribute to set the final attribute.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py`:
- Around line 459-465: The code is overwriting the SpanHolder in _create_span(),
losing previously saved span and metadata attachment tokens; instead of
replacing self.spans[run_id] with a new SpanHolder, add a suppression_token
field to the existing SpanHolder (or construct the holder with suppression_token
set) and set that field when _safe_attach_context() returns the suppression
token; update SpanHolder to include suppression_token and modify _end_span() to
detach tokens in reverse attach order: first suppression_token, then the span
attachment token, then association_properties_token so all context attachments
are properly unwound.
---
Outside diff comments:
In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py`:
- Around line 368-385: _in _set_chat_tool_calls, pre-serialized OpenAI
function.arguments (a JSON string) is being passed through json.dumps which
double-encodes it; update the logic in _set_chat_tool_calls to check the type of
tool_args and only call json.dumps(tool_args, cls=CallbackFilteredJSONEncoder)
when tool_args is not a str (i.e., when it is a dict/list), otherwise pass the
string verbatim to _set_span_attribute for f"{tool_call_prefix}.arguments"; use
the existing CallbackFilteredJSONEncoder only for non-string objects and keep
using _set_span_attribute to set the final attribute.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f74a231e-4bad-4a06-864a-f4fd7074d97f
📒 Files selected for processing (2)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.pypackages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py
|
Thanks for the thorough analysis @coderabbitai. The point about The Sourcery review approved the changes. Happy to address any maintainer feedback. |
|
Tip For best results, initiate chat on the files or code changes.
Would you like me to open a separate GitHub issue to track the |
5df3c51 to
3c2855c
Compare
|
yeah that'd be helpful @CodeRabbit , go ahead and open one. thanks |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 76 🏁 Script executed: Length of output: 52
|
…_create_llm_span Closes traceloop#3957 When _create_llm_span() is called for a run_id that already has an entry in self.spans, the old holder's token was lost without being detached, leaving an orphaned context_api.attach() on the OTel context stack. This is the same class of bug as traceloop#3526 / traceloop#3807. Defensively detach the existing holder's token before replacing the entry.
When _create_llm_span() is called for a run_id that already has a SpanHolder, the existing entry is silently overwritten. The old holder's context token (and association_properties_token) are lost without being detached, which corrupts the OpenTelemetry context stack. This adds a check before the overwrite: if an existing SpanHolder is found, its tokens are safely detached before being replaced. Follows the same pattern used in _end_span() and aligns with the fixes in traceloop#3526 and traceloop#3807. Fixes traceloop#3957
…_create_llm_span Closes traceloop#3957 When _create_llm_span() is called for a run_id that already has an entry in self.spans, the old holder's token was lost without being detached, leaving an orphaned context_api.attach() on the OTel context stack. This is the same class of bug as traceloop#3526 / traceloop#3807. Defensively detach the existing holder's token before replacing the entry.
|
Closing this. The orphaned context_api.attach() cleanup landed on main via a different change that wraps the same logic in a _detach_holder_contexts helper and calls it from both _end_span (for the run and its children) and _create_span (when an existing holder is overwritten). Same cases handled, just with the helper-based pattern. Tracking issue #3957 is also addressed there. Thanks for the back-and-forth on this earlier. |
Fixes #3526
Description
The LangChain instrumentation has two code paths where
context_api.attach()is called without a correspondingcontext_api.detach(), leaving orphaned contexts on the OpenTelemetry stack. After LangChain execution completes,trace.get_current_span()returns an ended span instead of the parent span, breaking downstream trace context propagation and log correlation.Root cause
1.
_create_span()— association_properties token lostFix: Save the token and store it in a new
SpanHolder.association_properties_tokenfield. Detach it in_end_span()alongside the span token.2.
on_chain_end()— redundant attach without detachFix: Removed entirely. The suppression token is already properly detached by
_end_span()viaSpanHolder.token, so this second attach was both redundant and harmful.Changes
span_utils.py: Addedassociation_properties_tokenfield toSpanHolderdataclasscallback_handler.py: Save association_properties attach token and detach in_end_span()callback_handler.py: Remove redundant orphaned attach inon_chain_end()Testing
ruff checkandruff formatpass on all changed filesSummary by CodeRabbit