Skip to content

fix(langchain): detach orphaned context_api.attach() calls that corrupt OTel context#3807

Closed
saivedant169 wants to merge 1 commit into
traceloop:mainfrom
saivedant169:fix/langchain-orphaned-context-attach
Closed

fix(langchain): detach orphaned context_api.attach() calls that corrupt OTel context#3807
saivedant169 wants to merge 1 commit into
traceloop:mainfrom
saivedant169:fix/langchain-orphaned-context-attach

Conversation

@saivedant169
Copy link
Copy Markdown
Contributor

@saivedant169 saivedant169 commented Mar 16, 2026

Fixes #3526

Description

The LangChain instrumentation has two code paths where context_api.attach() is called without a corresponding context_api.detach(), leaving orphaned contexts on the OpenTelemetry stack. After LangChain execution completes, trace.get_current_span() returns an ended span instead of the parent span, breaking downstream trace context propagation and log correlation.

Root cause

1. _create_span() — association_properties token lost

# BEFORE: token never saved, never detached
context_api.attach(
    context_api.set_value("association_properties", {...})
)

Fix: Save the token and store it in a new SpanHolder.association_properties_token field. Detach it in _end_span() alongside the span token.

2. on_chain_end() — redundant attach without detach

# BEFORE: orphaned attach to reset suppression key
context_api.attach(
    context_api.set_value(SUPPRESS_LANGUAGE_MODEL_INSTRUMENTATION_KEY, False)
)

Fix: Removed entirely. The suppression token is already properly detached by _end_span() via SpanHolder.token, so this second attach was both redundant and harmful.

Changes

  • span_utils.py: Added association_properties_token field to SpanHolder dataclass
  • callback_handler.py: Save association_properties attach token and detach in _end_span()
  • callback_handler.py: Remove redundant orphaned attach in on_chain_end()

Testing

  • ruff check and ruff format pass on all changed files
  • The first commit is a pre-existing formatting fix (ruff format applied to the two files); the second commit contains the actual bug fix

Summary by CodeRabbit

  • Refactor
    • Improved span lifecycle and context handling for more reliable instrumentation
    • Added management of association properties to better preserve span-associated state
    • Safer teardown/detachment of span-related context to reduce leaks or misattribution
    • Removed an unnecessary conditional re-attachment step to simplify end-of-chain behavior

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Mar 16, 2026

CLA assistant check
All committers have signed the CLA.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 16, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e9145662-1aca-4aef-aa94-63a10132e2ef

📥 Commits

Reviewing files that changed from the base of the PR and between 5df3c51 and 3c2855c.

📒 Files selected for processing (2)
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py

📝 Walkthrough

Walkthrough

The PR adds tracking for a second OpenTelemetry context attachment token ("association_properties") during span creation, stores it in SpanHolder, detaches it during span teardown, and removes an unsafe context re-attachment in on_chain_end() that previously caused orphaned attaches.

Changes

Cohort / File(s) Summary
Context Token Management
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
Track association_properties attachment token when metadata is attached; store token on span holder; detach that token in _end_span() in addition to the main token; remove conditional re-attachment of suppression flag in on_chain_end().
Public API Extension
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py
Added association_properties_token: Any = None to the SpanHolder dataclass to carry the extra context attachment token through span lifecycle.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 I hopped in to mend a context thread,
Attached two tokens, so no leaks spread.
I detach with care when the spans depart —
Now traces stay tidy, right from the start. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'fix(langchain): detach orphaned context_api.attach() calls that corrupt OTel context' directly and clearly summarizes the main fix: addressing orphaned context_api.attach() calls that corrupt OpenTelemetry context.
Linked Issues check ✅ Passed The PR successfully addresses all coding objectives from issue #3526: it stores and detaches the association_properties token in _create_span/_end_span, removes the orphaned attach in on_chain_end, and updates SpanHolder to track the new token field.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing the two orphaned context_api.attach() calls identified in #3526; no unrelated modifications or refactoring is present in the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py (1)

368-385: ⚠️ Potential issue | 🟠 Major

Fix double-encoding of OpenAI-style tool call arguments.

Lines 373-385 incorrectly handle pre-serialized arguments from OpenAI API responses. When tool_args comes from function.arguments (the fallback path), it's already a JSON string. Applying json.dumps() to a string wraps it in additional quotes/escapes, producing invalid span attributes. Check the type before serializing:

Suggested fix
-        _set_span_attribute(
-            span,
-            f"{tool_call_prefix}.arguments",
-            json.dumps(tool_args, cls=CallbackFilteredJSONEncoder),
-        )
+        serialized_args = (
+            tool_args
+            if isinstance(tool_args, str) or tool_args is None
+            else json.dumps(tool_args, cls=CallbackFilteredJSONEncoder)
+        )
+        _set_span_attribute(span, f"{tool_call_prefix}.arguments", serialized_args)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py`
around lines 368 - 385, _in _set_chat_tool_calls, pre-serialized OpenAI
function.arguments (a JSON string) is being passed through json.dumps which
double-encodes it; update the logic in _set_chat_tool_calls to check the type of
tool_args and only call json.dumps(tool_args, cls=CallbackFilteredJSONEncoder)
when tool_args is not a str (i.e., when it is a dict/list), otherwise pass the
string verbatim to _set_span_attribute for f"{tool_call_prefix}.arguments"; use
the existing CallbackFilteredJSONEncoder only for non-string objects and keep
using _set_span_attribute to set the final attribute.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py`:
- Around line 459-465: The code is overwriting the SpanHolder in _create_span(),
losing previously saved span and metadata attachment tokens; instead of
replacing self.spans[run_id] with a new SpanHolder, add a suppression_token
field to the existing SpanHolder (or construct the holder with suppression_token
set) and set that field when _safe_attach_context() returns the suppression
token; update SpanHolder to include suppression_token and modify _end_span() to
detach tokens in reverse attach order: first suppression_token, then the span
attachment token, then association_properties_token so all context attachments
are properly unwound.

---

Outside diff comments:
In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py`:
- Around line 368-385: _in _set_chat_tool_calls, pre-serialized OpenAI
function.arguments (a JSON string) is being passed through json.dumps which
double-encodes it; update the logic in _set_chat_tool_calls to check the type of
tool_args and only call json.dumps(tool_args, cls=CallbackFilteredJSONEncoder)
when tool_args is not a str (i.e., when it is a dict/list), otherwise pass the
string verbatim to _set_span_attribute for f"{tool_call_prefix}.arguments"; use
the existing CallbackFilteredJSONEncoder only for non-string objects and keep
using _set_span_attribute to set the final attribute.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f74a231e-4bad-4a06-864a-f4fd7074d97f

📥 Commits

Reviewing files that changed from the base of the PR and between 93786d9 and 5df3c51.

📒 Files selected for processing (2)
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
  • packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py

@saivedant169
Copy link
Copy Markdown
Contributor Author

Thanks for the thorough analysis @coderabbitai.

The point about _create_llm_span() overwriting the SpanHolder (line 464) is valid, but it's pre-existing behavior that was in the code before this PR — my changes only fix the two orphaned context_api.attach() calls reported in #3526. The _create_llm_span overwrite issue is a separate concern and should be tracked in its own issue if maintainers want it addressed.

The Sourcery review approved the changes. Happy to address any maintainer feedback.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 16, 2026

Tip

For best results, initiate chat on the files or code changes.

@saivedant169 Thank you for the clarification — that's a fair point. The _create_llm_span() SpanHolder overwrite is indeed pre-existing behavior and falls outside the scope of this PR, which is correctly focused on fixing the two orphaned context_api.attach() calls from #3526.

Would you like me to open a separate GitHub issue to track the _create_llm_span() SpanHolder overwrite concern so it doesn't get lost? I can include a backlink to this PR and the relevant context.

@saivedant169 saivedant169 force-pushed the fix/langchain-orphaned-context-attach branch from 5df3c51 to 3c2855c Compare April 9, 2026 00:53
@saivedant169
Copy link
Copy Markdown
Contributor Author

saivedant169 commented Apr 9, 2026

yeah that'd be helpful @CodeRabbit , go ahead and open one. thanks

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 9, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
gh issue create \
  --repo traceloop/openllmetry \
  --title "fix(langchain): _create_llm_span() overwrites SpanHolder causing potential context token loss" \
  --assignee saivedant169 \
  --body "## Summary

In \`packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py\`, the \`_create_llm_span()\` method overwrites an existing \`SpanHolder\` entry in the span registry without first detaching the context tokens stored in the previous holder. This is pre-existing behavior that predates PR `#3807`.

## Problem

When \`_create_llm_span()\` is called for a run ID that already has a \`SpanHolder\` in the registry, it replaces that entry directly. The old holder's \`token\` (and now \`association_properties_token\` introduced in `#3807`) are lost without being detached, which could silently corrupt the OpenTelemetry context stack in certain execution flows.

## Impact

- Orphaned context attachments similar to the ones fixed in `#3526` / `#3807` may still occur via this code path.
- Downstream spans or logs may inherit incorrect or stale context.

## Related

- Identified during review of PR `#3807` (https://github.com/traceloop/openllmetry/pull/3807), which fixed two other orphaned \`context_api.attach()\` calls.
- Reported by `@coderabbitai`, requested by `@saivedant169` to be tracked separately.

## Suggested Fix

Before overwriting a \`SpanHolder\` entry in \`_create_llm_span()\`, check if an existing holder is present and detach its context tokens (\`token\` and \`association_properties_token\`) before replacing it."

Length of output: 76


🏁 Script executed:

#!/bin/bash
gh issue create \
  --repo traceloop/openllmetry \
  --title "fix(langchain): _create_llm_span() overwrites SpanHolder causing potential context token loss" \
  --body "## Summary

In \`packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py\`, the \`_create_llm_span()\` method overwrites an existing \`SpanHolder\` entry in the span registry without first detaching the context tokens stored in the previous holder. This is pre-existing behavior that predates PR `#3807`.

## Problem

When \`_create_llm_span()\` is called for a run ID that already has a \`SpanHolder\` in the registry, it replaces that entry directly. The old holder's \`token\` (and now \`association_properties_token\` introduced in `#3807`) are lost without being detached, which could silently corrupt the OpenTelemetry context stack in certain execution flows.

## Impact

- Orphaned context attachments similar to the ones fixed in `#3526` / `#3807` may still occur via this code path.
- Downstream spans or logs may inherit incorrect or stale context.

## Related

- Identified during review of PR `#3807` (https://github.com/traceloop/openllmetry/pull/3807), which fixed two other orphaned \`context_api.attach()\` calls.
- Reported by `@coderabbitai`, requested by `@saivedant169` to be tracked separately.

## Suggested Fix

Before overwriting a \`SpanHolder\` entry in \`_create_llm_span()\`, check if an existing holder is present and detach its context tokens (\`token\` and \`association_properties_token\`) before replacing it."

Length of output: 52


@saivedant169 Done! I've opened #3957"fix(langchain): _create_llm_span() overwrites SpanHolder causing potential context token loss" — with a full description of the problem, its impact, a backlink to this PR (#3807), and a suggested fix. I wasn't able to assign it automatically due to integration permissions, but you can assign it yourself directly on the issue page.

saivedant169 added a commit to saivedant169/openllmetry that referenced this pull request Apr 9, 2026
…_create_llm_span

Closes traceloop#3957

When _create_llm_span() is called for a run_id that already has an entry
in self.spans, the old holder's token was lost without being detached,
leaving an orphaned context_api.attach() on the OTel context stack. This
is the same class of bug as traceloop#3526 / traceloop#3807.

Defensively detach the existing holder's token before replacing the entry.
RRocaP added a commit to RRocaP/openllmetry that referenced this pull request Apr 11, 2026
When _create_llm_span() is called for a run_id that already has a
SpanHolder, the existing entry is silently overwritten. The old
holder's context token (and association_properties_token) are lost
without being detached, which corrupts the OpenTelemetry context
stack.

This adds a check before the overwrite: if an existing SpanHolder
is found, its tokens are safely detached before being replaced.

Follows the same pattern used in _end_span() and aligns with the
fixes in traceloop#3526 and traceloop#3807.

Fixes traceloop#3957
saivedant169 added a commit to saivedant169/openllmetry that referenced this pull request Apr 12, 2026
…_create_llm_span

Closes traceloop#3957

When _create_llm_span() is called for a run_id that already has an entry
in self.spans, the old holder's token was lost without being detached,
leaving an orphaned context_api.attach() on the OTel context stack. This
is the same class of bug as traceloop#3526 / traceloop#3807.

Defensively detach the existing holder's token before replacing the entry.
@saivedant169
Copy link
Copy Markdown
Contributor Author

Closing this. The orphaned context_api.attach() cleanup landed on main via a different change that wraps the same logic in a _detach_holder_contexts helper and calls it from both _end_span (for the run and its children) and _create_span (when an existing holder is overwritten). Same cases handled, just with the helper-based pattern. Tracking issue #3957 is also addressed there. Thanks for the back-and-forth on this earlier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🐛 Bug Report: Orphaned context_api.attach() Calls Corrupt OpenTelemetry Context Stack

2 participants