Skip to content

fix: batch spans before OTLP export to preserve parent-child relationships#290

Open
devin-ai-integration[bot] wants to merge 6 commits intofederated-sdk-release-candidatefrom
devin/1772446166-fix-span-batching
Open

fix: batch spans before OTLP export to preserve parent-child relationships#290
devin-ai-integration[bot] wants to merge 6 commits intofederated-sdk-release-candidatefrom
devin/1772446166-fix-span-batching

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Mar 2, 2026

fix: batch spans before OTLP export to preserve parent-child relationships

Summary

HoneyHiveSpanProcessor._send_via_otlp() was calling self.otlp_exporter.export([span]) for every span individually as it completed. This meant each span arrived at the backend as a 1-span batch. The backend's resolveParentIdsFromSpans() can only resolve parent-child links within the same batch, so with single-span batches the parent lookup never succeeded — all events defaulted to parent_id=session_id, producing a completely flat tree.

Changes:

  • span_processor.py: _send_via_otlp() now buffers spans in a thread-safe _span_buffer instead of exporting immediately. A periodic flush timer (5s) and max buffer cap (512 spans) prevent unbounded growth. _flush_span_buffer() exports all buffered spans as a single batch, only clearing the buffer after a successful export. Called by force_flush(), shutdown(), the periodic timer, and when the buffer reaches capacity. The disable_batch=True (immediate) mode still exports spans one at a time.
  • otlp_exporter.py: _spans_to_otlp_json_payload() now groups spans by instrumentation scope into separate scopeSpans entries, instead of putting all spans under a single scope. This produces correct OTLP JSON when a batch contains spans from multiple instrumentors (e.g. autogen-core + openinference.instrumentation.openai).
  • autogen_integration.py: Removed debug scaffolding (capture_spans import/usage, verbose=True) from the public example.
  • Unit tests (tests/ and tests_v2/): Updated to verify new buffering behavior — spans are not exported immediately on on_end, but are held in _span_buffer and exported on explicit _flush_span_buffer().

Root cause discovered during autogen-agentchat v0.7.5 integration validation: 73/81 raw spans had correct parent_span_id, but 81/82 ingested events had parent_id=session_id.

End-to-end validation result: After the fix, 42/78 ingested events now have proper parent-child nesting (vs 0/82 before). The remaining 36 root-level events are expected — they represent top-level test function spans and spans that crossed flush-timer boundaries.

Review & Testing Checklist for Human

  • Cross-framework regression: This changes default behavior for ALL frameworks using the OTLP export path (not just autogen). Verify that existing integrations (OpenAI, Anthropic, LangGraph, etc.) still produce correct traces — their spans will now be deferred until flush.
  • Partial nesting (42/78 not 78/78): The 5-second flush timer means spans that arrive after a timer-triggered flush but before the next one won't share a batch with their parents. Verify this is acceptable, or consider whether force_flush() should be called more aggressively (e.g. after each top-level trace completes).
  • disable_batch=True path: Verify this mode still works as expected (immediate single-span export, no buffering). This path is not exercised in the integration test.
  • Data loss on crash: Buffered spans are lost if the process is killed (SIGKILL) before the 5s timer fires or shutdown() is called. The 5s timer + 512-span cap mitigate this but don't eliminate it.

Recommended test plan:

  1. Run an existing OpenAI or Anthropic integration example and verify traces appear nested (not flat) in the HoneyHive UI
  2. Run with disable_batch=True and confirm spans still export correctly
  3. Confirm force_flush() drains the buffer when called explicitly

Notes

  • Requested by: unknown ()
  • Link to Devin Session
  • CI status: 12 passing checks (Python 3.12, 3.13, Docker, Lambda, docs). One flaky performance test failure (test_tracing_minimal_overhead_integration CV 379% vs 300% threshold) unrelated to these changes and not required for merge.

…ships

The HoneyHiveSpanProcessor was exporting each span individually via
otlp_exporter.export([span]), creating 1-span batches. The backend's
resolveParentIdsFromSpans() can only resolve parent-child links within
the same batch, so with single-span batches parent matching never
succeeded and all events defaulted to parent_id=session_id (flat tree).

Changes:
- span_processor.py: Buffer spans in _span_buffer during on_end(),
  flush all buffered spans as a single batch in _flush_span_buffer()
  which is called by force_flush() and shutdown(). Immediate mode
  (disable_batch=True) still exports spans individually.
- otlp_exporter.py: Group spans by instrumentation scope in
  _spans_to_otlp_json_payload() so multi-scope batches produce
  correct OTLP JSON with separate scopeSpans entries.

Co-Authored-By: unknown <>
@devin-ai-integration devin-ai-integration bot requested a review from a team March 2, 2026 10:12
@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@snyk-io
Copy link

snyk-io bot commented Mar 2, 2026

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@claude
Copy link
Contributor

claude bot commented Mar 2, 2026

Claude finished @devin-ai-integration[bot]'s task in 2m 19s —— View job


  • Read changed files and gather context
  • Analyze span_processor.py changes
  • Analyze otlp_exporter.py changes
  • Post inline review comments (4 inline comments)
  • Post top-level review summary

Review Complete

Reviewed both changed files. The core approach (batching spans before OTLP export) is correct, but found two must-fix issues before merge:

  1. Unbounded buffer growth — no max size cap or periodic flush timer means memory will grow indefinitely in long-running processes
  2. Data loss on export failure — buffer is cleared before confirming export success, and the export result is never checked

See inline comments for details and suggested fixes.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

📚 Documentation preview built — Download artifact

Review instructions & validation status

How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

Comment on lines +878 to +904
try:
with self._buffer_lock:
if not self._span_buffer:
return True
spans_to_flush = list(self._span_buffer)
self._span_buffer.clear()

if not self.otlp_exporter:
self._safe_log("warning", "No OTLP exporter for flush")
return False

self._safe_log(
"info",
"Flushing %d buffered spans as single OTLP batch",
len(spans_to_flush),
)

result = self.otlp_exporter.export(spans_to_flush)

if hasattr(result, "name"):
self._safe_log(
"debug",
"Batch OTLP export result: %s (span_count=%d)",
result.name,
len(spans_to_flush),
)
return True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Data loss on export failure & export result ignored

Two issues here:

  1. Spans cleared before export succeeds (line 883): The buffer is cleared before export() is called (line 895). If export() raises an exception or returns FAILURE, those spans are permanently lost. The buffer should only be cleared after confirming a successful export.

  2. Export result not checked (line 904): The method always returns True after calling export, regardless of whether the export actually succeeded (SpanExportResult.SUCCESS vs FAILURE). A failed export should return False or re-buffer the spans.

Suggested change
try:
with self._buffer_lock:
if not self._span_buffer:
return True
spans_to_flush = list(self._span_buffer)
self._span_buffer.clear()
if not self.otlp_exporter:
self._safe_log("warning", "No OTLP exporter for flush")
return False
self._safe_log(
"info",
"Flushing %d buffered spans as single OTLP batch",
len(spans_to_flush),
)
result = self.otlp_exporter.export(spans_to_flush)
if hasattr(result, "name"):
self._safe_log(
"debug",
"Batch OTLP export result: %s (span_count=%d)",
result.name,
len(spans_to_flush),
)
return True
try:
with self._buffer_lock:
if not self._span_buffer:
return True
spans_to_flush = list(self._span_buffer)
self._span_buffer.clear()
if not self.otlp_exporter:
self._safe_log("warning", "No OTLP exporter for flush")
return False
self._safe_log(
"info",
"Flushing %d buffered spans as single OTLP batch",
len(spans_to_flush),
)
result = self.otlp_exporter.export(spans_to_flush)
if hasattr(result, "name"):
self._safe_log(
"debug",
"Batch OTLP export result: %s (span_count=%d)",
result.name,
len(spans_to_flush),
)
from opentelemetry.sdk.trace.export import SpanExportResult
if result != SpanExportResult.SUCCESS:
self._safe_log(
"error",
"Batch OTLP export failed, %d spans lost",
len(spans_to_flush),
)
return False
return True

Ideally on failure the spans should be re-added to the buffer for retry, but at minimum the result should be checked and reported honestly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in f179131:

  1. Buffer only cleared after successful export: Moved self._span_buffer.clear() to after SpanExportResult.SUCCESS check. New spans arriving during export are preserved via slice assignment.
  2. Export result now checked: _flush_span_buffer() returns False when export() doesn't return SUCCESS, and logs the failure.

Comment on lines +196 to +206
# Group spans by instrumentation scope name
scope_groups: Dict[str, List[ReadableSpan]] = {}
for span in spans:
scope_name = "unknown"
if (
hasattr(span, "instrumentation_scope")
and span.instrumentation_scope
and span.instrumentation_scope.name
):
scope_name = span.instrumentation_scope.name
scope_groups.setdefault(scope_name, []).append(span)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Good improvement — correct OTLP grouping by scope

This is a solid fix. The old code lumped all spans under a single instrumentation scope, which produced incorrect OTLP payloads when spans from different instrumentors (e.g., autogen-core + openinference.instrumentation.openai) were batched together.

One minor suggestion: consider using getattr chaining instead of nested hasattr checks for slightly cleaner code:

scope = getattr(span, "instrumentation_scope", None)
scope_name = getattr(scope, "name", None) or "unknown"

@claude
Copy link
Contributor

claude bot commented Mar 2, 2026

Review Summary

The core idea is sound — batching spans before OTLP export so the backend can resolve parent-child relationships within a single request is the correct fix for the flat-tree problem. The scope-grouping fix in otlp_exporter.py is also a good improvement that produces correct OTLP JSON for multi-instrumentor batches.

However, the current implementation has issues that should be addressed before merging:

Must Fix

  • Unbounded buffer growth (P0): The _span_buffer has no max size cap and no periodic flush timer. For long-running processes (web servers, daemons), this is effectively a memory leak. All spans accumulate in memory until force_flush()/shutdown(), which may never be called promptly. This is a significant behavioral regression from the prior per-span export. At minimum, add a _max_buffer_size threshold that triggers an automatic flush (e.g., 512 spans) and/or a periodic flush timer (e.g., every 5s).

  • Data loss on export failure: In _flush_span_buffer(), the buffer is cleared (line 883) before export() is called (line 895). If the export fails or throws, those spans are permanently lost. Additionally, the export result (SpanExportResult) is never checked — the method always returns True.

Should Fix

  • No unit tests: The buffering logic (_flush_span_buffer, buffer accumulation, thread safety, disable_batch=True path) has zero test coverage. Given this changes the default export behavior for all frameworks, tests are important before merging.

Notes

  • The otlp_exporter.py scope-grouping change is clean and correct. Minor suggestion to use getattr chaining for slightly cleaner code.
  • The shutdown lifecycle calls _flush_span_buffer() 2-3 times redundantly (once via force_flush, once via shutdown, potentially a retry). This is benign but worth documenting.
  • Nice cleanup removing the emoji prefixes from log messages in _send_via_otlp.

Documentation

No public API surface changed, so no docs updates are needed.

- Add _max_buffer_size (512) threshold that triggers automatic flush
- Add periodic flush timer (5s) to prevent unbounded buffer growth
- Only clear buffer after confirmed successful export (SpanExportResult.SUCCESS)
- Cancel flush timer during shutdown to avoid resource leaks
- New spans arriving during export are preserved (not dropped)

Addresses code review feedback:
- P0: Unbounded buffer growth in long-running processes
- Bug: Data loss on export failure (buffer cleared before export confirmed)
- Bug: Export result never checked

Co-Authored-By: unknown <>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

📚 Documentation preview built — Download artifact

Review instructions & validation status

How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

📚 Documentation preview built — Download artifact

Review instructions & validation status

How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

Three tests expected immediate export on _send_via_otlp in batched mode.
Updated to verify spans are buffered first, then exported on flush,
matching the new batch export implementation.

Co-Authored-By: unknown <>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

📚 Documentation preview built — Download artifact

Review instructions & validation status

How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

Four tests expected immediate export in batched mode. Updated to verify
spans are buffered first, then exported on flush.

Co-Authored-By: unknown <>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

📚 Documentation preview built — Download artifact

Review instructions & validation status

How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

📚 Documentation preview built — Download artifact

Review instructions & validation status

How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants