Skip to content

[go-fan] Go Module Review: go.opentelemetry.io/otel #3833

@github-actions

Description

@github-actions

🐹 Go Fan Report: OpenTelemetry Go SDK

Module Overview

go.opentelemetry.io/otel v1.43.0 is the Go implementation of the OpenTelemetry observability framework. This project uses the full suite: the core API (otel, otel/trace, otel/attribute, otel/codes, otel/propagation), the SDK (otel/sdk/trace, otel/sdk/resource, otel/semconv/v1.26.0), and the OTLP/HTTP exporter (otel/exporters/otlp/otlptrace/otlptracehttp).

Current Usage in gh-aw

The tracing integration lives primarily in internal/tracing/ and is used across the gateway's request pipeline.

  • Files: 5 production files + 1 test file
  • Import Count: 11 distinct OTel sub-packages
  • Key APIs Used: TracerProvider, Tracer.Start, Span.End, Span.SetAttributes, Span.SetStatus, Span.RecordError, otel.SetTracerProvider, otel.GetTextMapPropagator, propagation.NewCompositeTextMapPropagator, sdktrace.NewTracerProvider, resource.New, otlptracehttp.New

Spans Instrumented

Span Name Kind Location
gateway.request Server internal/server/http_helpers.go
mcp.tool_call Internal internal/server/unified.go
gateway.backend.execute Client internal/server/unified.go
proxy.difc_pipeline Internal internal/proxy/handler.go
proxy.backend.forward Client internal/proxy/handler.go

Research Findings

What's Working Well 🌟

The OTel integration is solid and follows most best practices:

  • Noop provider fallback: When no OTLP endpoint is configured, a noop.NewTracerProvider() is used — zero overhead in production without tracing
  • W3C TraceContext propagation: Both TraceContext and Baggage propagators are registered globally, enabling distributed trace continuation from upstream agents
  • Sampler selection: Supports AlwaysSample, NeverSample, and TraceIDRatioBased with configurable rate — correct OTel pattern
  • Batched export: sdktrace.WithBatcher(exporter) is used for async, buffered OTLP export (not WithSyncer)
  • Error recording: Both span.RecordError(err) and span.SetStatus(codes.Error, reason) are used together — correct pattern
  • SpanKinds: Correctly set to Server, Internal, or Client based on role
  • Test infra: tracetest.InMemoryExporter and sdktrace.NewSimpleSpanProcessor used in unit tests — idiomatic

Recent Updates (v1.43.0)

OTel Go 1.43.0 is recent and stable. Key features available:

  • trace/noop package (stable) — already in use ✅
  • resource.WithTelemetrySDK() — adds SDK version info to resource
  • semconv/v1.26.0 stable HTTP conventions (http.request.method, url.path, http.response.status_code)
  • sdktrace.WithRawSpanLimits() for fine-grained span attribute/event limits

Improvement Opportunities

🏃 Quick Wins

1. Add semconv.ServiceVersion to resource (internal/tracing/provider.go)

The service version is available via version.Get() but is not included in the OTel resource. Tracing backends (Jaeger, Honeycomb, Datadog) use service version for deployment-aware tracing.

// In provider.go, import internal/version
res, err := resource.New(ctx,
    resource.WithAttributes(
        semconv.ServiceName(serviceName),
        semconv.ServiceVersion(version.Get()),  // ← add this
    ),
    resource.WithProcessPID(),
    resource.WithHost(),
)

2. Add resource.WithSchemaURL(semconv.SchemaURL)

Per OTel spec, resources that use semconv attributes should declare their SchemaURL so backends can interpret them correctly. The semconv package already exports semconv.SchemaURL.

res, err := resource.New(ctx,
    resource.WithSchemaURL(semconv.SchemaURL),   // ← add this
    resource.WithAttributes(
        semconv.ServiceName(serviceName),
        semconv.ServiceVersion(version.Get()),
    ),
    resource.WithProcessPID(),
    resource.WithHost(),
)

3. Use semconv constants for HTTP span attributes

Multiple call sites use raw attribute strings that don't match the stable semconv v1.26.0 HTTP conventions:

// Current (raw strings — not semconv):
attribute.String("http.method", r.Method)
attribute.String("http.path", r.URL.Path)
attribute.Int("http.status_code", httpStatusCode)

// Better (stable semconv v1.26.0 constants, already imported):
semconv.HTTPRequestMethodKey.String(r.Method)
semconv.URLPathKey.String(r.URL.Path)
semconv.HTTPResponseStatusCodeKey.Int(httpStatusCode)

Affected files: internal/server/http_helpers.go, internal/server/unified.go, internal/proxy/handler.go, internal/tracing/http.go.

4. Cache tracer in unified.go and proxy/handler.go

Both files call tracing.Tracer() inside the request handler closure, which invokes otel.Tracer(instrumentationName) on every request. Caching at construction time is the established pattern used in http_helpers.go:

// Current (called per-request in unified.go and proxy/handler.go):
ctx, toolSpan := tracing.Tracer().Start(ctx, "mcp.tool_call", ...)

// Better (cache at server construction, like http_helpers.go does):
t := tracing.Tracer()
// ...inside handler:
ctx, toolSpan := t.Start(ctx, "mcp.tool_call", ...)

✨ Feature Opportunities

5. Add resource.WithTelemetrySDK()

OTel recommends including SDK version information in the resource for correlating telemetry with SDK upgrade effects:

res, err := resource.New(ctx,
    resource.WithTelemetrySDK(),   // ← adds otel.library.name, otel.library.version
    resource.WithSchemaURL(semconv.SchemaURL),
    // ... existing options
)

6. Add span events for tool call phase boundaries

The mcp.tool_call span in unified.go is described as "spanning all phases 0–6" but no events mark the boundaries between phases. Span events are the OTel-native way to capture lifecycle milestones within a span:

toolSpan.AddEvent("guard.evaluation.start")
// ... guard eval logic ...
toolSpan.AddEvent("guard.evaluation.done", oteltrace.WithAttributes(
    attribute.Bool("guard.allowed", allowed),
))
toolSpan.AddEvent("backend.execute.start")
// ... backend execution ...

This makes distributed traces much more useful for diagnosing latency within tool calls.

📐 Best Practice Alignment

7. Consolidate WrapHTTPHandler and WithOTELTracing

Two nearly-identical span-wrapping helpers exist:

  • tracing.WrapHTTPHandler in internal/tracing/http.go — generic, uses explicit tracer, logs per-request
  • server.WithOTELTracing in internal/server/http_helpers.go — adds session.id post-request, used in production middleware stack

They perform the same W3C context extraction + span creation + SpanKindServer pattern. Consider having WithOTELTracing delegate to WrapHTTPHandler (with extra attrs for session.id), or consolidate into a single well-tested implementation.

Recommendations

Prioritized by impact-to-effort ratio:

  1. 🥇 Add semconv.ServiceVersion to resource — 1 line change, high diagnostic value, uses existing version.Get()
  2. 🥇 Add resource.WithSchemaURL(semconv.SchemaURL) — 1 line change, correctness improvement
  3. 🥈 Use semconv constants for HTTP attributes — straightforward refactor across 4 files, improves backend compatibility
  4. 🥈 Cache tracing.Tracer() at construction — minor cleanup, aligns with existing pattern
  5. 🥉 Add resource.WithTelemetrySDK() — low-effort, adds observability metadata
  6. 🎯 Span events for tool call phases — higher effort but significantly improves tracing utility for perf analysis
  7. 🎯 Consolidate span helpers — reduces duplication, improves test coverage

Next Steps

  • Apply quick wins (items 1–4) in a single PR — all in internal/tracing/provider.go and span call sites
  • Evaluate span events for mcp.tool_call phases as a separate observability improvement
  • Consider consolidating WrapHTTPHandler / WithOTELTracing as tech debt cleanup

Generated by Go Fan 🐹 · Round-robin selection after github.com/tetratelabs/wazero
Module summary saved to: specs/mods/go-opentelemetry-otel.md (pending write access)
Run: §24442419216

Note

🔒 Integrity filter blocked 11 items

The following items were blocked because they don't meet the GitHub integrity level.

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by Go Fan · ● 2.8M ·

  • expires on Apr 22, 2026, 7:50 AM UTC

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions