Describe the feature
When operating the vLLM production stack with an upstream service (e.g., a semantic router or API gateway), there's no visibility into the router's request handling in distributed traces. The vLLM engine already supports OpenTelemetry tracing via environment variables, but the router creates a gap in the trace - we can see the upstream service and the engine, but not the routing layer in between.
Why do you need this feature?
No response
Additional context
Better visibility / distributed tracing support for an inference system in production!