As part of the 2018/04/10 call, I was asked to research whether the popular open tracing standards could intersect with our thoughts on correlation. I am opening this bug (to be auto-closed within a week) to document the cursory research.
I looked at Distributed Tracing W3C, Repo and the OpenTracing project. As far as I can tell, OpenTracing for distributed events is a layer on top of Distributed Tracing (it reuses the same headers over HTTP).
I strongly believe that we should integrate OpenTracing into our spec, though there are some reservations for using its fields in correlation context I outlined with in #195:
- Sequential Correlation:
- Tracing APIs will allow visualization of sequencing in cascading events, but do not help us us order events across requests.
- The
traceparent header is freely changed at cloud boundaries. Clouds use the opaque (and 512B limited) tracestate header to restore traceparent when a request reenters a cloud if necessary.
- Tracing is enabled on a per-request basis. The Distributed Tracing spec has an option to suggest recipients also trace that request, though it is a MAY requirement rather than a MUST requirement for many reasons, including security.
- Structured Correlation:
- All candidates come from OpenTracing, not the vanilla Distributed Tracing spec.
- The suggested concept was OpenTracing's Span Tags. These are not a good fit because they aren't propagated across hops.
- Baggage items are a slightly better fit, though it has the opposite stance on propagation. If an action triggers another event, that second event would contain all the baggage items of the first event (feature or bug?)
- OpenTracing is an extremely abstract spec. E.g. the golang version of OpenTracing only has mock support for baggage items and it seems like any key/value label pair is expressed as a raw HTTP header
As part of the 2018/04/10 call, I was asked to research whether the popular open tracing standards could intersect with our thoughts on correlation. I am opening this bug (to be auto-closed within a week) to document the cursory research.
I looked at Distributed Tracing W3C, Repo and the OpenTracing project. As far as I can tell, OpenTracing for distributed events is a layer on top of Distributed Tracing (it reuses the same headers over HTTP).
I strongly believe that we should integrate OpenTracing into our spec, though there are some reservations for using its fields in correlation context I outlined with in #195:
traceparentheader is freely changed at cloud boundaries. Clouds use the opaque (and 512B limited)tracestateheader to restoretraceparentwhen a request reenters a cloud if necessary.