Conversation
|
|
||
| """ | ||
| The OpenTelemetry tracing API describes the classes used to generate | ||
| distributed traces. |
There was a problem hiding this comment.
These docs are WIP, I just wanted a good example module docstring here to test the generated html docs.
opentelemetry/api/trace/__init__.py
Outdated
|
|
||
| parent = deserialize_span(serialized_span) | ||
| # Explicit parent span assignment | ||
| with tracer.span("child", parent=parent) as child: |
There was a problem hiding this comment.
How would the implicit one and explicit one work together in a mixed mode?
For example, I have a function A which is using implicit in-proc propagation, another function B which is using explicit propagation, if I need to call A from B, what should I do?
There was a problem hiding this comment.
I think functions that want to support both styles (e.g. in libraries) will need to add an optional parameter for the parent Span that defaults to the current one.
There was a problem hiding this comment.
Going down this path, I would imagine either all the libraries end up having parent span as the optional parameter for all the exposed functions, or most libraries will only support implicit context and makes it super difficult for the application developer to use explicit context.
There was a problem hiding this comment.
Supporting both implicit and explicit context like this is going to make for an ugly implementation. The spec should probably describe what happens for each, and I think there's good reason to separate the span constructors for each. We may need to solve this in the spec.
There was a problem hiding this comment.
So honestly adding support for explicit parent at the core API shouldn't look that bad - for libraries (i.e. Django instrumentation), I think it's fine to make it super difficult for the user to override it, or even do not expose it at all IHMO.
| do_work(span=child) | ||
|
|
||
| Applications should generally use a single global tracer, and use either | ||
| implicit or explicit context propagation consistently throughout. |
There was a problem hiding this comment.
How about libraries? The application could use explicit propagation consistently, while calling into a library which expects implicit context propagation.
There was a problem hiding this comment.
Yeah, we definitely should think about and document best practices for what libs should do. For that question, see https://github.com/open-telemetry/opentelemetry-python/pull/11/files#r290703075. Another question is what tracer they should use (e.g.: Use global tracer? Pass tracer at every call? Provide way to set library-wide tracer? Provide way to set a library-wide callback function that it calls to get the tracer?)
There was a problem hiding this comment.
Best-practices aside, users should have the final thing to say - if they, for some reason, need to 'mix' these modes (as you guys call it), that should be supported IHMO.
(As for the other questions, I'd say using a global Tracer would be a good thing to have)
There was a problem hiding this comment.
The application could use explicit propagation consistently, while calling into a library which expects implicit context propagation.
This is a problem since it would mean that passing an explicit context should change the implicit one. This would be unexpected in the application, but required if it's calling into a library that uses implicit context.
Provide way to set a library-wide callback function that it calls to get the tracer?
I thought implementations would provide a global tracer, do you want the callback so integrations can change this?
opentelemetry/api/trace/__init__.py
Outdated
| parent: This span's parent. | ||
|
|
||
| Raises: | ||
| ValueError: if ``name`` is null. |
There was a problem hiding this comment.
Do we only check if name is null or we expect more validations?
There was a problem hiding this comment.
I'd personally advocate for using a generic name here, instead of failing.
There was a problem hiding this comment.
OK, it sounds like implementations should provide a safe default here. In general, do you expect implementations to be more or less permissive than the API? I.e. should implementations be required to throw errors we list in the API? Should they be allowed to throw errors we don't list?
There was a problem hiding this comment.
In general, do you expect implementations to be more or less permissive than the API? I.e. should implementations be required to throw errors we list in the API?
Yes, I bet almost all implementations will have more restriction than the API would allow.
Should they be allowed to throw errors we don't list?
Probably not. One design principle we may want to follow is "the monitoring SDK will try to keep silent instead of changing the application behavior". In this spirit I guess we will have exporters doing some downcast/trimming based on their specific need, and not throw exceptions.
There was a problem hiding this comment.
In that case: do we want to allow implementations to throw here on null/blank name by including it in the API or are we ok with forbidding them from throwing by removing it?
There was a problem hiding this comment.
I bet almost all implementations will have more restriction than the API would allow
This suggests that implementations will throw errors we don't list in the API, or am I missing something?
FWIW the java API says to raise a NPE here. It might be worth taking this conversation there (or to the spec repo).
Using a safe default span name sounds good to me as long as we get the same behavior in all clients, or decide that this is "unspecified behavior" and that different clients can do this differently.
There was a problem hiding this comment.
Personally I'm in favor of option 1.
I'm also in favor one. That being said, this can also be mentioned/discussed in the Specification ;)
There was a problem hiding this comment.
I'd also prefer option 1. Option 3 (generating a random span name) is no good IMHO, because span names should be low-cardinality.
There was a problem hiding this comment.
I added open-telemetry/opentelemetry-specification#141 to track this in specs.
There was a problem hiding this comment.
It sounds like we should list any error implementations could throw in the api method docstring. Safe default names sound good to me, but let's clarify this in the spec first. I added #18 to track this.
opentelemetry/api/trace/__init__.py
Outdated
| """ | ||
| raise NotImplementedError | ||
|
|
||
| def create_span(self, name: str, parent: Span=None) -> Span: |
There was a problem hiding this comment.
How to create a root span which has no parent?
There was a problem hiding this comment.
True. In fact, I think its strange that create_span(..., parent=None) should have a non-None parent. One suggestion: create_span should just never affect or use (as parent) the active span. But instead, enhance the context manager returned by span to return the created span at __enter__.
There was a problem hiding this comment.
I can think of a few options here: use a sentinel default value instead of null and take parent=None to mean that we should create a root span, add a separate method create_root_span, or change the way we create spans and match java's builder pattern more closely.
create_spanshould just never affect or use (as parent) the active span
That's pretty surprising, I'd expect creating a child of the current active span to be the most common use case.
But instead, enhance the context manager returned by
spanto return the created span at__enter__
That's what I'd expect the context manager to do now.
|
After our meeting today, I'm thinking that maybe we should start out with only explicit context propagation/parent span specification. The implicit propagation (which requires thinking about a definition of an active trace context or span per thread/coroutine/"scope"/??? and whether the tracer should be responsible for it or some separate component that is one layer below or next to the tracer) is probably one of the most difficult APIs to get right. |
Co-Authored-By: Christian Neumüller <cn00@gmx.at>
…n into tracer-api-init
| is equivalent to:: | ||
|
|
||
| span = tracer.create_span(name, parent=tracer.get_current_span()) | ||
| with tracer.use_span(span): |
There was a problem hiding this comment.
Maybe I've missed something, I was thinking a different approach:
Implicit
with tracer.span(name):
do_work()Explicit
span = tracer.span(name)
span.start()
do_work()
span.finish()And in the span class we will have __enter__ and __exit__ (or __aenter__ and __aexit__).
There was a problem hiding this comment.
That's a nice interface, but what do you do about the fact that we need to modify the context if span is used as a context manager and not otherwise?
| raise NotImplementedError | ||
|
|
||
| @contextmanager | ||
| def use_span(self, span: Span) -> Iterator[None]: |
There was a problem hiding this comment.
For languages which don't have the with or using concept, it might be good to have a dedicated API. For Python, with span (Span.__enter__ and Span.__exit__) could be a good replacement for use_span?
There was a problem hiding this comment.
There was a problem hiding this comment.
I agree if we can find a nice way to handle modifying the context.
One argument for biting the bullet and figuring out implicit propagation now is that it will affect the API for explicit propagation too. Another is that even in the "explicit propagation" case I'm assuming we don't handle a context object in this API; we just make sure not to touch the implicit context. Another is that hashing out the tracing/context interaction is one of the reasons for this PR. :D
My working assumption is that our context will be a |
reyang
left a comment
There was a problem hiding this comment.
Per our discussion during the meeting, let's get this merged first. We can make further improves on namespace, package layout, improve span creation and global tracer registration.
Carry the final autoresearch tree as one reviewable change that includes the trace benchmark harness, SDK trace hot-path reductions, empty-context activation fast path, direct BoundedAttributes initialization, and focused validation updates. Experiments: open-telemetry#4, open-telemetry#5, open-telemetry#6, open-telemetry#7, open-telemetry#8, open-telemetry#9, open-telemetry#10, open-telemetry#11, open-telemetry#12, open-telemetry#14, open-telemetry#15, open-telemetry#16, open-telemetry#17, open-telemetry#18, open-telemetry#19 Metric: combined_mean_us 44.469470µs → 22.762846µs (-48.8%)
This PR adds the tracing API package and proposes some conventions for the package structure, documentation, and style of the project. As it is now, the API only describes the required classes for using the tracer to record spans. Propagation formats, resources, and the context API are all out of scope for this PR.
The API here borrows from:
.../docs/io/opentelemetry/trace/Tracer.html)The OT java API is our current best source of truth. Ideally we can use this PR to prove the java API and push the discussion here up to the spec.