-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-15044: [C++] Add OpenTelemetry exporters for debugging use #11925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Example of ostream output: Example of OTLP output: |
| namespace sdktrace = opentelemetry::sdk::trace; | ||
|
|
||
| // Custom JSON stdout exporter. Leverages the OTLP HTTP exporter's | ||
| // utilities to log the same format that would be sent to OTLP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a feature request on the OpenTelemetry side for them to expose this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is open-telemetry/opentelemetry-cpp#1111 which would effectively accomplish the same thing. However if they do add it to the contrib repo that would involve some more packaging work for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The least riskiest thing would be to just define our own JSON format and not try to reuse any upstream components.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's not too much work that would sound reasonable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. CC @cpcloud any thoughts? There was such an exporter on the original PR (#10260 (comment)) but as I recall the recommendation was to use the OTLP collector instead, so it was dropped. However, it seems the collector would be a lot more work for Conbench to integrate + is less convenient for local development workflows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Typically, the collector usage is deploying a container alongside the application, so without knowing anything else I'm not sure why it would be a lot more work.
@lidavidm and I went back and forth on the original PR about this, but I'm generally -1 on adding an exporter because exporters force a choice when used inside of a library. The Otel docs are pretty clear on how to use it within libraries (https://opentelemetry.io/docs/concepts/instrumenting-library/#opentelemetry-api)
Libraries should just instrument ideally, and testing can be done by constructing exporters in tests.
Including an exporter inside a library can also easily conflict with an application's N other exporters.
If these exporters are included I think they should be restricted to only when tests are being compiled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not totally understanding why a custom JSON format is less risky than using an upstream component or just having some documentation showing how to use the collector
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not totally understanding why a custom JSON format is less risky than using an upstream component or just having some documentation showing how to use the collector
Sorry, I meant just in terms of upstream changing APIs on us or something like that. The approach taken here does use the upstream generated Protobuf code and they might hide it from the public headers in the future, for instance.
Libraries should just instrument ideally, and testing can be done by constructing exporters in tests.
That is the intent, but I think Arrow occupies a halfway point between library and application, especially when bindings come into play. You can't enable C++ exporters from Python or R, for instance.
The exporters are not configured by default, for what it's worth - but they can be enabled by env var. The intent is that for development or testing, the env var can be used to easily dump spans somewhere, but an application would not use the env var and would configure its own exporter/tracer provider.
That said, this would all mostly be moot if Conbench ran a collector alongside test runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't enable C++ exporters from Python or R, for instance.
/me grumbles. I forgot about that. Ugh, I guess that means you'd probably have to use context propagation in-process which is kind of gross.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also see: open-telemetry/community#734
Yeah, in-process propagation is gross, the original PR used to have an example of that (to instrument Flight in Python) but I dropped it in the name of slimming the PR down.
Can't we just output the machine-readable JSON and users can use |
|
I'm not planning on writing any code to format the OTLP output further. I just added that to contrast it with the ostream exporter. (That said, the ostream exporter is not super useful once you log more than ~50 traces, so maybe it's not worth having at all.) |
|
This LGTM. |
|
@pitrou any other comments here? |
|
@github-actions crossbow submit -g cpp |
|
Revision: f7cbd4d Submitted crossbow builds: ursacomputing/crossbow @ actions-1317 |
|
Benchmark runs are scheduled for baseline = 27724a5 and contender = 8a4d812. 8a4d812 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
This adds two exporters that can be toggled with an environment variable, for debug use. One is the standard ostream exporter, which logs a human-friendly but machine-unfriendly format. The other uses a trick to log the JSON OTLP request format, which is easily parsable JSON but not very readable.