ARROW-15044: [C++] Add OpenTelemetry exporters for debugging use #11925

lidavidm · 2021-12-09T22:03:04Z

This adds two exporters that can be toggled with an environment variable, for debug use. One is the standard ostream exporter, which logs a human-friendly but machine-unfriendly format. The other uses a trick to log the JSON OTLP request format, which is easily parsable JSON but not very readable.

github-actions · 2021-12-09T22:03:26Z

https://issues.apache.org/jira/browse/ARROW-15044

lidavidm · 2021-12-09T22:04:03Z

Example of ostream output:

{
  name          : test
  trace_id      : 577bb5359fd4c9babdfb7e8def43d476
  span_id       : 41655856137c4635
  tracestate    : 
  parent_span_id: 0000000000000000
  start         : 1639087427620946845
  duration      : 8538876
  description   : 
  span kind     : Internal
  status        : Unset
  attributes    : 
	thread_id: 140062552533184
	foo: bar
  events        : 
  links         : 
  resources     : 
	service.name: unknown_service
	telemetry.sdk.version: 1.1.0
	telemetry.sdk.name: opentelemetry
	telemetry.sdk.language: cpp
  instr-lib     : arrow
}

Example of OTLP output:

{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"unknown_service"}},{"key":"telemetry.sdk.version","value":{"stringValue":"1.1.0"}},{"key":"telemetry.sdk.name","value":{"stringValue":"opentelemetry"}},{"key":"telemetry.sdk.language","value":{"stringValue":"cpp"}}]},"instrumentationLibrarySpans":[{"instrumentationLibrary":{"name":"arrow"},"spans":[{"traceId":"ID+DVD1Rarxdh8dNLchUmw==","spanId":"7IGA2C88AQk=","parentSpanId":"AAAAAAAAAAA=","name":"test","kind":"SPAN_KIND_INTERNAL","startTimeUnixNano":"1639087084846448841","endTimeUnixNano":"1639087084849284338","attributes":[{"key":"foo","value":{"stringValue":"bar"}},{"key":"thread_id","value":{"stringValue":"140253881153728"}}]},{"traceId":"pPN+j22d17tQVbHZpG6pbA==","spanId":"6apExpYokXw=","parentSpanId":"AAAAAAAAAAA=","name":"test","kind":"SPAN_KIND_INTERNAL","startTimeUnixNano":"1639087084850354448","endTimeUnixNano":"1639087084853673659","attributes":[{"key":"foo","value":{"stringValue":"bar"}},{"key":"thread_id","value":{"stringValue":"140253881153728"}}]},{"traceId":"yVWEeB/z4NedBKrCdezTMg==","spanId":"4AGb3ndTHek=","parentSpanId":"AAAAAAAAAAA=","name":"test","kind":"SPAN_KIND_INTERNAL","startTimeUnixNano":"1639087084873028556","endTimeUnixNano":"1639087084876243603","attributes":[{"key":"foo","value":{"stringValue":"bar"}},{"key":"thread_id","value":{"stringValue":"140253881153728"}}]}]}]}

pitrou · 2021-12-15T15:36:09Z

cpp/src/arrow/util/tracing_internal.cc

 namespace sdktrace = opentelemetry::sdk::trace;

+// Custom JSON stdout exporter. Leverages the OTLP HTTP exporter's
+// utilities to log the same format that would be sent to OTLP.


Is there a feature request on the OpenTelemetry side for them to expose this?

There is open-telemetry/opentelemetry-cpp#1111 which would effectively accomplish the same thing. However if they do add it to the contrib repo that would involve some more packaging work for us.

The least riskiest thing would be to just define our own JSON format and not try to reuse any upstream components.

If that's not too much work that would sound reasonable to me.

Sounds good. CC @cpcloud any thoughts? There was such an exporter on the original PR (#10260 (comment)) but as I recall the recommendation was to use the OTLP collector instead, so it was dropped. However, it seems the collector would be a lot more work for Conbench to integrate + is less convenient for local development workflows.

Interesting. Typically, the collector usage is deploying a container alongside the application, so without knowing anything else I'm not sure why it would be a lot more work.

@lidavidm and I went back and forth on the original PR about this, but I'm generally -1 on adding an exporter because exporters force a choice when used inside of a library. The Otel docs are pretty clear on how to use it within libraries (https://opentelemetry.io/docs/concepts/instrumenting-library/#opentelemetry-api)

Libraries should just instrument ideally, and testing can be done by constructing exporters in tests.

Including an exporter inside a library can also easily conflict with an application's N other exporters.

If these exporters are included I think they should be restricted to only when tests are being compiled.

Not totally understanding why a custom JSON format is less risky than using an upstream component or just having some documentation showing how to use the collector

Not totally understanding why a custom JSON format is less risky than using an upstream component or just having some documentation showing how to use the collector

Sorry, I meant just in terms of upstream changing APIs on us or something like that. The approach taken here does use the upstream generated Protobuf code and they might hide it from the public headers in the future, for instance.

Libraries should just instrument ideally, and testing can be done by constructing exporters in tests.

That is the intent, but I think Arrow occupies a halfway point between library and application, especially when bindings come into play. You can't enable C++ exporters from Python or R, for instance.

The exporters are not configured by default, for what it's worth - but they can be enabled by env var. The intent is that for development or testing, the env var can be used to easily dump spans somewhere, but an application would not use the env var and would configure its own exporter/tracer provider.

That said, this would all mostly be moot if Conbench ran a collector alongside test runs.

You can't enable C++ exporters from Python or R, for instance.

/me grumbles. I forgot about that. Ugh, I guess that means you'd probably have to use context propagation in-process which is kind of gross.

Also see: open-telemetry/community#734

Yeah, in-process propagation is gross, the original PR used to have an example of that (to instrument Flight in Python) but I dropped it in the name of slimming the PR down.

cpp/src/arrow/util/tracing_internal.cc

cpcloud · 2021-12-15T16:47:43Z

The other uses a trick to log the JSON OTLP request format, which is easily parsable JSON but not very readable.

Can't we just output the machine-readable JSON and users can use jq or something? No reason to write code for tools that already do an amazing job of formatting JSON

lidavidm · 2021-12-15T16:52:52Z

I'm not planning on writing any code to format the OTLP output further. I just added that to contrast it with the ostream exporter. (That said, the ostream exporter is not super useful once you log more than ~50 traces, so maybe it's not worth having at all.)

cpcloud · 2021-12-15T20:37:17Z

This LGTM.

lidavidm · 2021-12-16T16:07:33Z

@pitrou any other comments here?

pitrou · 2021-12-16T16:45:36Z

@github-actions crossbow submit -g cpp

github-actions · 2021-12-16T16:46:52Z

Revision: f7cbd4d

Submitted crossbow builds: ursacomputing/crossbow @ actions-1317

Task	Status
test-build-cpp-fuzz
test-conda-cpp
test-conda-cpp-valgrind
test-debian-10-cpp-amd64
test-debian-10-cpp-i386
test-debian-11-cpp-amd64
test-debian-11-cpp-i386
test-fedora-33-cpp
test-ubuntu-18.04-cpp
test-ubuntu-18.04-cpp-release
test-ubuntu-18.04-cpp-static
test-ubuntu-20.04-cpp
test-ubuntu-20.04-cpp-14
test-ubuntu-20.04-cpp-17
test-ubuntu-20.04-cpp-bundled
test-ubuntu-20.04-cpp-thread-sanitizer

ursabot · 2021-12-16T18:51:09Z

Benchmark runs are scheduled for baseline = 27724a5 and contender = 8a4d812. 8a4d812 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.0% ⬆️0.9%] ursa-i9-9960x
[Finished ⬇️0.93% ⬆️0.4%] ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

lidavidm added 2 commits December 9, 2021 14:51

ARROW-15044: [C++] Enable ostream span exporter for debugging

31a8600

ARROW-15044: [C++] Enable OTLP stdout exporter for debugging

623703b

github-actions bot added the Component: C++ label Dec 9, 2021

lidavidm mentioned this pull request Dec 9, 2021

WIP: [C++] Demonstrate C++ OTel instrumentation #11906

Closed

pitrou reviewed Dec 15, 2021

View reviewed changes

cpp/src/arrow/util/tracing_internal.cc Show resolved Hide resolved

lidavidm mentioned this pull request Dec 15, 2021

ARROW-15067: [C++] Add tracing spans to the scanner #11964

Closed

cpcloud approved these changes Dec 15, 2021

View reviewed changes

lidavidm added 2 commits December 15, 2021 15:46

ARROW-15044: [C++] Add stderr exporter

9961764

ARROW-15044: [C++] Fix lint

f7cbd4d

pitrou approved these changes Dec 16, 2021

View reviewed changes

pitrou closed this in 8a4d812 Dec 16, 2021

lidavidm deleted the arrow-15044 branch December 20, 2021 14:19

asfimport mentioned this pull request Dec 18, 2021

[C++] Add simple stdout/JSON exporter for OpenTelemetry #30561

Closed

ARROW-15044: [C++] Add OpenTelemetry exporters for debugging use #11925

ARROW-15044: [C++] Add OpenTelemetry exporters for debugging use #11925

Uh oh!

Conversation

lidavidm commented Dec 9, 2021

Uh oh!

github-actions bot commented Dec 9, 2021

Uh oh!

lidavidm commented Dec 9, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cpcloud Dec 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cpcloud commented Dec 15, 2021

Uh oh!

lidavidm commented Dec 15, 2021

Uh oh!

cpcloud commented Dec 15, 2021

Uh oh!

lidavidm commented Dec 16, 2021

Uh oh!

pitrou commented Dec 16, 2021

Uh oh!

github-actions bot commented Dec 16, 2021

Uh oh!

ursabot commented Dec 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

cpcloud Dec 15, 2021 •

edited

Loading

ursabot commented Dec 16, 2021 •

edited

Loading