Skip to content

datadog: Tracer portion of new tracing library#26148

Merged
jmarantz merged 40 commits into
envoyproxy:mainfrom
DataDog:david.goffredo/dd-trace-cpp-8/9-tracer
Mar 30, 2023
Merged

datadog: Tracer portion of new tracing library#26148
jmarantz merged 40 commits into
envoyproxy:mainfrom
DataDog:david.goffredo/dd-trace-cpp-8/9-tracer

Conversation

@dgoffredo
Copy link
Copy Markdown
Contributor

This is part of the work proposed in #23958.

This adds the Tracer class, which has one member function, startSpan, for creating a new trace or extracting one from incoming (HTTP) request headers.

Tracer is a thread-local wrapper around the corresponding class in dd-trace-cpp, except that failure to configure the underlying tracer puts Tracer into a "no-op mode" that always produces NullSpan instances.

In addition to that code and its corresponding unit tests, I also updated the version of dd-trace-cpp used. The newer version has additional Span methods that made testing easier. Additionally, the newer dd-trace-cpp models TraceID as its own class to accommodate 128-bit IDs. The existing unit test for Span needed to be modified to account for this.

Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
….goffredo/dd-trace-cpp-3/9-tracer_stats' of github.com:DataDog/envoy into david.goffredo/dd-trace-cpp-7/9-agent_http_client

Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
@dgoffredo dgoffredo requested a review from mattklein123 as a code owner March 17, 2023 18:47
@repokitteh-read-only repokitteh-read-only Bot added the deps Approval required for changes to Envoy's external dependencies label Mar 17, 2023
@repokitteh-read-only
Copy link
Copy Markdown

CC @envoyproxy/dependency-shepherds: Your approval is needed for changes made to (bazel/.*repos.*\.bzl)|(bazel/dependency_imports\.bzl)|(api/bazel/.*\.bzl)|(.*/requirements\.txt)|(.*\.patch).
envoyproxy/dependency-shepherds assignee is @mattklein123

🐱

Caused by: #26148 was opened by dgoffredo.

see: more, trace.

@jmarantz
Copy link
Copy Markdown
Contributor

many apologies for the delay in responsiveness on this and other PRs; I will try to catch up over the weekend.

@dgoffredo
Copy link
Copy Markdown
Contributor Author

No problem, thanks for taking a look.

After this one there's the final PR that I proposed yesterday. Almost there!

Copy link
Copy Markdown
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great at a high level.

I feel like we are getting into Datadog semantics which I am not familiar with; would it make sense to also get a review from someone else familiar with this tracing infrastructure?

/wait

config.agent.http_client = std::make_shared<AgentHTTPClient>(
cluster_manager, collector_cluster, collector_reference_host, tracer_stats);

auto maybe_config = datadog::tracing::finalize_config(config);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you give an explicit type here? Not obvious from context. I'm assuming something similar to absl::StatusOr.


auto maybe_config = datadog::tracing::finalize_config(config);
if (auto* error = maybe_config.if_error()) {
auto prefix = "Unable to configure Datadog tracer. Tracing is now disabled. Error: ";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never seen this pattern for declaring string constants. Maybe you could explicitly use whatever type with_prefix expects to avoid potentially implicitly converting between them?

Not that it matters for performance obviously.

Tracing::SpanPtr Tracer::startSpan(const Tracing::Config&, Tracing::TraceContext& trace_context,
const std::string& operation_name, SystemTime start_time,
const Tracing::Decision tracing_decision) {
auto& thread_local_tracer = thread_local_slot_->getTyped<ThreadLocalTracer>();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider using TypedSlot for thread-safe slots.

IMO the untyped slots are tech-debt; it would be good to remove them from the public API for TheadLocalStore, but it's so hard to find the time for things like this :)

here's an example usage: source/common/rds/rds_route_config_provider_impl.h:46


datadog::tracing::Tracer& tracer = *thread_local_tracer.tracer;
TraceContextReader reader{trace_context};
auto maybe_span = tracer.extract_span(reader, span_config);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: explicit type

maybe_span = tracer.create_span(span_config);
}

assert(maybe_span);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ASSERT or RELEASE_ASSERT per Envoy style?

RELEASE_ASSERT would be more like assert() (can abort production binaries)

class Tracer : public Tracing::Driver, private Logger::Loggable<Logger::Id::tracing> {
public:
/**
* Create a \c Tracer that produces traces according to the specified \p config
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do these back-slash escaped sequences do?

Do they render well in some IDE? I have not noticed these used in Envoy.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot from 2023-03-27 16-56-05

This is doxygen-style fixed-width formatting. I think that \c means "class" and \p means "parameter," though the doxygen docs point out that they can be used interchangeably.

My editor changes the color, and presumably rendered doxygen puts the following word in <code></code> tags.

Envoy code I've looked at doesn't use these operators. I started using them as a substitute for backticks when in an earlier review you advised me to comment doxygen-style.

I could remove them, but I like them. Up to you.

Tracer tracer("fake_cluster", "test_host", config, cluster_manager_, *store_.rootScope(),
thread_local_slot_allocator_);

(void)tracer;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this needed to avoid a compile error?

Envoy style is to use UNREFERENCED_PARAMETER from ./source/common/common/macros.h

@dgoffredo
Copy link
Copy Markdown
Contributor Author

@cgilmour Please take a look at this with Datadog eyes.

Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
@cgilmour
Copy link
Copy Markdown
Contributor

For the datadog-specific bits, I'm happy to confirm about this.
The intention with the earlier (opentracing-based) integration and with this one was to be both comprehensive but also low-footprint, so that future changes are "just a version-bump".

We both are employed by Datadog, and appreciate being a feature in this project.

@dgoffredo
Copy link
Copy Markdown
Contributor Author

Thanks, Caleb.

@jmarantz I addressed your earlier comments in a recent commit (except for the part about doxygen -- let me know).

Once we're good on this PR, I'll massage it into #26284 and that'll be the home stretch.

Copy link
Copy Markdown
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jmarantz
Copy link
Copy Markdown
Contributor

/retest

@repokitteh-read-only
Copy link
Copy Markdown

Retrying Azure Pipelines:
Check envoy-presubmit didn't fail.

🐱

Caused by: a #26148 (comment) was created by @jmarantz.

see: more, trace.

@jmarantz jmarantz merged commit 37b4dff into envoyproxy:main Mar 30, 2023
RiverPhillips pushed a commit to RiverPhillips/envoy that referenced this pull request Apr 7, 2023
This is part of the work proposed in envoyproxy#23958.

This adds the Tracer class, which has one member function, startSpan, for creating a new trace or extracting one from incoming (HTTP) request headers.

Tracer is a thread-local wrapper around the corresponding class in dd-trace-cpp, except that failure to configure the underlying tracer puts Tracer into a "no-op mode" that always produces NullSpan instances.

In addition to that code and its corresponding unit tests, I also updated the version of dd-trace-cpp used. The newer version has additional Span methods that made testing easier. Additionally, the newer dd-trace-cpp models TraceID as its own class to accommodate 128-bit IDs. The existing unit test for Span needed to be modified to account for this.

Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: River Phillips <riverphillips1@gmail.com>
jmarantz pushed a commit that referenced this pull request Apr 27, 2023
This is the final part of the work proposed in #23958.

This PR additionally contains all of the changes from #26148. This diff will shrink when that PR is merged.

This revision alters the DatadogTracerFactory in source/extensions/tracers/datadog/config.h to produce instances of the new dd-trace-cpp based implementation of the Datadog tracer, instead of the old dd-opentracing-cpp based one.

This revision actually alters the implementation of the Datadog tracer. Previous pull requests have only added code for use here.

This revision also moves some unit tests around:

Two tests that were part of datadog_tracer_impl_test.cc have been moved into agent_http_client_test.cc, because they're about HTTP requests.
One test that was part of datadog_tracer_impl_test.cc has been removed because its behavior is already covered by tests added in other PRs in this series.
The remaining tests in datadog_tracer_impl_test_.cc were moved into config_test.cc, since they all now have to do with the configuration of the tracer and the resulting behavior.
Finally, all references to dd-opentracing-cpp and msgpack have been removed from the repo, and the documentation (READMEs) and demo/ directory from #23958 have been added.

Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
reskin89 pushed a commit to reskin89/envoy that referenced this pull request Jul 11, 2023
)

This is the final part of the work proposed in envoyproxy#23958.

This PR additionally contains all of the changes from envoyproxy#26148. This diff will shrink when that PR is merged.

This revision alters the DatadogTracerFactory in source/extensions/tracers/datadog/config.h to produce instances of the new dd-trace-cpp based implementation of the Datadog tracer, instead of the old dd-opentracing-cpp based one.

This revision actually alters the implementation of the Datadog tracer. Previous pull requests have only added code for use here.

This revision also moves some unit tests around:

Two tests that were part of datadog_tracer_impl_test.cc have been moved into agent_http_client_test.cc, because they're about HTTP requests.
One test that was part of datadog_tracer_impl_test.cc has been removed because its behavior is already covered by tests added in other PRs in this series.
The remaining tests in datadog_tracer_impl_test_.cc were moved into config_test.cc, since they all now have to do with the configuration of the tracer and the resulting behavior.
Finally, all references to dd-opentracing-cpp and msgpack have been removed from the repo, and the documentation (READMEs) and demo/ directory from envoyproxy#23958 have been added.

Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: Ryan Eskin <ryan.eskin89@protonmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deps Approval required for changes to Envoy's external dependencies

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants