Skip to content

Datadog tracing extension: remove OpenTracing#23958

Closed
dgoffredo wants to merge 13 commits into
envoyproxy:mainfrom
DataDog:david.goffredo/dd-trace-cpp
Closed

Datadog tracing extension: remove OpenTracing#23958
dgoffredo wants to merge 13 commits into
envoyproxy:mainfrom
DataDog:david.goffredo/dd-trace-cpp

Conversation

@dgoffredo
Copy link
Copy Markdown
Contributor

@dgoffredo dgoffredo commented Nov 11, 2022

Commit Message: Datadog tracing extension uses new tracing library
Additional Description: See below.
Risk Level: Medium
Testing: Existing unit tests are sufficient.
Docs Changes: User-facing documentation for the Datadog tracing integration does not need to change, because these changes do not change the configuration.
Release Notes: N/A or unsure
Platform Specific Features: N/A
[Optional Runtime guard:]
[Optional Fixes #Issue] #21083
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]

Overview

This is rewrite of the Datadog tracing extension.

Now that there is a non-OpenTracing Datadog C++ Tracing library
(dd-trace-cpp), the first order of business is to remove the
OpenTracing dependency on Envoy's Datadog tracing extension by rewriting
the extension in terms of the new library.

This pull request addresses a longstanding issue, and if accepted will
allow Datadog tracing in Envoy to catch up with Datadog tracing in other
projects. Due to the OpenTracing dependency of the existing extension, Datadog
tracing has been frozen on an old version for a long time.

Changing to a non-OpenTracing, recent tracing library allows us to address
multiple problems with the current extension:

  • improper handling of error.* span tags,
  • lack of support for Datadog-specific sampling controls such as the
    DD_TRACE_SAMPLE_RATE, DD_TRACE_RATE_LIMIT, and DD_SPAN_SAMPLING_RULES
    environment variables,
  • malformed traces resulting from multiple request passes through a proxy within
    the same trace.

In general, we have not been able to update the Datadog tracing extension since
January 2021. With these changes we will be able to make updates from now on.

Design

The components of this extension are described in the included README.md file.
There's even a diagram.

There is further description of the inner workings of the underlying tracing
library in its own readme and more extensively in its maintainer
documentation
.

dd-trace-cpp exports mostly concrete types. However, there are two
context-dependent interfaces that are implemented in this pull request.
See event_scheduler.{h,cc} and agent_http_client.{h,cc}.

Testing

I made minimal modifications to the existing unit tests, and added several new ones.

There is also a demo/ directory in the source tree that I used for manual
ad hoc integration testing during development. If that directory does not
belong, then we can remove it.

Code Style

I read Envoy's contributing guidelines and C++ style guide, and have formatted
the code using ci/run_envoy_docker.sh './ci/do_ci.sh format'.

This is my first time contributing to Envoy. Anywhere that the style of this
extension differs from Envoy's house style, I would be happy to make changes
to conform to the house style.

Datadog Contacts

I'll be on vacation for three weeks right after I propose this pull request.
Until I return (or succumb to the temptation to check this pull request),
my teammate @cgilmour can address questions and make modifications.

We're willing to make modifications to dd-trace-cpp per this code review
as well. Envoy will be the first client of the new library.

Let's get this code in ship shape so that we can carry on adding features.

@repokitteh-read-only repokitteh-read-only Bot added the deps Approval required for changes to Envoy's external dependencies label Nov 11, 2022
@repokitteh-read-only
Copy link
Copy Markdown

CC @envoyproxy/dependency-shepherds: Your approval is needed for changes made to (bazel/.*repos.*\.bzl)|(bazel/dependency_imports\.bzl)|(api/bazel/.*\.bzl)|(.*/requirements\.txt)|(.*\.patch).
envoyproxy/dependency-shepherds assignee is @RyanTheOptimist

🐱

Caused by: #23958 was opened by dgoffredo.

see: more, trace.

@dgoffredo dgoffredo changed the title David.goffredo/dd trace cpp Datadog tracing extension: remove OpenTracing Nov 11, 2022
@cgilmour
Copy link
Copy Markdown
Contributor

As David is away for a bit, I'll sort out the conflicts and DCO so this can at least have CI checks run on them.

@RyanTheOptimist
Copy link
Copy Markdown
Contributor

/lgtm deps

@repokitteh-read-only repokitteh-read-only Bot removed the deps Approval required for changes to Envoy's external dependencies label Nov 14, 2022
Comment thread bazel/repositories.bzl Outdated
@RyanTheOptimist
Copy link
Copy Markdown
Contributor

/wait

@dgoffredo dgoffredo force-pushed the david.goffredo/dd-trace-cpp branch 2 times, most recently from ae85761 to 90deb68 Compare December 5, 2022 20:39
@dgoffredo
Copy link
Copy Markdown
Contributor Author

I'm back.

The format checker does not like our use of std::optional and std::string_view.

https://dev.azure.com/cncf/envoy/_build/results?buildId=122576&view=logs&j=55bc2ab9-36c1-5ab8-58a6-8c258728eeff&t=33b57bf9-ef13-58c8-bc2b-f804aeb16d39

I looked through the history of this policy, and it looks like forbidding the use of the std types has to do with supporting iOS 11. Correct me if I'm wrong.

In any case, I'm looking into adding an alternative build mode to dd-trace-cpp that will allow us to use Abseil types in the interface, rather than standard types.

Otherwise this PR remains ready for review.

@repokitteh-read-only repokitteh-read-only Bot added deps Approval required for changes to Envoy's external dependencies and removed waiting labels Dec 6, 2022
@dgoffredo dgoffredo force-pushed the david.goffredo/dd-trace-cpp branch from bcd8734 to 1d8476c Compare December 6, 2022 16:36
@dgoffredo
Copy link
Copy Markdown
Contributor Author

If I'm reading https://dev.azure.com/cncf/envoy/_build/results?buildId=122708&view=results correctly, the CI checks have the following complaints:

  1. override is missing from the destructor of a derived class whose base class has a virtual destructor.
  2. Some of the function names look_like_this instead of lookLikeThis.
  3. The unit test code coverage for source/extensions/tracers/datadog is lower than the allowed minimum of 96.6% (at 65.4%).
  4. The verify_x64 > Verify packages check failed with:
PackagesDistroChecker ERROR [distros] [debian_bullseye/envoy-1.25:proxy-responds] Test failed
stdout: ERROR Envoy did not respond correctly
Response was
Error - Request ID: 01GKMF7PYDY48GJ95D59QAF8VA

Log:
Envoy did not respond correctly
Response was
Error - Request ID: 01GKMF7PYDY48GJ95D59QAF8VA

Log:

(1) I'm happy to fix, though it seems unnecessary.
(2) I'm happy to fix.
(3) I will look to address.
(4) I don't understand.

@RyanTheOptimist
Copy link
Copy Markdown
Contributor

/lgtm deps

@repokitteh-read-only repokitteh-read-only Bot removed the deps Approval required for changes to Envoy's external dependencies label Dec 8, 2022
@RyanTheOptimist
Copy link
Copy Markdown
Contributor

/assign @cgilmour

@dgoffredo
Copy link
Copy Markdown
Contributor Author

dgoffredo commented Dec 8, 2022

I'm currently working on increasing the code coverage of the Datadog tracing unit tests.

Don't hesitate to submit review comments or concerns in the mean time. This pull request is my top priority.

@dgoffredo
Copy link
Copy Markdown
Contributor Author

It looks like I have appeased the CI system. ✔️

@RyanTheOptimist
Copy link
Copy Markdown
Contributor

@cgilmour can you take another look at this PR?

@dgoffredo
Copy link
Copy Markdown
Contributor Author

@cgilmour is on holiday as of today, I believe.

He has already looked over the code, though he did not submit a review.

I'm interested in getting feedback from an Envoy maintainer, if one is available to do a code review.

@RyanTheOptimist
Copy link
Copy Markdown
Contributor

/assign-from @envoyproxy/first-pass-reviewers

@dgoffredo
Copy link
Copy Markdown
Contributor Author

Part 6/9 is grinding through the CI: #25170

@wbpcode
Copy link
Copy Markdown
Member

wbpcode commented Jan 26, 2023

@dgoffredo thanks for these great contributions.
This is a super big prs and it's Chinese New year.
So, I can only review these prs in the next week. Sorry for the delay.

Thanks again.

@dgoffredo
Copy link
Copy Markdown
Contributor Author

dgoffredo commented Jan 26, 2023

@wbpcode Happy New Year! Take your time, I have plenty more to do before I'm blocked on these PRs.

@dgoffredo
Copy link
Copy Markdown
Contributor Author

Part 7/9 is grinding through the CI: #25417

@dgoffredo
Copy link
Copy Markdown
Contributor Author

Here's an update on the status of these pull requests.

I'll propose two or three more pull requests.

"8/9" will introduce the Tracer class (as is seen in this original pull request) and its unit tests.

"9/9" will introduce the Config class (as is seen in this original pull request) and its unit tests.

"10/9" will remove the old implementation, and probably update the version of the dd-trace-cpp dependency (it's changed since work began on this within Envoy).

It's possible that "10/9" will be part of "9/9" instead.

This week is "R&D Week" at my company, so I'm working on other things. I'll resume work on the remaining pull requests next week.

jmarantz pushed a commit that referenced this pull request Mar 10, 2023
This is part of the initiative described in #23958.

This revision introduces a new library dependency for Datadog tracing, and adds a component that will be used in the future when more of the new library is integrated into the Datadog tracing extension.

EventScheduler allows the new tracing library to schedule a recurring timer that fires every two seconds; this is when the tracing library will send its batched traces to the Datadog Agent.

Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
@jmarantz
Copy link
Copy Markdown
Contributor

/wait

@dgoffredo
Copy link
Copy Markdown
Contributor Author

Part 8/9 is grinding through the CI: #26148

@dgoffredo
Copy link
Copy Markdown
Contributor Author

Part 9/9 is grinding through the CI: #26284

Rejoice!

jmarantz pushed a commit that referenced this pull request Mar 30, 2023
This is part of the work proposed in #23958.

This adds the Tracer class, which has one member function, startSpan, for creating a new trace or extracting one from incoming (HTTP) request headers.

Tracer is a thread-local wrapper around the corresponding class in dd-trace-cpp, except that failure to configure the underlying tracer puts Tracer into a "no-op mode" that always produces NullSpan instances.

In addition to that code and its corresponding unit tests, I also updated the version of dd-trace-cpp used. The newer version has additional Span methods that made testing easier. Additionally, the newer dd-trace-cpp models TraceID as its own class to accommodate 128-bit IDs. The existing unit test for Span needed to be modified to account for this.

Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
RiverPhillips pushed a commit to RiverPhillips/envoy that referenced this pull request Apr 7, 2023
This is part of the work proposed in envoyproxy#23958.

This adds the Tracer class, which has one member function, startSpan, for creating a new trace or extracting one from incoming (HTTP) request headers.

Tracer is a thread-local wrapper around the corresponding class in dd-trace-cpp, except that failure to configure the underlying tracer puts Tracer into a "no-op mode" that always produces NullSpan instances.

In addition to that code and its corresponding unit tests, I also updated the version of dd-trace-cpp used. The newer version has additional Span methods that made testing easier. Additionally, the newer dd-trace-cpp models TraceID as its own class to accommodate 128-bit IDs. The existing unit test for Span needed to be modified to account for this.

Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: River Phillips <riverphillips1@gmail.com>
@RyanTheOptimist RyanTheOptimist removed their assignment Apr 7, 2023
@github-actions
Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions Bot added the stale stalebot believes this issue/PR has not been touched recently label Apr 22, 2023
@dgoffredo
Copy link
Copy Markdown
Contributor Author

/wait

I'll close this (without merging) once #26284 is merged.

@github-actions github-actions Bot removed the stale stalebot believes this issue/PR has not been touched recently label Apr 26, 2023
jmarantz pushed a commit that referenced this pull request Apr 27, 2023
This is the final part of the work proposed in #23958.

This PR additionally contains all of the changes from #26148. This diff will shrink when that PR is merged.

This revision alters the DatadogTracerFactory in source/extensions/tracers/datadog/config.h to produce instances of the new dd-trace-cpp based implementation of the Datadog tracer, instead of the old dd-opentracing-cpp based one.

This revision actually alters the implementation of the Datadog tracer. Previous pull requests have only added code for use here.

This revision also moves some unit tests around:

Two tests that were part of datadog_tracer_impl_test.cc have been moved into agent_http_client_test.cc, because they're about HTTP requests.
One test that was part of datadog_tracer_impl_test.cc has been removed because its behavior is already covered by tests added in other PRs in this series.
The remaining tests in datadog_tracer_impl_test_.cc were moved into config_test.cc, since they all now have to do with the configuration of the tracer and the resulting behavior.
Finally, all references to dd-opentracing-cpp and msgpack have been removed from the repo, and the documentation (READMEs) and demo/ directory from #23958 have been added.

Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
@dgoffredo
Copy link
Copy Markdown
Contributor Author

#26284 is merged, so this work is now complete. Thanks for playing.

@dgoffredo dgoffredo closed this Apr 27, 2023
reskin89 pushed a commit to reskin89/envoy that referenced this pull request Jul 11, 2023
)

This is the final part of the work proposed in envoyproxy#23958.

This PR additionally contains all of the changes from envoyproxy#26148. This diff will shrink when that PR is merged.

This revision alters the DatadogTracerFactory in source/extensions/tracers/datadog/config.h to produce instances of the new dd-trace-cpp based implementation of the Datadog tracer, instead of the old dd-opentracing-cpp based one.

This revision actually alters the implementation of the Datadog tracer. Previous pull requests have only added code for use here.

This revision also moves some unit tests around:

Two tests that were part of datadog_tracer_impl_test.cc have been moved into agent_http_client_test.cc, because they're about HTTP requests.
One test that was part of datadog_tracer_impl_test.cc has been removed because its behavior is already covered by tests added in other PRs in this series.
The remaining tests in datadog_tracer_impl_test_.cc were moved into config_test.cc, since they all now have to do with the configuration of the tracer and the resulting behavior.
Finally, all references to dd-opentracing-cpp and msgpack have been removed from the repo, and the documentation (READMEs) and demo/ directory from envoyproxy#23958 have been added.

Signed-off-by: David Goffredo <david.goffredo@datadoghq.com>
Signed-off-by: Ryan Eskin <ryan.eskin89@protonmail.com>
@dgoffredo dgoffredo deleted the david.goffredo/dd-trace-cpp branch July 15, 2023 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deps Approval required for changes to Envoy's external dependencies mobile waiting

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants