Skip to content

Adding distinct activity for distributed tracing to YARP#2098

Merged
samsp-msft merged 10 commits intomainfrom
samsp/diagnostics
May 5, 2023
Merged

Adding distinct activity for distributed tracing to YARP#2098
samsp-msft merged 10 commits intomainfrom
samsp/diagnostics

Conversation

@samsp-msft
Copy link
Copy Markdown
Contributor

@samsp-msft samsp-msft commented Apr 12, 2023

Adding distinct tracing objects to YARP.

These can then be picked up via monitoring middleware such as Open Telemetry.

using Azure.Monitor.OpenTelemetry.AspNetCore;

var builder = WebApplication.CreateBuilder(args);
builder.Services.AddReverseProxy().LoadFromConfig(builder.Configuration.GetSection("ReverseProxy"));

var otel = builder.Services.AddOpenTelemetry();
otel.UseAzureMonitor(o =>
{
    o.ConnectionString = builder.Configuration["APPLICATIONINSIGHTS_CONNECTION_STRING"];
});
otel.WithTracing(t =>
{
    t.AddSource("Yarp.*");
});

var app = builder.Build();

app.MapReverseProxy();

app.Run();

The activity source is only active if something is listening to it, such as t.AddSource("Yarp.*");.

Comment thread src/ReverseProxy/Health/ActiveHealthCheckMonitor.cs Outdated
Comment thread src/ReverseProxy/Model/IReverseProxyFeature.cs Outdated
Comment thread src/ReverseProxy/Utilities/Observability.cs Outdated
@samsp-msft
Copy link
Copy Markdown
Contributor Author

Result when seen in App Insights
image

@MihaZupan
Copy link
Copy Markdown
Member

If you add .AddHttpClientInstrumentation() to the builder, you should see another dependency node at about the same time you see Forward now (assuming things aren't broken). That means you don't gain new timing information from the new activity.

Re: Tags like routeId, is there prior art in adding such information to existing Activities?
E.g. to the root one ASP.NET will create, or to the one HttpClient creates internally?

Things like that should be trivial to do now that ActivityListener is a thing

ActivitySource.AddActivityListener(new ActivityListener()
{
    ShouldListenTo = source => source.Name == "System.Net.Http",
    ActivityStarted = activity =>
    {
        string routeId = "..."; // Get that somehow (IHttpContextAccessor?)
        activity.AddTag("YarpRoute", routeId);
    }
});

@samsp-msft samsp-msft marked this pull request as ready for review April 13, 2023 16:14
@samsp-msft samsp-msft requested a review from Tratcher as a code owner April 13, 2023 16:14
Comment thread src/ReverseProxy/Utilities/Observability.cs Outdated
Comment thread src/ReverseProxy/Model/ReverseProxyFeature.cs Outdated
@samsp-msft
Copy link
Copy Markdown
Contributor Author

If you add .AddHttpClientInstrumentation() to the builder, you should see another dependency node at about the same time you see Forward now (assuming things aren't broken). That means you don't gain new timing information from the new activity.

HttpClientInstrumentation is added as part of the AzureMonitor helper.

Adding to the existing activity verses adding another is an interesting question. You don't even need to get to the listener, you can probably just do Activity.Current.SetTag(...)

Comment thread src/ReverseProxy/Health/ActiveHealthCheckMonitor.cs Outdated
Comment thread src/ReverseProxy/Health/ActiveHealthCheckMonitor.cs Outdated
Comment thread src/ReverseProxy/Forwarder/ForwarderMiddleware.cs Outdated
Comment thread src/ReverseProxy/Utilities/Observability.cs Outdated
Comment thread src/ReverseProxy/Utilities/Observability.cs Outdated
Comment thread src/ReverseProxy/Utilities/Observability.cs Outdated
Comment thread src/ReverseProxy/Model/ReverseProxyFeature.cs Outdated
Comment thread src/ReverseProxy/SessionAffinity/SessionAffinityMiddleware.cs Outdated
@samsp-msft samsp-msft changed the title Prototype of Adding distinct activity for distributed tracing to YARP Adding distinct activity for distributed tracing to YARP Apr 14, 2023
Comment thread src/ReverseProxy/Model/ProxyPipelineInitializerMiddleware.cs Outdated
Comment thread src/ReverseProxy/Model/ReverseProxyFeature.cs Outdated
Copy link
Copy Markdown
Member

@Tratcher Tratcher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, just needs some cleaup.

Comment thread src/ReverseProxy/Health/ActiveHealthCheckMonitor.cs Outdated
Comment thread src/ReverseProxy/Model/HttpContextFeaturesExtensions.cs
Comment thread src/ReverseProxy/Model/IReverseProxyFeature.cs Outdated
Comment thread src/ReverseProxy/Model/ReverseProxyFeature.cs Outdated
Comment thread src/ReverseProxy/SessionAffinity/SessionAffinityMiddleware.cs Outdated
Comment thread src/ReverseProxy/Utilities/Observability.cs Outdated
Comment thread docs/docfx/articles/distributed-tracing.md
var activity = Observability.YarpActivitySource.StartActivity("Proxy Forwarder", ActivityKind.Server);
context.SetYarpActivity(activity);

await _next(context);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will now allocate an extra async state machine even if the request isn't being sampled.
Can you please change the logic to something along the lines of what I shared here: #2098 (comment)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will now allocate an extra async state machine even if the request isn't being sampled. Can you please change the logic to something along the lines of what I shared here: #2098 (comment)?

It only adds it if the activity is created, which depends on the listener. If the activity is null, nothing gets set.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that the method is now async, which means you get an extra state machine allocation even if you don't have the activity.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want you can keep it as-is and I can clean it up in a follow up PR

Comment thread src/ReverseProxy/SessionAffinity/SessionAffinityMiddleware.cs Outdated
Comment thread src/ReverseProxy/Utilities/Observability.cs Outdated
Comment thread src/ReverseProxy/Utilities/Observability.cs Outdated
Comment thread src/ReverseProxy/Utilities/Observability.cs
Comment thread src/ReverseProxy/Forwarder/ForwarderMiddleware.cs Outdated
Comment thread src/ReverseProxy/Model/ProxyPipelineInitializerMiddleware.cs Outdated
Comment thread src/ReverseProxy/Utilities/Observability.cs
Comment thread src/ReverseProxy/Model/ProxyPipelineInitializerMiddleware.cs
@samsp-msft samsp-msft merged commit 2463910 into main May 5, 2023
@samsp-msft samsp-msft deleted the samsp/diagnostics branch May 5, 2023 17:59
MihaZupan added a commit that referenced this pull request May 9, 2023
* Followup after #2098

* Include CreateRequest under destination_health_check activity
@MihaZupan MihaZupan added this to the YARP 2.x milestone May 25, 2023
andrewlock added a commit to DataDog/dd-trace-dotnet that referenced this pull request Apr 1, 2026
…#8370)

## Summary of changes

- Set `DD_TRACE_OTEL_ENABLED=true` by default in integration tests
- Add additional excludes for activity handlers with equivalent custom
instrumentation

## Reason for change

We had an escalation recently, which highlighted that we were missing
some entries in the `IgnoreActivityHandler`. To avoid hitting similar
issues in the future, we can set `DD_TRACE_OTEL_ENABLED=true` by
default, so we know as soon as a new `ActivitySource` lights up.

## Implementation details

A _lot_ of trial and error here, mostly setting
`DD_TRACE_OTEL_ENABLED=true` in the `TracingIntegrationTest` base class
and seeing what breaks 😅 That pointed to a variety of extra handlers and
differences in behaviour:

- `Couchbase.DotnetSdk.OpenTelemetryRequestTracer`, I don't think we
actually need this one strictly, but it showed up while I was trying to
fix persistent issues with Couchbase3, so I think it makes sense to
exclude it
- The _real_ issue I had is that some early versions of Couchbase (3.0.0
- 3.2.0) create activities using `new Activity()`, which means there's
_no_ `ActivitySource` associated, and so we have no way to filter them 💀
- Rather than fight with that, and because those versions are
deprecated, just disabled OTel integration for these specific tested
versions
- `connector-net` - this is the ActivitySource for `MySql.Data` (we
already excluded the one for `MySqlConnector`)
- `RabbitMQ.Client.*` - these were the ones that caused the original
issue, and were breaking DSM
- `Experimental.System.Net.Security` - this one came in .NET 9, and was
causing extra spans in gRPC and Yarp tests
- `Grpc.Net.Client` - The gRPC client has had an `ActivitySource` [for a
long time](grpc/grpc-dotnet#2244) 😅
- `Yarp.ReverseProxy` - ...[as has
Yarp](dotnet/yarp#2098)

In addition, there were some extra "fixes" to the tests required:

- For the `HttpMessageHandler` tests, where the integration is disabled,
we were previously verifying that W3C headers weren't injected, but they
_will_ be if OTel is enabled, so just relaxed the restrictions there.
- For `OpenTelemetrySdkTests.SubmitsOtlpLogs`, the data changes
depending on whether otel is enabled or not, so just reset it to the
default for simplicity rather than wrestle with it (I spent some time
ping-ponging snapshots before I gave up 😅)
- Updated the gRPC snapshots to add `grpc.method` and `grpc.status_code`
which are now added to the aspnetcore spans

## Test coverage

Hopefully self explanatory 😅 

## Other details

Fixes https://datadoghq.atlassian.net/browse/DSMS-138

Technically, this could be a breaking change for some people, so maybe
we should revisit making the ignore activity handler configurable?
igoragoli pushed a commit to DataDog/dd-trace-dotnet that referenced this pull request Apr 8, 2026
…#8370)

## Summary of changes

- Set `DD_TRACE_OTEL_ENABLED=true` by default in integration tests
- Add additional excludes for activity handlers with equivalent custom
instrumentation

## Reason for change

We had an escalation recently, which highlighted that we were missing
some entries in the `IgnoreActivityHandler`. To avoid hitting similar
issues in the future, we can set `DD_TRACE_OTEL_ENABLED=true` by
default, so we know as soon as a new `ActivitySource` lights up.

## Implementation details

A _lot_ of trial and error here, mostly setting
`DD_TRACE_OTEL_ENABLED=true` in the `TracingIntegrationTest` base class
and seeing what breaks 😅 That pointed to a variety of extra handlers and
differences in behaviour:

- `Couchbase.DotnetSdk.OpenTelemetryRequestTracer`, I don't think we
actually need this one strictly, but it showed up while I was trying to
fix persistent issues with Couchbase3, so I think it makes sense to
exclude it
- The _real_ issue I had is that some early versions of Couchbase (3.0.0
- 3.2.0) create activities using `new Activity()`, which means there's
_no_ `ActivitySource` associated, and so we have no way to filter them 💀
- Rather than fight with that, and because those versions are
deprecated, just disabled OTel integration for these specific tested
versions
- `connector-net` - this is the ActivitySource for `MySql.Data` (we
already excluded the one for `MySqlConnector`)
- `RabbitMQ.Client.*` - these were the ones that caused the original
issue, and were breaking DSM
- `Experimental.System.Net.Security` - this one came in .NET 9, and was
causing extra spans in gRPC and Yarp tests
- `Grpc.Net.Client` - The gRPC client has had an `ActivitySource` [for a
long time](grpc/grpc-dotnet#2244) 😅
- `Yarp.ReverseProxy` - ...[as has
Yarp](dotnet/yarp#2098)

In addition, there were some extra "fixes" to the tests required:

- For the `HttpMessageHandler` tests, where the integration is disabled,
we were previously verifying that W3C headers weren't injected, but they
_will_ be if OTel is enabled, so just relaxed the restrictions there.
- For `OpenTelemetrySdkTests.SubmitsOtlpLogs`, the data changes
depending on whether otel is enabled or not, so just reset it to the
default for simplicity rather than wrestle with it (I spent some time
ping-ponging snapshots before I gave up 😅)
- Updated the gRPC snapshots to add `grpc.method` and `grpc.status_code`
which are now added to the aspnetcore spans

## Test coverage

Hopefully self explanatory 😅 

## Other details

Fixes https://datadoghq.atlassian.net/browse/DSMS-138

Technically, this could be a breaking change for some people, so maybe
we should revisit making the ignore activity handler configurable?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants