Adding distinct activity for distributed tracing to YARP#2098
Adding distinct activity for distributed tracing to YARP#2098samsp-msft merged 10 commits intomainfrom
Conversation
…o improve the lifetime.
|
If you add Re: Tags like routeId, is there prior art in adding such information to existing Activities? Things like that should be trivial to do now that ActivitySource.AddActivityListener(new ActivityListener()
{
ShouldListenTo = source => source.Name == "System.Net.Http",
ActivityStarted = activity =>
{
string routeId = "..."; // Get that somehow (IHttpContextAccessor?)
activity.AddTag("YarpRoute", routeId);
}
}); |
HttpClientInstrumentation is added as part of the AzureMonitor helper. Adding to the existing activity verses adding another is an interesting question. You don't even need to get to the listener, you can probably just do Activity.Current.SetTag(...) |
Other cleanup based on comments.
Tratcher
left a comment
There was a problem hiding this comment.
Great, just needs some cleaup.
| var activity = Observability.YarpActivitySource.StartActivity("Proxy Forwarder", ActivityKind.Server); | ||
| context.SetYarpActivity(activity); | ||
|
|
||
| await _next(context); |
There was a problem hiding this comment.
This will now allocate an extra async state machine even if the request isn't being sampled.
Can you please change the logic to something along the lines of what I shared here: #2098 (comment)?
There was a problem hiding this comment.
This will now allocate an extra async state machine even if the request isn't being sampled. Can you please change the logic to something along the lines of what I shared here: #2098 (comment)?
It only adds it if the activity is created, which depends on the listener. If the activity is null, nothing gets set.
There was a problem hiding this comment.
The issue is that the method is now async, which means you get an extra state machine allocation even if you don't have the activity.
There was a problem hiding this comment.
If you want you can keep it as-is and I can clean it up in a follow up PR
…#8370) ## Summary of changes - Set `DD_TRACE_OTEL_ENABLED=true` by default in integration tests - Add additional excludes for activity handlers with equivalent custom instrumentation ## Reason for change We had an escalation recently, which highlighted that we were missing some entries in the `IgnoreActivityHandler`. To avoid hitting similar issues in the future, we can set `DD_TRACE_OTEL_ENABLED=true` by default, so we know as soon as a new `ActivitySource` lights up. ## Implementation details A _lot_ of trial and error here, mostly setting `DD_TRACE_OTEL_ENABLED=true` in the `TracingIntegrationTest` base class and seeing what breaks 😅 That pointed to a variety of extra handlers and differences in behaviour: - `Couchbase.DotnetSdk.OpenTelemetryRequestTracer`, I don't think we actually need this one strictly, but it showed up while I was trying to fix persistent issues with Couchbase3, so I think it makes sense to exclude it - The _real_ issue I had is that some early versions of Couchbase (3.0.0 - 3.2.0) create activities using `new Activity()`, which means there's _no_ `ActivitySource` associated, and so we have no way to filter them 💀 - Rather than fight with that, and because those versions are deprecated, just disabled OTel integration for these specific tested versions - `connector-net` - this is the ActivitySource for `MySql.Data` (we already excluded the one for `MySqlConnector`) - `RabbitMQ.Client.*` - these were the ones that caused the original issue, and were breaking DSM - `Experimental.System.Net.Security` - this one came in .NET 9, and was causing extra spans in gRPC and Yarp tests - `Grpc.Net.Client` - The gRPC client has had an `ActivitySource` [for a long time](grpc/grpc-dotnet#2244) 😅 - `Yarp.ReverseProxy` - ...[as has Yarp](dotnet/yarp#2098) In addition, there were some extra "fixes" to the tests required: - For the `HttpMessageHandler` tests, where the integration is disabled, we were previously verifying that W3C headers weren't injected, but they _will_ be if OTel is enabled, so just relaxed the restrictions there. - For `OpenTelemetrySdkTests.SubmitsOtlpLogs`, the data changes depending on whether otel is enabled or not, so just reset it to the default for simplicity rather than wrestle with it (I spent some time ping-ponging snapshots before I gave up 😅) - Updated the gRPC snapshots to add `grpc.method` and `grpc.status_code` which are now added to the aspnetcore spans ## Test coverage Hopefully self explanatory 😅 ## Other details Fixes https://datadoghq.atlassian.net/browse/DSMS-138 Technically, this could be a breaking change for some people, so maybe we should revisit making the ignore activity handler configurable?
…#8370) ## Summary of changes - Set `DD_TRACE_OTEL_ENABLED=true` by default in integration tests - Add additional excludes for activity handlers with equivalent custom instrumentation ## Reason for change We had an escalation recently, which highlighted that we were missing some entries in the `IgnoreActivityHandler`. To avoid hitting similar issues in the future, we can set `DD_TRACE_OTEL_ENABLED=true` by default, so we know as soon as a new `ActivitySource` lights up. ## Implementation details A _lot_ of trial and error here, mostly setting `DD_TRACE_OTEL_ENABLED=true` in the `TracingIntegrationTest` base class and seeing what breaks 😅 That pointed to a variety of extra handlers and differences in behaviour: - `Couchbase.DotnetSdk.OpenTelemetryRequestTracer`, I don't think we actually need this one strictly, but it showed up while I was trying to fix persistent issues with Couchbase3, so I think it makes sense to exclude it - The _real_ issue I had is that some early versions of Couchbase (3.0.0 - 3.2.0) create activities using `new Activity()`, which means there's _no_ `ActivitySource` associated, and so we have no way to filter them 💀 - Rather than fight with that, and because those versions are deprecated, just disabled OTel integration for these specific tested versions - `connector-net` - this is the ActivitySource for `MySql.Data` (we already excluded the one for `MySqlConnector`) - `RabbitMQ.Client.*` - these were the ones that caused the original issue, and were breaking DSM - `Experimental.System.Net.Security` - this one came in .NET 9, and was causing extra spans in gRPC and Yarp tests - `Grpc.Net.Client` - The gRPC client has had an `ActivitySource` [for a long time](grpc/grpc-dotnet#2244) 😅 - `Yarp.ReverseProxy` - ...[as has Yarp](dotnet/yarp#2098) In addition, there were some extra "fixes" to the tests required: - For the `HttpMessageHandler` tests, where the integration is disabled, we were previously verifying that W3C headers weren't injected, but they _will_ be if OTel is enabled, so just relaxed the restrictions there. - For `OpenTelemetrySdkTests.SubmitsOtlpLogs`, the data changes depending on whether otel is enabled or not, so just reset it to the default for simplicity rather than wrestle with it (I spent some time ping-ponging snapshots before I gave up 😅) - Updated the gRPC snapshots to add `grpc.method` and `grpc.status_code` which are now added to the aspnetcore spans ## Test coverage Hopefully self explanatory 😅 ## Other details Fixes https://datadoghq.atlassian.net/browse/DSMS-138 Technically, this could be a breaking change for some people, so maybe we should revisit making the ignore activity handler configurable?

Adding distinct tracing objects to YARP.
These can then be picked up via monitoring middleware such as Open Telemetry.
The activity source is only active if something is listening to it, such as
t.AddSource("Yarp.*");.