fix(aspnetcore): scope correlation processor per-factory to stop cross-factory tag leak#5891
Conversation
…s-factory tag leak
OptOut_DoesNotTag_AspNetCoreSpans failed on macOS in run 25572055061 with
a foreign GUID stamped into the opt-out exporter's activity. Cause: every
parallel WebApplicationFactory that calls AddAspNetCoreInstrumentation()
subscribes its TracerProvider to the process-global Microsoft.AspNetCore
ActivitySource. A single Activity per request is observed by every
subscribed processor, and any tag any processor writes is visible to every
exporter — so factory A's TUnitTestCorrelationProcessor was tagging
factory B's request activity, polluting the opt-out exporter.
[NotInParallel] on the OptOut class doesn't help because the contaminating
tests are sibling classes without the constraint.
Per-factory baggage approach:
- Each TestWebApplicationFactory gets a CorrelationScope holding a unique
FactoryId.
- A new FactoryScopeStartupFilter adds middleware that stamps the
Activity for every incoming request with a baggage entry carrying the
owning factory's id.
- TUnitTestCorrelationProcessor takes an optional CorrelationScope. When
bound, it tags only activities whose baggage carries its own factory
id; activities tagged by a sibling factory's middleware are skipped.
The parameterless processor constructor and AutoStart's process-wide use
are unchanged, so the TUnit.OpenTelemetry zero-config package and the
existing CorrelationProcessorTests keep their current behavior.
Two new regression tests cover the scoped path: a scoped processor refuses
to tag an activity bearing a different factory's baggage even when the
TraceRegistry fallback would happily resolve a TestId, and the same
processor still tags activities that carry its own factory's id.
Not up to standards ⛔🔴 Issues
|
| Category | Results |
|---|---|
| Security | 1 critical |
| CodeStyle | 1 minor |
🟢 Metrics 0 complexity
Metric Results Complexity 0
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
There was a problem hiding this comment.
Review: fix(aspnetcore): scope correlation processor per-factory
The root-cause analysis is correct and the approach is sound overall. The per-factory baggage filter cleanly isolates activity tagging without breaking the zero-config TUnit.OpenTelemetry path. A few things worth considering:
Medium — tunit.factory.id propagates in W3C baggage headers to downstream services
Activity.AddBaggage(...) writes a value that W3C propagators (including the ActivityPropagationHandler already threaded through the factory's HttpClient) will include in outgoing baggage request headers. In a test setup that talks to real downstream services or records HTTP exchanges, the factory GUID leaks into those headers. It's test-only metadata that downstream services shouldn't see.
Better approach: Store the factory→trace-ID mapping in an in-memory dictionary (similar to TraceRegistry), keyed by the Activity's TraceId, populated by the middleware:
// In FactoryActivityRegistry (internal to TUnit.AspNetCore):
internal static class FactoryActivityRegistry
{
private static readonly ConcurrentDictionary<ActivityTraceId, string> _map = new();
internal static void Register(ActivityTraceId traceId, string factoryId) => _map[traceId] = factoryId;
internal static string? GetFactoryId(ActivityTraceId traceId) => _map.TryGetValue(traceId, out var id) ? id : null;
}The middleware calls FactoryActivityRegistry.Register(activity.TraceId, _scope.FactoryId) instead of activity.AddBaggage(...), and the processor checks the registry instead of reading baggage. This keeps factory identity entirely in-process and out of propagated headers.
Low — OnStart is completely skipped for scoped processors, even for child activities
The comment correctly explains why root-level server activities can't be tagged at OnStart (middleware runs after ActivitySource.StartActivity). But child activities created during request handling (e.g. database spans, outgoing HttpClient spans within the SUT) do have the factory baggage available at OnStart via the parent chain. Skipping OnStart entirely for scoped processors is conservative and causes these spans to only be tagged at OnEnd.
In practice this is fine for BatchExportProcessor and InMemoryExporter, and OnEnd covers the root-activity case. But it's a silent behavioural difference from the unscoped path that may surprise users registering SimpleExportProcessor before the correlation processor. Worth a clarifying note or a targeted early-return only for root activities:
public override void OnStart(Activity activity)
{
// Only skip OnStart for root activities (no in-process parent) — factory baggage
// is set by middleware that runs after StartActivity, so it can't be read here.
if (_scope is not null && activity.Parent is null)
return;
TryTag(activity);
}Low — Missing test: scoped processor with activity that carries no factory baggage
The two new tests cover alien-factory baggage and own-factory baggage. There is no test for a scoped processor observing an activity that has no tunit.factory.id baggage at all (e.g. a background timer or a span started outside any request pipeline, observed by the factory's TracerProvider via AddHttpClientInstrumentation). The expected outcome — skip, even if TraceRegistry has an entry — falls out naturally from null != _scope.FactoryId, but making it explicit would document the intentional behaviour change versus the unscoped processor and guard against future regressions.
Positive notes
- The
CorrelationScope+FactoryScopeStartupFiltersplit cleanly separates concerns; the scope object is a dumb value and the filter/processor share it by reference without coupling. [EditorBrowsable(EditorBrowsableState.Never)]is appropriate for a type that must be public (because it appears in the processor's public constructor) but isn't part of the user-facing API.- Backward compatibility is fully preserved — the parameterless constructor and
AutoStart's process-wide path are unchanged. - The regression tests are well-constructed, especially the
ScopedProcessor_DoesNotTag_ActivitiesFromOtherFactoriestest that includes aTraceRegistryhit to prove it's the factory guard — not the registry miss — that blocks tagging.
The baggage-leakage point (item 1) is the only structural concern worth resolving before merge. Items 2 and 3 are optional improvements.
…Scope additions The PR's per-factory scoping added a new public sealed CorrelationScope type and a new public TUnitTestCorrelationProcessor(CorrelationScope?) constructor to TUnit.OpenTelemetry. The OpenTelemetry_Library_Has_No_API_Changes snapshot test was failing on all CI legs because the verified files didn't reflect the new surface. Regenerated locally from the received output; identical across the three TFM-specific snapshots except for line 4 (framework display name).
There was a problem hiding this comment.
Follow-up Review
Checking against the previous review — the PR has not been updated since that review was posted (still at commit 97cd6f36). None of the three raised items have been addressed yet.
Status of prior findings
Medium — tunit.factory.id leaks into W3C baggage headers (unresolved)
The middleware still calls activity.AddBaggage(CorrelationScope.FactoryIdBaggageKey, _scope.FactoryId). W3C propagators will include this in outgoing baggage: request headers to any downstream service the SUT contacts during tests. This is test infrastructure leaking into observed traffic.
The suggested fix from the prior review remains valid: replace the baggage write with an in-process registry keyed by ActivityTraceId (similar to TraceRegistry), so factory identity never leaves the process. This is the only structural concern blocking merge.
Low — OnStart fully skipped for scoped processors (unresolved)
Child activities (database spans, outgoing HttpClient spans created inside the SUT) have factory baggage available at OnStart via the parent chain, but are silently skipped and only tagged at OnEnd. The suggested narrowing:
if (_scope is not null && activity.Parent is null)
return;…would restore OnStart tagging for child activities while keeping the correct skip for root-level server activities. Optional but worth considering.
Low — Missing test for no-factory-baggage activity on a scoped processor (unresolved)
A scoped processor observing an activity with no tunit.factory.id baggage at all (e.g. a background timer span) should skip it, even when TraceRegistry has an entry. This falls out naturally from the current code (null != _scope.FactoryId) but has no explicit test. Optional.
No regressions introduced, the architecture and test quality remain good. The baggage-leakage item is the only one I'd consider a blocker.
Summary
OptOut_DoesNotTag_AspNetCoreSpansfailed on macOS in run 25572055061 with a foreign GUID stamped into the opt-out exporter's activity.Cause. Every parallel
WebApplicationFactorythat callsAddAspNetCoreInstrumentation()subscribes itsTracerProviderto the process-globalMicrosoft.AspNetCoreActivitySource. A singleActivityper request is observed by every subscribed processor — and any tag any processor writes is visible to every exporter. Factory A'sTUnitTestCorrelationProcessorwas therefore tagging factory B's request activity with factory A'sTestId, polluting the opt-out exporter.[NotInParallel]on the OptOut class doesn't help because the contaminating tests are sibling classes without the constraint.Approach: per-factory baggage filter
TestWebApplicationFactorynow owns aCorrelationScopewith a uniqueFactoryId.FactoryScopeStartupFilteradds middleware that stamps every incoming request'sActivitywith a baggage entry carrying the owning factory's id.TUnitTestCorrelationProcessortakes an optionalCorrelationScopeconstructor argument. When bound, it tags only activities whose baggage matches its own factory id; activities tagged by a sibling factory's middleware are skipped.AutoStart's process-wide use are unchanged, so theTUnit.OpenTelemetryzero-config package keeps its current behavior.Test plan
TUnit.AspNetCore.Testspass locally including bothAutoConfigureOpenTelemetryTestsandAutoConfigureOpenTelemetryOptOutTests.CorrelationProcessorTestsstill pass (parameterless constructor unchanged).TraceRegistrywould resolve aTestId), and still tags activities that carry its own factory's id..NETworkflow runs across all three OS legs.