Skip to content

Properly ingest dotnetup data after data-x-migration#54051

Draft
nagilson wants to merge 32 commits intodotnet:release/dnupfrom
nagilson:nagilson-dnup-otel-migration-for-data-x
Draft

Properly ingest dotnetup data after data-x-migration#54051
nagilson wants to merge 32 commits intodotnet:release/dnupfrom
nagilson:nagilson-dnup-otel-migration-for-data-x

Conversation

@nagilson
Copy link
Copy Markdown
Member

@nagilson nagilson commented Apr 22, 2026

Resolves #52785

History:
#52792 added working telemetry with an app insights board to dotnetup but used a separate implementation as the SDK did not use otel (not native aot)
#53343 moved to the official key which broke and made all data get lost (intentional)
#53181 implemented OpenTelemetry usage within the .NET SDK.

This PR:

  • Makes us produce traces (events under activities) instead of just spans, as spans don't get consumed by the data-x platform which ingests data
  • Reworks the dashboard to query the correct tables and properties as the data team enforces these standards for property names
  • Does not remove spans as they can still be useful for an aspire like dashboard (have not tested this myself but this is what others suggest)
  • Tries to BLOCK writing to the span or event traces, so that any data added goes through a wrapper which ensures both are applied

very wip code, but:

spans don't get consumed by the data-x-platform only events which go into the 'traces' table so we need to send this data as a 'trace'  (event) but also still have the spans so we can use the aspire like dashboards

I believe duration and other span properties dont get added to the activities so we need to add them ourselves.
… and activity

also add test failing if the code tries settag
we observe spotty behavior by running at scale in ci for now!
Comment thread src/Installer/dotnetup/GarbageCollector.cs Outdated
Comment thread src/Installer/dotnetup/Program.cs
Comment thread src/Installer/dotnetup/Commands/Shared/InstallWorkflow.cs
Activity Events cause a trace to be logged but the conventional way to send telemetry to a trace table is via a logger so this implementation uses a logger to send the data instead of creating another activity out of an existing activity and using that to publish an activity event.

error information is propagated up to each parent activity
exceptions tables get dropped by data x platform so we dont monitor those
this also tries to ensure properties properly proagate upon the spans with the wrapper
{
DotnetupTelemetry.Instance.Flush(timeoutMilliseconds: 100);
DotnetupTelemetry.Instance.WriteLogIfNecessary();
DotnetupTelemetry.Instance.Flush(timeoutMilliseconds: 5000);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to call all of these? if we do, do we really want all 3 to be separate or can we make this 1 function or happen automatically upon dispose

…to reclassify the same properties on a bunch of ops
@nagilson
Copy link
Copy Markdown
Member Author

I dont think we need the error event, we can simply query for the error property on the failing event(s)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant