diff --git a/docs/agent365-guided-setup/a365-observability-instructions.md b/docs/agent365-guided-setup/a365-observability-instructions.md index 69519d3e..0946e1d8 100644 --- a/docs/agent365-guided-setup/a365-observability-instructions.md +++ b/docs/agent365-guided-setup/a365-observability-instructions.md @@ -12,203 +12,665 @@ Add Agent 365 observability to your agent at any point after `a365 setup all` ha --- -## Task A — Add the observability SDK to the project +## Overview -Ask your coding agent (Claude Code, GitHub Copilot, or similar): +This skill instruments Microsoft Agent 365 observability into an existing agent codebase +without disrupting the agent's core logic. It: -> "Using #file:a365-observability-instructions.md, add observability to this project" +1. **Detects** the agent type (.NET AgentFramework, Node.js, or Python) +2. **Installs** the correct A365 observability packages (core + hosting + optional extensions) +3. **Wires** observability in the entry point +4. **Adds** BaggageBuilder context or BaggageMiddleware to message handlers +5. **Implements** the agentic token resolver with caching +6. **Adds** manual instrumentation scopes (InvokeAgentScope, InferenceScope, ExecuteToolScope — **required for store publishing**) +7. **Updates** configuration files with observability settings +8. **Validates** the build passes -> **Do NOT search NuGet package caches, inspect package XML files, browse MS Learn, or probe installed package versions. All required code is provided verbatim below — use it as-is.** +> **Store publishing requirement:** The Agent 365 store validation requires `InvokeAgentScope`, +> `InferenceScope`, and `ExecuteToolScope` to be implemented. This skill wires them. -1. Install the observability SDK packages: - - **.NET** — run these two commands in the project directory: - ``` - dotnet add package Microsoft.Agents.A365.Observability.Runtime --prerelease - dotnet add package Microsoft.Agents.A365.Observability.Hosting --prerelease - ``` - - Python / Node.js: see the MS Learn reference for current package names +All changes are **additive** and **idempotent** — rerunning the skill is safe. -### .NET helper files — scaffold before step 2 (agentType = 3 only) +--- -> **agentType = 3 only.** Skip this section for `agentType = 1` (Entra app ID agents) — those use the standard agentic token flow and do not need these files. -> -> These two files bridge gaps in the current SDK release and will be incorporated into `Microsoft.Agents.A365.Observability.Hosting` in a future version. Create them in an `Observability/` subfolder at the root of your project, replacing `` with the project's root namespace. +## Phase 0: Load Detection Cache and Validate -**`Observability/ObservabilityServiceExtensions.cs`** — registers the S2S token cache, background token service, and injectable `Agent365ObservabilityContext`: +**TaskCreate** — "Load detection cache and validate with user" -```csharp -using Microsoft.Agents.A365.Observability.Hosting; -using Microsoft.Agents.A365.Observability.Runtime.Tracing.Contracts; -using Microsoft.Extensions.Configuration; -using Microsoft.Extensions.DependencyInjection; +**Read** `.a365-workspace-detection.json`. -namespace ; +If the file is missing or `detectedAt` is older than 60 minutes: +> "`a365-setup` must be run before this skill — it registers your agent with Agent 365 and writes +> the project detection cache this skill depends on. Run `a365-setup` now, then return here." -// Injectable singleton wrapping AgentDetails for single-tenant agents. -// Pass ctx.AgentDetails to InvokeAgentScope.Start() for span attributes. -public sealed class Agent365ObservabilityContext -{ - public AgentDetails AgentDetails { get; } - internal Agent365ObservabilityContext(AgentDetails d) => AgentDetails = d; -} +Stop until the user confirms `a365-setup` has been run. + +Load from cache: `agentStack`, `programmingLanguage`, `usesTeamsOrCopilot`, `agentType`, `authMode` (if previously stored). + +Present the loaded values in one message and wait for confirmation: -public static class ObservabilityServiceExtensions -{ - // Registers S2S token cache + exporter, ObservabilityTokenService, and Agent365ObservabilityContext. - // Config is written by `a365 setup all` under the Agent365Observability section. - public static IServiceCollection AddAgent365Observability( - this IServiceCollection services, - string? clusterCategory = "production") - { - services.AddServiceTracingExporter(clusterCategory); - services.AddHostedService(); - services.AddSingleton(sp => - { - var obs = sp.GetRequiredService().GetSection("Agent365Observability"); - var agentDetails = new AgentDetails( - agentId: obs["AgentId"], - agentName: obs["AgentName"], - agentDescription: obs["AgentDescription"], - agentBlueprintId: obs["AgentBlueprintId"], - tenantId: obs["TenantId"] - ?? throw new InvalidOperationException("Agent365Observability:TenantId is required.")); - return new Agent365ObservabilityContext(agentDetails); - }); - return services; - } -} ``` +Here's what we detected about your agent: + • Stack: {agentStack} + • Language: {programmingLanguage} -**`Observability/ObservabilityTokenService.cs`** — background service that acquires a Power Platform token via a 3-hop FMI chain and refreshes it every 50 minutes: +Reply **yes** to confirm, or describe any corrections. +``` -```csharp -using Azure.Core; -using Azure.Identity; -using Microsoft.Agents.A365.Observability.Hosting.Caching; -using Microsoft.Identity.Client; - -namespace ; - -// Acquires a Power Platform token for A365 observability via a 3-hop FMI chain. -// Hop 1+2: Blueprint authenticates (MSI in prod, client secret locally) → -// gets T1 via .WithFmiPath(agentId) to Agent Identity. -// Hop 3: Agent Identity uses T1 as assertion → Power Platform token. -// (ServiceIdentity type — AADSTS82001 does not apply.) -internal sealed class ObservabilityTokenService : BackgroundService -{ - private static readonly string[] FmiScopes = ["api://AzureADTokenExchange/.default"]; - private static readonly string[] PowerPlatformScopes = ["https://api.powerplatform.com/.default"]; - private static readonly TimeSpan RefreshInterval = TimeSpan.FromMinutes(50); - - private readonly IExporterTokenCache _tokenCache; - private readonly ILogger _logger; - private readonly string _blueprintClientId, _blueprintClientSecret, _tenantId, _agentId; - - public ObservabilityTokenService( - IExporterTokenCache tokenCache, - ILogger logger, - IConfiguration configuration) - { - _tokenCache = tokenCache; - _logger = logger; - var obs = configuration.GetSection("Agent365Observability"); - _tenantId = obs["TenantId"] ?? throw new InvalidOperationException("Agent365Observability:TenantId is required."); - _agentId = obs["AgentId"] ?? throw new InvalidOperationException("Agent365Observability:AgentId is required."); - _blueprintClientId = obs["ClientId"] ?? throw new InvalidOperationException("Agent365Observability:ClientId is required."); - // ClientSecret is required at construction time even in production: - // MSI is tried first; the secret is only used as a local-dev fallback. - // Ensure ClientSecret is present in all environments (can be a placeholder in prod if MSI is guaranteed). - _blueprintClientSecret = obs["ClientSecret"] ?? throw new InvalidOperationException("Agent365Observability:ClientSecret is required."); - } - - protected override async Task ExecuteAsync(CancellationToken stoppingToken) - { - _logger.LogInformation("ObservabilityTokenService started."); - while (!stoppingToken.IsCancellationRequested) - { - try { await AcquireAndRegisterTokenAsync(stoppingToken); } - catch (Exception ex) when (!stoppingToken.IsCancellationRequested) - { _logger.LogWarning(ex, "Failed to acquire observability token; will retry in {Interval}.", RefreshInterval); } - try { await Task.Delay(RefreshInterval, stoppingToken); } - catch (OperationCanceledException) { break; } - } - _logger.LogInformation("ObservabilityTokenService stopped."); - } - - private async Task AcquireAndRegisterTokenAsync(CancellationToken ct) - { - string t1Token; - string authority = $"https://login.microsoftonline.com/{_tenantId}"; - - // Hop 1+2: Blueprint → T1 via FMI path (MSI in prod, client secret locally) - try - { - // ManagedIdentityCredential.GetTokenAsync uses a resource URI (no /.default suffix). - // FmiScopes uses /.default format — correct for MSAL AcquireTokenForClient. - // These two forms are intentionally different; do not "fix" them to match. - var assertion = await new ManagedIdentityCredential() - .GetTokenAsync(new TokenRequestContext(["api://AzureADTokenExchange"]), ct); - t1Token = (await ConfidentialClientApplicationBuilder - .Create(_blueprintClientId) - .WithClientAssertion((AssertionRequestOptions _) => Task.FromResult(assertion.Token)) - .WithAuthority(new Uri(authority)).Build() - .AcquireTokenForClient(FmiScopes).WithFmiPath(_agentId) - .ExecuteAsync(ct)).AccessToken; - } - catch (AuthenticationFailedException) - { - // MSI unavailable — fall back to client secret (local dev) - t1Token = (await ConfidentialClientApplicationBuilder - .Create(_blueprintClientId) - .WithClientSecret(_blueprintClientSecret) - .WithAuthority(new Uri(authority)).Build() - .AcquireTokenForClient(FmiScopes).WithFmiPath(_agentId) - .ExecuteAsync(ct)).AccessToken; - } - - // Hop 3: Agent Identity uses T1 → Power Platform token - var ppResult = await ConfidentialClientApplicationBuilder - .Create(_agentId) - .WithClientAssertion((AssertionRequestOptions _) => Task.FromResult(t1Token)) - .WithAuthority(new Uri(authority)).Build() - .AcquireTokenForClient(PowerPlatformScopes) - .ExecuteAsync(ct); - - _tokenCache.RegisterObservability(_agentId, _tenantId, ppResult.AccessToken, PowerPlatformScopes); - _logger.LogInformation("Observability token registered for agent {AgentId}.", _agentId); - } -} +**TaskUpdate** — Mark complete: "Load detection cache and validate with user" + +--- + +## Phase 0.5: Agent Kind and Authentication Mode + +**TaskCreate** — "Determine agent kind and authentication mode" + +Follow the agent kind and auth mode detection rules to determine the correct values. + +If `agentType` and `authMode` are already present in the detection cache (from a prior skill run in this session), confirm the values with the user and skip the questions. + +Store `agentType` (`ai-teammate` = AI Teammate, or `system-agent` = Agent (Non AI Teammate)) and `authMode`: +- **AI Teammate:** `user-delegated` (OBO as signed-in user) or `agentic-identity` (OBO as agent's own M365 identity) +- **Agent (Non AI Teammate):** `agentic-identity` (Assistive OBO) or `S2S` (Autonomous / Service Principal) + +**Update `.a365-workspace-detection.json`** — merge `agentType` and `authMode` into the existing cache file, preserving all other fields (`agentStack`, `programmingLanguage`, `usesTeamsOrCopilot`, `detectedAt`). Use the **Write** tool to write the merged object back. + +The `authMode` value drives Phases 3–5: OBO and S2S paths differ in entry point wiring (Phase 3), message handler pattern (Phase 4), and token resolver (Phase 5). **Phases 2, 6, 7, and 8 are identical regardless of `authMode`.** + +**TaskUpdate** — Mark complete: "Determine agent type and authentication mode" + +--- + +## Phase 1: Detect Agent Type + +**TaskCreate** — "Detect agent type and load reference patterns" + +1. **Run detection** using the following heuristics: + - Check for `.NET AgentFramework` indicators (Microsoft.Agent.*, AgentFramework) → `.csproj` + - Check for `Node.js` indicators (package.json, @langchain, openai, @microsoft/agents-*) + - Check for `Python` indicators (requirements.txt, pyproject.toml, `.py` files, `microsoft-agents`) + - Determine package file (*.csproj, package.json, pyproject.toml/requirements.txt) + - Determine entry point (Program.cs, index.ts/js, app.py / host_agent_server.py) + - Determine message handler location + +2. **Load reference patterns:** + - If .NET: **Read** `./references/dotnet-observability.md` + - If Node.js: **Read** `./references/nodejs-observability.md` + - If Python: **Read** `./references/python-observability.md` + +3. **If agent type cannot be determined**, write marker `.a365setup-unknown-agent` and **exit early** with clear error message. + +4. **TaskUpdate** — Mark complete and report detected agent type to user. + +--- + +## Phase 2: Install A365 Observability Packages + +**TaskCreate** — "Install A365 observability packages" + +### For .NET AgentFramework + +1. **Bash** — Run package installation (path-dependent): + + **OBO path** (`user-delegated` or `agentic-identity`): + ```bash + dotnet add package Microsoft.Agents.A365.Observability.Runtime + dotnet add package Microsoft.Agents.A365.Observability.Hosting + ``` + + **S2S / autonomous agents — use the unified distro** (preferred; do NOT also add Runtime/Hosting): + ```bash + dotnet add package Microsoft.OpenTelemetry --version 1.0.0-beta.1 + dotnet add package Azure.Identity + dotnet add package Microsoft.Identity.Client + # Required: Microsoft.OpenTelemetry v1.0.0-beta.1 has a hard runtime dependency on + # Microsoft.Extensions.Logging v10. Use the stable 10.0.4 release — do NOT use a + # preview version; Microsoft.Agents.A365.Observability.Hosting requires >= 10.0.4 + # and a lower preview causes NU1605 downgrade errors at restore time. + dotnet add package Microsoft.Extensions.Logging --version "10.0.4" + ``` + > **⚠️ Do NOT add `Microsoft.Agents.A365.Observability.Hosting` or `.Runtime` as + > direct `` entries when using the unified distro.** `Microsoft.OpenTelemetry` + > already brings both as transitive dependencies and re-exports their types. Adding them + > directly causes **CS0433 type ambiguity** on `AgentDetails`, `CallerDetails`, and + > `IExporterTokenCache`. Remove any explicit Hosting/Runtime references from the `.csproj`. + > + > **⚠️ TFM requirement:** If the project targets `net8.0`, upgrade to `net9.0` or later. + > `Microsoft.OpenTelemetry` v1.0.0-beta.1 has a hard runtime dependency on + > `Microsoft.Extensions.Logging` v10 which is not part of the `net8.0` or `net9.0` + > framework — it must be a direct reference so the assembly is copied to the output + > directory. Without it you get `FileNotFoundException` at startup. + +2. **Optional auto-instrumentation extensions** — ask the user which AI framework they use. + + **If the user selects `Extensions.OpenAI` — pre-flight check (do this first, as a named step):** + ```bash + dotnet list package | grep Azure.AI.OpenAI + ``` + If the installed version is below `2.7.0-beta.2`, upgrade it **before** installing the extension: + ```bash + dotnet add package Azure.AI.OpenAI --version 2.7.0-beta.2 + ``` + Do this proactively — do not wait for a build failure to discover the version conflict. + + Then install the selected extension(s): + ```bash + # Semantic Kernel + dotnet add package Microsoft.Agents.A365.Observability.Extensions.SemanticKernel + # OpenAI (requires Azure.AI.OpenAI >= 2.7.0-beta.2 — checked above) + dotnet add package Microsoft.Agents.A365.Observability.Extensions.OpenAI + # Agent Framework + dotnet add package Microsoft.Agents.A365.Observability.Extensions.AgentFramework + ``` + +3. **Verify** the packages appear in the `.csproj` file. + +### For Node.js + +1. **Bash** — Run package installation (core + hosting + unified distro): + ```bash + npm install @microsoft/opentelemetry + npm install @microsoft/agents-a365-observability + npm install @microsoft/agents-a365-runtime + npm install @microsoft/agents-a365-observability-hosting + ``` + +2. **Optional auto-instrumentation extensions** — ask the user which AI framework they use. + + **If the user selects `extensions-openai` — pre-flight check (do this first):** + The extension requires `@openai/agents ^0.7.0` as a peer dependency — this is the **OpenAI Agents SDK**, NOT the `openai` npm package and NOT `@azure/openai`. Check and install the peer dep first: + ```bash + npm list @openai/agents + # If missing or below 0.7.0: + npm install @openai/agents@^0.7.0 + ``` + + Then install the selected extension(s): + ```bash + # OpenAI Agents SDK (requires @openai/agents ^0.7.0 — checked above) + npm install @microsoft/agents-a365-observability-extensions-openai + # LangChain + npm install @microsoft/agents-a365-observability-extensions-langchain + ``` + +3. **Verify** the packages appear in `package.json`. + +### For Python + +1. **Bash** — Run package installation (unified distro + S2S deps): + ```bash + pip3 install microsoft-opentelemetry 2>/dev/null || pip install microsoft-opentelemetry + # S2S path also requires: + pip3 install msal azure-identity httpx 2>/dev/null || pip install msal azure-identity httpx + ``` + > **OBO path:** only `microsoft-opentelemetry` is required. The `msal`, `azure-identity`, and `httpx` packages are only needed for the S2S token service. + +2. **Optional auto-instrumentation extensions** — ask the user which AI framework they use and install accordingly: + ```bash + # Semantic Kernel + pip3 install microsoft-agents-a365-observability-extensions-semantic-kernel 2>/dev/null || pip install microsoft-agents-a365-observability-extensions-semantic-kernel + # OpenAI Agents SDK + pip3 install microsoft-agents-a365-observability-extensions-openai 2>/dev/null || pip install microsoft-agents-a365-observability-extensions-openai + # Agent Framework + pip3 install microsoft-agents-a365-observability-extensions-agent-framework 2>/dev/null || pip install microsoft-agents-a365-observability-extensions-agent-framework + # LangChain + pip3 install microsoft-agents-a365-observability-extensions-langchain 2>/dev/null || pip install microsoft-agents-a365-observability-extensions-langchain + ``` + +3. **Update the dependency manifest** — `pip install` does not modify `requirements.txt` or `pyproject.toml` automatically. Explicitly add the installed packages: + - `requirements.txt` project: append `microsoft-opentelemetry` (and S2S deps if applicable) + - `pyproject.toml` project: add under `[project] dependencies` or run `uv add microsoft-opentelemetry` / `poetry add microsoft-opentelemetry` + +4. **Verify** the packages appear in `requirements.txt` or `pyproject.toml`. + +7. **TaskUpdate** — Mark complete. + +--- + +## Phase 3: Wire Observability in Entry Point + +**TaskCreate** — "Wire observability in entry point" + +> **Pre-existing placeholders:** As of CLI 1.1, `a365 setup all` auto-writes `Agent365Observability` placeholder sections to `appsettings.json` (.NET) or `.env` (Node.js/Python). Before creating config from scratch, **check if placeholders already exist** and fill in values rather than duplicating the section. + +### For .NET AgentFramework + +1. **Read** the current entry point (`Program.cs` or detected file). + +2. **Edit** — Add observability wiring following the reference pattern in `dotnet-observability.md`: + - Add using directives for the observability namespaces + - **OBO path** (`user-delegated` or `agentic-identity`): call `builder.Services.AddAgenticTracingExporter();` then `builder.AddA365Tracing();` + - **S2S path**: First **Write** the two scaffold files from the reference doc — `Observability/ObservabilityServiceExtensions.cs` (DI extension with `AddAgent365Observability()` using `ServiceTokenCache` and conditional `ObservabilityTokenService`) and `Observability/ObservabilityTokenService.cs` (background service that acquires the Observability API token via the MSAL FMI 3-hop chain with `.WithFmiPath()` targeting scope `api://9b975845-388f-4429-889e-eab1ef63949c/.default`, supports MSI with client-secret fallback). Then call `builder.Services.AddAgent365Observability();` and `builder.UseMicrosoftOpenTelemetry(...)` with token resolver reading from the `ServiceTokenCache`. **Critical:** Set `o.Agent365.Exporter.UseS2SEndpoint = true` in the options callback — without this, the exporter posts to the wrong path (`/observability/` instead of `/observabilityService/`) and gets HTTP 401. See "Known Issues" section. + - Optionally register `adapter.Use(new BaggageTurnMiddleware())` (OBO path only) to auto-populate baggage on every request + - Mark all new lines with: `// A365 Observability — best-effort instrumentation (verify against official sample)` + +3. **Preserve** all existing code — only add new lines, never remove. + +### For Node.js + +1. **Read** the current entry point (`index.ts`, `app.ts`, or detected file). + +2. **Edit** — Add observability initialization following the reference pattern in `nodejs-observability.md`: + - Add imports for `ObservabilityManager` from `@microsoft/agents-a365-observability` + - **OBO path**: Call `useMicrosoftOpenTelemetry({ a365: { enabled: true, tokenResolver } })` from `@microsoft/opentelemetry` **before** any LLM/framework imports. The `tokenResolver` reads from `AgenticTokenCacheInstance`. + - **S2S path**: First **Write** `observability/token-cache.ts` (in-memory token cache with `cacheToken`/`getCachedToken`/`tokenResolver`) and `observability/observability-token-service.ts` using the scaffold pattern from `nodejs-observability.md` (S2S section). This module acquires the Observability API token via MSAL FMI 3-hop chain (`@azure/msal-node` with `fmiPath` parameter, targeting scope `api://9b975845-388f-4429-889e-eab1ef63949c/.default`, supports MSI with client-secret fallback) and refreshes it every 50 min. Then call `useMicrosoftOpenTelemetry()` with the S2S workaround pattern from `nodejs-observability.md` (custom `Agent365Exporter` + `A365SpanProcessor` via `spanProcessors` when `AGENT365_USE_S2S_ENDPOINT=true`). Set `ENABLE_A365_OBSERVABILITY_EXPORTER=false` in `.env`. Also run `npm install @microsoft/opentelemetry @azure/msal-node @azure/identity @opentelemetry/sdk-trace-base`. + - Optionally register `adapter.use(new BaggageMiddleware())` (OBO path) to auto-populate baggage on every request + - Mark all new lines with: `// A365 Observability — best-effort instrumentation (verify against official sample)` + +3. **Preserve** all existing code — only add new lines, never remove. + +### For Python + +1. **Read** the current entry point (`app.py`, `host_agent_server.py`, or detected file). + +2. **Edit** — Add observability configuration following the reference pattern in `python-observability.md`: + - Add `from microsoft.opentelemetry import use_microsoft_opentelemetry` and call `use_microsoft_opentelemetry(enable_a365=True, a365_token_resolver=...)` with `service_name` and `service_namespace` + - **OBO path**: Wire `a365_token_resolver` to return the cached agentic token from `token_cache.py`. + - **S2S path**: First **Write** `observability/token_cache.py` (in-memory token cache with `cache_token`/`get_cached_token`) and `observability/observability_token_service.py` using the scaffold pattern from `python-observability.md` (S2S section). This module acquires the Observability API token via MSAL FMI 3-hop chain (`msal.ConfidentialClientApplication` with `fmi_path` parameter, targeting scope `api://9b975845-388f-4429-889e-eab1ef63949c/.default`, supports MSI with client-secret fallback) and refreshes it every 50 min via an `asyncio` background task. Then call `use_microsoft_opentelemetry(enable_a365=True, a365_token_resolver=...)` from `microsoft.opentelemetry` and schedule `run_token_service()` as an asyncio task. Also install `msal` and `azure-identity` if not already present. + - Optionally register `BaggageMiddleware` or use `ObservabilityHostingManager` on the adapter (OBO path) to auto-populate baggage on every request + - Mark all new lines with: `# A365 Observability — best-effort instrumentation (verify against official sample)` + +3. **Preserve** all existing code — only add new lines, never remove. + +4. **TaskUpdate** — Mark complete. + +--- + +## Phase 4: Add BaggageBuilder Context to Message Handler + +**TaskCreate** — "Add BaggageBuilder context to message handler" + +> **Skip this phase** if BaggageMiddleware was registered in Phase 3 — the middleware handles +> baggage propagation automatically for every request. + +> **Auth mode note:** All three `authMode` values use `authHandlerName: "AGENTIC"` in the +> code — the token exchange call is identical. The identity in traces is determined by Azure AD +> provisioning and the incoming token. Add an inline comment indicating which mode was chosen. + +### For .NET AgentFramework + +1. **Read** the detected message handler file. + +2. **Edit** — Follow the reference pattern in `dotnet-observability.md` based on `authMode`: + + **OBO path** (`user-delegated` or `agentic-identity`): + - Inject `IExporterTokenCache` in the constructor + - Use `new BaggageBuilder().FromTurnContext(turnContext).Build()` — requires `using Microsoft.Agents.A365.Observability.Hosting.Extensions;`; `Build()` returns `IDisposable`, use `using var` + - Call `RegisterObservability` with all four arguments per turn (wrap in try/catch — non-fatal): + ```csharp + _agentTokenCache.RegisterObservability( + turnContext.Activity.Recipient.AgenticAppId, + turnContext.Activity.Recipient.TenantId, + new AgenticTokenStruct( + userAuthorization: UserAuthorization, + turnContext: turnContext, + authHandlerName: "AGENTIC"), + EnvironmentUtils.GetObservabilityAuthenticationScope() + ); + ``` + - `user-delegated`: token exchange resolves to the **signed-in user's** identity → traces attributed to the user + - `agentic-identity`: token exchange resolves to the **agentic user** provisioned in Azure AD → traces attributed to the agent + - Add inline comment: `// A365 auth mode: {authMode} — see: https://learn.microsoft.com/en-us/entra/agent-id/agent-on-behalf-of-oauth-flow` + + **S2S path**: + - Inject `Agent365ObservabilityContext` (singleton registered by `AddAgent365Observability()`) in the constructor — **not** `IExporterTokenCache` + - **Baggage:** Use `new BaggageBuilder().FromTurnContext(turnContext).Build()` as a separate `using var baggageScope` — `FromTurnContext()` is an extension on `BaggageBuilder` **only**; it does not exist on `InvokeAgentScope` or any scope type + - **Scope:** Use `InvokeAgentScope.Start(new Request(...), new InvokeAgentScopeDetails(endpoint: new Uri("...")), _obs.AgentDetails, callerDetails)` as a separate `using var scope` — `InvokeAgentScopeDetails` has **no parameterless constructor**; always pass at least `endpoint`. `CallerDetails` with the blueprint sponsor's identity is **required** for S2S traces to appear in the portal + - **No** per-turn `RegisterObservability()` call; **no** `.FromTurnContext()` chaining on the scope + - Add inline comment: `// A365 auth mode: S2S — FMI 3-hop chain via ObservabilityTokenService (scope: api://9b975845-388f-4429-889e-eab1ef63949c/.default)` + + Mark all new lines with: `// A365 Observability — best-effort instrumentation (verify against official sample)` + +3. **Preserve** all existing handler logic. + +### For Node.js + +1. **Read** the detected message handler file. + +2. **Edit** — Add BaggageBuilder context following the reference pattern in `nodejs-observability.md`: + - Import `BaggageBuilder` from `@microsoft/agents-a365-observability` + - Import `AgenticTokenCacheInstance`, `BaggageBuilderUtils` from `@microsoft/agents-a365-observability-hosting` + - Import `getObservabilityAuthenticationScope` from `@microsoft/agents-a365-runtime` + - **OBO paths only** (`user-delegated` / `agentic-identity`): Call `AgenticTokenCacheInstance.RefreshObservabilityToken(agentId, tenantId, context, authorization, scopes)` at the start of each turn (non-fatal, wrap in try/catch): + - `user-delegated`: `authorization` is the **user's** delegated token → traces attributed to the user + - `agentic-identity`: `authorization` resolves to the **agentic user** provisioned in Azure AD → traces attributed to the agent + - **S2S path**: Do **NOT** call `AgenticTokenCacheInstance.RefreshObservabilityToken` — there is no user authorization token. The `tokenResolver` passed to `useMicrosoftOpenTelemetry()` (set up in Phase 3) handles authentication via the FMI 3-hop chain token service. + - Use `BaggageBuilderUtils.fromTurnContext(new BaggageBuilder(), context).build()` to build baggage automatically from TurnContext + - Wrap the handler body in `await baggageScope.run(async () => { ... })` + - Add inline comment: `// A365 auth mode: {authMode} — see: https://learn.microsoft.com/en-us/entra/agent-id/agent-on-behalf-of-oauth-flow` + - Mark all new lines with: `// A365 Observability — best-effort instrumentation (verify against official sample)` + +3. **Preserve** all existing handler logic. + +### For Python + +1. **Read** the detected message handler file. + +2. **Edit** — Add BaggageBuilder context following the reference pattern in `python-observability.md`: + - Import `BaggageBuilder` from `microsoft.opentelemetry.a365.core` + - Import `populate` from `microsoft.opentelemetry.a365.hosting.scope_helpers.populate_baggage` + - Import `AgenticTokenCache`, `AgenticTokenStruct` from `microsoft.opentelemetry.a365.hosting.token_cache_helpers` + - Import `get_observability_authentication_scope` from `microsoft.opentelemetry.a365.runtime` + - Call `token_cache.register_observability(agent_id=..., tenant_id=..., token_generator=AgenticTokenStruct(authorization=AGENT_APP.auth, turn_context=context), observability_scopes=get_observability_authentication_scope())`: + - `user-delegated`: the OBO exchange resolves to the **signed-in user's** identity + - `agentic-identity`: the OBO exchange resolves to the **agentic user** provisioned in Azure AD + - `S2S`: agent authenticates as itself — no user context available + - Use `populate(builder, turn_context)` to auto-populate baggage, then `with builder.build():` + - Wrap existing agent logic inside the baggage scope + - Add inline comment: `# A365 auth mode: {authMode} — see: https://learn.microsoft.com/en-us/entra/agent-id/agent-on-behalf-of-oauth-flow` + - Mark all new lines with: `# A365 Observability — best-effort instrumentation (verify against official sample)` + +3. **Preserve** all existing handler logic. + +4. **TaskUpdate** — Mark complete. + +--- + +## Phase 5: Implement Agentic Token Resolver + +**TaskCreate** — "Implement agentic token resolver with caching" + +For AI Teammate agents using the hosting packages, the built-in token cache (`AddAgenticTracingExporter` for .NET, `AgenticTokenCacheInstance` for Node.js, `AgenticTokenCache` for Python) handles caching automatically — no custom resolver needed. Skip to step 3 for these agents. + +### For .NET AgentFramework (hosting path) + +1. `AddAgenticTracingExporter()` (registered in Phase 3) provides the `IExporterTokenCache` DI instance — no additional token resolver class needed. + +2. In the agent class, inject `IExporterTokenCache` in the constructor and call `RegisterObservability(...)` per turn (already done in Phase 4). + +### For .NET AgentFramework (S2S path) + +The `ObservabilityTokenService` background service (created in Phase 3 via the scaffold) acquires and refreshes the Observability API token automatically via the FMI 3-hop chain (Blueprint → Agent Identity → Power Platform PFAT token) — no manual `TokenResolver` delegate needed. + +1. **Check** if `Observability/ObservabilityServiceExtensions.cs` and `Observability/ObservabilityTokenService.cs` exist. If yes, **skip** — they were already created in Phase 3. + +2. **If absent** (Phase 3 was skipped or re-running the skill on a partial state), create them now following the S2S scaffold patterns in `dotnet-observability.md`. These files provide `AddAgent365Observability()` (DI extension registering `AddServiceTracingExporter`, `ObservabilityTokenService`, and `Agent365ObservabilityContext`) and `ObservabilityTokenService` (background service that acquires the Observability API token via the FMI 3-hop chain and refreshes it every 50 minutes). + +### For Node.js (OBO path) + +`AgenticTokenCacheInstance` from `@microsoft/agents-a365-observability-hosting` handles caching automatically. The `useMicrosoftOpenTelemetry()` call in Phase 3 wires it as the `tokenResolver`. No additional token resolver module is needed unless `Use_Custom_Resolver=true` is required (see reference doc for custom resolver pattern). + +### For Node.js (S2S path) + +**Check** if `observability/observability-token-service.ts` exists. If yes, **skip** — it was created in Phase 3. + +**If absent** (Phase 3 was skipped or re-running), create `observability/token-cache.ts` and `observability/observability-token-service.ts` now using the scaffold from `nodejs-observability.md` (S2S section). The token service uses MSAL (`@azure/msal-node`) with `fmiPath` to acquire tokens via the FMI 3-hop chain targeting scope `api://9b975845-388f-4429-889e-eab1ef63949c/.default`. Call `startTokenService(config)` at app startup and pass `tokenResolver` from the cache module to `useMicrosoftOpenTelemetry()`. + +### For Python (OBO path) + +`AgenticTokenCache` from `microsoft.opentelemetry.a365.hosting.token_cache_helpers` handles caching automatically. It was wired as the `token_resolver` in the `configure()` call in Phase 3. No additional module is needed. + +### For Python (S2S path) + +**Check** if `observability/observability_token_service.py` exists. If yes, **skip** — it was created in Phase 3. + +**If absent**, create `observability/token_cache.py` and `observability/observability_token_service.py` now using the scaffold from `python-observability.md` (S2S section). The token service uses MSAL (`msal.ConfidentialClientApplication`) with `fmi_path` to acquire tokens via the FMI 3-hop chain targeting scope `api://9b975845-388f-4429-889e-eab1ef63949c/.default`. Call `acquire_initial_token()` for pre-warm, schedule `run_token_service()` as `asyncio.create_task()`, and pass `token_cache.get_cached_token` as the `a365_token_resolver` in `use_microsoft_opentelemetry()`. + +**TaskUpdate** — Mark complete. + +--- + +## Phase 5.5: Scan ALL Agent Source Files and Add Instrumentation Scopes + +**TaskCreate** — "Scan all source files and instrument InvokeAgentScope, InferenceScope, ExecuteToolScope" + +> **Store publishing requirement:** The Agent 365 store validator requires `InvokeAgentScope`, +> `InferenceScope`, and `ExecuteToolScope` to be present and populating telemetry. Missing any one +> of these three scopes causes store validation failure. + +> **This phase is mandatory. Do NOT skip it or proceed to Phase 6 until it is complete.** +> **Do NOT write any scope code until Step 3 (the summary table) has been confirmed by the user.** + +**Step 1 — Glob all source files** (excluding generated/build output): +- .NET: `**/*.cs` excluding `obj/` +- Node.js: `**/*.ts` or `**/*.js` excluding `node_modules/`, `dist/` +- Python: `**/*.py` + +**Step 2 — Read and scan every file** for instrumentation points: + +| What to look for | Scope to apply | Role | +|-----------------|---------------|------| +| Message handlers, timer loops, `BackgroundService.ExecuteAsync`, autonomous cycles — any agent "turn" or operation | `InvokeAgentScope` | **Root** — required outermost scope | +| LLM/model API calls (`CompleteChatAsync`, `chat.completions.create`, `kernel.InvokeAsync`, `RunStreamingAsync`, etc.) | `InferenceScope` | Child — nest inside `InvokeAgentScope` | +| Tool/function dispatch calls, external API calls acting as tools | `ExecuteToolScope` | Child — nest inside `InvokeAgentScope` | +| Final response / streaming output operations | `OutputScope` | Child — nest inside `InvokeAgentScope` | + +**Step 3 — Present a summary table** of ALL findings and wait for user confirmation before writing any code: + +``` +Files scanned: X +Instrumentation plan: +| File | Method / Location | Operation | Scope to add | +|------|------------------|-----------|-------------| +| Agent/MyAgent.cs | OnMessageAsync | User message handler | InvokeAgentScope (root) | +| Agent/MyAgent.cs | OnMessageAsync → RunStreamingAsync | LLM streaming call | InferenceScope | +| Agent/MyAgent.cs | BuildAgent → GetCurrentWeather | Tool dispatch | ExecuteToolScope | +| WeatherMonitorService.cs | ExecuteAsync (timer loop body) | Autonomous cycle | InvokeAgentScope (root) | +| WeatherMonitorService.cs | ExecuteAsync → CompleteChatAsync | LLM call | InferenceScope | +... + +Confirm to apply, or describe corrections. ``` -2. **Register the exporter and tracing in startup code** - - .NET (`agentType = 3`): call `builder.Services.AddAgent365Observability()` (using the extension from the helper files above), then `builder.AddA365Tracing()` - - .NET (`agentType = 1`): call `builder.Services.AddAgenticTracingExporter()`, then `builder.AddA365Tracing()` - - Python / Node.js: see the MS Learn reference -3. **Wire up the token resolver in your agent** - - .NET (`agentType = 3`): inject `Agent365ObservabilityContext` into your agent class and any background services; pass `ctx.AgentDetails` directly to `InvokeAgentScope.Start()` — no `RegisterObservability` call needed - - .NET (`agentType = 1`): call `_agentTokenCache.RegisterObservability(agentId, tenantId, new AgenticTokenStruct(...), EnvironmentUtils.GetObservabilityAuthenticationScope())` inside your turn handler - - Python / Node.js: see the MS Learn reference -4. Add the exporter configuration setting (`EnableAgent365Exporter` / `ENABLE_A365_OBSERVABILITY_EXPORTER`) **enabled by default** in the main config, and **disabled in the development/local override** (e.g. `appsettings.Development.json` for .NET, `.env` for Node.js/Python) so that `dotnet run` / local dev stays console-only until the agent is reachable from the platform +**Step 4 — Apply** the scopes per language-specific patterns after confirmation, following the scope hierarchy rule: +- `InvokeAgentScope` is always the outermost scope — one per agent turn or autonomous operation +- `InferenceScope`, `ExecuteToolScope`, `OutputScope` are children — always nested inside an open `InvokeAgentScope` +- Never open child scopes as standalone top-level scopes — they produce orphaned spans the exporter silently drops - > **To verify exporter connectivity from Visual Studio:** temporarily set `"EnableAgent365Exporter": true` and add `"Microsoft.Agents.A365.Observability": "Debug"` and `"OpenTelemetry": "Debug"` to the `LogLevel` section of `appsettings.Development.json`, then revert when done. +### For .NET AgentFramework — scope patterns -> **REQUIRED — do not skip this step.** -> After completing steps 1–4 above, you **must** say to the user, verbatim: -> -> "--- -> **Observability SDK is wired up.** Would you like me to scan your code and add instrumentation automatically? I'll find LLM calls, tool dispatches, agent-to-agent calls, and output operations and wrap each with the appropriate tracing scope. -> -> Reply **yes** to add instrumentation, or **no** to skip (you can add it later). -> ---" +- `CallerDetails` must be passed to `InvokeAgentScope.Start()` as the 4th parameter — required for traces to appear in the MAC portal +- For S2S autonomous agents: read sponsor details from config (`Agent365Observability:Sponsor`) and construct `CallerDetails` with `UserDetails(userId, userName, userEmail)` +- For autonomous background operations with no `ITurnContext` (e.g. `BackgroundService`): use `new BaggageBuilder().AgentId(...).TenantId(...).Build()` — `FromTurnContext()` is not available without a turn context +- Pass `UserDetails` directly (not wrapped in `CallerDetails`) to `InferenceScope.Start()` and `ExecuteToolScope.Start()` as the optional 4th parameter +- `InferenceCallDetails` requires `providerName` — it is **not optional** (CS7036 if omitted) +- `ExecuteToolScope.RecordResponse()` takes `string`, not a `Response` object (CS1503 if passed an object) +- The `Agent365ObservabilityContext` singleton should hold both `AgentDetails` and `CallerDetails` -- If **yes**: scan all agent source files, identify operations matching the scope types in Task B, present a summary of planned changes, confirm with the user, then apply — adding the correct scope wrapper and required usings to each. **Follow the hierarchy rule in Task B:** every instrumented block must have `InvokeAgentScope` as its outermost scope; `InferenceScope`, `ExecuteToolScope`, and `OutputScope` are child scopes that go inside it. -- If **no**: skip — instrumentation can be added later via Task B. +### For Node.js — scope patterns -> **Note — recording response data:** Auto-instrumentation adds scope wrappers only. To attach the actual response text to a span, call the appropriate record method manually after you have the result: -> - `invokeAgentScope.RecordResponse(responseText)` — adds the agent's final reply to the `invoke_agent` span -> - `inferenceScope.RecordOutputMessages(...)` / `inferenceScope.RecordInputMessages(...)` — attaches LLM output/input messages to the `Chat` span -> -> These are one-liners and are best added by hand once you know which variable holds the response. +- Use `ScopeUtils.populateInvokeAgentScopeFromTurnContext` from `@microsoft/agents-a365-observability-hosting` to auto-populate from TurnContext +- `CallerDetails` must be passed to `InvokeAgentScope.start()` as the 4th parameter +- Pass `UserDetails` directly to `InferenceScope.start()` and `ExecuteToolScope.start()` as the optional 4th parameter +- Export `callerDetails` (for `InvokeAgentScope`) and `userDetails` (for `InferenceScope`/`ExecuteToolScope`) from the entry point module alongside `agentDetails` + +### For Python — scope patterns + +- `CallerDetails` / `UserDetails` must be supplied when creating the top-level `InvokeAgentScope` — required for MAC portal visibility +- For S2S autonomous agents, construct `CallerDetails(UserDetails(userId, userName, userEmail))` from config or environment +- Pass `UserDetails` directly to `InferenceScope`, `ExecuteToolScope`, and `OutputScope` +- Keep shared observability state with both `agent_details` and `caller_details` / `user_details` so nested scopes can reuse them consistently + +All new lines marked with the language-appropriate comment: +- C# / JavaScript / TypeScript: `// A365 Observability — best-effort instrumentation (verify against official sample)` +- Python: `# A365 Observability — best-effort instrumentation (verify against official sample)` + +**TaskUpdate** — Mark complete only after all planned scopes have been applied and confirmed by the user. + +--- + +## Phase 6: Update Configuration Files + +**TaskCreate** — "Update configuration files with observability settings" + +### For .NET AgentFramework + +1. **Read** `appsettings.json` fully — **before writing anything** — and identify: + - Whether a `Logging` section already exists anywhere in the file + - Whether `Logging.LogLevel` already exists + - The existing `EnableAgent365Exporter`, `AgentBlueprintId`, and `TenantId` values + + > **Merge safety rule (enforce without exception):** A JSON file may only have one `Logging` section. If `Logging` or `Logging.LogLevel` already exists, **merge** the new log level keys into that block. Never append a second `Logging` section — this produces silently invalid config where only the last block wins. + +2. **Check for existing `a365 setup` configuration:** + - `EnableAgent365Exporter` — always set to `true` in `appsettings.json` (the Development override sets it to `false`; `a365 setup` may have written `false` here, which this skill corrects) + - If `Agent365Observability` section exists → **preserve** all existing values (AgentBlueprintId, TenantId, AgentName, AgentDescription, Sponsor) + - If missing → add with defaults + +3. **Edit** — Add or update observability configuration following the reference pattern: + + **`appsettings.json`** (exporter enabled by default in all environments except Development): + ```json + { + "EnableAgent365Exporter": true, // ← enabled by default; Development override turns it off + "Agent365Observability": { + "AgentBlueprintId": "...", // ← populated by a365 setup (or placeholder if not run) + "TenantId": "...", + "AgentName": "", + "AgentDescription": "", + "Sponsor": { + "UserId": "<>", + "UserName": "<>", + "UserEmail": "<>" + }, + // S2S path only — add: + // "ClientId": "", + // "ClientSecret": "", // MSI tried first in prod; secret is local-dev fallback + // "UseManagedIdentity": false // ← set false for local dev (MSI only works on Azure infra) + }, + "Logging": { + "LogLevel": { + "Default": "Information", + "Microsoft.Agents.A365.Observability": "Debug", + "OpenTelemetry": "Debug" + } + } + } + ``` + + > **S2S note:** `EnableAgent365Exporter` must be `true` for S2S span export to work. `a365 setup` may write `false` — this skill corrects it. Also set `UseManagedIdentity: false` for local dev since MSI is only available on Azure infrastructure (App Service, AKS, VM). On local machines, MSI fails with `CredentialUnavailableError: Network unreachable`. + > + > **Sponsor note:** For S2S / autonomous agents, the `Sponsor` section provides `CallerDetails` for MAC portal trace visibility. Use the Blueprint app ID as `UserId`, the Blueprint display name as `UserName`, and the agent sponsor's email as `UserEmail`. + + **`appsettings.Development.json`** (create if absent — disables exporter for local dev so traces go to console only): + ```json + { + "EnableAgent365Exporter": false + } + ``` + +4. **Critical:** The `Logging.LogLevel` section is **required** for observability events to appear in console output and Microsoft Defender. Without this, the SDK is instrumented but logs are suppressed. The `a365 setup` command does **not** add logging configuration. + +5. **If `appsettings.json` does not exist**, create it with the complete structure above. + +6. **If `Logging` or `Logging.LogLevel` already exists**, merge the new entries into that existing block. Do **not** create a second `Logging` section — only one is allowed in a JSON config file. + +7. **Inform user:** + - "Observability exporter is enabled by default (`EnableAgent365Exporter: true` in `appsettings.json`). For local development, `appsettings.Development.json` overrides this to `false` so traces go to console only." + - If `AgentBlueprintId` or `TenantId` are empty: "Run `a365 setup` to populate AgentBlueprintId and TenantId, or fill them manually from your Entra app registration." + - If S2S path: "Add `ClientId` and `ClientSecret` under `Agent365Observability` in `appsettings.json` — `ObservabilityTokenService` requires both. In production, MSI is tried first and the secret is a local-dev fallback; `ClientSecret` must still be present in config." + +### For Node.js + +1. **Read** `.env` (or `.env.local`, `.env.development`). + +2. **Check for existing `a365 setup` configuration:** + - If `ENABLE_A365_OBSERVABILITY_EXPORTER` exists → **preserve** it (do not change) + - If missing → add with default value `false` + +3. **Edit** — Add or update observability environment variables following the reference pattern in `nodejs-observability.md`: + ```dotenv + ENABLE_A365_OBSERVABILITY_EXPORTER=false + SERVICE_NAME=my-agent + A365_OBSERVABILITY_LOG_LEVEL=info|warn|error + Use_Custom_Resolver=false + + # Sponsor / CallerDetails for MAC portal trace visibility + agent365Observability__sponsorUserId=<> + agent365Observability__sponsorUserName=<> + agent365Observability__sponsorUserEmail=<> + ``` + - **S2S path only:** Also add `AGENT365_USE_S2S_ENDPOINT=true` — this tells the distro to use the `/observabilityService/...` endpoint path instead of `/observability/...`. + +4. **If `.env` does not exist**, create it with the variables above. + +5. **If the project uses `.env.example`**, also update it with placeholder values. + +6. **Inform user:** + - If `ENABLE_A365_OBSERVABILITY_EXPORTER` is `false`: "Observability is instrumented but disabled. Set ENABLE_A365_OBSERVABILITY_EXPORTER=true in .env to start exporting traces." + +### For Python + +1. **Read** `.env` (or `.env.local`). + +2. **Edit** — Add or update observability environment variables: + ```dotenv + ENABLE_A365_OBSERVABILITY_EXPORTER=false + ``` + - **S2S path only:** Also add `AGENT365_USE_S2S_ENDPOINT=true` — this tells the distro to use the `/observabilityService/...` endpoint path instead of `/observability/...`. + +3. **If `.env` does not exist**, create it with the variable above. + +4. **Inform user:** + - If `ENABLE_A365_OBSERVABILITY_EXPORTER` is `false`: "Observability is instrumented but disabled. Set ENABLE_A365_OBSERVABILITY_EXPORTER=true in .env to start exporting traces." + +7. **TaskUpdate** — Mark complete. + +--- + +## Phase 7: Validate Build + +**TaskCreate** — "Validate build passes" + +### For .NET AgentFramework + +1. **Bash** — Run: + ```bash + dotnet build + ``` + +2. **If build fails**, collect error output and present to user with suggested fixes. + +3. **If build succeeds**, confirm to user. + +### For Node.js + +1. **Bash** — Run: + ```bash + npm install # Ensure new packages are installed + npm run build || npm run compile || echo "No build script found — skipping compile check" + ``` + +2. **If build fails**, collect error output and present to user with suggested fixes. + +3. **If build succeeds** (or no build script exists), confirm to user. + +### For Python + +1. **Bash** — Run an import check to verify the packages load without errors: + ```bash + python3 -c "from microsoft.opentelemetry import use_microsoft_opentelemetry; print('A365 observability imports OK')" 2>/dev/null || python -c "from microsoft.opentelemetry import use_microsoft_opentelemetry; print('A365 observability imports OK')" + ``` + +2. **If import fails**, collect error output and present to user with suggested fixes (usually a missing `pip install`). + +3. **If import succeeds**, confirm to user. + +4. **TaskUpdate** — Mark complete. + +--- + +## Phase 8: Test Locally + +**TaskCreate** — "Test locally" + +Ask the user: + +``` +AskUserQuestion: + question: "Build succeeded. Want to run a quick local test now?" + options: + - "Yes — run the test-local skill" + - "No — I'll test later" +``` + +If yes, invoke the `test-local` skill. + +**TaskUpdate** — Mark complete. + +--- + +## Phase 9: Final Summary ### Task A completion — final summary @@ -274,3 +736,221 @@ using var scope = InvokeAgentScope.Start(new Request(text), new InvokeAgentScope ``` For Python and Node.js, equivalent OpenTelemetry spans are used with the same Agent365 attribute names. See the [MS Learn reference](https://learn.microsoft.com/en-us/microsoft-agent-365/developer/observability) for attribute names and patterns. + +--- + +## Error Handling + +### Unknown Agent Type +If the agent type cannot be determined: +- Write marker: `.a365setup-unknown-agent` +- Exit early with message: "Could not detect agent type. Please verify this is a .NET AgentFramework, Node.js, or Python agent project." + +### Build Failures +If the build fails after instrumentation: +- Do NOT revert changes +- Present error output to user +- Suggest fixes based on error messages +- Offer to help debug + +### Missing Files +If expected files are not found: +- Ask user to confirm the project structure +- Suggest running detection again +- Offer to create missing files if appropriate + +--- + +## Idempotency + +This skill is safe to rerun. On subsequent runs: +- Skip package installation if packages already present +- Skip code edits if observability is already wired (detect by marker comments) +- Update configuration only if values are missing +- Always revalidate the build + +--- + +## S2S Known Issues and Workarounds + +### OtelWrite App Role Assignment + +`a365 setup all` **attempts** to grant `Agent365.Observability.OtelWrite` to the Agent Identity SP, but this requires **Global Administrator** privileges. If the logged-in user is not a Global Admin, the assignment silently fails with 403 and trace exports will return HTTP 403 from the observability service. + +**The CLI prints a PowerShell admin consent script** in its output when the assignment fails. When running `a365 setup all`, **always scan the output for this script block** and display it to the user in a fenced code block so they can copy it and hand it to a Global Admin. + +If the script was not captured, grant the permission manually via Entra portal (requires Global Admin): +1. [Entra portal](https://entra.microsoft.com) > App registrations > select Blueprint app > API permissions +2. Add a permission > APIs my organization uses > search `9b975845-388f-4429-889e-eab1ef63949c` +3. Add both **Delegated** and **Application** `Agent365.Observability.OtelWrite` > Grant admin consent + +Alternatively, read the `agentIdentityClientId` from `a365.generated.config.json` and use the Graph API: + +```bash +# Create a temp JSON body file (required on Windows due to az rest escaping) +echo '{"principalId":"","resourceId":"2a275186-1775-4439-8551-5438df22cdfc","appRoleId":"8f71190c-00c8-461d-a63b-f74abde9ba52"}' > body.json +az rest --method POST --url "https://graph.microsoft.com/v1.0/servicePrincipals//appRoleAssignments" --body @body.json +rm body.json +``` + +- `resourceId` `2a275186-...` is the Observability API SP object ID +- `appRoleId` `8f71190c-...` is the OtelWrite role ID +- For agents provisioned before CLI 1.1, this manual step is still required + +### Node.js and .NET SDK `/otlp/` URL Path Bug + +The Node.js SDK (`@microsoft/agents-a365-observability@0.2.0-preview.5`) and .NET SDK (`0.3.4-beta`) include `/otlp/` in the S2S export URL path. The Power Platform PFAT gateway returns `401 MSAuth10AuthenticatorTypeUnknown` on this path. Python SDK `0.1.0` does NOT include `/otlp/` and works correctly. + +**Status:** Awaiting SDK fix. No workaround should be applied in generated code — this is an SDK-level issue. + +### S2S Endpoint Path — `useS2SEndpoint` Not Passed by Distro + +The `@microsoft/opentelemetry` distro creates `Agent365Exporter` internally but does NOT pass `useS2SEndpoint: true`. For S2S agents, the exporter defaults to the OBO path (`/observability/tenants/{tenantId}/otlp/agents/{agentId}/traces`), but S2S requires `/observabilityService/...`. + +**This bug affects BOTH Node.js and .NET SDKs:** + +**Node.js (`@microsoft/opentelemetry` v0.1.0-beta.1):** + +1. `A365Configuration` — add `useS2SEndpoint` property + `AGENT365_USE_S2S_ENDPOINT` env var support +2. `distro.js` — pass `a365Config.useS2SEndpoint` when constructing `Agent365Exporter` + +**For generated agent code:** Set the env var in `.env`: +``` +AGENT365_USE_S2S_ENDPOINT=true +``` + +This is a distro-level fix. The `useMicrosoftOpenTelemetry()` call does NOT need a custom `spanProcessors` array — the built-in exporter reads the env var via `A365Configuration` and passes it to `Agent365Exporter`. + +**.NET (`Microsoft.OpenTelemetry` v1.0.0-beta.1):** + +The `UseMicrosoftOpenTelemetry()` builder extension does NOT set `UseS2SEndpoint = true` on the `Agent365ExporterOptions` when using the unified distro. Without this, the exporter posts to `/observability/` (OBO path) instead of `/observabilityService/` (S2S path), causing HTTP 401. + +**Fix:** Set `UseS2SEndpoint = true` explicitly in the `UseMicrosoftOpenTelemetry` options callback: +```csharp +builder.UseMicrosoftOpenTelemetry(o => +{ + o.Exporters = ExportTarget.Agent365 | ExportTarget.Console; + o.Agent365.Exporter.UseS2SEndpoint = true; // ← Required for S2S agents + o.Agent365.Exporter.TokenResolver = async (agentId, tenantId) => + { + return tokenCache != null + ? await tokenCache.GetObservabilityToken(agentId, tenantId) + : null; + }; +}); +``` + +**URL paths:** +- OBO: `observability/tenants/{tenantId}/otlp/agents/{agentId}/traces` +- S2S: `observabilityService/tenants/{tenantId}/otlp/agents/{agentId}/traces` + +### Node.js MSAL `fmiPath` Not Supported (AADSTS82008) + +No published version of `@azure/msal-node` (v3.x or v5.x) serializes the `fmiPath` parameter to the token endpoint request body. Passing `fmiPath` in `acquireTokenByClientCredential()` options (even with `as any`) is silently ignored, resulting in: + +``` +AADSTS82008: All agentic applications requesting a token exchange token must include the fmipath parameter on the token request. +``` + +**Workaround (implemented in `nodejs-observability.md`):** For the client-secret local-dev path (`acquireT1ViaClientSecret`), use a direct HTTP POST to `https://login.microsoftonline.com/{tenantId}/oauth2/v2.0/token` with `fmi_path={agentId}` as a URL-encoded form parameter. The MSI path still uses MSAL + `ManagedIdentityCredential` which handles FMI via a different mechanism. + +**Status:** Awaiting `@azure/msal-node` to ship native `fmiPath` support. Remove the HTTP workaround once available. + +### Node.js LangChain Instrumentor Initialization Order + +`LangChainTraceInstrumentor.instrument(LangChainCallbacks)` requires `ObservabilityManager` to be fully initialized. Calling it as a standalone statement after `useMicrosoftOpenTelemetry()` throws `"ObservabilityManager is not configured yet"` when `a365.enabled: true`. + +**Workaround:** Use `instrumentationOptions: { langchain: {} }` inside the `useMicrosoftOpenTelemetry()` options object. This ensures the distro initializes the manager and the LangChain instrumentor in the correct order. + +### .NET `Microsoft.OpenTelemetry` v1.0.0-beta.1 Requires .NET 10 Logging + +`Microsoft.OpenTelemetry` v1.0.0-beta.1 has a hard **runtime** dependency on `Microsoft.Extensions.Logging` v10. On projects targeting `net8.0` or `net9.0`, this causes a `FileNotFoundException` for `Microsoft.Extensions.Logging, Version=10.0.0.0` at startup. + +**Workaround:** Add an explicit direct reference to the **stable** v10 release so the assembly is copied to the output directory: +```bash +dotnet add package Microsoft.Extensions.Logging --version "10.0.4" +``` + +> **⚠️ Do NOT use a preview version** (e.g. `10.0.0-*` or `10.0.0-preview.*`). `Microsoft.Agents.A365.Observability.Hosting` already requires `>= 10.0.4` — specifying a lower preview causes a **NU1605 downgrade error** at restore time. + +If the project targets `net8.0`, upgrade the TFM to `net9.0`: +```xml +net9.0 +``` + +**Status:** This is expected to be resolved when `Microsoft.OpenTelemetry` ships a stable release. + +### .NET CS0433 Type Ambiguity — Do NOT Add Hosting/Runtime as Direct References + +When `Microsoft.OpenTelemetry` is referenced, adding `Microsoft.Agents.A365.Observability.Hosting` or `Microsoft.Agents.A365.Observability.Runtime` as **direct** `` entries causes **CS0433 build errors** — the types `AgentDetails`, `CallerDetails`, and `IExporterTokenCache` exist in both assemblies simultaneously. + +**Cause:** `Microsoft.OpenTelemetry` re-exports all A365 observability types internally. The Hosting and Runtime packages are already brought in transitively. + +**Fix:** Remove the direct `` entries for `Hosting` and `Runtime` from the `.csproj`. Keep only `Microsoft.OpenTelemetry`, `Azure.Identity`, `Microsoft.Identity.Client`, and `Microsoft.Extensions.Logging`: +```xml + + + + + + + + + +``` + +**Status:** SDK packaging issue — types should not be re-exported by the distro. Awaiting fix in a future release. + +### .NET `InferenceCallDetails` Constructor — `providerName` Is Required + +The `InferenceCallDetails` constructor signature is `(InferenceOperationType operationName, string model, string providerName, int? inputTokens, int? outputTokens, string[]? finishReasons, string? conversationId)`. The `providerName` parameter is **required** (not optional). Omitting it causes CS7036. + +**Correct usage:** +```csharp +new InferenceCallDetails( + operationName: InferenceOperationType.Chat, + model: "gpt-5.4", + providerName: "Azure OpenAI") +``` + +### .NET `ExecuteToolScope.RecordResponse` Takes `string`, Not `Response` + +`ExecuteToolScope.RecordResponse()` accepts a `string` parameter (the tool result), not a `Response` object. Passing `new Response(...)` causes CS1503. + +**Correct usage:** +```csharp +toolScope.RecordResponse(resultString); +``` + +### .NET `appsettings.json` — S2S Configuration Notes + +For S2S / autonomous agents: +- `EnableAgent365Exporter` must be `true` in `appsettings.json` (not `false` — `a365 setup` may write `false` by default) +- `UseManagedIdentity` must be `false` for local development (MSI is only available on Azure infrastructure) +- Both `ClientId` and `ClientSecret` are required under `Agent365Observability` for the FMI 3-hop chain + +### CallerDetails Required for MAC Portal Trace Visibility + +For S2S / autonomous agents, `CallerDetails` with `UserDetails` (`userId`, `userName`, `userEmail`) must be passed to `InvokeAgentScope.Start()` / `.start()`. Without `CallerDetails`, exported spans reach the observability API (HTTP 200) but do **not** appear in the Microsoft Admin Center (MAC) portal's Advanced Hunting view. + +**Node.js API differences:** +- `InvokeAgentScope.start()` takes `CallerDetails` (wraps `userDetails`) as 4th parameter +- `InferenceScope.start()` and `ExecuteToolScope.start()` take `UserDetails` directly as 4th parameter +- `OutputScope.start()` takes `UserDetails` directly as 4th parameter + +**.NET API:** +- `InvokeAgentScope.Start()` takes `CallerDetails` (wraps `UserDetails`) as 4th parameter +- Other scopes do not take `CallerDetails` directly + +**Recommendation:** For autonomous agents without a real user, use the Blueprint sponsor's identity: +- `UserId` = Blueprint App (Client) ID +- `UserName` = Blueprint display name +- `UserEmail` = Agent sponsor's email address + +--- + +## References + +- **.NET Patterns:** [dotnet-observability.md](./references/dotnet-observability.md) +- **Node.js Patterns:** [nodejs-observability.md](./references/nodejs-observability.md) +- **Python Patterns:** [python-observability.md](./references/python-observability.md) diff --git a/docs/agent365-guided-setup/references/dotnet-observability.md b/docs/agent365-guided-setup/references/dotnet-observability.md new file mode 100644 index 00000000..3b046a8c --- /dev/null +++ b/docs/agent365-guided-setup/references/dotnet-observability.md @@ -0,0 +1,955 @@ +# .NET AgentFramework — A365 Observability Reference + +Authoritative package versions and code patterns for instrumenting A365 observability +into a .NET AgentFramework agent. All samples mirror the official Microsoft Learn docs +(updated 2026-04-30). + +--- + +## NuGet Packages + +| Package | Purpose | +|---------|---------| +| `Microsoft.Agents.A365.Observability.Runtime` | `AddA365Tracing()`, `BaggageBuilder`, `EnvironmentUtils` — required for all agents | +| `Microsoft.Agents.A365.Observability.Hosting` | `AddAgenticTracingExporter()` — OBO token caching (user-delegated / agentic-identity); `AddServiceTracingExporter()` — S2S token cache (`IExporterTokenCache`) | +| `Microsoft.Agents.A365.Observability.Hosting.Caching` | `IExporterTokenCache`, `AgenticTokenStruct` | +| `Microsoft.Agents.A365.Observability.Hosting.Extensions` | `FromTurnContext()` extension on `BaggageBuilder` | +| `Microsoft.Agents.A365.Observability.Hosting.Middleware` | `BaggageTurnMiddleware`, `UseObservabilityRequestContext` | +| `Microsoft.Agents.A365.Observability.Runtime.Common` | `BaggageBuilder`, `EnvironmentUtils` | +| `Microsoft.Agents.A365.Observability.Runtime.Tracing.Exporters` | `Agent365ExporterOptions`, `Agent365ExporterType` | +| `Microsoft.Agents.A365.Observability.Runtime.Tracing.Contracts` | `AgentDetails`, `InvokeAgentScopeDetails`, `ToolCallDetails`, `InferenceCallDetails`, `Request`, `Channel`, `UserDetails`, `CallerDetails`, `Response`, `SpanDetails` | +| `Microsoft.Agents.A365.Observability.Runtime.Tracing.Scopes` | `InvokeAgentScope`, `ExecuteToolScope`, `InferenceScope`, `OutputScope` | +| `Microsoft.Agents.A365.Observability.Extensions.SemanticKernel` | SK auto-instrumentation (optional) | +| `Microsoft.Agents.A365.Observability.Extensions.OpenAI` | OpenAI auto-instrumentation (optional) | +| `Microsoft.Agents.A365.Observability.Extensions.AgentFramework` | AgentFramework auto-instrumentation (optional) | + +Unified Distro (preferred for S2S / autonomous agents): + +| Package | Purpose | +|---------|---------| +| `Microsoft.OpenTelemetry` (v1.0.0-beta.1) | All-in-one: includes A365 observability types (`BaggageBuilder`, `InvokeAgentScope`, `InferenceScope`, `ExecuteToolScope`, `IExporterTokenCache`, `ServiceTokenCache`, `AgentDetails`, etc.) plus OTel pipeline configuration | +| `Azure.Identity` | `ManagedIdentityCredential` for MSI-based token acquisition | +| `Microsoft.Identity.Client` | MSAL `ConfidentialClientApplicationBuilder` with `.WithFmiPath()` for the FMI token chain | + +Install commands: +```bash +# Preferred for S2S / autonomous agents (includes all observability types): +dotnet add package Microsoft.OpenTelemetry --version 1.0.0-beta.1 +dotnet add package Azure.Identity +dotnet add package Microsoft.Identity.Client +# Required: Microsoft.OpenTelemetry v1.0.0-beta.1 has a hard runtime dependency on +# Microsoft.Extensions.Logging v10. On net9.0 the assembly is not in the framework +# so it must be a direct reference to ensure it is copied to the output directory. +# Use 10.0.4 (stable) — do NOT use a preview version; the Hosting package requires >= 10.0.4 +# and specifying a lower preview causes NU1605 downgrade errors. +dotnet add package Microsoft.Extensions.Logging --version "10.0.4" +``` + +> **⚠️ Do NOT add `Microsoft.Agents.A365.Observability.Hosting` or `.Runtime` as direct +> `` entries on the S2S path.** `Microsoft.OpenTelemetry` already brings +> both as transitive dependencies and re-exports their types. Adding them directly causes +> **CS0433 type ambiguity** (`AgentDetails`, `CallerDetails`, `IExporterTokenCache` exist +> in both assemblies). Remove any explicit Hosting/Runtime references and let them flow +> transitively through `Microsoft.OpenTelemetry`. + +Install commands (individual packages / OBO path): +```bash +# Required for all agents +dotnet add package Microsoft.Agents.A365.Observability.Runtime + +# Required for OBO agents (authMode: user-delegated or agentic-identity) +dotnet add package Microsoft.Agents.A365.Observability.Hosting + +# Optional auto-instrumentation extensions +dotnet add package Microsoft.Agents.A365.Observability.Extensions.SemanticKernel +dotnet add package Microsoft.Agents.A365.Observability.Extensions.OpenAI +dotnet add package Microsoft.Agents.A365.Observability.Extensions.AgentFramework +``` + +--- + +## Program.cs — S2S Path (`authMode: S2S`) + +Use this pattern for Agent (Non AI Teammate) agents that run without a signed-in user (Autonomous / S2S). +Requires two scaffold files in `Observability/` — create these before wiring Program.cs. + +> **⚠️ Known issues (v1.0.0-beta.1):** +> - **TFM / Logging v10:** `Microsoft.OpenTelemetry` v1.0.0-beta.1 has a hard **runtime** dependency on `Microsoft.Extensions.Logging` v10. On `net8.0` or `net9.0` this assembly is not part of the framework, causing `FileNotFoundException` at startup. Fix: upgrade TFM to `net9.0` and add `dotnet add package Microsoft.Extensions.Logging --version "10.0.4"`. Use the **stable** `10.0.4` release — specifying a preview version (e.g. `10.0.0-preview.*`) causes NU1605 downgrade errors because `Microsoft.Agents.A365.Observability.Hosting` already requires `>= 10.0.4`. +> - **CS0433 type ambiguity:** Do NOT add `Microsoft.Agents.A365.Observability.Hosting` or `Microsoft.Agents.A365.Observability.Runtime` as direct `` entries alongside `Microsoft.OpenTelemetry`. The distro re-exports their types internally; adding them directly creates duplicate-type errors for `AgentDetails`, `CallerDetails`, and `IExporterTokenCache`. Remove the direct references and let them flow transitively through `Microsoft.OpenTelemetry`. +> - **BaggageBuilder namespace:** `BaggageBuilder` requires `using Microsoft.Agents.A365.Observability.Runtime.Common;`. The `FromTurnContext()` extension additionally requires `using Microsoft.Agents.A365.Observability.Hosting.Extensions;`. Both usings are needed in the agent class. +> - **UseS2SEndpoint:** The distro does NOT set `UseS2SEndpoint = true` on the internal `Agent365Exporter`. You MUST set `o.Agent365.Exporter.UseS2SEndpoint = true` in the `UseMicrosoftOpenTelemetry` options callback, or the exporter posts to `/observability/` (OBO path) instead of `/observabilityService/` (S2S path), causing HTTP 401. +> - **InferenceCallDetails:** The `providerName` parameter is required (not optional). Constructor: `(InferenceOperationType operationName, string model, string providerName, ...)`. +> - **ExecuteToolScope.RecordResponse:** Takes `string`, not `Response` object. +> - **UseManagedIdentity:** Set `false` for local dev. MSI only works on Azure infrastructure. + +### Scaffold: `Observability/ObservabilityServiceExtensions.cs` + +```csharp +using Microsoft.Agents.A365.Observability.Hosting.Caching; +using Microsoft.Agents.A365.Observability.Runtime.Tracing.Contracts; +using Microsoft.Extensions.Configuration; +using Microsoft.Extensions.DependencyInjection; + +namespace ; + +// Injectable singleton wrapping AgentDetails for single-tenant agents. +// Pass ctx.AgentDetails to InvokeAgentScope.Start() for span attributes. +public sealed class Agent365ObservabilityContext +{ + public AgentDetails AgentDetails { get; } + internal Agent365ObservabilityContext(AgentDetails d) => AgentDetails = d; +} + +public static class ObservabilityServiceExtensions +{ + // Registers S2S token cache, ObservabilityTokenService (if credentials are present), + // and Agent365ObservabilityContext. + // Config is written by `a365 setup all` under the Agent365Observability section. + // When Agent365Observability credentials are missing, the agent still runs — spans are + // emitted to the console exporter but not exported to the A365 service. + public static IServiceCollection AddAgent365Observability(this IServiceCollection services) + { + services.AddSingleton, ServiceTokenCache>(); + + services.AddSingleton(sp => + { + var obs = sp.GetRequiredService().GetSection("Agent365Observability"); + var agentDetails = new AgentDetails( + agentId: obs["AgentId"] ?? "local-dev", + agentName: obs["AgentName"] ?? "my-agent", + agentDescription: obs["AgentDescription"] ?? "", + agentBlueprintId: obs["AgentBlueprintId"] ?? "", + tenantId: obs["TenantId"] ?? "local-dev"); + return new Agent365ObservabilityContext(agentDetails); + }); + + // Only start the background token service when the required credentials are configured. + // Without these, the agent runs fine — observability spans go to the console exporter only. + services.AddSingleton(); + services.AddHostedService(sp => + { + var obs = sp.GetRequiredService().GetSection("Agent365Observability"); + var useManagedIdentity = !bool.TryParse(obs["UseManagedIdentity"], out var parsedUseManagedIdentity) + || parsedUseManagedIdentity; // default true + + var hasCommonCredentials = !string.IsNullOrEmpty(obs["TenantId"]) + && !string.IsNullOrEmpty(obs["AgentId"]) + && !string.IsNullOrEmpty(obs["ClientId"]) + && !obs["TenantId"]!.StartsWith("<<"); + + var hasClientSecret = !string.IsNullOrEmpty(obs["ClientSecret"]) + && !obs["ClientSecret"]!.StartsWith("<<"); + + var hasCredentials = hasCommonCredentials + && (useManagedIdentity || hasClientSecret); + + return new OptionalHostedService( + hasCredentials ? sp.GetRequiredService() : null, + sp.GetRequiredService>(), + hasCredentials ? null : + "Agent365Observability credentials not configured — skipping token service. " + + "Run 'a365 setup all' to enable A365 observability export."); + }); + + return services; + } + + // Wrapper that conditionally starts a hosted service, allowing graceful skip. + private sealed class OptionalHostedService(IHostedService? inner, ILogger logger, string? skipWarning = null) : IHostedService + { + public Task StartAsync(CancellationToken ct) + { + if (inner != null) + return inner.StartAsync(ct); + + if (skipWarning != null) + logger.LogWarning("{Warning}", skipWarning); + + return Task.CompletedTask; + } + + public Task StopAsync(CancellationToken ct) => inner?.StopAsync(ct) ?? Task.CompletedTask; + } +} +``` + +### Scaffold: `Observability/ObservabilityTokenService.cs` + +> **Important:** The recommended approach is the **3-hop FMI chain** using MSAL with `.WithFmiPath()`: +> +> ``` +> Blueprint (client_credentials / MSI) +> → Hop 1+2: FMI token (api://AzureADTokenExchange/.default with WithFmiPath(agentId)) +> → Agent Identity token +> → Hop 3: Observability API token (scope=api://9b975845-388f-4429-889e-eab1ef63949c/.default) +> ``` +> +> **Auth strategy** is controlled by `Agent365Observability:UseManagedIdentity`: +> - `true` (production) — MSI → Blueprint FIC → Agent Identity → API +> - `false` (local dev) — Client Secret → Blueprint FIC → Agent Identity → API +> +> **Note:** As of CLI 1.1, `a365 setup all` automatically grants `Agent365.Observability.OtelWrite` to the Agent Identity SP (both delegated and application). No manual role assignment is needed for newly provisioned agents. + +```csharp +using Azure.Core; +using Azure.Identity; +using Microsoft.Agents.A365.Observability.Hosting.Caching; +using Microsoft.Identity.Client; + +namespace ; + +// Acquires an Observability API token for A365 observability via a 3-hop FMI chain. +// Hop 1+2: Blueprint authenticates (MSI in prod, client secret locally) → +// gets T1 via .WithFmiPath(agentId) to Agent Identity. +// Hop 3: Agent Identity uses T1 as assertion → Observability API token. +// (ServiceIdentity type — AADSTS82001 does not apply.) +// +// Auth strategy is controlled by Agent365Observability:UseManagedIdentity: +// true (production) — MSI → Blueprint FIC → Agent Identity → API +// false (local dev) — Client Secret → Blueprint FIC → Agent Identity → API +internal sealed class ObservabilityTokenService : BackgroundService +{ + private static readonly string[] FmiScopes = ["api://AzureADTokenExchange/.default"]; + private static readonly string[] ObservabilityScopes = ["api://9b975845-388f-4429-889e-eab1ef63949c/.default"]; + private static readonly TimeSpan RefreshInterval = TimeSpan.FromMinutes(50); + + private readonly IExporterTokenCache _tokenCache; + private readonly ILogger _logger; + private readonly string _blueprintClientId, _blueprintClientSecret, _tenantId, _agentId; + private readonly bool _useManagedIdentity; + + public ObservabilityTokenService( + IExporterTokenCache tokenCache, + ILogger logger, + IConfiguration configuration) + { + _tokenCache = tokenCache; + _logger = logger; + var obs = configuration.GetSection("Agent365Observability"); + _tenantId = obs["TenantId"] ?? ""; + _agentId = obs["AgentId"] ?? ""; + _blueprintClientId = obs["ClientId"] ?? ""; + _blueprintClientSecret = obs["ClientSecret"] ?? ""; + _useManagedIdentity = obs.GetValue("UseManagedIdentity", true); + } + + protected override async Task ExecuteAsync(CancellationToken stoppingToken) + { + _logger.LogInformation("ObservabilityTokenService started (UseManagedIdentity={UseMsi}).", _useManagedIdentity); + while (!stoppingToken.IsCancellationRequested) + { + try { await AcquireAndRegisterTokenAsync(stoppingToken); } + catch (Exception ex) when (!stoppingToken.IsCancellationRequested) + { _logger.LogWarning(ex, "Failed to acquire observability token; will retry in {Interval}.", RefreshInterval); } + try { await Task.Delay(RefreshInterval, stoppingToken); } + catch (OperationCanceledException) { break; } + } + _logger.LogInformation("ObservabilityTokenService stopped."); + } + + private async Task AcquireAndRegisterTokenAsync(CancellationToken ct) + { + string authority = $"https://login.microsoftonline.com/{_tenantId}"; + + // Hop 1+2: Blueprint → T1 via FMI path + // When UseManagedIdentity is true, try MSI first and fall back to client secret + // on AuthenticationFailedException (e.g. when running locally without MSI). + string t1Token; + if (_useManagedIdentity) + { + try + { + t1Token = await AcquireT1ViaMsiAsync(authority, ct); + } + catch (AuthenticationFailedException ex) + { + _logger.LogWarning(ex, "MSI authentication failed; falling back to client secret."); + t1Token = await AcquireT1ViaClientSecretAsync(authority, ct); + } + } + else + { + t1Token = await AcquireT1ViaClientSecretAsync(authority, ct); + } + + // Hop 3: Agent Identity uses T1 → Observability API token + var obsResult = await ConfidentialClientApplicationBuilder + .Create(_agentId) + .WithClientAssertion((AssertionRequestOptions _) => Task.FromResult(t1Token)) + .WithAuthority(new Uri(authority)).Build() + .AcquireTokenForClient(ObservabilityScopes) + .ExecuteAsync(ct); + + _tokenCache.RegisterObservability(_agentId, _tenantId, obsResult.AccessToken, ObservabilityScopes); + _logger.LogInformation("Observability token registered for agent {AgentId}.", _agentId); + } + + private async Task AcquireT1ViaMsiAsync(string authority, CancellationToken ct) + { + var assertion = await new ManagedIdentityCredential() + .GetTokenAsync(new TokenRequestContext(["api://AzureADTokenExchange"]), ct); + return (await ConfidentialClientApplicationBuilder + .Create(_blueprintClientId) + .WithClientAssertion((AssertionRequestOptions _) => Task.FromResult(assertion.Token)) + .WithAuthority(new Uri(authority)).Build() + .AcquireTokenForClient(FmiScopes).WithFmiPath(_agentId) + .ExecuteAsync(ct)).AccessToken; + } + + private async Task AcquireT1ViaClientSecretAsync(string authority, CancellationToken ct) + { + return (await ConfidentialClientApplicationBuilder + .Create(_blueprintClientId) + .WithClientSecret(_blueprintClientSecret) + .WithAuthority(new Uri(authority)).Build() + .AcquireTokenForClient(FmiScopes).WithFmiPath(_agentId) + .ExecuteAsync(ct)).AccessToken; + } +} +``` + +### Program.cs wiring + +```csharp +using Microsoft.Agents.A365.Observability.Hosting.Caching; +using Microsoft.OpenTelemetry; + +var builder = WebApplication.CreateBuilder(args); + +// A365 Observability — S2S token cache + background token service + AgentDetails context. +// ObservabilityTokenService acquires tokens via a 3-hop FMI chain (Blueprint → Agent Identity → API) +// and registers them with the ServiceTokenCache every 50 minutes. +builder.Services.AddAgent365Observability(); + +// Microsoft OpenTelemetry distro — configures OTel tracing pipeline + A365 exporter. +// The token resolver reads from the ServiceTokenCache populated by ObservabilityTokenService. +// Note: tokenCache is resolved lazily after Build() via the closure over the local variable. +IExporterTokenCache? tokenCache = null; +builder.UseMicrosoftOpenTelemetry(o => +{ + o.Exporters = builder.Environment.IsDevelopment() + ? ExportTarget.Agent365 | ExportTarget.Console + : ExportTarget.Agent365; + + // ⚠️ Required for S2S: distro does NOT set this automatically in v1.0.0-beta.1 + o.Agent365.Exporter.UseS2SEndpoint = true; + + o.Agent365.Exporter.TokenResolver = async (agentId, tenantId) => + { + return tokenCache != null + ? await tokenCache.GetObservabilityToken(agentId, tenantId) + : null; + }; +}); + +// ... rest of service configuration ... + +var app = builder.Build(); +tokenCache = app.Services.GetService>(); + +// ... rest of app configuration ... +``` + +--- + +## Program.cs — Hosting Path (AI Teammate, auto token caching) + +Use this pattern when the agent uses the AI Teammate hosting framework. + +```csharp +using Microsoft.Agents.A365.Observability.Runtime; +using Microsoft.Agents.A365.Observability.Hosting; + +var builder = WebApplication.CreateBuilder(args); + +// Registers IExporterTokenCache in DI — handles token caching automatically. +builder.Services.AddAgenticTracingExporter(); + +// Registers the OTel TracerProvider with the A365 exporter. +builder.AddA365Tracing(); + +var app = builder.Build(); + +// Optional: register HTTP-level baggage middleware (before the Bot Framework pipeline) +// app.UseObservabilityRequestContext((httpContext) => +// { +// var tenantId = GetTenantIdFromContext(httpContext); +// var agentId = GetAgentIdFromContext(httpContext); +// return (tenantId, agentId); +// }); +``` + +--- + +## Adapter — BaggageTurnMiddleware + +Register `BaggageTurnMiddleware` to auto-populate baggage from every incoming `ITurnContext`. +This removes the need to call `BaggageBuilder` manually in each activity handler. + +```csharp +using Microsoft.Agents.A365.Observability.Hosting.Middleware; + +adapter.Use(new BaggageTurnMiddleware()); +// The middleware skips async replies (ContinueConversation) to avoid overwriting baggage. +``` + +For HTTP-level baggage (before the Bot Framework pipeline), register via `UseObservabilityRequestContext`: + +```csharp +using Microsoft.Agents.A365.Observability.Hosting.Middleware; + +app.UseObservabilityRequestContext((httpContext) => +{ + var tenantId = GetTenantIdFromContext(httpContext); + var agentId = GetAgentIdFromContext(httpContext); + return (tenantId, agentId); +}); +``` + +--- + +## Agent Class — Message Handler (OBO Path, `authMode: user-delegated` or `agentic-identity`) + +```csharp +using Microsoft.Agents.Builder; +using Microsoft.Agents.Builder.App.UserAuth; +using Microsoft.Extensions.Logging; +using Microsoft.Agents.A365.Observability.Hosting.Caching; +using Microsoft.Agents.A365.Observability.Runtime.Common; +using System; +using System.Threading; +using System.Threading.Tasks; + +public class MyAgent : AgentApplication +{ + private readonly IExporterTokenCache _agentTokenCache; + private readonly ILogger _logger; + + public MyAgent( + AgentApplicationOptions options, + IExporterTokenCache agentTokenCache, + ILogger logger) : base(options) + { + _agentTokenCache = agentTokenCache ?? throw new ArgumentNullException(nameof(agentTokenCache)); + _logger = logger ?? throw new ArgumentNullException(nameof(logger)); + } + + protected async Task MessageActivityAsync( + ITurnContext turnContext, + ITurnState turnState, + CancellationToken cancellationToken) + { + // Option A: Manual BaggageBuilder (use if BaggageTurnMiddleware is NOT registered) + // Build() returns IDisposable — use `using var` to scope the baggage context. + using var baggageScope = new BaggageBuilder() + .TenantId(turnContext.Activity.Recipient.TenantId) + .AgentId(turnContext.Activity.Recipient.AgenticAppId) + .Build(); + + // Option B: FromTurnContext helper (preferred — auto-populates from activity) + // Requires: using Microsoft.Agents.A365.Observability.Hosting.Extensions; + // using var baggageScope = new BaggageBuilder() + // .FromTurnContext(turnContext) + // .Build(); + + // Register the agentic token so the exporter can authenticate exports. + try + { + _agentTokenCache.RegisterObservability( + turnContext.Activity.Recipient.AgenticAppId, + turnContext.Activity.Recipient.TenantId, + new AgenticTokenStruct( + userAuthorization: UserAuthorization, + turnContext: turnContext, + authHandlerName: "AGENTIC" + ), + EnvironmentUtils.GetObservabilityAuthenticationScope() + ); + } + catch (Exception ex) + { + _logger.LogWarning(ex, "Error registering for observability."); + } + + // ... existing agent message handling logic ... + } +} +``` + +--- + +## Agent Class — Message Handler (S2S Path, `authMode: S2S`) + +Inject `Agent365ObservabilityContext` instead of `IExporterTokenCache`. +`ObservabilityTokenService` holds the token in the background — no per-turn `RegisterObservability` call. + +```csharp +using Microsoft.Agents.Builder; +using Microsoft.Agents.A365.Observability.Hosting.Extensions; +using Microsoft.Agents.A365.Observability.Runtime.Common; +using Microsoft.Agents.A365.Observability.Runtime.Tracing.Contracts; +using Microsoft.Agents.A365.Observability.Runtime.Tracing.Scopes; + +public class MyAgent : AgentApplication +{ + // CallerDetails is read from Agent365Observability:Sponsor config — injected via + // Agent365ObservabilityContext singleton (see ObservabilityServiceExtensions). + // For autonomous agents, use the Blueprint sponsor's identity. + private readonly Agent365ObservabilityContext _obs; + + public MyAgent(AgentApplicationOptions options, Agent365ObservabilityContext obs) + : base(options) + { + _obs = obs; + } + + protected async Task MessageActivityAsync( + ITurnContext turnContext, + ITurnState turnState, + CancellationToken cancellationToken) + { + // No RegisterObservability() call — ObservabilityTokenService holds the token. + // IMPORTANT: FromTurnContext() is an extension on BaggageBuilder only — it does NOT + // exist on InvokeAgentScope. InvokeAgentScopeDetails has no parameterless constructor; + // pass at least `endpoint`. Keep baggage and scope as two separate using statements. + // authMode: S2S + + // Step 1: propagate baggage from the incoming turn. + // Requires: using Microsoft.Agents.A365.Observability.Hosting.Extensions; + using var baggageScope = new BaggageBuilder() + .FromTurnContext(turnContext) + .Build(); + + // Step 2: start the invoke scope with CallerDetails (required for traces to show up). + using var scope = InvokeAgentScope.Start( + new Request(turnContext.Activity.Text), + new InvokeAgentScopeDetails(endpoint: new Uri("https://your-agent-endpoint")), + _obs.AgentDetails, + _obs.CallerDetails); + + // ... existing agent message handling logic ... + } +} +``` + +```csharp +// ObservabilityServiceExtensions.cs — DI registration with dynamic CallerDetails from config +public sealed class Agent365ObservabilityContext +{ + public AgentDetails AgentDetails { get; } + public CallerDetails CallerDetails { get; } + internal Agent365ObservabilityContext(AgentDetails d, CallerDetails c) + { + AgentDetails = d; + CallerDetails = c; + } +} + +public static class ObservabilityServiceExtensions +{ + public static IServiceCollection AddAgent365Observability(this IServiceCollection services) + { + services.AddSingleton(sp => + { + var obs = sp.GetRequiredService().GetSection("Agent365Observability"); + var agentDetails = new AgentDetails( + agentId: obs["AgentId"] ?? "local-dev", + agentName: obs["AgentName"] ?? "unknown", + agentDescription: obs["AgentDescription"] ?? "", + agentBlueprintId: obs["AgentBlueprintId"] ?? "", + tenantId: obs["TenantId"] ?? "local-dev"); + + // Read sponsor/caller details from config — enables trace visibility in MAC portal + var sponsor = obs.GetSection("Sponsor"); + var callerDetails = new CallerDetails( + userDetails: new UserDetails( + userId: sponsor["UserId"] ?? obs["ClientId"] ?? "unknown", + userName: sponsor["UserName"] ?? obs["AgentName"] ?? "Blueprint Sponsor", + userEmail: sponsor["UserEmail"] ?? "")); + + return new Agent365ObservabilityContext(agentDetails, callerDetails); + }); + // ... rest of DI registration + return services; + } +} +``` + +--- + +## Manual Instrumentation Scopes + +> **Store publishing requirement:** `InvokeAgentScope`, `InferenceScope`, and `ExecuteToolScope` +> are **required** for store validation. Missing any one causes store validation failure. + +### InvokeAgentScope + +```csharp +using System; +using System.Threading.Tasks; +using Microsoft.Agents.A365.Observability.Runtime.Tracing.Contracts; +using Microsoft.Agents.A365.Observability.Runtime.Tracing.Scopes; + +var agentDetails = new AgentDetails( + agentId: "agent-456", + agentName: "MyAgent", + agentDescription: "Handles user requests.", + agenticUserId: "auid-123", + agenticUserEmail: "agent@contoso.com", + agentBlueprintId: "blueprint-789", + tenantId: "tenant-123" +); + +var scopeDetails = new InvokeAgentScopeDetails( + endpoint: new Uri("https://myagent.contoso.com") +); + +var request = new Request( + content: userInput, + sessionId: "session-abc", + channel: new Channel("msteams"), + conversationId: "conv-xyz" +); + +var callerDetails = new CallerDetails( + userDetails: new UserDetails( + userId: "user-123", + userEmail: "jane.doe@contoso.com", + userName: "Jane Doe" + ) +); + +// Start the scope — dispose automatically ends the span +using var scope = InvokeAgentScope.Start( + request: request, + scopeDetails: scopeDetails, + agentDetails: agentDetails, + callerDetails: callerDetails +); + +scope.RecordInputMessages(new[] { userInput }); + +// ... your agent logic here ... + +scope.RecordOutputMessages(new[] { output }); +``` + +### ExecuteToolScope + +```csharp +using Microsoft.Agents.A365.Observability.Runtime.Tracing.Contracts; +using Microsoft.Agents.A365.Observability.Runtime.Tracing.Scopes; + +// Use the same agentDetails and request instances from InvokeAgentScope above +var userDetails = new UserDetails( + userId: "user-123", + userEmail: "jane.doe@contoso.com", + userName: "Jane Doe" +); + +var toolCallDetails = new ToolCallDetails( + toolName: "summarize", + arguments: "{\"text\": \"...\"}", + toolCallId: "tc-001", + description: "Summarize provided text", + toolType: "function", + endpoint: new Uri("https://tools.contoso.com:8080") +); + +using var scope = ExecuteToolScope.Start( + request: request, + details: toolCallDetails, + agentDetails: agentDetails, + userDetails: userDetails +); + +// ... your tool logic here ... + +scope.RecordResponse("{\"summary\": \"The text was summarized.\"}"); +``` + +### InferenceScope + +```csharp +using Microsoft.Agents.A365.Observability.Runtime.Tracing.Contracts; +using Microsoft.Agents.A365.Observability.Runtime.Tracing.Scopes; + +// Use the same agentDetails and request instances from InvokeAgentScope above +var userDetails = new UserDetails( + userId: "user-123", + userEmail: "jane.doe@contoso.com", + userName: "Jane Doe" +); + +var inferenceDetails = new InferenceCallDetails( + operationName: InferenceOperationType.Chat, + model: "gpt-4o-mini", + providerName: "Azure OpenAI", + inputTokens: 123, + outputTokens: 456, + finishReasons: new[] { "stop" } +); + +using var scope = InferenceScope.Start( + request: request, + details: inferenceDetails, + agentDetails: agentDetails, + userDetails: userDetails +); + +// ... your inference logic here ... + +scope.RecordOutputMessages(new[] { "AI response message" }); +scope.RecordInputTokens(123); +scope.RecordOutputTokens(456); +``` + +### OutputScope (async scenarios) + +```csharp +using Microsoft.Agents.A365.Observability.Runtime.Tracing.Contracts; +using Microsoft.Agents.A365.Observability.Runtime.Tracing.Scopes; + +// Use the same agentDetails and request instances from InvokeAgentScope above + +// Get the parent context from the originating scope +var parentContext = invokeScope.GetActivityContext(); + +var response = new Response(new[] { "Here is your organized inbox with 15 urgent emails." }); + +using var scope = OutputScope.Start( + request: request, + response: response, + agentDetails: agentDetails, + spanDetails: new SpanDetails(parentContext: parentContext) +); +// Output messages are recorded automatically from the response +``` + +--- + +## appsettings.json — Complete Pattern + +> **Note:** If you ran `a365 setup`, the following values are **already present** in your +> `appsettings.json`: `EnableAgent365Exporter: false`, `Agent365Observability.AgentBlueprintId`, +> and `Agent365Observability.TenantId`. Preserve these existing values when instrumenting. + +**OBO path (`authMode: user-delegated` or `agentic-identity`):** + +```json +{ + "EnableAgent365Exporter": true, + "Agent365Observability": { + "AgentBlueprintId": "your-blueprint-id", + "TenantId": "your-tenant-id", + "AgentName": "My Agent", + "AgentDescription": "Description of what this agent does" + }, + "Logging": { + "LogLevel": { + "Default": "Information", + "Microsoft.Agents.A365.Observability": "Information", + "OpenTelemetry": "Warning" + } + } +} +``` + +**S2S path (`authMode: S2S`):** + +```json +{ + "Agent365Observability": { + "AgentBlueprintId": "<>", + "TenantId": "<>", + "AgentName": "<>", + "AgentDescription": "<>", + "AgentId": "<>", + "ClientId": "<>", + "ClientSecret": "<>", + "UseManagedIdentity": true, + "Sponsor": { + "UserId": "<>", + "UserName": "<>", + "UserEmail": "<>" + } + }, + "Logging": { + "LogLevel": { + "Default": "Information", + "Microsoft.Agents": "Warning", + "Microsoft.Hosting.Lifetime": "Information" + } + } +} +``` + +> **S2S auth note:** `UseManagedIdentity` defaults to `true`. In production (Azure), the service uses Managed Identity and the `ClientSecret` is only needed as a local-dev fallback. Set to `false` in `appsettings.Development.json` if you always want client-secret auth locally. +> +> **Sponsor note:** For S2S / autonomous agents, the `Sponsor` section provides the `CallerDetails` required for MAC portal trace visibility. Use the Blueprint app ID as `UserId`, the Blueprint display name as `UserName`, and the agent sponsor's email as `UserEmail`. + +> **Critical:** The `Logging.LogLevel` section is **required** for observability events to be +> captured in console output and forwarded to Microsoft Defender. Without this, the SDK is +> instrumented but logs are suppressed. The `a365 setup` command does **not** add logging +> configuration — you must add it manually or via this instrumentation skill. + +> **Local dev convention:** Set `EnableAgent365Exporter: false` in `appsettings.Development.json` +> to keep local runs console-only. The main `appsettings.json` should have it **enabled** so +> deployed environments export by default without requiring an env override. + +## appsettings.Development.json + +```json +{ + "EnableAgent365Exporter": false, + "Logging": { + "LogLevel": { + "Default": "Information", + "Microsoft.Agents.A365.Observability": "Debug", + "OpenTelemetry": "Debug" + } + } +} +``` + +## Validate Locally + +Set `EnableAgent365Exporter` to `false` in `appsettings.Development.json` — spans export to the console. + +To investigate export failures, enable verbose logging: + +```json +{ + "EnableAgent365Exporter": true, + "Logging": { + "LogLevel": { + "Microsoft.Agents.A365.Observability": "Debug" + } + } +} +``` + +Or set environment variables: + +```bash +EnableAgent365Exporter=True +A365_OBSERVABILITY_DOMAIN_OVERRIDE=https://your-test-endpoint.example.com +# For S2S exports, override to the Observability API scope used by FMI Hop 3. +A365_OBSERVABILITY_SCOPE_OVERRIDE=api://9b975845-388f-4429-889e-eab1ef63949c/.default +``` + +Key log messages: + +```text +info: Agent365ExporterCore: Obtained token for agent {agentId} tenant {tenantId}. +info: Agent365ExporterCore: Sending {count} spans to {requestUri} for agent {agentId} tenant {tenantId}. +info: Agent365ExporterCore: HTTP {statusCode} exporting spans. 'x-ms-correlation-id': '{correlationId}'. +error: Agent365Exporter: Exception exporting spans: {exception} +warn: Agent365ExporterCore: No token obtained for agent {agentId} tenant {tenantId}. Skipping export. +``` + +> If you don't register an `ILoggerFactory` in DI, the exporter automatically falls back to a console logger. + +--- + +## Key Types Reference + +| Type | Namespace | Purpose | +|------|-----------|---------| +| `BaggageBuilder` | `Microsoft.Agents.A365.Observability.Runtime.Common` | Propagates context across spans; `Build()` returns `IDisposable` — use `using var` | +| `EnvironmentUtils` | `Microsoft.Agents.A365.Observability.Runtime.Common` | `GetObservabilityAuthenticationScope()` helper | +| `IExporterTokenCache` | `Microsoft.Agents.A365.Observability.Hosting.Caching` | DI interface for caching and retrieving agentic tokens | +| `ServiceTokenCache` | `Microsoft.Agents.A365.Observability.Hosting.Caching` | S2S implementation of `IExporterTokenCache` | +| `AgenticTokenStruct` | `Microsoft.Agents.A365.Observability.Hosting.Caching` | Wraps `TurnContext` + `UserAuthorization` + `AuthHandlerName` for token resolution. Uses **constructor** syntax: `new AgenticTokenStruct(userAuthorization: ..., turnContext: ..., authHandlerName: "AGENTIC")` | +| `Agent365ExporterOptions` | `Microsoft.Agents.A365.Observability.Runtime.Tracing.Exporters` | Exporter config (`TokenResolver`, `MaxQueueSize`, `ScheduledDelayMilliseconds`, etc.) | +| `Agent365ExporterType` | `Microsoft.Agents.A365.Observability.Runtime.Tracing.Exporters` | Enum for `AddA365Tracing()` exporter type param | +| `AddAgenticTracingExporter()` | `Microsoft.Agents.A365.Observability.Hosting` | DI extension for OBO token caching (`IExporterTokenCache`) — user-delegated / agentic-identity | +| `AddServiceTracingExporter()` | `Microsoft.Agents.A365.Observability.Hosting` | Legacy/manual DI extension for S2S token cache (`IExporterTokenCache`) when not using the unified distro | +| `Agent365ObservabilityContext` | Scaffold (`Observability/`) | Singleton wrapping `AgentDetails` for S2S agents — inject instead of per-turn `RegisterObservability` | +| `ObservabilityTokenService` | Scaffold (`Observability/`) | `BackgroundService` — acquires the export token via the FMI 3-hop chain (`.WithFmiPath()` + agent assertion); refreshes every 50 min | +| `AddAgent365Observability()` | Scaffold (`Observability/`) | Registers `ServiceTokenCache`, `ObservabilityTokenService` (conditional), and `Agent365ObservabilityContext` | +| `UseMicrosoftOpenTelemetry()` | `Microsoft.OpenTelemetry` | Configures OTel pipeline with A365 exporter (preferred for S2S) | +| `ExportTarget` | `Microsoft.OpenTelemetry` | Enum: `Agent365`, `Console`, `AzureMonitor` | +| `AddA365Tracing()` | `Microsoft.Agents.A365.Observability.Runtime` | Registers OTel TracerProvider with A365 exporter | +| `BaggageTurnMiddleware` | `Microsoft.Agents.A365.Observability.Hosting.Middleware` | Adapter middleware — auto-populates baggage from every `ITurnContext` | +| `FromTurnContext()` | `Microsoft.Agents.A365.Observability.Hosting.Extensions` | Extension on **`BaggageBuilder` only** — auto-populates from activity. Does NOT exist on `InvokeAgentScope` or any scope type. | +| `InvokeAgentScope` | `Microsoft.Agents.A365.Observability.Runtime.Tracing.Scopes` | Required for store publishing — wrap top-level message handler | +| `ExecuteToolScope` | `Microsoft.Agents.A365.Observability.Runtime.Tracing.Scopes` | Required for store publishing — wrap each tool call | +| `InferenceScope` | `Microsoft.Agents.A365.Observability.Runtime.Tracing.Scopes` | Required for store publishing — wrap each LLM call | +| `OutputScope` | `Microsoft.Agents.A365.Observability.Runtime.Tracing.Scopes` | For async scenarios where parent scope can't capture output synchronously | +| `AgentDetails` | `Microsoft.Agents.A365.Observability.Runtime.Tracing.Contracts` | Agent identity for scope telemetry | +| `InvokeAgentScopeDetails` | `Microsoft.Agents.A365.Observability.Runtime.Tracing.Contracts` | Endpoint details for `InvokeAgentScope` | +| `ToolCallDetails` | `Microsoft.Agents.A365.Observability.Runtime.Tracing.Contracts` | Tool info for `ExecuteToolScope` | +| `InferenceCallDetails` | `Microsoft.Agents.A365.Observability.Runtime.Tracing.Contracts` | Model/token info for `InferenceScope` | +| `CallerDetails` / `UserDetails` | `Microsoft.Agents.A365.Observability.Runtime.Tracing.Contracts` | Caller identity | + +--- + +## Agent365ExporterOptions Properties + +| Property | Description | Default | +|----------|-------------|---------| +| `UseS2SEndpoint` | Use service-to-service endpoint path | `false` | +| `MaxQueueSize` | Max queue size for batch processor | `2048` | +| `ScheduledDelayMilliseconds` | Delay between export batches | `5000` | +| `ExporterTimeoutMilliseconds` | Timeout for export operation | `30000` | +| `MaxExportBatchSize` | Max batch size | `512` | + +--- + +## Configuration Sources + +The `a365 setup` command (as of April 2026) automatically writes the following to `appsettings.json`: + +```json +{ + "EnableAgent365Exporter": false, + "Agent365Observability": { + "AgentBlueprintId": "", + "TenantId": "", + "AgentName": "", + "AgentDescription": "" + } +} +``` + +**What `a365 setup` does NOT add:** +- `Logging.LogLevel` configuration (required for Defender visibility) +- `Agent365Observability:Sponsor` values for `CallerDetails` (required for S2S / autonomous agent trace visibility in MAC portal) + +**When instrumenting observability:** +1. Preserve existing `EnableAgent365Exporter`, `AgentBlueprintId`, `TenantId` values +2. Add `Logging.LogLevel` section if missing +3. Populate `AgentName` and `AgentDescription` if empty + +--- + +## Troubleshooting + +| Symptom | Cause | Fix | +|---------|-------|-----| +| No traces in console | OTel not wired | Call `builder.UseMicrosoftOpenTelemetry()` (or `builder.AddA365Tracing()` for OBO path) | +| No logs in Defender | Missing `Logging.LogLevel` config | Add `Microsoft.Agents.A365.Observability: Debug` to appsettings.json | +| `AgenticAppId` is null | Missing `AGENTIC_APP_ID` env var | Set it in `.env` or App Service config | +| Token resolver returns null | `AddAgenticTracingExporter()` not called | Add to `Program.cs` DI | +| 401 from A365 exporter | OAuth consent not granted | Run `a365 setup permissions observability`; also check if upgrading past `0.3-beta` (requires new `Agent365.Observability.OtelWrite` permission) | +| Build error on `BaggageBuilder` | Wrong namespace | Use `Microsoft.Agents.A365.Observability.Runtime.Common` | +| Build error on `AgenticTokenStruct` | Object initializer syntax used | Use constructor: `new AgenticTokenStruct(userAuthorization: ..., turnContext: ..., authHandlerName: "AGENTIC")` | +| Build error on `IExporterTokenCache` | Wrong namespace | Use `Microsoft.Agents.A365.Observability.Hosting.Caching` | +| Build error on `AddAgenticTracingExporter` | Wrong namespace | Use `Microsoft.Agents.A365.Observability.Hosting` | +| Build error on `AddA365Tracing` | Wrong namespace | Use `Microsoft.Agents.A365.Observability.Runtime` | +| Spans dropped silently | Missing tenant/agent ID in baggage | Ensure `BaggageBuilder` is set up before creating spans, or register `BaggageTurnMiddleware` | +| S2S: token service skipped at startup | Placeholder or missing `Agent365Observability` credentials | Run `a365 setup all` or populate `TenantId`, `AgentId`, `ClientId`, and `ClientSecret` (when `UseManagedIdentity` is `false`) | +| S2S: 401 on export | Token acquired for wrong scope or app | Verify FMI Hop 3 scope is `api://9b975845-388f-4429-889e-eab1ef63949c/.default`. For agents provisioned before CLI 1.1, verify Agent Identity SP has `Agent365.Observability.OtelWrite` app role via Entra portal | +| S2S: FMI Hop 1+2 fails | Blueprint credentials wrong or `.WithFmiPath(agentId)` target incorrect | Check `ClientId` (Blueprint app ID) and `ClientSecret` in appsettings; verify `AgentId` matches the Agent Identity app ID | +| S2S: FMI Hop 3 → 401 on export | Wrong scope or missing role | FMI Hop 3 scope is `api://9b975845-388f-4429-889e-eab1ef63949c/.default`; Agent Identity SP needs `OtelWrite` role assigned via Graph API | +| S2S: MSI fails locally | No Managed Identity available in dev | Set `UseManagedIdentity: false` in appsettings.Development.json, ensure `ClientSecret` is populated | +| S2S: `UseMicrosoftOpenTelemetry` not found | Unified distro not installed | Run `dotnet add package Microsoft.OpenTelemetry --version 1.0.0-beta.1` | +| S2S: Runtime `FileNotFoundException` for `Microsoft.Extensions.Logging v10.0.0` | `Microsoft.OpenTelemetry` v1.0.0-beta.1 depends on v10 logging | (1) Upgrade TFM to `net9.0`. (2) Run `dotnet add package Microsoft.Extensions.Logging --version "10.0.4"` — use the **stable** version, not a preview; specifying a preview causes NU1605 downgrade errors because `Microsoft.Agents.A365.Observability.Hosting` requires `>= 10.0.4`. | +| S2S: CS0433 type ambiguity on `AgentDetails` / `CallerDetails` / `IExporterTokenCache` | `Microsoft.Agents.A365.Observability.Hosting` and/or `.Runtime` added as direct references alongside `Microsoft.OpenTelemetry` | Remove the direct `` entries for `Hosting` and `Runtime` from the `.csproj`. `Microsoft.OpenTelemetry` already brings both transitively and re-exports their types — direct references create duplicate symbols. | +| S2S: HTTP 401 on span export (correct token) | `UseS2SEndpoint` not set — exporter posts to `/observability/` instead of `/observabilityService/` | Set `o.Agent365.Exporter.UseS2SEndpoint = true` in `UseMicrosoftOpenTelemetry` options | +| S2S: CS7036 on `InferenceCallDetails` — missing `providerName` | `providerName` is required (not optional) | Use: `new InferenceCallDetails(operationName: ..., model: ..., providerName: "Azure OpenAI")` | +| S2S: CS1503 on `ExecuteToolScope.RecordResponse` | Method takes `string`, not `Response` | Use: `toolScope.RecordResponse(resultString)` | +| S2S: `InvokeAgentScopeDetails` constructor error | No parameterless constructor exists | Pass at least `endpoint`: `new InvokeAgentScopeDetails(endpoint: new Uri("..."))` | +| S2S: `InvokeAgentScope` has no `FromTurnContext` | `FromTurnContext` is a `BaggageBuilder` extension only | Create `BaggageBuilder` separately: `new BaggageBuilder().FromTurnContext(tc).Build()` | +| Build error: `Azure.AI.OpenAI` version conflict with `Extensions.OpenAI` | Package requires `Azure.AI.OpenAI >= 2.7.0-beta.2` | Run `dotnet add package Azure.AI.OpenAI --version 2.7.0-beta.2` before adding the extension | diff --git a/docs/agent365-guided-setup/references/nodejs-observability.md b/docs/agent365-guided-setup/references/nodejs-observability.md new file mode 100644 index 00000000..31ad9085 --- /dev/null +++ b/docs/agent365-guided-setup/references/nodejs-observability.md @@ -0,0 +1,1057 @@ +# Node.js — A365 Observability Reference + +Authoritative package versions and code patterns for instrumenting A365 observability +into a Node.js agent. All samples mirror the official Microsoft Learn docs (updated 2026-04-30). + +--- + +## npm Packages + +| Package | Purpose | +|---------|---------| +| `@microsoft/agents-a365-observability` | Logger/exporter helpers such as `setLogger`, `ExporterEventNames`, and additional observability contracts | +| `@microsoft/agents-a365-observability-hosting` | `AgenticTokenCacheInstance`, `BaggageBuilderUtils`, `BaggageMiddleware`, `ObservabilityHostingManager`, `ScopeUtils` | +| `@microsoft/agents-a365-runtime` | `getObservabilityAuthenticationScope()`, `ClusterCategory` | + +**Unified distro entry point:** + +| Package | Purpose | +|---------|---------| +| `@microsoft/opentelemetry` (v0.1.0-beta.1) | Required entry point: `useMicrosoftOpenTelemetry()`, `shutdownMicrosoftOpenTelemetry()`, all scope types (`BaggageBuilder`, `InvokeAgentScope`, `InferenceScope`, `ExecuteToolScope`), `AgentDetails`, and all contract types | +| `@azure/msal-node` (^3.6.0) | MSAL `ConfidentialClientApplication` with `fmiPath` for the FMI token chain | +| `@azure/identity` (^4.6.0) | `ManagedIdentityCredential` for MSI-based token acquisition | + +Install commands: +```bash +# Required for all agents +npm install @microsoft/opentelemetry@0.1.0-beta.1 +npm install @microsoft/agents-a365-observability +npm install @microsoft/agents-a365-runtime + +# Required for AI Teammate agents (hosting path) +npm install @microsoft/agents-a365-observability-hosting + +# S2S token service dependencies +npm install @azure/msal-node @azure/identity + +# Optional auto-instrumentation extensions +npm install @microsoft/agents-a365-observability-extensions-openai +npm install @microsoft/agents-a365-observability-extensions-langchain +``` + +The unified distro `useMicrosoftOpenTelemetry()` entry point is used for both OBO and S2S flows. + +Minimum Node.js: **18.x** (LTS). TypeScript: **5.x** recommended. + +--- + +## Entry Point — Observability Init (before any LLM imports) + +### Configuration + +Initialize the unified distro before importing the rest of your app so LangChain auto-instrumentation can patch libraries. + +```typescript +// A365 Observability — best-effort instrumentation (verify against official sample) +// index.ts — must be called BEFORE importing other modules +import { configDotenv } from 'dotenv'; +configDotenv(); + +import { useMicrosoftOpenTelemetry } from '@microsoft/opentelemetry'; +import { tokenResolver } from './token-cache'; +import { AgenticTokenCacheInstance } from '@microsoft/agents-a365-observability-hosting'; + +useMicrosoftOpenTelemetry({ + a365: { + enabled: true, + // Option 1: Custom token resolver with local cache (sample default when Use_Custom_Resolver=true) + tokenResolver: process.env.Use_Custom_Resolver === 'true' + ? (agentId: string, tenantId: string) => tokenResolver(agentId, tenantId) ?? '' + : (agentId: string, tenantId: string) => AgenticTokenCacheInstance.getObservabilityToken(agentId, tenantId) ?? '', + }, + // instrumentationOptions is optional — omit unless you need framework-specific auto-instrumentation. + // The @microsoft/agents-a365-observability-extensions-langchain package has a peer dep conflict + // with @langchain/core@^0.3.0, so manual scopes (InvokeAgentScope, InferenceScope, etc.) are preferred. +}); +``` + +> **Auto-instrumentation:** `instrumentationOptions: { langchain: {} }` is optional and only useful +> if `@microsoft/agents-a365-observability-extensions-langchain` is installed (requires `@langchain/core@^1.1.32`). +> For most agents, manual scopes are sufficient and avoid the peer dependency conflict. + +### S2S configuration (`authMode: S2S`) + +S2S observability is supported for Node.js. The token service uses a **3-hop FMI (Federated Managed Identity) token chain**: + +``` +Blueprint (client_credentials / MSI) + → Hop 1+2: FMI token (api://AzureADTokenExchange/.default with fmiPath=agentId) + → Agent Identity token + → Hop 3: Observability API token (scope=api://9b975845-388f-4429-889e-eab1ef63949c/.default) +``` + +No OBO user token is required. + +> **Auth strategy** is controlled by `AGENT365_USE_MANAGED_IDENTITY`: +> - `true` (production) — MSI → Blueprint FIC → Agent Identity → API +> - `false` (local dev) — Client Secret → Blueprint FIC → Agent Identity → API + +> **IMPORTANT — MSAL `fmiPath` limitation (as of 2026-04-30):** In the +> `acquireTokenByClientCredential()` flow used by the **client-secret path**, published versions +> of `@azure/msal-node` (v3.x or v5.x) do not serialize a caller-supplied `fmiPath` to the +> token endpoint. Passing `fmiPath` via `acquireTokenByClientCredential()` with `as any` results +> in `AADSTS82008: All agentic applications requesting a token exchange token must include the +> fmipath parameter`. **Workaround:** For the client-secret path (`acquireT1ViaClientSecret`), +> use a direct HTTP POST to the `/oauth2/v2.0/token` endpoint with `fmi_path` as a form +> parameter. The MSI path (`acquireT1ViaMsi`) is still expected to work because it obtains the +> Blueprint/FMI token through `ManagedIdentityCredential` (a `client_assertion` flow, not a +> `client_credentials` + secret flow), rather than relying on MSAL to serialize `fmiPath` on a +> standard client-credential request. This workaround will be removed once MSAL ships native +> `fmiPath` support for the client-secret credential path. + +> **Note:** `a365 setup all` attempts to grant `Agent365.Observability.OtelWrite` to the Agent Identity SP, but this requires **Global Administrator** privileges. If the assignment fails (403), a Global Admin must manually grant the role via Entra portal — otherwise trace exports will return HTTP 403. + +> **IMPORTANT — SDK `useS2SEndpoint` bug (v0.1.0-beta.1):** The `@microsoft/opentelemetry` +> distro does **not** pass `useS2SEndpoint` to `Agent365Exporter`. The exporter defaults +> `useS2SEndpoint` to `false`, sending spans to `/observability/` instead of +> `/observabilityService/`. S2S tokens are rejected (HTTP 401) on the non-S2S endpoint. +> **Workaround:** Create a custom `Agent365Exporter` with `useS2SEndpoint: true` via +> `spanProcessors` and do **not** pass `a365` options to the distro (see Step 3 entry point). +> Also set `ENABLE_A365_OBSERVABILITY_EXPORTER=false` in `.env` — this env var has highest +> precedence and overrides programmatic `enabled: false`, re-creating the broken built-in exporter. + +> **Auto-instrumentation note:** The `instrumentationOptions: { langchain: {} }` option is +> **not required** for autonomous agents. The distro attempts OpenAI Agents auto-instrumentation +> by default (logs a benign `ERR_MODULE_NOT_FOUND` warning for `@openai/agents` if not installed). +> The optional `@microsoft/agents-a365-observability-extensions-langchain` package has a peer +> dependency on `@langchain/core@^1.1.32` which conflicts with `@langchain/core@^0.3.0` used +> by most LangChain projects — skip it and use manual scopes instead. + +#### Step 1 — Create `observability/token-cache.ts` + +Simple in-memory token cache shared by the token service and the OTel exporter: + +```typescript +// observability/token-cache.ts +// A365 Observability — best-effort instrumentation (verify against official sample) + +interface CacheEntry { + token: string; + expiresAt: number; // Unix ms +} + +const EXPIRY_BUFFER_MS = 5 * 60 * 1000; // 5 minutes + +const cache = new Map(); + +export function cacheToken(agentId: string, tenantId: string, token: string, expiresInMs: number = 60 * 60 * 1000): void { + const key = `${agentId}:${tenantId}`; + cache.set(key, { + token, + expiresAt: Date.now() + expiresInMs, + }); +} + +export function getCachedToken(agentId: string, tenantId: string): string | null { + const key = `${agentId}:${tenantId}`; + const entry = cache.get(key); + + if (!entry) { + return null; + } + + if (Date.now() + EXPIRY_BUFFER_MS >= entry.expiresAt) { + cache.delete(key); + return null; + } + + return entry.token; +} + +/** + * Token resolver called by the A365 Observability exporter when exporting telemetry. + */ +export const tokenResolver = (agentId: string, tenantId: string): string | null => { + return getCachedToken(agentId, tenantId); +}; +``` + +#### Step 2 — Create `observability/observability-token-service.ts` + +Background token acquisition via MSAL 3-hop FMI chain: + +```typescript +// observability/observability-token-service.ts +// A365 Observability — best-effort instrumentation (verify against official sample) +// A365 auth mode: S2S — 3-hop FMI token chain (MSAL) +// Hop 1+2: Blueprint (MSI or client secret) → T1 via FMI path → Agent Identity +// Hop 3: Agent Identity uses T1 as assertion → Observability API token + +import { ConfidentialClientApplication } from '@azure/msal-node'; +import { ManagedIdentityCredential } from '@azure/identity'; +import { cacheToken } from './token-cache'; + +const FMI_SCOPES = ['api://AzureADTokenExchange/.default']; +const OBSERVABILITY_SCOPES = ['api://9b975845-388f-4429-889e-eab1ef63949c/.default']; +const REFRESH_INTERVAL_MS = 50 * 60 * 1000; // 50 minutes + +export interface TokenServiceConfig { + tenantId: string; + agentId: string; + blueprintClientId: string; + blueprintClientSecret: string; + useManagedIdentity: boolean; +} + +export function startTokenService(config: TokenServiceConfig): ReturnType { + console.log(`[A365 Observability] Token service started (useManagedIdentity=${config.useManagedIdentity}).`); + + const run = async () => { + try { + await acquireAndRegisterToken(config); + } catch (error) { + console.warn(`[A365 Observability] Failed to acquire token; will retry in ${REFRESH_INTERVAL_MS / 1000}s.`, error); + } + }; + + // Acquire immediately, then on interval + run(); + return setInterval(run, REFRESH_INTERVAL_MS); +} + +async function acquireAndRegisterToken(config: TokenServiceConfig): Promise { + const authority = `https://login.microsoftonline.com/${config.tenantId}`; + + // Hop 1+2: Blueprint → T1 via FMI path + const t1Token = config.useManagedIdentity + ? await acquireT1ViaMsi(authority, config.blueprintClientId, config.agentId) + : await acquireT1ViaClientSecret(authority, config.blueprintClientId, config.blueprintClientSecret, config.agentId); + + // Hop 3: Agent Identity uses T1 → Observability API token + const identityApp = new ConfidentialClientApplication({ + auth: { + clientId: config.agentId, + authority, + clientAssertion: t1Token, + }, + }); + + const obsResult = await identityApp.acquireTokenByClientCredential({ + scopes: OBSERVABILITY_SCOPES, + }); + + if (!obsResult?.accessToken) { + throw new Error('Failed to acquire observability token: no access token returned'); + } + + const expiresInMs = obsResult.expiresOn + ? obsResult.expiresOn.getTime() - Date.now() + : 55 * 60 * 1000; + cacheToken(config.agentId, config.tenantId, obsResult.accessToken, expiresInMs); + console.log(`[A365 Observability] Token registered for agent ${config.agentId}.`); +} + +async function acquireT1ViaMsi(authority: string, blueprintClientId: string, agentId: string): Promise { + // ManagedIdentityCredential.getToken uses a resource URI (no /.default suffix). + const credential = new ManagedIdentityCredential(); + const msiToken = await credential.getToken('api://AzureADTokenExchange'); + + const blueprintApp = new ConfidentialClientApplication({ + auth: { + clientId: blueprintClientId, + authority, + clientAssertion: msiToken.token, + }, + }); + + const result = await blueprintApp.acquireTokenByClientCredential({ + scopes: FMI_SCOPES, + azureRegion: undefined, + fmiPath: agentId, + } as any); // fmiPath is available in MSAL Node but not yet in stable types + + if (!result?.accessToken) { + throw new Error('FMI T1 via MSI failed: no access token returned'); + } + return result.accessToken; +} + +async function acquireT1ViaClientSecret(authority: string, blueprintClientId: string, blueprintClientSecret: string, agentId: string): Promise { + // Direct HTTP request — @azure/msal-node does not yet serialize fmiPath to the token endpoint. + // Use native fetch to POST with fmi_path form parameter until MSAL ships support. + const tokenUrl = `${authority}/oauth2/v2.0/token`; + const params = new URLSearchParams({ + client_id: blueprintClientId, + client_secret: blueprintClientSecret, + scope: FMI_SCOPES[0], + grant_type: 'client_credentials', + fmi_path: agentId, + }); + + const response = await fetch(tokenUrl, { + method: 'POST', + headers: { 'Content-Type': 'application/x-www-form-urlencoded' }, + body: params.toString(), + }); + + if (!response.ok) { + const errorBody = await response.text(); + throw new Error(`FMI T1 via client secret failed (${response.status}): ${errorBody}`); + } + + const data = await response.json() as { access_token?: string }; + if (!data.access_token) { + throw new Error('FMI T1 via client secret failed: no access_token in response'); + } + return data.access_token; +} +``` + +#### Step 3 — Wire in entry point (`index.ts`) + +```typescript +// authMode: S2S — service principal, no user OBO. +import { configDotenv } from 'dotenv'; +configDotenv(); + +import { + useMicrosoftOpenTelemetry, + shutdownMicrosoftOpenTelemetry, + Agent365Exporter, + A365SpanProcessor, +} from '@microsoft/opentelemetry'; +import type { AgentDetails, CallerDetails, UserDetails } from '@microsoft/opentelemetry'; +import { BatchSpanProcessor } from '@opentelemetry/sdk-trace-base'; + +import { tokenResolver } from './observability/token-cache'; +import { startTokenService } from './observability/observability-token-service'; + +// ── Configuration ──────────────────────────────────────────────────────────── +const TENANT_ID = process.env.AGENT365_TENANT_ID || ''; +const AGENT_ID = process.env.AGENT365_AGENT_ID || ''; +const BLUEPRINT_ID = process.env.AGENT365_BLUEPRINT_ID || ''; +const CLIENT_ID = process.env.AGENT365_CLIENT_ID || ''; +const CLIENT_SECRET = process.env.AGENT365_CLIENT_SECRET || ''; +const AGENT_NAME = process.env.AGENT365_AGENT_NAME || 'my-agent'; +const AGENT_DESCRIPTION = process.env.AGENT365_AGENT_DESCRIPTION || ''; +const SPONSOR_USER_ID = process.env.agent365Observability__sponsorUserId || CLIENT_ID || ''; +const SPONSOR_USER_NAME = process.env.agent365Observability__sponsorUserName || AGENT_NAME; +const SPONSOR_USER_EMAIL = process.env.agent365Observability__sponsorUserEmail || ''; +const USE_MANAGED_IDENTITY = (process.env.AGENT365_USE_MANAGED_IDENTITY || 'true').toLowerCase() === 'true'; +const USE_S2S_ENDPOINT = (process.env.AGENT365_USE_S2S_ENDPOINT || 'false').toLowerCase() === 'true'; + +function hasA365Credentials(): boolean { + const requiredValues = [TENANT_ID, AGENT_ID, CLIENT_ID]; + const hasRequired = requiredValues.every(v => v && !v.startsWith('<<')); + if (!hasRequired) return false; + if (USE_MANAGED_IDENTITY) return true; + return !!CLIENT_SECRET && !CLIENT_SECRET.startsWith('<<'); +} + +const A365_ENABLED = hasA365Credentials(); + +// ── Agent Details ──────────────────────────────────────────────────────────── +export const agentDetails: AgentDetails = { + agentId: AGENT_ID || 'local-dev', + agentName: AGENT_NAME, + agentDescription: AGENT_DESCRIPTION, + agentBlueprintId: BLUEPRINT_ID, + tenantId: TENANT_ID || 'local-dev', +}; + +export const userDetails: UserDetails = { + userId: SPONSOR_USER_ID || 'unknown', + userName: SPONSOR_USER_NAME || 'Blueprint Sponsor', + userEmail: SPONSOR_USER_EMAIL, +}; + +export const callerDetails: CallerDetails = { + userDetails, +}; + +// ── Observability ──────────────────────────────────────────────────────────── +// Microsoft OpenTelemetry distro with A365 exporter. +// Token resolver reads from in-memory cache populated by the background token service. +// +// SDK workaround (v0.1.0-beta.1): The distro does not pass `useS2SEndpoint` +// to Agent365Exporter. When AGENT365_USE_S2S_ENDPOINT=true, we supply our own +// A365SpanProcessor + Agent365Exporter via `spanProcessors` instead. +// IMPORTANT: Set ENABLE_A365_OBSERVABILITY_EXPORTER=false in .env to prevent +// the env var from overriding the programmatic `enabled` setting. +const a365TokenResolver = (agentId: string, tenantId: string) => + tokenResolver(agentId, tenantId) ?? ''; + +const s2sSpanProcessors = A365_ENABLED && USE_S2S_ENDPOINT + ? [ + new A365SpanProcessor(), + new BatchSpanProcessor( + new Agent365Exporter({ + useS2SEndpoint: true, + tokenResolver: a365TokenResolver, + }) + ), + ] + : []; + +useMicrosoftOpenTelemetry({ + // When using S2S workaround, don't pass a365 options (avoids duplicate exporter + // or noisy console fallback). Otherwise let the distro create its own exporter. + a365: A365_ENABLED && !USE_S2S_ENDPOINT + ? { + enabled: true, + tokenResolver: a365TokenResolver, + } + : undefined, + spanProcessors: s2sSpanProcessors, +}); + +// ... import app modules AFTER observability init ... + +// Start background token service after server is listening +const tokenServiceInterval = A365_ENABLED + ? startTokenService({ + tenantId: TENANT_ID, + agentId: AGENT_ID, + blueprintClientId: CLIENT_ID, + blueprintClientSecret: CLIENT_SECRET, + useManagedIdentity: USE_MANAGED_IDENTITY, + }) + : undefined; + +// Graceful shutdown: +function shutdown(signal: string) { + console.log(`${signal} received — shutting down`); + if (tokenServiceInterval) { + clearInterval(tokenServiceInterval); + } + shutdownMicrosoftOpenTelemetry().finally(() => process.exit(0)); +} +process.on('SIGTERM', () => shutdown('SIGTERM')); +process.on('SIGINT', () => shutdown('SIGINT')); +``` + +#### S2S environment variables + +```dotenv +# Agent 365 Observability — S2S +AGENT365_TENANT_ID= +AGENT365_AGENT_ID= +AGENT365_BLUEPRINT_ID= +AGENT365_CLIENT_ID= +AGENT365_CLIENT_SECRET= +AGENT365_AGENT_NAME=my-agent +AGENT365_AGENT_DESCRIPTION= +agent365Observability__sponsorUserId=<> +agent365Observability__sponsorUserName=<> +agent365Observability__sponsorUserEmail=<> +AGENT365_USE_MANAGED_IDENTITY=true +AGENT365_USE_S2S_ENDPOINT=true +# IMPORTANT: Must be false when using the S2S workaround (AGENT365_USE_S2S_ENDPOINT=true), +# because this env var overrides the programmatic `enabled` setting in A365Configuration. +# The custom Agent365Exporter with useS2SEndpoint handles export instead. +ENABLE_A365_OBSERVABILITY_EXPORTER=false +``` + +Message handler baggage setup is **identical** to `user-delegated` / `agentic-identity` — only the token resolver and credential source differ. Do **not** call `AgenticTokenCacheInstance.RefreshObservabilityToken` for S2S agents. + +--- + +## Adapter — BaggageMiddleware + +Register `BaggageMiddleware` to auto-populate baggage from every incoming `TurnContext`. +This removes the need to call `BaggageBuilder` manually in each activity handler. + +```typescript +import { BaggageMiddleware } from '@microsoft/agents-a365-observability-hosting'; + +// Option 1: Register middleware directly on the adapter +adapter.use(new BaggageMiddleware()); +// The middleware skips async replies (ContinueConversation) to avoid overwriting baggage. +``` + +```typescript +import { ObservabilityHostingManager } from '@microsoft/agents-a365-observability-hosting'; + +// Option 2: Use ObservabilityHostingManager for composite configuration +const manager = new ObservabilityHostingManager(); +manager.configure(adapter, { enableBaggage: true }); +``` + +--- + +## Message Handler — Token Refresh + BaggageBuilder + +For OBO / user-delegated / agentic-identity flows, the official sample now builds the baggage scope from `TurnContext`, optionally adds `sessionDescription(...)`, preloads the exporter token, then runs the agent logic inside `baggageScope.run(...)`. + +The sample supports **two token refresh patterns**: +- **Option 1 (sample default when `Use_Custom_Resolver=true`)** — exchange the OBO token yourself and cache it with `createAgenticTokenCacheKey(...)` +- **Option 2** — call `AgenticTokenCacheInstance.RefreshObservabilityToken(...)` + +```typescript +// A365 Observability — best-effort instrumentation (verify against official sample) +import { BaggageBuilder } from '@microsoft/opentelemetry'; +import { AgenticTokenCacheInstance, BaggageBuilderUtils } from '@microsoft/agents-a365-observability-hosting'; +import { getObservabilityAuthenticationScope } from '@microsoft/agents-a365-runtime'; +import tokenCache, { createAgenticTokenCacheKey } from './token-cache'; + +// Inside your AgentApplication subclass / message handler: +async function handleMessage(turnContext: TurnContext, state: ApplicationTurnState) { + const baggageScope = BaggageBuilderUtils.fromTurnContext( + new BaggageBuilder(), + turnContext + ).sessionDescription('Initial onboarding session') + .build(); + + await preloadObservabilityToken(turnContext); + + try { + await baggageScope.run(async () => { + // ... your LangChain invocation, tool calls, streaming, etc. ... + }); + } finally { + baggageScope.dispose(); + } +} + +async function preloadObservabilityToken(turnContext: TurnContext): Promise { + const agentId = turnContext.activity?.recipient?.agenticAppId ?? ''; + const tenantId = turnContext.activity?.recipient?.tenantId ?? ''; + + if (process.env.Use_Custom_Resolver === 'true') { + // Option 1: Custom cache + const aauToken = await agentApplication.authorization.exchangeToken(turnContext, 'agentic', { + scopes: getObservabilityAuthenticationScope() + }); + const cacheKey = createAgenticTokenCacheKey(agentId, tenantId); + tokenCache.set(cacheKey, aauToken?.token || ''); + } else { + // Option 2: Built-in cache + await AgenticTokenCacheInstance.RefreshObservabilityToken( + agentId, + tenantId, + turnContext, + agentApplication.authorization, + getObservabilityAuthenticationScope() + ); + } +} +``` + +> If you already registered `BaggageMiddleware`, you can usually skip the manual `BaggageBuilderUtils.fromTurnContext(...)` call, but the per-turn token preload/refresh step is still required for OBO export. + +--- + +## Manual Instrumentation Scopes + +> **Store publishing requirement:** `InvokeAgentScope`, `InferenceScope`, and `ExecuteToolScope` +> are **required** for store validation. Missing any one causes store validation failure. + +> **Import source:** Import all scope types (`InvokeAgentScope`, `InferenceScope`, `ExecuteToolScope`, `BaggageBuilder`, `AgentDetails`, etc.) from `@microsoft/opentelemetry`. + +```typescript +import { + BaggageBuilder, + InvokeAgentScope, + InferenceScope, + ExecuteToolScope, + InferenceOperationType, +} from '@microsoft/opentelemetry'; +import type { + AgentDetails, + InferenceDetails, + InvokeAgentScopeDetails, + A365Request, + ToolCallDetails, +} from '@microsoft/opentelemetry'; +``` + +### InvokeAgentScope + +```typescript +import { + InvokeAgentScope, + InvokeAgentScopeDetails, + AgentDetails, + CallerDetails, + UserDetails, + Channel, + Request, + ServiceEndpoint, +} from '@microsoft/opentelemetry'; + +// Use the same agentDetails and request instances across all scopes in a request. +const agentDetails: AgentDetails = { + agentId: 'agent-456', + agentName: 'Email Assistant', + agentDescription: 'An AI agent powered by Azure OpenAI', + agentAUID: 'auid-123', + agentEmail: 'agent@contoso.com', // note: interface field is agentAUID (uppercase UID) + agentBlueprintId: 'blueprint-789', + tenantId: 'tenant-123', +}; + +const scopeDetails: InvokeAgentScopeDetails = { + endpoint: { host: 'myagent.contoso.com', port: 443 } as ServiceEndpoint, +}; + +const request: Request = { + content: 'Please help me organize my emails', + sessionId: 'session-42', + conversationId: 'conv-xyz', + channel: { name: 'msteams' } as Channel, +}; + +const callerDetails: CallerDetails = { + userDetails: { + userId: 'user-123', + userEmail: 'jane.doe@contoso.com', + userName: 'Jane Doe', + } as UserDetails, +}; + +const scope = InvokeAgentScope.start(request, scopeDetails, agentDetails, callerDetails); + +try { + await scope.withActiveSpanAsync(async () => { + scope.recordInputMessages(['Please help me organize my emails']); + + const response = await invokeAgent(request.content); + + scope.recordOutputMessages(['I found 15 urgent emails', 'Here is your organized inbox']); + }); +} catch (error) { + scope.recordError(error as Error); + throw error; +} finally { + scope.dispose(); +} +``` + +> **TIP:** For S2S autonomous agents, export `callerDetails` and `userDetails` from the entry +> point module so all scope files can import them alongside `agentDetails`. +> Read sponsor details from env vars: +> - `agent365Observability__sponsorUserId` (fallback: `clientId`) +> - `agent365Observability__sponsorUserName` (fallback: `agentName`) +> - `agent365Observability__sponsorUserEmail` + +#### InvokeAgentScope with ScopeUtils (hosting path — auto-populates from TurnContext) + +```typescript +import { InvokeAgentScopeDetails, AgentDetails, ServiceEndpoint } from '@microsoft/opentelemetry'; +import { ScopeUtils } from '@microsoft/agents-a365-observability-hosting'; + +const agentDetails: AgentDetails = { agentId: 'agent-456' }; +const scopeDetails: InvokeAgentScopeDetails = { + endpoint: { host: 'myagent.contoso.com', port: 443 } as ServiceEndpoint, +}; + +const scope = ScopeUtils.populateInvokeAgentScopeFromTurnContext( + agentDetails, + scopeDetails, + context, // TurnContext + authToken // authentication token string +); + +try { + await scope.withActiveSpanAsync(async () => { + const response = await invokeAgent(context.activity.text); + scope.recordOutputMessages([response]); + }); +} finally { + scope.dispose(); +} +``` + +### ExecuteToolScope + +```typescript +import { ExecuteToolScope, ToolCallDetails } from '@microsoft/opentelemetry'; + +// Use the same agentDetails, userDetails, and request instances from InvokeAgentScope above. + +const toolDetails: ToolCallDetails = { + toolName: 'email-search', + arguments: JSON.stringify({ query: 'from:boss@company.com', limit: 10 }), + toolCallId: 'tool-call-456', + description: 'Search emails by criteria', + toolType: 'function', + endpoint: { + host: 'tools.contoso.com', + port: 8080, + protocol: 'https' + }, +}; + +const scope = ExecuteToolScope.start(request, toolDetails, agentDetails, userDetails); + +try { + return await scope.withActiveSpanAsync(async () => { + const result = await searchEmails(toolDetails.arguments); + scope.recordResponse(result); + return result; + }); +} catch (error) { + scope.recordError(error as Error); + throw error; +} finally { + scope.dispose(); +} +``` + +#### ExecuteToolScope with ScopeUtils + +```typescript +import { ToolCallDetails } from '@microsoft/opentelemetry'; +import { ScopeUtils } from '@microsoft/agents-a365-observability-hosting'; + +const toolDetails: ToolCallDetails = { + toolName: 'email-search', + arguments: JSON.stringify({ query: 'from:boss@company.com' }), + toolCallId: 'tool-call-456', + toolType: 'function', +}; + +const scope = ScopeUtils.populateExecuteToolScopeFromTurnContext( + toolDetails, + context, // TurnContext + authToken // authentication token string +); + +try { + await scope.withActiveSpanAsync(async () => { + const result = await searchEmails(toolDetails.arguments); + scope.recordResponse(JSON.stringify(result)); + }); +} finally { + scope.dispose(); +} +``` + +### InferenceScope + +#### Example + +```typescript +// A365 Observability — best-effort instrumentation (verify against official sample) +import { + InferenceScope, + InferenceOperationType, +} from '@microsoft/opentelemetry'; +import type { + AgentDetails, + InferenceDetails, + Request, + UserDetails, +} from '@microsoft/opentelemetry'; + +const inferenceDetails: InferenceDetails = { + operationName: InferenceOperationType.CHAT, + model: 'gpt-4o-mini', +}; + +const request: Request = { + conversationId: context.activity?.conversation?.id || `conv-${Date.now()}`, +}; + +const agentDetails: AgentDetails = { + agentId: context.activity?.recipient?.agenticAppId || agentName, + agentName, + tenantId: context.activity?.recipient?.tenantId || 'sample-tenant', +}; + +const userDetails: UserDetails = { + userId: process.env.agent365Observability__sponsorUserId || context.activity?.from?.id || 'blueprint-app-id', + userName: process.env.agent365Observability__sponsorUserName || context.activity?.from?.name || agentName, + userEmail: process.env.agent365Observability__sponsorUserEmail || '', +}; + +let response = ''; +const scope = InferenceScope.start(request, inferenceDetails, agentDetails, userDetails); +try { + await scope.withActiveSpanAsync(async () => { + response = await invokeAgent(prompt); + scope.recordOutputMessages([response]); + scope.recordInputMessages([prompt]); + scope.recordInputTokens(45); + scope.recordOutputTokens(78); + scope.recordFinishReasons(['stop']); + }); +} catch (error) { + scope.recordError(error as Error); + throw error; +} finally { + scope.dispose(); +} +``` + +#### InferenceScope with ScopeUtils + +```typescript +import { InferenceDetails, InferenceOperationType } from '@microsoft/opentelemetry'; +import { ScopeUtils } from '@microsoft/agents-a365-observability-hosting'; + +const inferenceDetails: InferenceDetails = { + operationName: InferenceOperationType.CHAT, + model: 'gpt-4o-mini', + providerName: 'azure-openai', +}; + +const scope = ScopeUtils.populateInferenceScopeFromTurnContext( + inferenceDetails, + context, // TurnContext + authToken // authentication token string +); + +try { + await scope.withActiveSpanAsync(async () => { + const response = await callLLM(); + scope.recordOutputMessages([response.text]); + scope.recordInputTokens(response.usage.inputTokens); + scope.recordOutputTokens(response.usage.outputTokens); + }); +} finally { + scope.dispose(); +} +``` + +### OutputScope (async scenarios) + +```typescript +import { OutputScope, OutputResponse, SpanDetails } from '@microsoft/opentelemetry'; + +// Use the same agentDetails, userDetails, and request instances from InvokeAgentScope above. + +// Get the parent context from the originating scope +const parentContext = invokeScope.getSpanContext(); + +const response: OutputResponse = { + messages: ['Here is your organized inbox with 15 urgent emails.'], +}; + +const scope = OutputScope.start( + request, + response, + agentDetails, + userDetails, + { parentContext } as SpanDetails +); + +// Output messages are recorded automatically from the response +scope.dispose(); +``` + +--- + +## Advanced: Custom Token Resolver + +```typescript +import { useMicrosoftOpenTelemetry } from '@microsoft/opentelemetry'; +import { AgenticTokenCacheInstance } from '@microsoft/agents-a365-observability-hosting'; +import { tokenResolver } from './token-cache'; // your custom resolver + +useMicrosoftOpenTelemetry({ + a365: { + enabled: true, + tokenResolver: + process.env.Use_Custom_Resolver === 'true' + ? (agentId: string, tenantId: string) => tokenResolver(agentId, tenantId) ?? '' + : (agentId: string, tenantId: string) => + AgenticTokenCacheInstance.getObservabilityToken(agentId, tenantId) ?? '', + }, +}); +``` + +--- + +## Auto-Instrumentation Extensions + +### OpenAI Agents SDK + +> **Peer dependency:** `@microsoft/agents-a365-observability-extensions-openai` requires +> `@openai/agents ^0.7.0` (the **OpenAI Agents SDK**) — this is NOT the `openai` npm package +> and NOT `@azure/openai`. Install the peer dep first: +> ```bash +> npm install @openai/agents@^0.7.0 +> npm install @microsoft/agents-a365-observability-extensions-openai +> ``` + +```typescript +import { OpenAIAgentsTraceInstrumentor } from '@microsoft/agents-a365-observability-extensions-openai'; + +// Assumes useMicrosoftOpenTelemetry(...) already ran in your entry point. +const instrumentor = new OpenAIAgentsTraceInstrumentor({ + enabled: true, + tracerName: 'openai-agents-tracer', + tracerVersion: '1.0.0' +}); + +instrumentor.enable(); +``` + +### LangChain + +> **IMPORTANT:** `LangChainTraceInstrumentor.instrument()` requires `ObservabilityManager` to be +> fully initialized first. Calling it **after** `useMicrosoftOpenTelemetry()` as a separate +> statement will throw `"ObservabilityManager is not configured yet"` if `a365.enabled` is `true`. +> +> **Preferred approach:** Use `instrumentationOptions: { langchain: {} }` inside the +> `useMicrosoftOpenTelemetry()` call. This ensures correct initialization order: +> +> ```typescript +> useMicrosoftOpenTelemetry({ +> a365: { enabled: true, tokenResolver: ... }, +> instrumentationOptions: { +> langchain: {}, +> }, +> }); +> ``` +> +> **Alternative (conditional):** If you must call `instrument()` separately, guard it: +> ```typescript +> if (process.env.ENABLE_A365_OBSERVABILITY_EXPORTER === 'true') { +> LangChainTraceInstrumentor.instrument(LangChainCallbacks); +> } +> ``` + +```typescript +import { LangChainTraceInstrumentor } from '@microsoft/agents-a365-observability-extensions-langchain'; +import * as LangChainCallbacks from '@langchain/core/callbacks/manager'; + +// Assumes useMicrosoftOpenTelemetry(...) already ran in your entry point. +LangChainTraceInstrumentor.instrument(LangChainCallbacks); +``` + +--- + +## .env Variables + +> **Note:** If you ran `a365 setup`, `ENABLE_A365_OBSERVABILITY_EXPORTER=false` is **already +> present** in your `.env` file. Preserve this value when instrumenting. + +```dotenv +# ── A365 Observability ──────────────────────────────────────────────────────── +# Set to true to export to Microsoft Admin Center (production only). +# a365 setup automatically adds this with value "false". +ENABLE_A365_OBSERVABILITY_EXPORTER=false + +# Shown in Microsoft Admin Center observability dashboard. +SERVICE_NAME=my-agent + +# Log level: pipe-separated list of levels to emit. +A365_OBSERVABILITY_LOG_LEVEL=info|warn|error + +# Set to true to use a custom token resolver instead of AgenticTokenCacheInstance. +# Default: false (use built-in cache). Set to true for local testing with custom auth. +Use_Custom_Resolver=false + +# Sponsor / CallerDetails for MAC portal trace visibility (S2S / autonomous agents). +agent365Observability__sponsorUserId=<> +agent365Observability__sponsorUserName=<> +agent365Observability__sponsorUserEmail=<> +# ───────────────────────────────────────────────────────────────────────────── +``` + +| Variable | Local | Production | +|---|---|---| +| `ENABLE_A365_OBSERVABILITY_EXPORTER` | `false` | `true` | +| `Use_Custom_Resolver` | `true` (optional) | `false` | +| `agent365Observability__sponsorUserId` | `<>` | `<>` | +| `agent365Observability__sponsorUserName` | `<>` | `<>` | +| `agent365Observability__sponsorUserEmail` | `<>` | `<>` | +| `NODE_ENV` | `development` | `production` | + +--- + +## Validate Locally + +Set `ENABLE_A365_OBSERVABILITY_EXPORTER=false` — spans export to the console. + +To investigate export failures, enable verbose logging: + +```bash +ENABLE_A365_OBSERVABILITY_EXPORTER=true +A365_OBSERVABILITY_LOG_LEVEL=info|warn|error +``` + +Key console messages: + +```text +[INFO] [Agent365Exporter] Exporting 245 spans +[INFO] [Agent365Exporter] Partitioned into 3 identity groups (2 spans skipped) +[INFO] [Agent365Exporter] Token resolved successfully via tokenResolver +[EVENT] export-group succeeded in 98ms {"tenantId":"...","agentId":"...","correlationId":"abc-123"} +[ERROR] [Agent365Exporter] Failed with status 401, correlation ID: abc-123 +[WARN] export-partition-span-missing-identity: 5 spans skipped due to missing tenant or agent ID +``` + +Custom logger for capturing export events to a file: + +```typescript +import { setLogger, ExporterEventNames } from '@microsoft/agents-a365-observability'; + +setLogger({ + info: (msg, ...args) => myLogger.info(msg, ...args), + warn: (msg, ...args) => myLogger.warn(msg, ...args), + error: (msg, ...args) => myLogger.error(msg, ...args), + event: (eventType: ExporterEventNames, isSuccess: boolean, durationMs: number, + message?: string, details?: Record) => { + myLogger.info({ eventType, isSuccess, durationMs, message, ...details }); + } +}); +``` + +--- + +## Key API Surface + +| Symbol | Module | Purpose | +|--------|--------|---------| +| `useMicrosoftOpenTelemetry(options)` | `@microsoft/opentelemetry` | Configure the OTel pipeline with the A365 exporter | +| `shutdownMicrosoftOpenTelemetry()` | `@microsoft/opentelemetry` | Graceful shutdown of the OTel provider | +| `tokenResolver` | `./observability/token-cache` | Returns cached token for the A365 exporter | +| `startTokenService(config)` | `./observability/observability-token-service` | Background MSAL FMI token acquisition | +| `BaggageBuilder` | `@microsoft/opentelemetry` | Fluent builder for tenant/agent/correlation baggage | +| `BaggageBuilderUtils.fromTurnContext(builder, ctx)` | `@microsoft/agents-a365-observability-hosting` | Populates baggage from a `TurnContext` automatically | +| `BaggageMiddleware` | `@microsoft/agents-a365-observability-hosting` | Adapter middleware — auto-populates baggage for every request | +| `ObservabilityHostingManager` | `@microsoft/agents-a365-observability-hosting` | Composite hosting configuration | +| `ScopeUtils.populateInvokeAgentScopeFromTurnContext` | `@microsoft/agents-a365-observability-hosting` | Creates `InvokeAgentScope` from `TurnContext` | +| `ScopeUtils.populateExecuteToolScopeFromTurnContext` | `@microsoft/agents-a365-observability-hosting` | Creates `ExecuteToolScope` from `TurnContext` | +| `ScopeUtils.populateInferenceScopeFromTurnContext` | `@microsoft/agents-a365-observability-hosting` | Creates `InferenceScope` from `TurnContext` | +| `AgenticTokenCacheInstance.getObservabilityToken(agentId, tenantId)` | `@microsoft/agents-a365-observability-hosting` | Retrieve cached observability token | +| `AgenticTokenCacheInstance.RefreshObservabilityToken(...)` | `@microsoft/agents-a365-observability-hosting` | Refresh and cache token for the current turn | +| `getObservabilityAuthenticationScope()` | `@microsoft/agents-a365-runtime` | Returns the OAuth2 scope string for the observability API. **Deprecated** in v0.2.0-preview.5 — still functional; modern replacement is `defaultObservabilityConfigurationProvider.getConfiguration().observabilityAuthenticationScopes` | +| `InvokeAgentScope.start(request, scopeDetails, agentDetails, callerDetails)` | `@microsoft/opentelemetry` | Start agent invocation telemetry scope | +| `ExecuteToolScope.start(request, toolDetails, agentDetails, userDetails)` | `@microsoft/opentelemetry` | Start tool execution telemetry scope | +| `InferenceScope.start(request, inferenceDetails, agentDetails, userDetails)` | `@microsoft/opentelemetry` | Start LLM inference telemetry scope | +| `OutputScope.start(request, response, agentDetails, userDetails, spanDetails)` | `@microsoft/opentelemetry` | Start output telemetry scope (async scenarios) | +| `setLogger(logger)` | `@microsoft/agents-a365-observability` | Optional custom exporter logger | +| `ExporterEventNames` | `@microsoft/agents-a365-observability` | Event names emitted by the exporter logger | +| `scope.withActiveSpanAsync(fn)` | — | Execute async work within the active OTel span | +| `scope.recordInputMessages(msgs)` / `scope.recordOutputMessages(msgs)` | — | Record prompts and completions | +| `scope.recordInputTokens(n)` / `scope.recordOutputTokens(n)` | — | Record token counts | +| `scope.recordFinishReasons(reasons)` | — | Record finish reasons (e.g. `['stop']`) | +| `scope.recordError(error)` | — | Record an error on the span | +| `scope.dispose()` | — | End and export the span (call in `finally`) | + +--- + +## Troubleshooting + +| Symptom | Cause | Fix | +|---------|-------|-----| +| No console traces | `useMicrosoftOpenTelemetry()` not initialized early enough | Call it in the entry point before importing LLM or agent modules | +| Spans missing baggage | Handler not wrapped in baggage scope | Register `BaggageMiddleware` or wrap handler body in `baggageScope.run()` | +| Token resolver always returns `''` | `RefreshObservabilityToken` not called per turn | Call it at the start of each message handler turn | +| `Cannot find module '@microsoft/agents-a365-observability'` | Package not installed | Run `npm install @microsoft/agents-a365-observability` | +| `Cannot find module '@microsoft/agents-a365-observability-hosting'` | Package not installed | Run `npm install @microsoft/agents-a365-observability-hosting` | +| Traces not in Admin Center | Exporter env var not set | Set `ENABLE_A365_OBSERVABILITY_EXPORTER=true` in production | +| 401 on export | Missing permission | Check if upgrading past `0.2.0-preview.1` (requires new `Agent365.Observability.OtelWrite` permission) | +| Spans dropped silently | Missing tenant/agent ID | Ensure `BaggageBuilder` (or `BaggageMiddleware`) populates tenant/agent ID before creating spans | +| TypeScript error on `agentAuid` in `AgentDetails` | Interface field is `agentAUID` (uppercase UID), not `agentAuid` | Change to `agentAUID: '...'` | +| `extensions-openai` install fails / peer dep error | Missing `@openai/agents` peer dep | Run `npm install @openai/agents@^0.7.0` first; this is the OpenAI Agents SDK, not the `openai` package | +| S2S: AADSTS82001 or AADSTS1002012 | Direct MSAL client credentials not supported | Use the 3-hop FMI chain: Blueprint → FMI path → Agent Identity → Observability API token. | +| S2S: 401 on export | Token scope mismatch | Ensure Hop 3 scope is `api://9b975845-388f-4429-889e-eab1ef63949c/.default`. Also ensure Agent Identity SP has OtelWrite role assigned | +| S2S: 403 on `observabilityService/` endpoint | Missing app role | Assign `Agent365.Observability.OtelWrite` to the **Agent Identity** SP (not just the Blueprint) via Graph API | +| S2S: MSI fails locally | No Managed Identity in dev | Set `AGENT365_USE_MANAGED_IDENTITY=false` and provide `AGENT365_CLIENT_SECRET` | +| S2S: token resolver never called | `RefreshObservabilityToken` called for S2S | Remove `AgenticTokenCacheInstance.RefreshObservabilityToken` — not used in S2S; token comes from `a365.tokenResolver` in `useMicrosoftOpenTelemetry(...)` | +| `fromTurnContext` not found on `BaggageBuilder` | Static method is on `BaggageBuilderUtils`, not `BaggageBuilder` | Use `BaggageBuilderUtils.fromTurnContext(new BaggageBuilder(), context)` | diff --git a/docs/agent365-guided-setup/references/python-observability.md b/docs/agent365-guided-setup/references/python-observability.md new file mode 100644 index 00000000..3f7e4f58 --- /dev/null +++ b/docs/agent365-guided-setup/references/python-observability.md @@ -0,0 +1,875 @@ +# Python — A365 Observability Reference + +Authoritative package versions and code patterns for instrumenting A365 observability +into a Python agent. All samples mirror the official Microsoft Learn docs (updated 2026-04-30). + +--- + +## pip Packages + +| Package | Purpose | +|---------|---------| +| `microsoft-opentelemetry` | Unified distro entry point: `use_microsoft_opentelemetry()`, all scope types from `microsoft.opentelemetry.a365.core`, hosting helpers from `microsoft.opentelemetry.a365.hosting`, and OBO/S2S exporter wiring | +| `msal` | MSAL Python `ConfidentialClientApplication` for Hop 3 token acquisition (Hop 1+2 uses direct HTTP POST — see known issue below) | +| `azure-identity` | `ManagedIdentityCredential` for MSI-based token acquisition (async variant) | +| `httpx` | Direct HTTP POST for FMI Hop 1+2 token acquisition (MSAL `fmi_path` workaround) | + +Install commands: +```bash +pip3 install microsoft-opentelemetry 2>/dev/null || pip install microsoft-opentelemetry +pip3 install msal azure-identity httpx 2>/dev/null || pip install msal azure-identity httpx +``` + +--- + +## Entry Point — Observability Init + +### Unified Distro + +```python +# A365 Observability — best-effort instrumentation (verify against official sample) +from microsoft.opentelemetry import use_microsoft_opentelemetry +from token_cache import get_cached_agentic_token + +use_microsoft_opentelemetry( + enable_a365=True, + enable_azure_monitor=False, + a365_token_resolver=lambda agent_id, tenant_id: get_cached_agentic_token( + tenant_id, agent_id + ), +) +``` + +This matches the current official sample: initialize the unified distro once at startup, +then refresh the per-turn OBO token in your message handler. + +### S2S configuration (`authMode: S2S`) + +S2S observability is supported for Python. The token service uses a **3-hop FMI (Federated Managed Identity) token chain**: + +``` +Blueprint (client_credentials / MSI) + → Hop 1+2: FMI token (api://AzureADTokenExchange/.default with fmi_path=agentId) + → Agent Identity token + → Hop 3: Observability API token (scope=api://9b975845-388f-4429-889e-eab1ef63949c/.default) +``` + +No OBO user token is required. + +> **Auth strategy** is controlled by `AGENT365_USE_MANAGED_IDENTITY`: +> - `true` (production) — MSI → Blueprint FIC → Agent Identity → API +> - `false` (local dev) — Client Secret → Blueprint FIC → Agent Identity → API + +> **⚠️ Known Issue (msal v1.34.0):** Python MSAL does NOT properly support `fmi_path` as a parameter to `acquire_token_for_client()`. Passing it causes `TypeError: Session.request() got an unexpected keyword argument 'fmi_path'`. Use **direct HTTP POST** to the token endpoint with `fmi_path` as a form parameter for Hop 1+2 (same workaround as Node.js). MSAL is fine for Hop 3 (no `fmi_path` needed). + +> **Note:** As of CLI 1.1, `a365 setup all` automatically grants `Agent365.Observability.OtelWrite` to the Agent Identity SP (both delegated and application). No manual role assignment is needed for newly provisioned agents. + +#### Step 1 — Create `observability/token_cache.py` + +Simple in-memory token cache shared by the token service and the OTel exporter: + +```python +# observability/token_cache.py +# A365 Observability — best-effort instrumentation (verify against official sample) + +"""Simple in-memory token cache for observability tokens.""" + +import threading +from datetime import datetime, timedelta, timezone + +_lock = threading.Lock() +_cache: dict[str, tuple[str, datetime]] = {} + +# Tokens are considered valid if they expire more than 5 minutes from now. +_EXPIRY_BUFFER = timedelta(minutes=5) + + +def cache_token(agent_id: str, tenant_id: str, token: str, expires_in: timedelta = timedelta(hours=1)) -> None: + """Cache an observability token for a specific agent/tenant pair.""" + key = f"{agent_id}:{tenant_id}" + expires_at = datetime.now(timezone.utc) + expires_in + with _lock: + _cache[key] = (token, expires_at) + + +def get_cached_token(agent_id: str, tenant_id: str) -> str | None: + """Retrieve a cached token if it exists and hasn't expired.""" + key = f"{agent_id}:{tenant_id}" + with _lock: + entry = _cache.get(key) + if entry is None: + return None + token, expires_at = entry + if datetime.now(timezone.utc) + _EXPIRY_BUFFER >= expires_at: + del _cache[key] + return None + return token +``` + +#### Step 2 — Create `observability/observability_token_service.py` + +Background token acquisition via 3-hop FMI chain (direct HTTP POST for Hop 1+2, MSAL for Hop 3): + +```python +# observability/observability_token_service.py +# A365 Observability — best-effort instrumentation (verify against official sample) +# A365 auth mode: S2S — 3-hop FMI token chain (direct HTTP POST + MSAL) +# Hop 1+2: Blueprint (MSI or client secret) → T1 via token endpoint POST + fmi_path → Agent Identity +# Hop 3: Agent Identity uses T1 as assertion → Observability API token + +import asyncio +import logging +from datetime import timedelta + +import httpx +import msal + +from observability import token_cache + +logger = logging.getLogger(__name__) + +FMI_SCOPE = "api://AzureADTokenExchange/.default" +OBSERVABILITY_SCOPES = ["api://9b975845-388f-4429-889e-eab1ef63949c/.default"] +REFRESH_INTERVAL_SECONDS = 50 * 60 # 50 minutes + + +async def acquire_initial_token( + tenant_id: str, + agent_id: str, + blueprint_client_id: str, + blueprint_client_secret: str, + use_managed_identity: bool, +) -> None: + """Acquire the first observability token before background services start.""" + await _acquire_and_register_token( + tenant_id, agent_id, blueprint_client_id, blueprint_client_secret, use_managed_identity + ) + + +async def run_token_service( + tenant_id: str, + agent_id: str, + blueprint_client_id: str, + blueprint_client_secret: str, + use_managed_identity: bool, +) -> None: + """Run the background token acquisition loop.""" + logger.info("ObservabilityTokenService started (use_managed_identity=%s).", use_managed_identity) + + while True: + try: + await _acquire_and_register_token( + tenant_id, agent_id, blueprint_client_id, blueprint_client_secret, use_managed_identity + ) + except asyncio.CancelledError: + raise + except Exception: + logger.warning( + "Failed to acquire observability token; will retry in %d seconds.", + REFRESH_INTERVAL_SECONDS, + exc_info=True, + ) + + await asyncio.sleep(REFRESH_INTERVAL_SECONDS) + + +async def _acquire_and_register_token( + tenant_id: str, + agent_id: str, + blueprint_client_id: str, + blueprint_client_secret: str, + use_managed_identity: bool, +) -> None: + authority = f"https://login.microsoftonline.com/{tenant_id}" + token_url = f"{authority}/oauth2/v2.0/token" + + # Hop 1+2: Blueprint → T1 via FMI path + if use_managed_identity: + t1_token = await _acquire_t1_via_msi(token_url, blueprint_client_id, agent_id) + else: + t1_token = await _acquire_t1_via_client_secret( + token_url, blueprint_client_id, blueprint_client_secret, agent_id + ) + + # Hop 3: Agent Identity uses T1 → Observability API token + identity_app = msal.ConfidentialClientApplication( + client_id=agent_id, + client_credential={"client_assertion": t1_token}, + authority=authority, + ) + obs_result = identity_app.acquire_token_for_client(scopes=OBSERVABILITY_SCOPES) + + if "access_token" not in obs_result: + raise RuntimeError(f"Failed to acquire observability token: {obs_result.get('error_description', obs_result)}") + + token_cache.cache_token(agent_id, tenant_id, obs_result["access_token"], expires_in=timedelta(minutes=55)) + logger.info("Observability token registered for agent %s.", agent_id) + + +async def _acquire_t1_via_msi(token_url: str, blueprint_client_id: str, agent_id: str) -> str: + """Acquire T1 token using Managed Identity (production) — direct HTTP POST.""" + from azure.identity.aio import ManagedIdentityCredential + + async with ManagedIdentityCredential() as credential: + msi_token = await credential.get_token("api://AzureADTokenExchange") + + async with httpx.AsyncClient() as client: + resp = await client.post( + token_url, + data={ + "grant_type": "client_credentials", + "client_id": blueprint_client_id, + "client_assertion_type": "urn:ietf:params:oauth:client-assertion-type:jwt-bearer", + "client_assertion": msi_token.token, + "scope": FMI_SCOPE, + "fmi_path": agent_id, + }, + ) + result = resp.json() + + if "access_token" not in result: + raise RuntimeError(f"FMI T1 via MSI failed: {result.get('error_description', result)}") + return result["access_token"] + + +async def _acquire_t1_via_client_secret( + token_url: str, blueprint_client_id: str, blueprint_client_secret: str, agent_id: str +) -> str: + """Acquire T1 token using client secret (local dev) — direct HTTP POST with fmi_path.""" + async with httpx.AsyncClient() as client: + resp = await client.post( + token_url, + data={ + "grant_type": "client_credentials", + "client_id": blueprint_client_id, + "client_secret": blueprint_client_secret, + "scope": FMI_SCOPE, + "fmi_path": agent_id, + }, + ) + result = resp.json() + + if "access_token" not in result: + raise RuntimeError(f"FMI T1 via client secret failed: {result.get('error_description', result)}") + return result["access_token"] +``` + +#### Step 3 — Wire in entry point (`main.py` or `app.py`) + +```python +# authMode: S2S — 3-hop FMI token chain via direct HTTP POST + MSAL, no user OBO. +import asyncio +import logging +import os + +from dotenv import load_dotenv +from aiohttp import web + +from microsoft.opentelemetry import use_microsoft_opentelemetry +from microsoft.opentelemetry.a365.core import AgentDetails + +from observability import token_cache +from observability.observability_token_service import acquire_initial_token, run_token_service + +load_dotenv() + +# ── Configuration ──────────────────────────────────────────────────────────── +TENANT_ID = os.environ.get("AGENT365_TENANT_ID", "") +AGENT_ID = os.environ.get("AGENT365_AGENT_ID", "") +BLUEPRINT_ID = os.environ.get("AGENT365_BLUEPRINT_ID", "") +CLIENT_ID = os.environ.get("AGENT365_CLIENT_ID", "") +CLIENT_SECRET = os.environ.get("AGENT365_CLIENT_SECRET", "") +AGENT_NAME = os.environ.get("AGENT365_AGENT_NAME", "my-agent") +AGENT_DESCRIPTION = os.environ.get("AGENT365_AGENT_DESCRIPTION", "") +USE_MANAGED_IDENTITY = os.environ.get("AGENT365_USE_MANAGED_IDENTITY", "true").lower() == "true" + +def _has_a365_credentials() -> bool: + required_values = [TENANT_ID, AGENT_ID, CLIENT_ID] + if not all(v and not v.startswith("<<") for v in required_values): + return False + if USE_MANAGED_IDENTITY: + return True + return bool(CLIENT_SECRET) and not CLIENT_SECRET.startswith("<<") + +A365_ENABLED = _has_a365_credentials() + +# ── Agent Details ──────────────────────────────────────────────────────────── +agent_details = AgentDetails( + agent_id=AGENT_ID or "local-dev", + agent_name=AGENT_NAME, + agent_description=AGENT_DESCRIPTION, + agent_blueprint_id=BLUEPRINT_ID, + tenant_id=TENANT_ID or "local-dev", +) + +# ── Microsoft OpenTelemetry Distro ─────────────────────────────────────────── +use_microsoft_opentelemetry( + enable_a365=True, + enable_azure_monitor=False, + enable_console=True, # disable in production + a365_use_s2s_endpoint=True, # CRITICAL for S2S — posts to /observabilityService/ + a365_enable_observability_exporter=True, + a365_token_resolver=lambda aid, tid: token_cache.get_cached_token(aid, tid) or "", +) + +# ── Background Tasks ───────────────────────────────────────────────────────── +async def start_background_tasks(app: web.Application) -> None: + if A365_ENABLED: + try: + await acquire_initial_token( + tenant_id=TENANT_ID, + agent_id=AGENT_ID, + blueprint_client_id=CLIENT_ID, + blueprint_client_secret=CLIENT_SECRET, + use_managed_identity=USE_MANAGED_IDENTITY, + ) + except Exception: + logging.warning("Initial token acquisition failed; continuing with background refresh.", exc_info=True) + + app["token_task"] = asyncio.create_task( + run_token_service( + tenant_id=TENANT_ID, + agent_id=AGENT_ID, + blueprint_client_id=CLIENT_ID, + blueprint_client_secret=CLIENT_SECRET, + use_managed_identity=USE_MANAGED_IDENTITY, + ) + ) + else: + logging.warning( + "Agent365 credentials not configured — skipping token service. " + "Run 'a365 setup all' to enable A365 observability export." + ) + + # ... rest of background task startup ... +``` + +> **⚠️ `a365_use_s2s_endpoint=True` is required for S2S agents.** Without it, the exporter posts to `/observability/` (OBO endpoint) instead of `/observabilityService/` (S2S endpoint), causing 401 errors. The Python SDK uniquely supports this as a native kwarg — no custom `spanProcessors` workaround needed (unlike Node.js). + +#### S2S environment variables + +```dotenv +# Agent 365 Observability — S2S +AGENT365_TENANT_ID= +AGENT365_AGENT_ID= +AGENT365_BLUEPRINT_ID= +AGENT365_CLIENT_ID= +AGENT365_CLIENT_SECRET= +AGENT365_AGENT_NAME=my-agent +AGENT365_AGENT_DESCRIPTION= +AGENT365_USE_MANAGED_IDENTITY=true + +# Sponsor identity for CallerDetails (MAC portal visibility) +AGENT365_SPONSOR_USER_ID= +AGENT365_SPONSOR_USER_EMAIL= +AGENT365_SPONSOR_USER_NAME= +``` + +Message handler baggage setup is **identical** to `user-delegated` / `agentic-identity` — only the token resolver and credential source differ. Do **not** use the OBO per-turn token-registration flow for S2S agents. + +### Hosting path — OBO token cache (AI Teammate agents) + +#### Unified Distro + +```python +# A365 Observability — best-effort instrumentation (verify against official sample) +from microsoft.opentelemetry import use_microsoft_opentelemetry +from token_cache import cache_agentic_token, get_cached_agentic_token + +use_microsoft_opentelemetry( + enable_a365=True, + a365_token_resolver=lambda agent_id, tenant_id: get_cached_agentic_token( + tenant_id, agent_id + ), +) +``` + +```python +# token_cache.py +# A365 Observability — best-effort instrumentation (verify against official sample) + +"""Token caching utilities for Agent 365 Observability exporter authentication.""" + +import logging + +logger = logging.getLogger(__name__) + +_agentic_token_cache = {} + + +def cache_agentic_token(tenant_id: str, agent_id: str, token: str) -> None: + """Cache the agentic token for use by Agent 365 Observability exporter.""" + key = f"{tenant_id}:{agent_id}" + _agentic_token_cache[key] = token + logger.debug(f"Cached agentic token for {key}") + + +def get_cached_agentic_token(tenant_id: str, agent_id: str) -> str | None: + """Retrieve cached agentic token for Agent 365 Observability exporter.""" + key = f"{tenant_id}:{agent_id}" + return _agentic_token_cache.get(key) +``` + +#### Alternative: AgenticTokenCache helper + +```python +from microsoft.opentelemetry import use_microsoft_opentelemetry +from microsoft.opentelemetry.a365.hosting.token_cache_helpers import AgenticTokenCache + +token_cache = AgenticTokenCache() + +use_microsoft_opentelemetry( + enable_a365=True, + a365_token_resolver=token_cache.get_observability_token, +) +``` + +--- + +## Adapter — Hosting Baggage + +Register hosting baggage helpers to auto-populate baggage from every incoming `TurnContext`. +This removes the need to call `BaggageBuilder` manually in each activity handler. + +### Unified Distro + +```python +from microsoft.opentelemetry.a365.hosting import ( + ObservabilityHostingManager, + ObservabilityHostingOptions, +) + +ObservabilityHostingManager.configure( + adapter.middleware_set, + ObservabilityHostingOptions(enable_baggage=True), +) +``` + +Use these import paths when you need manual baggage wiring too: + +```python +from microsoft.opentelemetry.a365.core import BaggageBuilder, InvokeAgentScope +from microsoft.opentelemetry.a365.hosting.scope_helpers.populate_baggage import populate +``` + +--- + +## Message Handler — Token Refresh + BaggageBuilder + +### Unified Distro + +```python +# A365 Observability — best-effort instrumentation (verify against official sample) +from microsoft.opentelemetry.a365.core import BaggageBuilder +from microsoft.opentelemetry.a365.hosting.scope_helpers.populate_baggage import populate +from microsoft.opentelemetry.a365.runtime import get_observability_authentication_scope +from token_cache import cache_agentic_token + +async def _setup_observability_token(self, context: TurnContext, tenant_id: str, agent_id: str): + try: + exaau_token = await self.agent_app.auth.exchange_token( + context, + scopes=get_observability_authentication_scope(), + auth_handler_id=self.auth_handler_name, + ) + cache_agentic_token(tenant_id, agent_id, exaau_token.token) + except Exception as e: + logger.warning(f"Failed to cache observability token: {e}") + + +@AGENT_APP.activity("message", auth_handlers=["AGENTIC"]) +async def on_message(context: TurnContext, state: TurnState): + tenant_id = context.activity.recipient.tenant_id + agent_id = context.activity.recipient.agentic_app_id + + await self._setup_observability_token(context, tenant_id, agent_id) + + builder = BaggageBuilder() + populate(builder, context) + + with builder.build(): + # ... your agent message handling logic ... + pass +``` + +Manual `BaggageBuilder` (without the `populate()` helper): + +```python +from microsoft.opentelemetry.a365.core import BaggageBuilder + +with ( + BaggageBuilder() + .tenant_id("tenant-123") + .agent_id("agent-456") + .conversation_id("conv-789") + .build() +): + # Any spans started in this context will receive these as attributes + pass +``` + +--- + +## Manual Instrumentation Scopes + +> **Store publishing requirement:** `InvokeAgentScope`, `InferenceScope`, and `ExecuteToolScope` +> are **required** for store validation. Missing any one causes store validation failure. + +> **Import source:** Use the unified distro import path: `from microsoft.opentelemetry.a365.core import ...`. + +```python +from microsoft.opentelemetry.a365.core import ( + AgentDetails, + BaggageBuilder, + InferenceCallDetails, + InferenceOperationType, + InferenceScope, + InvokeAgentScope, + InvokeAgentScopeDetails, + ExecuteToolScope, + ToolCallDetails, + Request, + ServiceEndpoint, +) +``` + +### InvokeAgentScope + +```python +from microsoft.opentelemetry.a365.core import ( + InvokeAgentScope, + InvokeAgentScopeDetails, + AgentDetails, + CallerDetails, + UserDetails, + Channel, + Request, + ServiceEndpoint, +) + +# Reuse the same agent_details and request instances across all scopes in a request. +agent_details = AgentDetails( + agent_id="agent-456", + agent_name="My Agent", + agent_description="An AI agent powered by Azure OpenAI", + agentic_user_id="auid-123", + agentic_user_email="agent@contoso.com", + agent_blueprint_id="blueprint-789", + tenant_id="tenant-123", +) + +scope_details = InvokeAgentScopeDetails( + endpoint=ServiceEndpoint(hostname="myagent.contoso.com", port=443), +) + +request = Request( + content="User asks a question", + session_id="session-42", + conversation_id="conv-xyz", + channel=Channel(name="msteams"), +) + +caller_details = CallerDetails( + user_details=UserDetails( + user_id="user-123", + user_email="jane.doe@contoso.com", + user_name="Jane Doe", + ), +) + +with InvokeAgentScope.start(request, scope_details, agent_details, caller_details) as scope: + # Record input messages + scope.record_input_messages(["User asks a question"]) + # Perform agent invocation logic + response = call_agent(...) + # Record output messages + scope.record_output_messages([response]) +``` + +### Shared Observability Context Module (`observability/obs_context.py`) + +For autonomous/S2S agents, create a shared module to avoid circular imports between agent, monitor, and main: + +```python +# observability/obs_context.py +import os +from microsoft.opentelemetry.a365.core import AgentDetails, CallerDetails, UserDetails + +# ── Configuration from .env ────────────────────────────────────────────────── +A365_ENABLED = os.environ.get("ENABLE_A365_OBSERVABILITY", "").lower() == "true" +TENANT_ID = os.environ.get("AGENT365_TENANT_ID", "") +AGENT_ID = os.environ.get("AGENT365_AGENT_ID", "") +BLUEPRINT_ID = os.environ.get("AGENT365_BLUEPRINT_ID", "") +CLIENT_ID = os.environ.get("AGENT365_CLIENT_ID", "") +CLIENT_SECRET = os.environ.get("AGENT365_CLIENT_SECRET", "") +USE_MANAGED_IDENTITY = os.environ.get("AGENT365_USE_MANAGED_IDENTITY", "false").lower() == "true" +USE_S2S_ENDPOINT = os.environ.get("AGENT365_USE_S2S_ENDPOINT", "false").lower() == "true" + +# ── Shared instances (imported by agent & monitor modules) ─────────────────── +agent_details = AgentDetails( + agent_id=AGENT_ID, + agent_name=os.environ.get("AGENT365_AGENT_NAME", ""), + agent_description=os.environ.get("AGENT365_AGENT_DESCRIPTION", ""), + agent_blueprint_id=BLUEPRINT_ID, + tenant_id=TENANT_ID, +) + +# CallerDetails — for autonomous agents, use Blueprint sponsor identity +caller_details = CallerDetails( + user_details=UserDetails( + user_id=os.environ.get("AGENT365_SPONSOR_USER_ID", BLUEPRINT_ID), + user_email=os.environ.get("AGENT365_SPONSOR_USER_EMAIL", ""), + user_name=os.environ.get("AGENT365_SPONSOR_USER_NAME", ""), + ), +) +``` + +> **Why CallerDetails?** Without `CallerDetails`, traces will NOT appear in the Microsoft Admin Center (MAC) portal. For autonomous agents with no real user, use the Blueprint sponsor's identity. The `user.id`, `user.email`, and `user.name` span attributes are set from CallerDetails. + +> **Import pattern:** Import `agent_details` and `caller_details` from `obs_context` in your agent and monitor modules — do NOT create them inline to avoid circular imports with `main.py`. + +### ExecuteToolScope + +```python +from microsoft.opentelemetry.a365.core import ( + ExecuteToolScope, + ToolCallDetails, + Request, + ServiceEndpoint, +) + +# Use the same agent_details and request instances from InvokeAgentScope above. + +tool_details = ToolCallDetails( + tool_name="summarize", + tool_type="function", + tool_call_id="tc-001", + arguments="{'text': '...'}", + description="Summarize provided text", + endpoint=ServiceEndpoint(hostname="tools.contoso.com", port=8080), +) + +with ExecuteToolScope.start(request, tool_details, agent_details) as scope: + result = run_tool(tool_details) + scope.record_response(result) +``` + +### InferenceScope + +> **⚠️ Python SDK uses camelCase parameter names** (matching the underlying .NET/Java convention): +> `operationName`, `model`, `providerName`, `inputTokens`, `outputTokens`, `finishReasons`, `thoughtProcess`, `endpoint`. +> Do NOT use snake_case (`operation_name`, `provider_name`) — this causes `TypeError` at runtime. + +```python +from microsoft.opentelemetry.a365.core import ( + InferenceScope, + InferenceCallDetails, + InferenceOperationType, +) + +# Use the same agent_details and request instances from InvokeAgentScope above. + +inference_details = InferenceCallDetails( + operationName=InferenceOperationType.CHAT, + model="gpt-4o-mini", + providerName="azure-openai", + inputTokens=123, + outputTokens=456, + finishReasons=["stop"], +) + +with InferenceScope.start(request, inference_details, agent_details) as scope: + completion = call_llm(...) + scope.record_output_messages([completion.text]) + scope.record_input_tokens(completion.usage.input_tokens) + scope.record_output_tokens(completion.usage.output_tokens) +``` + +### OutputScope (async scenarios) + +```python +from microsoft.opentelemetry.a365.core import ( + OutputScope, + Response, + SpanDetails, +) + +# Use the same agent_details and request instances from InvokeAgentScope above. + +# Get the parent context from the originating scope +parent_context = invoke_scope.get_context() + +response = Response(messages=["Here is your organized inbox with 15 urgent emails."]) + +with OutputScope.start( + request, + response, + agent_details, + span_details=SpanDetails(parent_context=parent_context), +): + # Output messages are recorded automatically from the response + pass +``` + +--- + +## Auto-Instrumentation Extensions + +The unified distro handles supported framework instrumentation automatically after startup. +No framework-specific bootstrap call or manual extension instrumentor is needed. + +### Semantic Kernel + +```python +from microsoft.opentelemetry import use_microsoft_opentelemetry + +use_microsoft_opentelemetry(enable_a365=True) +# Semantic Kernel is auto-instrumented when installed. +``` + +### OpenAI Agents SDK + +```python +from microsoft.opentelemetry import use_microsoft_opentelemetry + +use_microsoft_opentelemetry(enable_a365=True) +# OpenAI Agents SDK instrumentation is handled by the distro. +``` + +### Agent Framework + +```python +from microsoft.opentelemetry import use_microsoft_opentelemetry + +use_microsoft_opentelemetry(enable_a365=True) +# Agent Framework instrumentation is handled by the distro. +``` + +### LangChain + +```python +from microsoft.opentelemetry import use_microsoft_opentelemetry + +use_microsoft_opentelemetry(enable_a365=True) +# LangChain instrumentation is handled by the distro. +``` + +--- + +## .env Variables + +> **⚠️ Python requires TWO env vars** (unlike Node.js and .NET which use a single flag): +> - `ENABLE_A365_OBSERVABILITY_EXPORTER` — controls exporter creation +> - `ENABLE_A365_OBSERVABILITY` — controls A365 span creation +> +> Without **both** set to `true`, `use_microsoft_opentelemetry()` can initialize successfully but `InvokeAgentScope.start()` still creates a **no-op scope** (no actual OTel spans are produced or exported). This is the #1 cause of "spans seem to run but nothing is exported." + +> **Note:** If you ran `a365 setup`, `ENABLE_A365_OBSERVABILITY_EXPORTER=false` is **already +> present** in your `.env` file. Preserve this value when instrumenting. + +```dotenv +# ── A365 Observability ──────────────────────────────────────────────────────── +# BOTH are required for Python (set to true for production): +ENABLE_A365_OBSERVABILITY_EXPORTER=false +ENABLE_A365_OBSERVABILITY=true +# ───────────────────────────────────────────────────────────────────────────── +``` + +--- + +## Validate Locally + +Set `ENABLE_A365_OBSERVABILITY_EXPORTER=false` — spans export to the console. + +To investigate export failures, enable verbose logging in your application startup: + +```python +import logging + +logging.basicConfig(level=logging.DEBUG) +logging.getLogger("microsoft.opentelemetry").setLevel(logging.DEBUG) +logging.getLogger("microsoft.opentelemetry.a365").setLevel(logging.DEBUG) +``` + +Key log messages: + +```text +DEBUG Token resolved for agent {agentId} tenant {tenantId} +DEBUG Exporting {n} spans to {url} +DEBUG HTTP 200 - correlation ID: abc-123 +ERROR Token resolution failed: {error} +ERROR HTTP 401 exporting spans - correlation ID: abc-123 +INFO No spans with tenant/agent identity found; nothing exported. +``` + +Import check to verify packages are installed: + +```bash +python -c "from microsoft.opentelemetry import use_microsoft_opentelemetry; from microsoft.opentelemetry.a365.hosting import ObservabilityHostingManager; print('A365 observability imports OK')" +``` + +--- + +## `use_microsoft_opentelemetry()` kwargs + +| Kwarg | Description | +|-------|-------------| +| `enable_a365` | Enables A365 observability instrumentation and exporter wiring | +| `a365_token_resolver` | Sync callable `(agent_id, tenant_id) -> str \| None` for OBO or S2S export authentication | +| `a365_cluster_category` | Optional cluster label such as `prod` | +| `a365_use_s2s_endpoint` | Uses the service-to-service export endpoint | +| `a365_suppress_invoke_agent_input` | Suppresses input messages on `InvokeAgent` spans | +| `a365_enable_observability_exporter` | Enables the A365 exporter in code instead of env-only configuration | +| `a365_observability_scope_override` | Overrides the default observability OAuth scope | +| `resource` | Standard OpenTelemetry `Resource` for `service.name` / `service.namespace` | + +--- + +## Key API Surface + +| Symbol | Module | Purpose | +|--------|--------|---------| +| `use_microsoft_opentelemetry()` | `microsoft.opentelemetry` | Configure the unified OTel pipeline with the A365 exporter | +| `AgentDetails` | `microsoft.opentelemetry.a365.core` | Agent identity for manual scopes | +| `BaggageBuilder` | `microsoft.opentelemetry.a365.core` | Propagates tenant/agent/conversation context across spans | +| `populate(builder, turn_context)` | `microsoft.opentelemetry.a365.hosting.scope_helpers.populate_baggage` | Auto-populates `BaggageBuilder` from `TurnContext` | +| `ObservabilityHostingManager` | `microsoft.opentelemetry.a365.hosting` | Composite hosting configuration for adapter middleware | +| `AgenticTokenCache` | `microsoft.opentelemetry.a365.hosting.token_cache_helpers` | Official hosting token-cache helper for AI Teammate agents | +| `cache_agentic_token()` / `get_cached_agentic_token()` | `token_cache` | Custom in-memory token cache module for per-turn OBO refresh | +| `acquire_initial_token()` / `run_token_service()` | `observability.observability_token_service` | Background FMI token acquisition for S2S (direct HTTP POST for Hop 1+2, MSAL for Hop 3) | +| `get_observability_authentication_scope()` | `microsoft.opentelemetry.a365.runtime` | Returns the OAuth2 scope string | +| `InvokeAgentScope.start(request, scope_details, agent_details, caller_details)` | `microsoft.opentelemetry.a365.core` | Start agent invocation telemetry scope (context manager) | +| `ExecuteToolScope.start(request, tool_details, agent_details)` | `microsoft.opentelemetry.a365.core` | Start tool execution telemetry scope (context manager) | +| `InferenceScope.start(request, inference_details, agent_details)` | `microsoft.opentelemetry.a365.core` | Start LLM inference telemetry scope (context manager) | +| `OutputScope.start(request, response, agent_details, span_details)` | `microsoft.opentelemetry.a365.core` | Start output telemetry scope (async scenarios) | +| `scope.record_input_messages(msgs)` / `scope.record_output_messages(msgs)` | — | Record prompts and completions | +| `scope.record_input_tokens(n)` / `scope.record_output_tokens(n)` | — | Record token counts | +| `scope.record_response(result)` | — | Record tool execution result | +| `scope.get_context()` | — | Get OTel context for use as parent in `OutputScope` | + +--- + +## Troubleshooting + +| Symptom | Cause | Fix | +|---------|-------|-----| +| No console traces | `use_microsoft_opentelemetry()` not called | Call the observability initializer before any spans are created | +| Spans missing baggage | Handler not wrapped in baggage scope | Use `ObservabilityHostingManager.configure(...)` or `with builder.build():` | +| Token resolver returns `None` | Per-turn OBO token cache was never refreshed | Call `exchange_token()` and `cache_agentic_token()` at the start of each message handler turn | +| `ModuleNotFoundError` | Package not installed | Run `pip install microsoft-opentelemetry` and install `msal azure-identity httpx` when needed | +| Traces not in Admin Center | Exporter env var not set | Set `ENABLE_A365_OBSERVABILITY_EXPORTER=true` in production | +| 401 on export | Missing permission | Check if upgrading past `0.3.0` (requires new `Agent365.Observability.OtelWrite` permission) | +| Spans dropped silently | Missing tenant/agent ID | Ensure `BaggageBuilder` or `populate()` adds tenant/agent identity before creating spans | +| S2S: OBO token-refresh code still runs in the handler | S2S does not use per-turn OBO token exchange | Remove the OBO handler refresh path; token comes from the background token service via `a365_token_resolver` | +| S2S 401: wrong Hop 3 scope | FMI Hop 3 used `https://api.powerplatform.com/.default` from older samples | Change Hop 3 scope to `api://9b975845-388f-4429-889e-eab1ef63949c/.default` | +| S2S 401 even with correct scope | `OtelWrite` role not on Agent Identity SP | For agents provisioned before CLI 1.1, manually assign `Agent365.Observability.OtelWrite` to the Agent Identity SP via Entra portal (App registrations > Blueprint app > API permissions) | +| S2S: Spans appear to run but nothing is exported | `ENABLE_A365_OBSERVABILITY=true` not set | Python SDK has **two** env vars: `ENABLE_A365_OBSERVABILITY_EXPORTER` (exporter creation) AND `ENABLE_A365_OBSERVABILITY` (span creation). Both must be `true`. Without the second, `InvokeAgentScope.start()` creates a no-op scope. | +| S2S: `_is_telemetry_enabled()` returns `False` | `ENABLE_A365_OBSERVABILITY` env var missing | Set `ENABLE_A365_OBSERVABILITY=true` in `.env` — this is separate from `ENABLE_A365_OBSERVABILITY_EXPORTER` | +| S2S: MSI fails locally | No Managed Identity in dev | Set `AGENT365_USE_MANAGED_IDENTITY=false` and provide `AGENT365_CLIENT_SECRET` | +| S2S: FMI Hop 1+2 returns 400 | `fmi_path` missing or wrong `client_id` | Ensure `fmi_path=` (Agent Identity app ID, not Blueprint ID) and `client_id=` | +| S2S: `TypeError: Session.request() got an unexpected keyword argument 'fmi_path'` | MSAL Python v1.34.0 bug | Use direct HTTP POST to `https://login.microsoftonline.com/{tenantId}/oauth2/v2.0/token` with `fmi_path` as form data instead of MSAL `acquire_token_for_client(fmi_path=...)`. MSAL is still used for Hop 3 (no `fmi_path` needed). | +| S2S: `InferenceCallDetails.__init__() got an unexpected keyword argument 'operation_name'` | Python SDK uses camelCase kwargs | Use `operationName=`, `providerName=`, `inputTokens=`, `outputTokens=`, `finishReasons=` (camelCase, NOT snake_case) | +| S2S: HTTP 400 TenantIdInvalid from exporter | Token not yet acquired when exporter first fires | Ensure `acquire_initial_token()` runs in lifespan BEFORE monitor starts. The `a365_token_resolver` returns `""` when no cached token exists, causing 400. | +| S2S: HTTP 403 `insufficient_scope: Required app role: Agent365.Observability.OtelWrite` | OtelWrite role not assigned to Agent Identity SP | Run PowerShell: `Connect-MgGraph; $sp = Get-MgServicePrincipal -Filter "appId eq ''"` then `New-MgServicePrincipalAppRoleAssignment` with OtelWrite role from observability API SP (`9b975845-388f-4429-889e-eab1ef63949c`) | +| S2S: FMI Hop 3 returns `AADSTS700024` | Agent Identity has no FMI credential | Verify `a365 setup all` completed successfully — it creates the federated credential on the Agent Identity | +| S2S: HTTP 200 but `rejectedSpans > 0` | Missing baggage context (tenant_id/agent_id) | Ensure `BaggageBuilder().tenant_id(...).agent_id(...).build()` wraps all scope code — without it, spans lack identity and are rejected |