feat(ai) cost-attribution telemetry + per-call ceiling + observability docs (Wave 3 PR 7) by jkeeley2073 · Pull Request #91 · Early-Bird-Solutions-LLC/PinballWizard

jkeeley2073 · 2026-05-07T15:53:02Z

Summary

Wave 3 PR 7 — closes the cost half of Wave 3. Per ADR-0015. PR 8 (eval harness) is in flight in a worktree-isolated subagent and lands separately; the two only conflict on Program.cs (which PR 7 does NOT touch).

What ships

Cost calculation surface

AiFoundryOptions.PricingTable — per-deployment USD-cent rates (per 1k input + output tokens), defaults populated for the three Phase 3 deployments from May 2026 Azure OpenAI public pricing:
- gpt-4o-mini: 0.015c/1k in, 0.060c/1k out
- gpt-4-1 (mapping to gpt-4.1): 0.20c/1k in, 0.80c/1k out
- text-embedding-3-large: 0.013c/1k uniform
ModelPricing record — Industry units (per 1k); cost-counter values stay readable in dashboards
AiCostCalculator + IAiCostCalculator — Pure-function calculator. Missing rows → 0c + debug log
TokenUsage record — Per-call token-count snapshot (DeploymentName + InputTokens + OutputTokens)

Token-usage extraction abstraction

ITokenUsageReader + NullTokenUsageReader — Abstraction seam. As of Microsoft.Agents.AI 1.4.0 GA the SDK doesn't expose a stable Usage surface on response objects (agent-framework#2688 — open feature request). NullTokenUsageReader (default) returns null so cost stays at 0c until the SDK lands the API. Pricing + ceiling machinery is in place so the swap is a one-class change in a follow-up PR. Interface accepts object so the type contract doesn't break across SDK reshuffles

`AiRouter` integration

After agent invocation, attempts ITokenUsageReader.TryRead; on hit, computes cost via IAiCostCalculator and tags pinwiz.ai.cost_usd_cents (with model + sub_agent + prompt_version attributes per ADR-0015)
Per-call cost ceiling: if cumulative cost exceeded AiFoundryOptions.PerCallCostCeilingUsdCents (default $0.10), router returns refusal with RefusalCategory.CostCeilingHit (previously-reserved enum value, now wired)
AgentResponse type name correction: the actual Microsoft.Agents.AI 1.4.0 type is AgentResponse (NOT AgentRunResponse — Microsoft Learn docs are ahead of the package). Fixed in the new router branches; existing code already used var so prior PRs were unaffected

`docs/observability.md`

New section AI orchestrator instruments (Phase 3) with the full pinwiz.ai.* inventory + the inherited gen_ai.* attribute list (do NOT duplicate those into pinwiz.* per ADR-0015 lean posture)
New section Daily AI cost aggregation (Phase 6 deploy gate) with the KQL query template Phase 6 alert rules pin against
Deferred section updated: "Real ITokenUsageReader impl pending .NET Agent Performance Metrics and Token Usage microsoft/agent-framework#2688" as explicit revisit trigger

Tests

10 new AiCostCalculatorTests:

known-deployment per-k rates, heavy-tier higher rate, unknown-deployment-zero, zero-tokens-zero, empty-deployment-zero, null-usage-throws, empty-pricing-table-zero, ctor-arg-null guards, default-pricing-table-shape pin

629/629 tests passing (was 619, +10 new).

Audit posture

Build green: 0 warnings, 0 errors under TreatWarningsAsErrors
Tests green: 629/629
Identity check ✅
No bare catch
No XML doc comments per feedback_no_xml_docs.md
Sibling-diff against ConfidenceCalculator / existing options patterns: matches
Every option field is read (PricingTable, PerCallCostCeilingUsdCents consumed by AiRouter)
[GeneratedRegex] etc. — no new analyzer suppressions

What this PR does NOT do (intentional)

Real token usage extraction — NullTokenUsageReader placeholder until Microsoft.Agents.AI exposes Usage on AgentResponse (issue #2688). Cost stays at 0c in production until that swap; pricing + ceiling machinery is ready
Program.cs changes — none, so PR 7 + PR 8 don't conflict on the CLI surface
pinwiz.ai.escalations population — counter exists from PR 4; populated in a future PR when sub-agent escalation routing trace is read from Foundry's connected-agents dispatch
Unit tests for AiRouter cost-ceiling path — would require mocking IAiAgent which is sealed in Microsoft.Agents.AI. The cost-ceiling path is exercised end-to-end via the eval harness (PR 8); H2 hand-off is the live validation

Test plan

Build green (0 warnings, 0 errors)
Tests green (629/629)
Identity check (personal noreply on commit)
DI gating verified (NullTokenUsageReader registered as singleton; AiCostCalculator registered)
Reviewer confirms the deferred-real-impl approach is acceptable until Microsoft.Agents.AI lands Usage on AgentResponse
After PR 8 + H2 hand-off: live exercise validates the full cost-attribution + ceiling path against deployed Foundry

Sources:

…y docs (Wave 3 PR 7) Wave 3 PR 7 of Phase 3 — closes the cost half of the wave per ADR-0015. PR 8 (eval harness) lands separately; the two only conflict on Program.cs (which PR 7 doesn't touch). What ships: PricingTable (AiFoundryOptions): - Per-deployment USD-cent rates per 1k input/output tokens, keyed by deployment name. Defaults populated for the three Phase 3 deployments at May 2026 Azure OpenAI public pricing: gpt-4o-mini — 0.015c/1k in, 0.060c/1k out gpt-4-1 (gpt-4.1) — 0.20c/1k in, 0.80c/1k out text-embedding-3-large — 0.013c/1k uniform Operators override via configuration when prices shift. ModelPricing record: - Per-deployment pricing fact (InputCentsPer1K + OutputCentsPer1K). Industry units (per 1k) keep cost-counter values readable in dashboards even though Microsoft publishes pricing per 1M tokens. AiCostCalculator + IAiCostCalculator (Application/Ai/Cost/): - Pure-function calculator. Looks up the deployment in PricingTable; missing rows return 0 cents (best-effort) and emit a debug log so operators can spot mis-keyed deployments without breaking the hot path. ITokenUsageReader + NullTokenUsageReader: - Abstraction for extracting a TokenUsage snapshot from Microsoft.Agents.AI's response object. As of 1.4.0 GA the SDK doesn't yet expose a stable Usage surface (microsoft/agent- framework#2688); NullTokenUsageReader (default) returns null so cost stays at 0c until the SDK lands the API. Pricing + ceiling machinery is in place so the swap is a one-class change in a follow-up PR. Interface accepts `object` so the type contract doesn't break across SDK reshuffles. AiRouter integration: - After agent invocation, attempts to read TokenUsage; on hit, computes cost and tags pinwiz.ai.cost_usd_cents counter (with model + sub_agent + prompt_version attributes per ADR-0015). - Per-call cost ceiling: if cumulative cost exceeded AiFoundryOptions.PerCallCostCeilingUsdCents (default $0.10), the router returns a refusal with RefusalCategory.CostCeilingHit (the previously-reserved enum value, now wired). Refusal text invites the user to ask a more focused question. - AgentResponse type name correction: the actual Microsoft.Agents.AI 1.4.0 type is AgentResponse (NOT AgentRunResponse — Microsoft Learn docs are ahead of the package as of build time). Fixed in the new router branches; the existing code already used `var` so the prior PRs were unaffected. docs/observability.md: - New section: AI orchestrator instruments (Phase 3) with the full pinwiz.ai.* inventory + the inherited gen_ai.* attribute list (do NOT duplicate those into pinwiz.* per ADR-0015 lean posture). - New section: Daily AI cost aggregation (Phase 6 deploy gate) with the KQL query template Phase 6 alert rules pin against. - Deferred section updated: 'Real ITokenUsageReader impl pending microsoft/agent-framework#2688' as the explicit revisit trigger. Tests: - AiCostCalculatorTests (10 tests): known-deployment per-k rates, heavy-tier higher rate, unknown-deployment-zero, zero-tokens-zero, empty-deployment-zero, null-usage-throws, empty-pricing-table-zero, ctor-arg-null guards, default-pricing-table-shape pin. Build green (0 warnings under TreatWarningsAsErrors), 629/629 tests passing (was 619, +10 new). Identity verified. Sources: - Microsoft.Agents.AI 1.4.0 GA NuGet (April 2026) - microsoft/agent-framework#2688 (.NET token usage on response — open) - Azure OpenAI public pricing (May 2026)

Resolves the lone CONFLICTING file ServiceCollectionExtensions.cs in Application/Ai/ — PR #91 added IAiCostCalculator + ITokenUsageReader singletons; this PR (#92) added the four custom evaluators. Both sets of singletons need to register; the resolved file keeps both blocks side-by-side in the import order Cost-then-Evaluators (alphabetical). Build green (0 warnings under TreatWarningsAsErrors); 687/687 tests passing — that's 629 (PR #91 baseline after merge to main) + 58 new from this PR's evaluator + harness + ground-truth-file tests. Identity verified.

jkeeley2073 added the claude-code Generated with Claude Code label May 7, 2026

jkeeley2073 merged commit d672558 into main May 7, 2026
4 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ai) cost-attribution telemetry + per-call ceiling + observability docs (Wave 3 PR 7)#91

feat(ai) cost-attribution telemetry + per-call ceiling + observability docs (Wave 3 PR 7)#91
jkeeley2073 merged 1 commit into
mainfrom
Dev-Phase3CostTelemetryAndCeiling

jkeeley2073 commented May 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jkeeley2073 commented May 7, 2026

Summary

What ships

Cost calculation surface

Token-usage extraction abstraction

AiRouter integration

docs/observability.md

Tests

Audit posture

What this PR does NOT do (intentional)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`AiRouter` integration

`docs/observability.md`