diff --git a/docs/adr/28003-fallback-audit-metrics-without-aw-info.md b/docs/adr/28003-fallback-audit-metrics-without-aw-info.md new file mode 100644 index 00000000000..c62ca549c46 --- /dev/null +++ b/docs/adr/28003-fallback-audit-metrics-without-aw-info.md @@ -0,0 +1,77 @@ +# ADR-28003: Fallback Strategy for Audit Metrics When aw_info.json Is Absent + +**Date**: 2026-04-23 +**Status**: Draft +**Deciders**: pelikhan + +--- + +## Part 1 — Narrative (Human-Friendly) + +### Context + +The `gh aw audit` command aggregates run-level metrics (token usage, turn count, engine config) to produce audit reports. These metrics are primarily sourced from `aw_info.json`, a structured artifact written by newer workflow runs. Legacy runs that predate the introduction of `aw_info.json` do not produce this artifact, causing the audit command to emit `engine_config: null`, `metrics.token_usage: null`, and `metrics.turns: null` even when alternative data sources — `agent_usage.json` and raw agent log files (`agent-stdio.log`, `events.jsonl`) — are present in the run directory. This gap reduces the usefulness of audit reports for historical analysis and fleet-wide comparisons. + +### Decision + +We will implement a multi-level fallback strategy in the audit pipeline that recovers metrics from alternative artifacts and log files when `aw_info.json` is absent. For token usage, the pipeline will fall back to `agent_usage.json` when the firewall proxy `token-usage.jsonl` is unavailable. For engine config, the pipeline will infer the engine by parsing available log files with all registered engine parsers and selecting the parser that recovers the strongest signal (prioritizing turn count, then token usage, then tool calls). For turn count and token usage in the audit report, the pipeline will cascade through: run-level parsed metrics → artifact token summaries → log inference. + +### Alternatives Considered + +#### Alternative 1: Require aw_info.json and Backfill Historical Data + +Enforce `aw_info.json` as a mandatory artifact and run a one-time migration to retroactively populate it for historical runs. This was rejected because it requires coordinating infrastructure changes across all historical workflow runs and cannot recover data that was never recorded. + +#### Alternative 2: Surface Null Values and Document Limitations + +Accept `null` metric fields for older runs and document that pre-`aw_info.json` runs have incomplete audit data. This was rejected because it degrades the audit tool's utility for historical fleet analysis and provides no path forward for operators who need accurate metrics across their entire run history. + +### Consequences + +#### Positive +- Audit reports are populated for legacy runs, enabling accurate historical fleet analysis. +- The fallback chain is additive and non-destructive: runs with `aw_info.json` are unaffected. +- `agent_usage.json` token data (including `effective_tokens`) is surfaced through the same `TokenUsageSummary` abstraction already used by the primary path. + +#### Negative +- The audit pipeline now has three distinct code paths for metric acquisition, increasing complexity and surface area for bugs. +- Inferred engine identification via log scoring is heuristic: the parser selection algorithm (weighted by turns, then token usage, then tool calls) may misidentify the engine when log content is ambiguous or shared across parsers. +- `agent_usage.json` is treated as a single-request summary, so per-model and per-request breakdowns are not available via this fallback. + +#### Neutral +- The `TokenUsageEntry` struct gains an `effective_tokens` field to accommodate `agent_usage.json` data; `token-usage.jsonl` entries omit this field and continue using computed effective token totals. +- The engine inference function (`inferBestEngineMetricsFromContent`) iterates all registered engines and may add latency proportional to the number of registered parsers for runs without `aw_info.json`. + +--- + +## Part 2 — Normative Specification (RFC 2119) + +> The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHALL**, **SHALL NOT**, **SHOULD**, **SHOULD NOT**, **RECOMMENDED**, **MAY**, and **OPTIONAL** in this section are to be interpreted as described in [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119). + +### Token Usage Acquisition + +1. Implementations **MUST** attempt to load token usage from `token-usage.jsonl` (firewall proxy log) first. +2. Implementations **MUST** fall back to `agent_usage.json` when `token-usage.jsonl` is absent or cannot be located. +3. Implementations **MUST NOT** apply custom token weight overrides (from `aw_info.json`) to the `agent_usage.json` fallback path, as custom weights are only meaningful alongside the firewall proxy data. +4. Implementations **SHOULD** search for `agent_usage.json` at the root of the run directory before walking subdirectories, to minimize filesystem traversal. + +### Audit Metric Fallback Chain + +1. Implementations **MUST** populate `metrics.token_usage` by cascading through, in order: (1) run-level parsed log metrics, (2) `input_tokens + output_tokens` from the artifact `TokenUsageSummary`, (3) token usage inferred from log content. +2. Implementations **MUST** populate `metrics.turns` by cascading through, in order: (1) run-level parsed log metrics, (2) turn count inferred from log content. +3. Implementations **MUST NOT** overwrite a non-zero metric value with a fallback value from a lower-priority source. + +### Engine Config Inference + +1. When `aw_info.json` is absent, implementations **MUST** attempt engine inference by parsing available log files using all engines registered in the global engine registry. +2. Implementations **MUST** select the inferred engine by maximising a weighted score: `turns * 100000 + len(tool_calls) * 1000 + token_usage`. +3. Implementations **MUST NOT** return an inferred engine config if no registered engine parser recovers any useful signal (turns, token usage, or tool calls). +4. Implementations **SHOULD** prefer `events.jsonl` over `agent-stdio.log` for engine inference when both are present. + +### Conformance + +An implementation is considered conformant with this ADR if it satisfies all **MUST** and **MUST NOT** requirements above. Failure to meet any **MUST** or **MUST NOT** requirement constitutes non-conformance. + +--- + +*This is a DRAFT ADR generated by the [Design Decision Gate](https://github.com/github/gh-aw/actions/runs/24834078573) workflow. The PR author must review, complete, and finalize this document before the PR can merge.* diff --git a/pkg/cli/audit_expanded.go b/pkg/cli/audit_expanded.go index cca6622baaf..7a6bd6301f4 100644 --- a/pkg/cli/audit_expanded.go +++ b/pkg/cli/audit_expanded.go @@ -2,6 +2,7 @@ package cli import ( "encoding/json" + "errors" "fmt" "os" "path/filepath" @@ -11,6 +12,7 @@ import ( "github.com/github/gh-aw/pkg/logger" "github.com/github/gh-aw/pkg/timeutil" + "github.com/github/gh-aw/pkg/workflow" ) var auditExpandedLog = logger.New("cli:audit_expanded") @@ -112,6 +114,10 @@ func findAwInfoPath(logsPath string) string { // extractEngineConfig parses aw_info.json and returns an AuditEngineConfig func extractEngineConfig(logsPath string) *AuditEngineConfig { + return extractEngineConfigWithInferredEngine(logsPath, "") +} + +func extractEngineConfigWithInferredEngine(logsPath, inferredEngineID string) *AuditEngineConfig { if logsPath == "" { return nil } @@ -119,6 +125,16 @@ func extractEngineConfig(logsPath string) *AuditEngineConfig { awInfoPath := findAwInfoPath(logsPath) if awInfoPath == "" { auditExpandedLog.Printf("aw_info.json not found in %s", logsPath) + if inferredEngineID != "" { + registry := workflow.GetGlobalEngineRegistry() + if engine, err := registry.GetEngine(inferredEngineID); err == nil { + auditExpandedLog.Printf("Inferred engine config without aw_info.json: engine=%s", inferredEngineID) + return &AuditEngineConfig{ + EngineID: inferredEngineID, + EngineName: engine.GetDisplayName(), + } + } + } return nil } awInfo, err := parseAwInfo(awInfoPath, false) @@ -148,6 +164,89 @@ func extractEngineConfig(logsPath string) *AuditEngineConfig { return config } +func inferFallbackLogMetrics(logsPath string) (LogMetrics, string) { + if logsPath == "" { + return LogMetrics{}, "" + } + + if eventsJSONLPath := findEventsJSONLFile(logsPath); eventsJSONLPath != "" { + if metrics, err := parseEventsJSONLFile(eventsJSONLPath, false); err == nil && hasUsefulFallbackMetrics(metrics) { + return metrics, "copilot" + } + } + + agentLogPath := findAgentStdioLogPath(logsPath) + if agentLogPath == "" { + return LogMetrics{}, "" + } + content, err := os.ReadFile(agentLogPath) + if err != nil { + return LogMetrics{}, "" + } + return inferBestEngineMetricsFromContent(string(content)) +} + +func findAgentStdioLogPath(logsPath string) string { + root := filepath.Join(logsPath, "agent-stdio.log") + if _, err := os.Stat(root); err == nil { + return root + } + + var found string + walkErr := filepath.Walk(logsPath, func(path string, info os.FileInfo, err error) error { + if err != nil || info == nil || info.IsDir() { + return nil + } + if info.Name() == "agent-stdio.log" { + found = path + return filepath.SkipAll + } + return nil + }) + if walkErr != nil && !errors.Is(walkErr, filepath.SkipAll) { + auditExpandedLog.Printf("Failed while searching for agent-stdio.log in %s: %v", logsPath, walkErr) + } + return found +} + +func hasUsefulFallbackMetrics(metrics LogMetrics) bool { + return metrics.TokenUsage > 0 || metrics.Turns > 0 || metrics.EstimatedCost > 0 || len(metrics.ToolCalls) > 0 +} + +func inferBestEngineMetricsFromContent(logContent string) (LogMetrics, string) { + registry := workflow.GetGlobalEngineRegistry() + engineIDs := registry.GetSupportedEngines() + const ( + // Prioritize selecting parsers that recover turn count first (primary signal for audit quality), + // then token usage, then tool call shape. + fallbackTurnsWeight = 100000 + fallbackToolCallsWeight = 1000 + ) + + var bestMetrics LogMetrics + var bestEngineID string + bestScore := -1 + + for _, engineID := range engineIDs { + engine, err := registry.GetEngine(engineID) + if err != nil { + continue + } + metrics := engine.ParseLogMetrics(logContent, false) + score := metrics.TokenUsage + (metrics.Turns * fallbackTurnsWeight) + (len(metrics.ToolCalls) * fallbackToolCallsWeight) + if score > bestScore { + bestScore = score + bestMetrics = metrics + bestEngineID = engineID + } + } + + if !hasUsefulFallbackMetrics(bestMetrics) { + return LogMetrics{}, "" + } + return bestMetrics, bestEngineID +} + // extractPromptAnalysis reads prompt.txt and returns analysis metrics func extractPromptAnalysis(logsPath string) *PromptAnalysis { if logsPath == "" { diff --git a/pkg/cli/audit_expanded_test.go b/pkg/cli/audit_expanded_test.go index 230a3a2c96a..23f843fcb8c 100644 --- a/pkg/cli/audit_expanded_test.go +++ b/pkg/cli/audit_expanded_test.go @@ -111,6 +111,29 @@ func TestExtractEngineConfigWithDetails(t *testing.T) { assert.Equal(t, "org/repo", result.Repository, "Repository should match") } +func TestExtractEngineConfigInferredWithoutAwInfo(t *testing.T) { + tmpDir := testutil.TempDir(t, "engine-infer-*") + logContent := `{"type":"result","subtype":"success","num_turns":3,"usage":{"input_tokens":100,"output_tokens":200}}` + require.NoError(t, os.WriteFile(filepath.Join(tmpDir, "agent-stdio.log"), []byte(logContent), 0o644)) + + _, inferredEngineID := inferFallbackLogMetrics(tmpDir) + result := extractEngineConfigWithInferredEngine(tmpDir, inferredEngineID) + require.NotNil(t, result, "Engine config should be inferred when aw_info.json is missing but agent log is available") + assert.NotEmpty(t, result.EngineID, "Inferred engine ID should not be empty") +} + +func TestInferFallbackLogMetricsFindsNestedAgentStdioLog(t *testing.T) { + tmpDir := testutil.TempDir(t, "engine-infer-nested-*") + nestedDir := filepath.Join(tmpDir, "agent", "logs") + require.NoError(t, os.MkdirAll(nestedDir, 0o755)) + logContent := `{"type":"result","subtype":"success","num_turns":4,"usage":{"input_tokens":120,"output_tokens":80}}` + require.NoError(t, os.WriteFile(filepath.Join(nestedDir, "agent-stdio.log"), []byte(logContent), 0o644)) + + metrics, inferredEngineID := inferFallbackLogMetrics(tmpDir) + assert.Positive(t, metrics.Turns, "Fallback metrics should be extracted from nested agent-stdio.log") + assert.NotEmpty(t, inferredEngineID, "Engine ID should be inferred from nested agent-stdio.log") +} + func TestExtractPromptAnalysis(t *testing.T) { tests := []struct { name string diff --git a/pkg/cli/audit_report.go b/pkg/cli/audit_report.go index fc90342c07c..8d5c593e925 100644 --- a/pkg/cli/audit_report.go +++ b/pkg/cli/audit_report.go @@ -279,6 +279,32 @@ func buildAuditData(processedRun ProcessedRun, metrics LogMetrics, mcpToolUsage WarningCount: run.WarningCount, } + needsFallbackMetrics := metricsData.TokenUsage == 0 || metricsData.Turns == 0 + needsFallbackEngineConfig := run.LogsPath != "" && findAwInfoPath(run.LogsPath) == "" + var fallbackMetrics LogMetrics + var inferredEngineID string + if run.LogsPath != "" && (needsFallbackMetrics || needsFallbackEngineConfig) { + fallbackMetrics, inferredEngineID = inferFallbackLogMetrics(run.LogsPath) + } + + // Fallback token usage: when the run-level metric is missing/zero for older + // runs, use aggregated input+output tokens from agent_usage/token usage artifacts. + if metricsData.TokenUsage == 0 && processedRun.TokenUsage != nil { + metricsData.TokenUsage = processedRun.TokenUsage.TotalInputTokens + processedRun.TokenUsage.TotalOutputTokens + } + if metricsData.TokenUsage == 0 && metrics.TokenUsage > 0 { + metricsData.TokenUsage = metrics.TokenUsage + } + if metricsData.Turns == 0 && metrics.Turns > 0 { + metricsData.Turns = metrics.Turns + } + if metricsData.TokenUsage == 0 && fallbackMetrics.TokenUsage > 0 { + metricsData.TokenUsage = fallbackMetrics.TokenUsage + } + if metricsData.Turns == 0 && fallbackMetrics.Turns > 0 { + metricsData.Turns = fallbackMetrics.Turns + } + // Populate effective tokens from the firewall proxy summary when available, // otherwise fall back to the effective tokens stored on the run itself. if processedRun.TokenUsage != nil && processedRun.TokenUsage.TotalEffectiveTokens > 0 { @@ -347,7 +373,7 @@ func buildAuditData(processedRun ProcessedRun, metrics LogMetrics, mcpToolUsage performanceMetrics := generatePerformanceMetrics(processedRun, metricsData, toolUsage) // Extract expanded audit data - engineConfig := extractEngineConfig(run.LogsPath) + engineConfig := extractEngineConfigWithInferredEngine(run.LogsPath, inferredEngineID) promptAnalysis := extractPromptAnalysis(run.LogsPath) sessionAnalysis := buildSessionAnalysis(processedRun, metrics) safeOutputSummary := buildSafeOutputSummary(createdItems) diff --git a/pkg/cli/audit_report_test.go b/pkg/cli/audit_report_test.go index fa3a4a19c3b..b435578fbe6 100644 --- a/pkg/cli/audit_report_test.go +++ b/pkg/cli/audit_report_test.go @@ -896,6 +896,35 @@ func TestBuildAuditDataMinimal(t *testing.T) { _ = auditData.Jobs } +func TestBuildAuditDataFallbackMetricsWithoutAwInfo(t *testing.T) { + tmpDir := testutil.TempDir(t, "audit-fallback-*") + logContent := `{"type":"result","subtype":"success","num_turns":7,"usage":{"input_tokens":100,"output_tokens":200}}` + require.NoError(t, os.WriteFile(filepath.Join(tmpDir, "agent-stdio.log"), []byte(logContent), 0o644)) + + processedRun := ProcessedRun{ + Run: WorkflowRun{ + DatabaseID: 42, + WorkflowName: "Fallback Metrics Workflow", + Status: "completed", + Conclusion: "success", + LogsPath: tmpDir, + TokenUsage: 0, + Turns: 0, + }, + TokenUsage: &TokenUsageSummary{ + TotalInputTokens: 5944, + TotalOutputTokens: 8698, + TotalEffectiveTokens: 243846, + TotalCacheReadTokens: 1170605, + TotalCacheWriteTokens: 86049, + }, + } + + auditData := buildAuditData(processedRun, workflow.LogMetrics{}, nil) + assert.Equal(t, 14642, auditData.Metrics.TokenUsage, "token usage should fall back to input+output from agent usage summary") + assert.Equal(t, 7, auditData.Metrics.Turns, "turns should fall back to inferred value from agent log") +} + func TestRenderJSONComplete(t *testing.T) { auditData := AuditData{ Overview: OverviewData{ diff --git a/pkg/cli/token_usage.go b/pkg/cli/token_usage.go index 9818d0ca08d..550e9d2056d 100644 --- a/pkg/cli/token_usage.go +++ b/pkg/cli/token_usage.go @@ -30,8 +30,11 @@ type TokenUsageEntry struct { OutputTokens int `json:"output_tokens"` CacheReadTokens int `json:"cache_read_tokens"` CacheWriteTokens int `json:"cache_write_tokens"` - DurationMs int `json:"duration_ms"` - ResponseBytes int `json:"response_bytes"` + // EffectiveTokens is populated by agent_usage.json fallback data. token-usage.jsonl + // entries usually omit this field and rely on computed effective token totals. + EffectiveTokens int `json:"effective_tokens"` + DurationMs int `json:"duration_ms"` + ResponseBytes int `json:"response_bytes"` } // AmbientContextMetrics captures token footprint for the first LLM invocation. @@ -84,6 +87,7 @@ type ModelTokenUsageRow struct { // tokenUsageJSONLPath is the relative path within the firewall logs directory const tokenUsageJSONLPath = "api-proxy-logs/token-usage.jsonl" +const agentUsageJSONPath = "agent_usage.json" // parseTokenUsageFile parses a token-usage.jsonl file and returns the aggregated summary. // Custom weights, when non-nil, override the built-in model multipliers and token class @@ -283,6 +287,93 @@ func findTokenUsageFile(runDir string) string { return "" } +// findAgentUsageFile searches for agent_usage.json in the run directory. +func findAgentUsageFile(runDir string) string { + primary := filepath.Join(runDir, agentUsageJSONPath) + if _, err := os.Stat(primary); err == nil { + tokenUsageLog.Printf("Found agent usage file at primary path: %s", primary) + return primary + } + + var found string + _ = filepath.Walk(runDir, func(path string, info os.FileInfo, err error) error { + if err != nil || info == nil || info.IsDir() { + return nil + } + if info.Name() == agentUsageJSONPath { + found = path + return filepath.SkipAll + } + return nil + }) + + if found != "" { + tokenUsageLog.Printf("Found agent usage file via walk: %s", found) + } + return found +} + +func parseAgentUsageFile(filePath string, customWeights *types.TokenWeights) (*TokenUsageSummary, error) { + cleanPath := filepath.Clean(filePath) + data, err := os.ReadFile(cleanPath) + if err != nil { + return nil, fmt.Errorf("failed to read agent usage file: %w", err) + } + + var entry TokenUsageEntry + if err := json.Unmarshal(data, &entry); err != nil { + return nil, fmt.Errorf("failed to parse agent usage file: %w", err) + } + + summary := &TokenUsageSummary{ + TotalInputTokens: entry.InputTokens, + TotalOutputTokens: entry.OutputTokens, + TotalCacheReadTokens: entry.CacheReadTokens, + TotalCacheWriteTokens: entry.CacheWriteTokens, + TotalEffectiveTokens: entry.EffectiveTokens, + ByModel: make(map[string]*ModelTokenUsage), + } + + totalInputPlusCacheRead := summary.TotalInputTokens + summary.TotalCacheReadTokens + if totalInputPlusCacheRead > 0 { + summary.CacheEfficiency = float64(summary.TotalCacheReadTokens) / float64(totalInputPlusCacheRead) + } + + hasTokenData := summary.TotalInputTokens > 0 || + summary.TotalOutputTokens > 0 || + summary.TotalCacheReadTokens > 0 || + summary.TotalCacheWriteTokens > 0 || + summary.TotalEffectiveTokens > 0 + if hasTokenData { + summary.TotalRequests = 1 + summary.ByModel["unknown"] = &ModelTokenUsage{ + Provider: entry.Provider, + InputTokens: entry.InputTokens, + OutputTokens: entry.OutputTokens, + CacheReadTokens: entry.CacheReadTokens, + CacheWriteTokens: entry.CacheWriteTokens, + EffectiveTokens: entry.EffectiveTokens, + Requests: 1, + } + } + + summary.AmbientContext = &AmbientContextMetrics{ + InputTokens: entry.InputTokens, + CachedTokens: entry.CacheReadTokens, + EffectiveTokens: entry.InputTokens + entry.CacheReadTokens, + } + + // If the file does not include effective_tokens, compute it using resolved + // token weights (custom aw_info weights when available, otherwise defaults). + if summary.TotalEffectiveTokens == 0 { + populateEffectiveTokensWithCustomWeights(summary, customWeights) + } + + tokenUsageLog.Printf("Parsed agent usage file: input=%d, output=%d, cache_read=%d, cache_write=%d, effective=%d", + summary.TotalInputTokens, summary.TotalOutputTokens, summary.TotalCacheReadTokens, summary.TotalCacheWriteTokens, summary.TotalEffectiveTokens) + return summary, nil +} + // analyzeTokenUsage finds and parses the token-usage.jsonl file from a run directory. // It automatically reads custom token weights from aw_info.json when present and // applies them to the effective token computation. @@ -290,21 +381,32 @@ func analyzeTokenUsage(runDir string, verbose bool) (*TokenUsageSummary, error) tokenUsageLog.Printf("Analyzing token usage in: %s", runDir) filePath := findTokenUsageFile(runDir) - if filePath == "" { - return nil, nil + if filePath != "" { + if verbose { + fileInfo, _ := os.Stat(filePath) + if fileInfo != nil { + fmt.Fprintf(os.Stderr, " Found token usage file: %s (%d bytes)\n", filepath.Base(filePath), fileInfo.Size()) + } + } + + // Try to load custom token weights from aw_info.json for this run + customWeights := extractCustomTokenWeightsFromDir(runDir) + return parseTokenUsageFile(filePath, customWeights) } + agentUsagePath := findAgentUsageFile(runDir) + if agentUsagePath == "" { + return nil, nil + } if verbose { - fileInfo, _ := os.Stat(filePath) + fileInfo, _ := os.Stat(agentUsagePath) if fileInfo != nil { - fmt.Fprintf(os.Stderr, " Found token usage file: %s (%d bytes)\n", filepath.Base(filePath), fileInfo.Size()) + fmt.Fprintf(os.Stderr, " Found agent usage file: %s (%d bytes)\n", filepath.Base(agentUsagePath), fileInfo.Size()) } } - // Try to load custom token weights from aw_info.json for this run customWeights := extractCustomTokenWeightsFromDir(runDir) - - return parseTokenUsageFile(filePath, customWeights) + return parseAgentUsageFile(agentUsagePath, customWeights) } // extractCustomTokenWeightsFromDir reads aw_info.json from a run directory and returns diff --git a/pkg/cli/token_usage_test.go b/pkg/cli/token_usage_test.go index 1e5ca341bf5..0f431aa064f 100644 --- a/pkg/cli/token_usage_test.go +++ b/pkg/cli/token_usage_test.go @@ -291,6 +291,39 @@ func TestAnalyzeTokenUsage(t *testing.T) { assert.Equal(t, 1, summary.TotalRequests, "should have 1 request") assert.Equal(t, 100, summary.TotalInputTokens, "should have correct input tokens") }) + + t.Run("falls back to agent_usage.json when token-usage.jsonl is missing", func(t *testing.T) { + tmpDir := testutil.TempDir(t, "analyze-agent-usage") + agentUsageFile := filepath.Join(tmpDir, "agent_usage.json") + content := `{"input_tokens":5944,"output_tokens":8698,"cache_read_tokens":1170605,"cache_write_tokens":86049,"effective_tokens":243846}` + require.NoError(t, os.WriteFile(agentUsageFile, []byte(content), 0o644)) + + summary, err := analyzeTokenUsage(tmpDir, false) + require.NoError(t, err, "should parse agent_usage.json without error") + require.NotNil(t, summary, "should return summary from agent_usage.json") + assert.Equal(t, 5944, summary.TotalInputTokens, "input tokens should match agent usage") + assert.Equal(t, 8698, summary.TotalOutputTokens, "output tokens should match agent usage") + assert.Equal(t, 243846, summary.TotalEffectiveTokens, "effective tokens should match agent usage") + assert.Equal(t, 1, summary.TotalRequests, "agent usage fallback should synthesize one request") + }) + + t.Run("applies custom weights from aw_info when agent_usage effective_tokens is missing", func(t *testing.T) { + tmpDir := testutil.TempDir(t, "analyze-agent-usage-custom-weights") + awInfoFile := filepath.Join(tmpDir, "aw_info.json") + awInfoContent := `{"token_weights":{"multipliers":{"unknown":2}}}` + require.NoError(t, os.WriteFile(awInfoFile, []byte(awInfoContent), 0o644)) + + agentUsageFile := filepath.Join(tmpDir, "agent_usage.json") + agentUsageContent := `{"input_tokens":10,"output_tokens":5,"cache_read_tokens":0,"cache_write_tokens":0}` + require.NoError(t, os.WriteFile(agentUsageFile, []byte(agentUsageContent), 0o644)) + + summary, err := analyzeTokenUsage(tmpDir, false) + require.NoError(t, err, "should parse agent_usage.json with custom weights") + require.NotNil(t, summary, "should return summary from agent_usage.json") + assert.Equal(t, 60, summary.TotalEffectiveTokens, "custom multiplier should be applied to computed effective tokens") + require.Contains(t, summary.ByModel, "unknown", "unknown model bucket should be present") + assert.Equal(t, 60, summary.ByModel["unknown"].EffectiveTokens, "per-model effective tokens should use custom weights") + }) } func TestCacheEfficiency(t *testing.T) {