Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions specs/reporters.md
Original file line number Diff line number Diff line change
Expand Up @@ -407,7 +407,7 @@ At `Verbose+`.

#### 19. `analysis_complete`

Fields: `findingsCount`, `failedHunks?`, `failedExtractions?`, `skippedFiles?`, `usage?`, `totalDuration`
Fields: `findingsCount`, `failedHunks?`, `failedExtractions?`, `skippedFiles?`, `usage?`, `totalDuration`, `traceId?`

This is a multi-line summary section, not a single event.

Expand All @@ -420,8 +420,9 @@ SUMMARY
Use -v for failure details
3 files skipped
Analysis completed in 4.2s · 5.0k in / 800 out · $0.01
Trace: abc123def456...
```
(Counts colored by severity. Warnings in yellow. Hint dimmed. Usage line dimmed.)
(Counts colored by severity. Warnings in yellow. Hint dimmed. Usage line dimmed. Trace ID dimmed, `Verbose+` only, only when Sentry is initialized.)

The `-v` hint appears only when there are failures (`failedHunks > 0` or `failedExtractions > 0`) **and** verbosity is below `Verbose`. At `Verbose+`, per-hunk details are already shown via events 16-18 so the hint is suppressed.

Expand All @@ -434,8 +435,9 @@ The `-v` hint appears only when there are failures (`failedHunks > 0` or `failed
[2026-02-08T14:30:56.000Z] warden: 3 files skipped
[2026-02-08T14:30:56.000Z] warden: Usage: 5.0k input, 800 output, $0.01
[2026-02-08T14:30:56.000Z] warden: Total time: 4.2s
[2026-02-08T14:30:56.000Z] warden: Trace: abc123def456...
```
Warnings and skipped files only shown when non-zero. The `-v` hint follows the same gating as TTY.
Warnings and skipped files only shown when non-zero. The `-v` hint follows the same gating as TTY. Trace ID shown at `Verbose+` only, when Sentry is initialized.

**JSONL:** Summary record (the last line in the JSONL file)

Expand Down Expand Up @@ -555,7 +557,7 @@ Each line in a JSONL file is one of three record types, discriminated by the pre

### Shared Types

**RunMetadata**: `{ timestamp: string, durationMs: number, cwd: string }`
**RunMetadata**: `{ timestamp: string, durationMs: number, cwd: string, traceId?: string }`

**UsageStats**: `{ inputTokens: int, outputTokens: int, cacheReadInputTokens?: int, cacheCreationInputTokens?: int, costUSD: number }` -- `inputTokens` is the total input token count; `cacheReadInputTokens` and `cacheCreationInputTokens` are subsets of it.

Expand Down
48 changes: 39 additions & 9 deletions specs/telemetry.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,26 @@ Observability via Sentry: tracing, error context, and business metrics. All tele
| `tracesSampleRate` | `1.0` (every transaction traced) |
| `enableLogs` | `true` (structured Sentry logs) |

### Global Attributes

Set via `Sentry.getGlobalScope().setAttributes()`. These propagate automatically to all metrics and spans.

| Attribute | Set when | Value |
|-----------|----------|-------|
| `warden.source` | `initSentry()` | `github-action` or `cli` |
| `warden.repository` | After context built | `owner/repo` (e.g. `getsentry/sentry`) |

### Trace ID

The trace ID from the root span serves as the unique run identifier. It is surfaced in:

- **CLI summary** (`-v`): Dimmed `Trace: {id}` line in the SUMMARY section at Verbose+ verbosity
- **CLI debug output** (`-vv`): `reporter.debug()` at the start of the command span (safety net if run crashes before summary)
- **Sentry structured logs**: `trace.id` field in the `Workflow initialized` log entry
- **JSONL run metadata**: `traceId` field in `JsonlRunMetadata`

Operators can use the trace ID to locate the corresponding Sentry trace for any Warden run.

### Integrations

| Integration | Purpose |
Expand Down Expand Up @@ -229,16 +249,26 @@ Untagged `captureException` calls exist at top-level catch handlers in `src/cli/

Emitted via `Sentry.metrics.*`. Each function is a no-op when Sentry is not initialized and wrapped in try/catch so metrics never break the workflow.

All metrics inherit `warden.source` and `warden.repository` from the global scope (see **Global Attributes** above). Only per-metric attributes are listed below.

### Run count (`emitRunMetric`)

| Metric | Type | Per-metric attributes |
|--------|------|-----------------------|
| `workflow.runs` | count | -- (inherits globals) |

Called once per analysis workflow execution (CLI run or GitHub Action workflow).

### Skill-level (`emitSkillMetrics`)

| Metric | Type | Attributes |
|--------|------|------------|
| `skill.duration` | distribution (ms) | `skill`, `repository`, `source` |
| `tokens.input` | distribution | `skill`, `repository`, `source` |
| `tokens.output` | distribution | `skill`, `repository`, `source` |
| `cost.usd` | distribution | `skill`, `repository`, `source` |
| `findings.total` | count | `skill`, `repository`, `source` |
| `findings` | count | `skill`, `repository`, `source`, `severity` |
| Metric | Type | Per-metric attributes |
|--------|------|-----------------------|
| `skill.duration` | distribution (ms) | `skill` |
| `tokens.input` | distribution | `skill` |
| `tokens.output` | distribution | `skill` |
| `cost.usd` | distribution | `skill` |
| `findings.total` | count | `skill` |
| `findings` | count | `skill`, `severity` |

### Extraction (`emitExtractionMetrics`)

Expand Down Expand Up @@ -305,7 +335,7 @@ Called from `evaluateFixesAndResolveStale` when stale comments are resolved.

| File | Role |
|------|------|
| `src/sentry.ts` | Init, integrations, metric emission functions |
| `src/sentry.ts` | Init, integrations, global attributes, metric emission functions |
| `src/sdk/analyze.ts` | `executeQuery` (gen AI span), `analyzeFile` / `analyzeHunk` (workflow spans), extraction + retry + dedup metrics |
| `src/action/fix-evaluation/index.ts` | `evaluateFixAttempts` / per-comment spans, fix eval metrics |
| `src/action/workflow/base.ts` | `ActionFailedError` sentinel, `setFailed()` |
Expand Down
11 changes: 9 additions & 2 deletions src/action/workflow/pr-workflow.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
import { readFileSync } from 'node:fs';
import { dirname, join } from 'node:path';
import type { Octokit } from '@octokit/rest';
import { Sentry, logger, emitStaleResolutionMetric } from '../../sentry.js';
import { Sentry, logger, emitStaleResolutionMetric, setGlobalAttributes, emitRunMetric } from '../../sentry.js';
import { loadWardenConfig, resolveSkillConfigs } from '../../config/loader.js';
import type { ResolvedTrigger } from '../../config/loader.js';
import type { WardenConfig } from '../../config/schema.js';
Expand Down Expand Up @@ -631,7 +631,14 @@ export async function runPRWorkflow(
});
}

logger.info('Workflow initialized', { 'trigger.count': matchedTriggers.length });
setGlobalAttributes({ 'warden.repository': context.repository.fullName });
emitRunMetric();

const traceId = span.spanContext().traceId;
logger.info('Workflow initialized', {
'trigger.count': matchedTriggers.length,
'trace.id': traceId,
});

if (matchedTriggers.length === 0) {
await cleanupOrphanedComments(octokit, context, inputs.anthropicApiKey);
Expand Down
22 changes: 18 additions & 4 deletions src/cli/main.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import { existsSync } from 'node:fs';
import { dirname, join, resolve } from 'node:path';
import { config as dotenvConfig } from 'dotenv';
import { Sentry, flushSentry } from '../sentry.js';
import { Sentry, flushSentry, setGlobalAttributes, emitRunMetric, getTraceId } from '../sentry.js';
import { loadWardenConfig, resolveSkillConfigs } from '../config/loader.js';
import type { SkillRunnerOptions } from '../sdk/runner.js';
import { resolveSkillAsync } from '../skills/loader.js';
Expand Down Expand Up @@ -138,14 +138,15 @@ async function outputResultsAndHandleFixes(
const { reports, filteredReports, hasFailure, failureReasons } = processed;

// Write JSONL output if requested (uses unfiltered reports for complete data)
const traceId = getTraceId();
if (options.output) {
writeJsonlReport(options.output, reports, totalDuration);
writeJsonlReport(options.output, reports, totalDuration, { traceId });
reporter.success(`Wrote JSONL output to ${options.output}`);
}

// Always write automatic run log for debugging
const runLogPath = getRunLogPath(repoPath);
writeJsonlReport(runLogPath, reports, totalDuration);
writeJsonlReport(runLogPath, reports, totalDuration, { traceId });
reporter.debug(`Run log: ${runLogPath}`);

// Collect fixable findings early so we know whether to suppress diffs in the report
Expand Down Expand Up @@ -174,7 +175,7 @@ async function outputResultsAndHandleFixes(

// Show summary (uses filtered reports for display)
reporter.blank();
reporter.renderSummary(filteredReports, totalDuration);
reporter.renderSummary(filteredReports, totalDuration, { traceId });

// Handle fixes: --fix (automatic) always runs, interactive step-through in TTY mode
if (fixableFindings.length > 0) {
Expand Down Expand Up @@ -267,6 +268,10 @@ async function runSkills(
skillsToRun = [];
}

// Set global telemetry context and emit run metric
setGlobalAttributes({ 'warden.repository': context.repository.fullName });
emitRunMetric();

// Handle case where no skills to run
if (skillsToRun.length === 0) {
if (options.json) {
Expand Down Expand Up @@ -488,6 +493,10 @@ async function runConfigMode(options: CLIOptions, reporter: Reporter): Promise<n

reporter.contextFiles(pullRequest.files);

// Set global telemetry context and emit run metric
setGlobalAttributes({ 'warden.repository': context.repository.fullName });
emitRunMetric();

// Resolve skills into triggers and match
const resolvedTriggers = resolveSkillConfigs(config, options.model);
const matchedTriggers = resolvedTriggers.filter((t) => matchTrigger(t, context, 'local'));
Expand Down Expand Up @@ -696,6 +705,11 @@ export async function main(): Promise<void> {
async (span) => {
span.setAttribute('cli.command', command);

const traceId = getTraceId();
if (traceId) {
reporter.debug(`Trace ID: ${traceId}`);
}

switch (command) {
case 'init':
return runInit(options, reporter);
Expand Down
5 changes: 4 additions & 1 deletion src/cli/output/jsonl.ts
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ export interface JsonlRunMetadata {
timestamp: string;
durationMs: number;
cwd: string;
traceId?: string;
}

/**
Expand Down Expand Up @@ -125,7 +126,8 @@ function aggregateUsage(reports: SkillReport[]): UsageStats | undefined {
export function writeJsonlReport(
outputPath: string,
reports: SkillReport[],
durationMs: number
durationMs: number,
options?: { traceId?: string }
): void {
const resolvedPath = resolve(process.cwd(), outputPath);
const timestamp = new Date().toISOString();
Expand All @@ -135,6 +137,7 @@ export function writeJsonlReport(
timestamp,
durationMs,
cwd,
traceId: options?.traceId,
};

const lines: string[] = [];
Expand Down
8 changes: 7 additions & 1 deletion src/cli/output/reporter.ts
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,7 @@ export class Reporter {
/**
* Render the summary section.
*/
renderSummary(reports: SkillReport[], totalDuration: number): void {
renderSummary(reports: SkillReport[], totalDuration: number, options?: { traceId?: string }): void {
const allFindings: Finding[] = [];
let totalFailedHunks = 0;
let totalFailedExtractions = 0;
Expand Down Expand Up @@ -227,6 +227,9 @@ export class Reporter {
} else {
this.log(chalk.dim(durationLine));
}
if (options?.traceId && this.verbosity >= Verbosity.Verbose) {
this.log(chalk.dim(`Trace: ${options.traceId}`));
}
} else {
this.logPlain(`Summary: ${formatFindingCountsPlain(counts)}`);
if (totalFailedHunks > 0) {
Expand All @@ -245,6 +248,9 @@ export class Reporter {
this.logPlain(`Usage: ${formatUsagePlain(totalUsage)}`);
}
this.logPlain(`Total time: ${formatDuration(totalDuration)}`);
if (options?.traceId && this.verbosity >= Verbosity.Verbose) {
this.logPlain(`Trace: ${options.traceId}`);
}
}
}

Expand Down
4 changes: 1 addition & 3 deletions src/cli/output/tasks.ts
Original file line number Diff line number Diff line change
Expand Up @@ -432,9 +432,7 @@ export async function runSkillTask(
}

// Emit metrics and log completion
emitSkillMetrics(report, {
repository: context.repository.fullName,
});
emitSkillMetrics(report);
logger.info(logger.fmt`Skill execution complete: ${displayName}`, {
'finding.count': report.findings.length,
'duration_ms': report.durationMs,
Expand Down
51 changes: 38 additions & 13 deletions src/sentry.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ import { getVersion } from './utils/index.js';
export type SentryContext = 'cli' | 'action';

let initialized = false;
let deploymentContext: SentryContext | undefined;

export function initSentry(context: SentryContext): void {
const dsn = process.env['WARDEN_SENTRY_DSN'];
Expand All @@ -26,14 +25,41 @@ export function initSentry(context: SentryContext): void {
],
});

deploymentContext = context;
Sentry.setTag('deployment.context', context);
Sentry.setTag('service.version', getVersion());
Sentry.getGlobalScope().setAttributes({
'warden.source': context === 'action' ? 'github-action' : 'cli',
});
}

export { Sentry };
export const { logger } = Sentry;

/**
* Set attributes on the global Sentry scope.
* These automatically apply to ALL metrics and spans.
*/
export function setGlobalAttributes(attrs: Record<string, string | number | boolean>): void {
if (!initialized) return;
try {
Sentry.getGlobalScope().setAttributes(attrs);
} catch {
// Never break the workflow
}
}

/**
* Get the trace ID from the active span, if available.
* Useful for correlating runs to Sentry traces in logs and output.
*/
export function getTraceId(): string | undefined {
if (!initialized) return undefined;
try {
return Sentry.getActiveSpan()?.spanContext().traceId;
} catch {
return undefined;
}
}

/**
* Run a metrics callback only when Sentry is initialized.
* Swallows errors so metrics never break the main workflow.
Expand All @@ -47,20 +73,19 @@ function safeEmit(fn: () => void): void {
}
}

export interface SkillMetricsContext {
/** Full repository name (e.g. "owner/repo") */
repository?: string;
/**
* Emit a single run count. Call once per analysis workflow execution.
* Inherits warden.source and warden.repository from global scope.
*/
export function emitRunMetric(): void {
safeEmit(() => {
Sentry.metrics.count('workflow.runs', 1);
});
}

export function emitSkillMetrics(report: SkillReport, context?: SkillMetricsContext): void {
export function emitSkillMetrics(report: SkillReport): void {
safeEmit(() => {
const attrs: Record<string, string> = { skill: report.skill };
if (context?.repository) {
attrs['repository'] = context.repository;
}
if (deploymentContext) {
attrs['source'] = deploymentContext;
}

Sentry.metrics.distribution('skill.duration', report.durationMs ?? 0, {
unit: 'millisecond',
Expand Down