feat(metrics): Add rework rate OTEL metrics (#265) — 5th DORA metric by tamirdresher · Pull Request #415 · bradygaster/squad

tamirdresher · 2026-03-15T14:08:01Z

Summary

Add PR rework rate as OTEL metrics in squad-sdk, following the exact pattern of the existing metrics (#261–#264) in \otel-metrics.ts.

This replaces #381 (the standalone CLI command approach) with a proper SDK-level OTEL metric that integrates into the existing telemetry pipeline.

What's included

\packages/squad-sdk/src/runtime/rework.ts\ — Pure calculation module

Typed interfaces: \PrInfo, \PrReview, \PrCommit, \PrReworkResult, \ReworkSummary\
\calculatePrRework()\ — per-PR rework analysis (commits after first review)
\calculateReworkSummary()\ — aggregate metrics across multiple PRs

\otel-metrics.ts\ — #265 — Rework Rate Metrics\ section

Four OTEL instruments following the exact same lazy-init pattern:

\squad.rework.rate\ (Gauge) — current rework rate percentage
\squad.rework.cycles\ (Histogram) — review cycles per PR
\squad.rework.rejection_rate\ (Gauge) — percentage of PRs with changes requested
\squad.rework.time_ms\ (Histogram) — time spent in rework

Export functions:
ecordReworkMetrics(result)\ and
ecordReworkSummary(summary)\

Tests — 19 Vitest tests (all passing)

9 tests for \calculatePrRework\ (edge cases: no reviews, post-review commits, multiple cycles, nested commit objects, empty data, missing author, filtered reviews, rework time)
4 tests for \calculateReworkSummary\ (empty, aggregate, all-clean, single high-rework)
6 tests for OTEL metric recording (
ecordReworkMetrics,
ecordReworkSummary, instrument creation, reset)

Changeset

@bradygaster/squad-sdk: minor\ — Add rework rate OTEL metrics (#265)

Metrics tracked (the 5th DORA metric)

Metric	Description
Rework Rate	commits after first review / total commits × 100
Review Cycles	changes-requested → approved transitions
Rejection Rate	PRs with ≥1 changes requested / total PRs × 100
Rework Time	last approval − first changes-requested

Refs #265

…ORA metric Add PR rework rate as OTEL metrics in squad-sdk, following the exact pattern of existing metrics (bradygaster#261-bradygaster#264) in otel-metrics.ts. New modules: - runtime/rework.ts: Pure calculation module with typed interfaces (PrInfo, PrReview, PrCommit, PrReworkResult, ReworkSummary) and functions (calculatePrRework, calculateReworkSummary) - otel-metrics.ts bradygaster#265 section: Four OTEL instruments - squad.rework.rate (Gauge) — rework rate percentage - squad.rework.cycles (Histogram) — review cycles per PR - squad.rework.rejection_rate (Gauge) — rejection percentage - squad.rework.time_ms (Histogram) — time spent in rework - Export functions: recordReworkMetrics(), recordReworkSummary() Includes 19 Vitest tests (9 for calculatePrRework, 4 for calculateReworkSummary, 6 for OTEL metric recording). Replaces bradygaster#381 (CLI command approach) with proper SDK-level OTEL metrics. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

tamirdresher · 2026-03-15T14:21:06Z

Follow-up Ideas for Rework Rate

Three enhancements that would make this metric self-sustaining:

1. GitHub Actions Cron — Automated Assessment

A scheduled workflow (daily/weekly) that runs the calculation without any CLI session. Calls gh to get recent merged PRs, computes rework metrics, and:

Emits OTEL metrics to a configured collector
Writes results to .squad/monitoring/rework-metrics.json
If rework rate exceeds threshold, creates an issue
Could feed into Ralph's work-check cycle

2. Ralph Integration — Self-Correction Loop

Wire into Ralph's weekly retro: high rework = flag it, Lead runs retro, team adjusts. Low rework sustained over N weeks = reduce review gates, lower polling, auto-promote skill confidence. The "nap" concept: a well-calibrated squad can relax its guardrails.

3. Azure DevOps Support

Current impl uses gh CLI. For ADO repos, need az repos pr list or ADO REST API. The calculation logic in rework.ts is provider-agnostic — just needs PrInfo, PrReview, PrCommit shaped data. An ADO adapter that maps to these interfaces would make it work across both platforms.

Happy to contribute any of these as follow-up PRs.

tamirdresher · 2026-03-15T14:34:02Z

Note: CI failure is a pre-existing issue on dev branch, not from this PR.

The CLI build fails on missing SDK exports (listRoles, searchRoles, getCategories, getRoleById, generateCharterFromRole) in roles.ts, cast.ts, and coordinator.ts. These imports exist in the CLI but the SDK doesn't export them yet. This breaks any PR targeting dev right now.

Our changes (squad-sdk only) compile clean - the rework.ts module and otel-metrics.ts additions type-check without errors.

Re: ADO support - the calculation in rework.ts is already provider-agnostic (just needs PrInfo/PrReview/PrCommit data). Rather than shipping adapter classes, this is better handled at the agent/skill layer - the Squad coordinator already knows how to call gh vs az based on repo context. A skill template documenting both data sources is sufficient.

* chore(squad): Phase 2 launch — thinking feedback, P0 bugs, dual telemetry Phase 1 complete: 5 issues closed (bradygaster#325, bradygaster#326, bradygaster#327, bradygaster#328, bradygaster#329), 5 PRs merged. Phase 2 launched with Cheritto (thinking feedback), Hockney (P0 bugs), Saul (dual telemetry). Decision inbox merged and archived. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(squad): Phase 2 Wave 1 merged, Wave 2 launched Session: 2026-02-23T2145-phase2-wave2 Phase 2 Wave 1 complete (PRs bradygaster#351, bradygaster#352, bradygaster#353 merged). Wave 2 launched: Cheritto on ghost response detection (bradygaster#332), Hockney on error hardening (bradygaster#334). Changes: - Session log created: 2026-02-23T2145-phase2-wave2.md - Merged 3 inbox decisions (Cheritto, Hockney, Saul) - Deleted inbox files post-merge Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(squad): Epic bradygaster#323 complete — all phases shipped 🎉 All 3 phases delivered: - Phase 1 (Testing Wave): 6 issues closed - Phase 2 (Improvement): 6 issues closed - Phase 3 (Breathtaking): 7 issues closed - 17 PRs merged, 19 issues closed total Session log: 2026-02-23T2320-epic-complete.md Decisions merged from inbox: P2 UX Polish, first-run wow moment Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * hostile QA: end-to-end quality assessment — 10 findings, 4 HIGH severity Candid assessment requested by Brady. Traced every code path in cli-entry.ts, shell/index.ts, shell/commands.ts, App.tsx, coordinator.ts, spawn.ts, and the SDK adapter client. Key findings: - Dead sessions never evicted from agentSessions Map after connection drop - No React ErrorBoundary — any render throw kills the shell - Nasty-inputs corpus (95 strings) is never imported by any test - No SIGTERM handler in interactive shell - MemoryManager exported but never instantiated (dead code) - Single streaming content slot clobbers multi-agent output - User input silently dropped during processing (no type-ahead buffer) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(squad): quality review findings — 7 issues filed Quality audit complete: 5 agents assessed CLI across testing, coverage, stability, accessibility, UX. Results: 4 P0 blockers (bradygaster#365–bradygaster#368), 3 P1 items (bradygaster#369–bradygaster#371). Blocking: Waingro dead sessions, ErrorBoundary, dropped input; Marquez help text consistency. Changes: - Logged session summary to .squad/log/2026-02-24T0205-quality-review-complete.md - Updated .squad/identity/now.md with quality review findings and new issue numbers Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(squad): merge decision — Marquez UX audit findings Quality assessment merged from inbox (Grade B): 11 improvements (3 P0, 4 P1, 4 P2). help text, stub commands, vocabulary, separators, roster. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(squad): test sprint launch Session: 2026-02-24T0210-test-sprint Changes: - Logged test sprint: 5 agents, 7+ issues - Branches: P0 fixes, stale tests, E2E, hostile/SDK, A11y Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: correct ThinkingIndicator assertion to match component behavior The ThinkingIndicator renders empty string when isThinking=false, not 'No agents active'. Fix the test assertion. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…er#429, bradygaster#424, bradygaster#417, bradygaster#415, bradygaster#412, bradygaster#411) Documents features and changes from recent PRs that shipped without corresponding docs updates: - bradygaster#429: Update model catalog with Sonnet 4.6, Opus 4.6, GPT-5.4 defaults - bradygaster#424: Document --sdk switch for TypeScript config generation - bradygaster#412: Document --roles flag for opt-in base roles - bradygaster#411: Note Ralph in init + @copilot routing template removal - bradygaster#442: Add Session Recovery skill documentation - bradygaster#417: Document CastingEngine character casting - bradygaster#415: Add rework rate OTEL metrics reference Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: fill content gaps from 7 recent PRs (#442, #429, #424, #417, #415, #412, #411) Documents features and changes from recent PRs that shipped without corresponding docs updates: - #429: Update model catalog with Sonnet 4.6, Opus 4.6, GPT-5.4 defaults - #424: Document --sdk switch for TypeScript config generation - #412: Document --roles flag for opt-in base roles - #411: Note Ralph in init + @copilot routing template removal - #442: Add Session Recovery skill documentation - #417: Document CastingEngine character casting - #415: Add rework rate OTEL metrics reference Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: remove duplicate gpt-5.1-codex-mini from Fast/Cheap tier --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

tamirdresher force-pushed the feat/rework-rate-otel-metric branch from 0f44a19 to c654cf7 Compare March 15, 2026 14:33

bradygaster merged commit a1fcdb8 into bradygaster:dev Mar 15, 2026
1 of 2 checks passed

bradygaster mentioned this pull request Mar 15, 2026

feat(metrics): Add rework rate CLI command (5th DORA metric) #381

Closed

diberry mentioned this pull request Mar 19, 2026

docs: fill content gaps from 7 recent PRs #458

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(metrics): Add rework rate OTEL metrics (#265) — 5th DORA metric#415

feat(metrics): Add rework rate OTEL metrics (#265) — 5th DORA metric#415
bradygaster merged 1 commit intobradygaster:devfrom
tamirdresher:feat/rework-rate-otel-metric

tamirdresher commented Mar 15, 2026

Uh oh!

tamirdresher commented Mar 15, 2026

Uh oh!

tamirdresher commented Mar 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tamirdresher commented Mar 15, 2026

Summary

What's included

\packages/squad-sdk/src/runtime/rework.ts\ — Pure calculation module

\otel-metrics.ts\ — #265 — Rework Rate Metrics\ section

Tests — 19 Vitest tests (all passing)

Changeset

Metrics tracked (the 5th DORA metric)

Uh oh!

tamirdresher commented Mar 15, 2026

Follow-up Ideas for Rework Rate

1. GitHub Actions Cron — Automated Assessment

2. Ralph Integration — Self-Correction Loop

3. Azure DevOps Support

Uh oh!

tamirdresher commented Mar 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants