agent-retro

Compatibility: Claude Code only (for now). This skill reads Claude Code's JSONL session transcripts. Support for other agents (Gemini CLI, Codex, Cursor, etc.) is on the roadmap.

A session retrospective skill for AI coding agents. Run /agent-retro at the end of a session to analyze what happened, identify friction, and get concrete suggestions for improving your skills, rules, and workflows.

Follows the Agent Skills open standard.

What it does

/agent-retro reads your session transcript from disk and produces a structured analysis:

Full conversation arc — every user message and assistant response, in order. No sampling, no skipping.
Token cost breakdown — per-agent attribution so you can see where the budget went.
Tool result waste detection — flags oversized tool results (a 45KB file read that was never used is money burned).
Friction analysis — identifies user corrections, redirects, and abandoned approaches, then traces each to a root cause.
Actionable proposals — specific edits to skills, rules, or config. Not vague "improve X" — the actual text to change.

The output is a markdown retro file plus an interactive walkthrough where you approve or defer each proposed action.

Why this exists

Every agent framework has inline reflection (retry loops, critic agents). None of them do post-session systemic reflection — looking across an entire conversation to find patterns of failure and proposing configuration changes to prevent them next time.

This is the equivalent of an agile sprint retrospective, but for a single AI session, producing machine-editable artifacts rather than sticky notes on a board. See references/design.md for the full design rationale and comparison with Reflexion, LangGraph, CrewAI, and others.

Install

Recommended (works with any Agent Skills-compatible tool):

npx skills add giannimassi/agent-retro

Or clone manually into your skills directory:

# Claude Code
git clone https://github.com/giannimassi/agent-retro.git ~/.claude/skills/agent-retro

# Gemini CLI (when supported)
# git clone https://github.com/giannimassi/agent-retro.git ~/.gemini/skills/agent-retro

Usage

At the end of any session:

/agent-retro

The skill will:

Find your current session's transcript
Extract structured data (streaming, no full file load)
Analyze the conversation arc for friction patterns
Classify what the session produced
Propose concrete improvements
Walk you through each proposal for approval

Extraction script standalone

The Python extraction script can be used independently:

# Quick session verification (reads only first/last 64KB)
python3 scripts/extract.py <session.jsonl> --metadata-only

# Compact extraction (tool counts, no individual calls)
python3 scripts/extract.py <session.jsonl> --summary

# Full extraction (includes every tool call detail)
python3 scripts/extract.py <session.jsonl>

How it works

Token-efficient extraction

The extraction script (scripts/extract.py) minimizes token usage:

Streaming — processes JSONL line-by-line, never loads the full file into memory
No tool result content — tracks result sizes without including the content. A 50MB session transcript produces ~30KB of extraction output.
Head/tail metadata — --metadata-only reads only the first and last 64KB of the file for session verification, borrowing a technique from Claude Code's internal readSessionLite function.
Full conversation arc — all user messages and assistant text are preserved. The arc is the whole point of a retro — you can't analyze friction you can't see.

What the extraction captures

Field	What	Why
`session`	ID, cwd, branch, duration	Context
`tokens`	Input/output/cache totals + USD cost	Budget analysis
`tools`	Call counts or full call list	Pattern detection
`agents`	Each dispatch with type, model, cost	Delegation efficiency
`skills`	Each invocation with args	Skill triggering analysis
`git`	Branches, commits, PR operations	What was shipped
`files`	Files read/written/edited	Scope tracking
`conversation_arc`	Full timeline of messages	Friction detection
`tool_result_sizes`	Per-tool total/avg/max bytes	Waste detection

Example output

See examples/sample-retro.md for a retro from a real session.

Roadmap

This skill currently only works with Claude Code. The analysis steps (friction detection, root cause tracing, action proposals) are agent-agnostic — only the transcript reading is Claude-specific.

Planned support:

Agent	Transcript format	Status
Claude Code	`~/.claude/projects/<cwd>/<session>.jsonl`	Supported
Gemini CLI	`~/.gemini/sessions/`	Planned
OpenAI Codex	TBD	Planned
Cursor	TBD	Planned
Roo Code	TBD	Planned

Contributions welcome — if you know where another agent stores its session data, open an issue.

Requirements

Python 3.8+ (stdlib only, no dependencies)
Currently: Claude Code v2.1.59+

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
examples		examples
references		references
scripts		scripts
tests		tests
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agent-retro

What it does

Why this exists

Install

Usage

Extraction script standalone

How it works

Token-efficient extraction

What the extraction captures

Example output

Roadmap

Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

agent-retro

What it does

Why this exists

Install

Usage

Extraction script standalone

How it works

Token-efficient extraction

What the extraction captures

Example output

Roadmap

Requirements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages