armory

Curated, production-grade skills, agents, hooks, rules, commands, utilities, and presets for AI coding agents. No magic, no demos — battle-tested workflows built for developers who use AI seriously.

Overview

armory is a collection of packages for Claude Code and Claude.ai. Each package is a self-contained prompt or automation unit that extends Claude with a repeatable, opinionated workflow for a specific task domain. Packages span seven types: skills, agents, hooks, rules, commands, utilities, and presets.

Philosophy: Packages in this collection are practical and context-free. They define the how, not just the what — covering inputs, outputs, edge cases, and failure modes. They are tested in real workloads, not constructed as examples.

Intended for developers who treat AI coding agents as a serious part of their workflow.

Package Catalog

Agents — Orchestrators

Orchestrator agents compose skills and other agents into multi-phase workflows. Each can run solo or be spawned by another agent via the Agent tool.

Agent	Model	Description
team-lead	opus	Meta-orchestrator — decomposes multi-domain requests, delegates to specialized agents, synthesizes results
codebase-auditor	sonnet	Unified quality assessment — spawns code-reviewer, security-reviewer, secret-scanner in parallel, merges report
project-architect	opus	Phased requirements discovery producing architecture documents with diagrams and tech stack justification
project-planner	sonnet	Task decomposition with dependency mapping, three-point estimates, milestone timelines, and risk logs
research-analyst	opus	Multi-source investigation with parallel agents across web, academic, video, and competitive sources
idea-scout	opus	Business idea validation — Lean Canvas, parallel market/competitive/feasibility research, weighted scorecard
full-stack-builder	opus	End-to-end implementation from spec — scaffolding, sprints, quality passes, documentation, pre-delivery review
release-captain	sonnet	Ship lifecycle with quality gates — pre-flight, secret scan, changelog, version bump, PR creation
proposal-writer	opus	Technical proposals with ROI calculations, three-tier pricing, and Problem-Agitate-Solve framing
content-strategist	sonnet	Multi-channel content creation with per-channel adaptation and automated quality passes
media-producer	sonnet	Visual and video format router — selects the right skill based on concept type and output needs

Agents — Analyzers

Agent	Model	Description
code-reviewer	sonnet	Multi-phase code review with severity-ranked findings
security-reviewer	sonnet	OWASP Top 10 vulnerability scanning
secret-scanner	haiku	Pre-commit detection of hardcoded credentials
test-engineer	sonnet	Co-evolutionary skill evolution with generate-verify-refine loops

Model routing: Agents marked opus run on Claude Opus 4.7 with xhigh effort by default in Claude Code. Use max effort only for genuinely hard novel problems (diminishing returns, overthinking risk); high when running concurrent sessions or for cost-sensitive work. Opus 4.7 uses adaptive thinking — there is no fixed thinking budget to tune.

Skills — Development & Tooling

Skill	Description
agent-builder	Build AI agents using the Claude Agent SDK and headless CLI mode — covers tool definitions, MCP servers, and programmatic orchestration
github	GitHub CLI operations via `gh` — issues, PRs, CI/Actions, releases, search, REST/GraphQL API, with error handling and automation workflows
filesystem	File and directory operations via Claude Code built-in tools — replaces the Filesystem MCP server with native Read, Write, Edit, Glob, Grep
mcp-to-skill	Convert MCP servers into on-demand skills to reduce active context window token usage
gpu-optimizer	GPU optimization for consumer GPUs (8-24GB VRAM) — PyTorch, XGBoost, CuPy/RAPIDS, memory management, and CUDA tuning
tavily	AI-optimized web search and content extraction via Tavily API with structured output parsing
test-harness	Comprehensive pytest suite generation — happy path, edge cases, error conditions, fixtures, mocks, async, parametrized tests
debug-investigator	Systematic debugging framework — hypothesis-driven investigation with bisection, log analysis, instrumentation, and minimal reproduction
to-markdown	Convert any file or URL to clean Markdown via MarkItDown — PDF, DOCX, XLSX, PPTX, HTML, images, audio, CSV, JSON, XML, YouTube, EPub
web-fetch	Web content fetching via curl and WebFetch — replaces the Fetch MCP server with native HTTP operations and jq parsing
lightpanda-browser	Lightweight headless browser automation via Lightpanda + agent-browser CDP — 9x lower memory, 11x faster, for scraping, DOM extraction, and form automation
skill-library	Agent-native catalog for browsing, installing, updating, syncing, and removing armory skills from within a Claude Code session
env-validator	Validate `.env` files against project requirements — missing vars, type mismatches, insecure defaults, `.env.example` drift

Skills — Research & Analysis

Skill	Description
literature-review	Systematic literature review — search, screen, extract, and synthesize academic research with gap analysis and structured citations
youtube-search	Search YouTube by keyword via yt-dlp — returns structured metadata (title, URL, channel, views, duration, date) for discovery and source curation
youtube-analysis	YouTube video transcript extraction and structured concept analysis — multi-level summaries, key concepts, takeaways, no API keys required
notebooklm	Google NotebookLM automation via notebooklm-py — create notebooks, add sources, chat, generate podcasts, videos, infographics, quizzes, flashcards, and more
research-critique	Critical analysis of research papers — methodology evaluation, claims-evidence alignment, contribution assessment with collegial analytical posture
immune	Hybrid adaptive memory with Cheatsheet (positive patterns) and Immune (negative patterns) — Hot/Cold tiered memory, multi-domain antibody scanning, auto-learning

Skills — Review & Quality

Skill	Description
architecture-reviewer	Architecture reviews across 7 scored dimensions — structural integrity, scalability, security, performance, enterprise readiness, operations, data
code-refiner	Deep code simplification and refactoring — structural complexity analysis, anti-pattern detection, idiomatic rewrites across Python, Go, TS, Rust
pr-review	Diff-based PR review across 5 dimensions — code quality, test coverage, silent failures, type design, comment quality with severity-ranked output
pre-landing-review	Gate-oriented safety audit with two-pass severity triage — CRITICAL (SQL, races, trust) blocks landing, INFORMATIONAL is advisory
plan-review	Pre-implementation plan audit stress-testing scope, assumptions, risks, and failure modes with product and engineering lenses
manuscript-review	Pre-publication manuscript audit with 24 diagnostic dimensions, citation hygiene, and cross-element coherence
manuscript-provenance	Computational provenance audit verifying every number, table, and figure in a manuscript traces back to code
repo-sentinel	Security audit and enforcement for public repos — 12 attack surfaces, pre-release readiness, history scrubbing, CI gates
package-evaluator	Evaluate package quality across 6 weighted dimensions with type-specific signals — frontmatter, triggers, structure, depth, consistency, compliance
devils-advocate	Challenges AI-generated plans, code, designs, and decisions — pre-mortem, inversion, Socratic questioning with steel-manning and clear verdicts
dependency-audit	Dependency risk assessment — license compliance, maintenance health scoring, CVE detection, bloat identification, supply chain analysis
qa-systematic	Systematic web QA testing with 8-category health scoring, issue taxonomy, and regression tracking — full, quick, and regression modes
ux-expert	UX audit and redesign for B2B SaaS dashboards — 8-dimension analysis, wireframes, component recommendations, severity-ranked findings

Skills — Visualization & Documents

Skill	Description
architecture-diagram	Layered architecture diagrams as self-contained HTML with inline SVG icons and CSS Grid layout
concept-to-image	Turn concepts into polished HTML visuals, export as PNG or SVG
concept-to-video	Turn concepts into animated explainer videos using Manim — MP4/GIF output with audio overlay, templates, multi-scene
remotion-video	Production motion graphics using Remotion (React) — branded content, data-driven video, audio sync, TailwindCSS
html-presentation	Convert documents and outlines into self-contained HTML slide presentations
static-web-artifacts-builder	Self-contained interactive HTML artifacts — infographics, dashboards, diagrams
md-to-pdf	Markdown to styled PDF with Mermaid diagrams, KaTeX math, and syntax highlighting

Skills — Documentation & Release

Skill	Description
changelog-composer	Structured changelogs from git history — conventional commit parsing, audience filtering, breaking change detection
ship-workflow	Automated release pipeline — merge main, run tests, pre-landing review, version bump, changelog, bisectable commits, PR
engineering-retro	Git-based engineering retrospective — commit analysis, velocity metrics, session patterns, health scoring over time windows
adr-writer	Architecture Decision Records — context capture, alternatives analysis, consequence projection, status lifecycle
api-docs-generator	API documentation audit and enhancement — FastAPI docstrings, Pydantic examples, OpenAPI spec enrichment, coverage reports

Skills — Backend & Data

Skill	Description
sql-optimizer	SQL performance analysis — EXPLAIN interpretation, anti-pattern detection, index recommendations, rewrites
migration-risk-analyzer	Database migration risk assessment — lock analysis, downtime estimation, rollback strategies, validation
benchmark-runner	Structured benchmark design — metric selection, test case matrix, environment capture, statistical rigor

Skills — Business Validation

Skill	Description
idea-validator	Full business idea validation orchestrator — Lean Canvas, JTBD, parallel market/competitive/feasibility agents, SWOT/PESTLE, weighted scoring
market-analyzer	Market sizing and trend analysis — TAM/SAM/SOM calculation, Rogers adoption curve, data triangulation, timing assessment
competitive-analyzer	Competitive landscape analysis — Porter's Five Forces, feature/pricing matrices, positioning maps, moat taxonomy
feasibility-assessor	Financial and technical feasibility — unit economics (CAC/LTV), revenue modeling, break-even, technical risk scoring, build-vs-buy

Skills — AI/ML & Planning

Skill	Description
prompt-lab	Systematic prompt engineering — variant generation, evaluation rubrics, failure mode analysis, test suites
rag-auditor	RAG pipeline evaluation — retrieval metrics, generation quality, failure taxonomy, diagnostic queries
task-decomposer	Feature decomposition — phased task breakdown, dependency mapping, edge case enumeration, sizing
estimate-calibrator	Calibrated three-point estimates — PERT ranges, unknown identification, confidence intervals, bias correction

Skills — Writing

Skill	Description
humanize	Detect and remove AI-generated writing patterns — 24 lexical patterns + 12 statistical signals, 6 domain profiles, 5-phase pipeline with semantic preservation
linkedin-post-style	Write LinkedIn posts in a specific technical voice with visual companion support — carousels via md-to-pdf, images via concept-to-image, video via concept-to-video

Skills — Skill Evolution (EvoSkills)

Skill	Description
paper-to-skill	Convert research papers into executable skill packages via methodology extraction and co-evolutionary refinement
skill-distiller	Distill Opus-quality skill packages into deterministic, Haiku-executable workflows via trace-driven distillation
surrogate-verifier	Information-isolated verification generating structured test assertions and failure diagnostics for skills

Research lineage: the EvoSkills pipeline (arXiv 2604.01687) handles offline co-evolutionary refinement. The immune skill together with armory's auto-memory system implements the stateful-prompt concept from Memento-Skills (arXiv 2603.18743) — the read-write reflective loop for continual learning without parameter updates.

Skills — Deprecated

Skills below are superseded by base model capabilities. They remain installable but receive no further updates.

Skill	Reason
doc-condenser	Base model handles summarization natively
regex-builder	Base model generates regex at equivalent quality
sequential-thinking	Base model handles chain-of-thought natively

Rules

Rule	Description
commit-standards	Conventional commit format, branch naming
test-standards	Coverage thresholds, test quality requirements
security-standards	Secret management, input validation, auth
token-efficiency	Token-efficient tool usage patterns

Commands

Command	Description
tdd	Test-driven development workflow
security-scan	Security vulnerability audit
refactor	Code simplification workflow
evolve	Co-evolutionary skill generation

Hooks

Hook	Description
git-protection	Block dangerous git operations
pre-edit-backup	Backup files before edits
cost-tracker	Log session cost/token usage
anatomy-index	Maintain project file index with token estimates
read-dedup	Warn on duplicate file reads within a session
prompt-context	Inject text file as additionalContext on every prompt

Utilities

Utility	Description
arxiv-search	Search arXiv for papers, output structured JSON metadata
dependency-tree	Visualize project dependency graph
test-coverage-report	Coverage summary for changed files

Presets

Presets install curated bundles of passive packages (rules, hooks, commands) in one command. For active workflow orchestration, use agents instead.

Preset	Packages	Description
core	3 skills, 1 hook, 1 rule	Baseline review-commit lifecycle. Start here.
sec-strict	5 skills, 3 agents, 2 rules, 2 hooks, 1 command	Audit-grade security stack with codebase-auditor. Superset of `core`.
python-strict	4 skills, 2 agents, 3 rules, 2 hooks, 2 commands	Full Python enforcement — TDD, type checking, test coverage, security standards.
ai-builder	6 skills	AI/ML development toolkit — agent building, prompt engineering, GPU optimization, RAG auditing.
skill-evolution	6 skills, 1 agent, 1 command	EvoSkills pipeline — co-evolutionary skill factory with paper-to-skill, distillation, and verification.
terse-mode	1 hook	Terse output enforcement via prompt-context hook with compaction-immune rule injection.

Deprecated Presets

Superseded by orchestrator agents that provide autonomous workflow orchestration instead of manual skill invocation.

Preset	Replacement
~~biz-validation~~	`idea-scout` agent
~~media-craft~~	`media-producer` agent
~~content-ops~~	`content-strategist` agent
~~research~~	`research-analyst` agent
~~eng-ops~~	`release-captain` + `full-stack-builder` agents

Installation

Option 1 — Skills CLI (recommended)

Install any package directly using npx skills:

# Install all packages
npx skills add Mathews-Tom/armory

# Install a specific skill or agent
npx skills add Mathews-Tom/armory -s architecture-reviewer
npx skills add Mathews-Tom/armory -s codebase-auditor

# List available packages without installing
npx skills add Mathews-Tom/armory -l

Option 2 — Profile installer

git clone https://github.com/Mathews-Tom/armory.git
cd armory

# Install by profile
just install-profile core
just install-profile python-strict

# Install by type
uv run scripts/install.py --type skills
uv run scripts/install.py --type agents

# Interactive TUI
uv run scripts/install.py

Displays a version-aware table of all packages, detects installed versions, and lets you select which to install or upgrade. Profiles install curated bundles of packages across all types.

Option 3 — Claude Code plugin marketplace (skills, agents, commands only)

claude plugin marketplace add Mathews-Tom/armory
/plugin install armory

This uses Claude Code's native plugin system and loads a subset of armory's catalog.

Package type	Supported via plugin marketplace
skills	✅ yes
agents	✅ yes
commands	✅ yes
hooks	❌ no — requires `npx skills` or the profile installer
rules	❌ no — armory-specific type, not a Claude Code plugin concept
utilities	❌ no — armory-specific type, not a Claude Code plugin concept
presets	❌ no — use `just install-profile` instead

For the full catalog across all seven package types, use Option 1 (Skills CLI) or Option 2 (profile installer).

Option 4 — Manual

Clone the repo and symlink individual package folders:

git clone https://github.com/Mathews-Tom/armory.git

# Skills
ln -s "$(pwd)/armory/skills/architecture-reviewer" ~/.claude/skills/architecture-reviewer

# Agents
ln -s "$(pwd)/armory/agents/codebase-auditor" ~/.claude/agents/codebase-auditor

Or download .skill / .agent archives from the Releases page.

Usage

Packages activate when Claude detects a matching intent. Each package defines trigger phrases in its frontmatter description — check the definition file (SKILL.md, AGENT.md, etc.) in each folder.

Example triggers:

"Run a security audit before I push this to GitHub"
-> activates: repo-sentinel (skill)

"Review this code for quality issues"
-> activates: code-reviewer (agent)

"Evaluate the quality of this package"
-> activates: package-evaluator (skill)

Commands are invoked explicitly via slash syntax:

/tdd calculate_discount    -> TDD workflow for a function
/security-scan src/        -> security vulnerability audit
/refactor src/utils.py     -> code simplification

Hooks fire automatically on Claude Code lifecycle events (PreToolUse, PostToolUse, Stop, UserPromptSubmit). Rules load as context when relevant. Presets install bundles via just install-profile.

Package Quality

Every package is evaluated against 6 shared dimensions using the package-evaluator, with type-specific signals for agents, hooks, rules, commands, utilities, and presets:

Dimension	Weight	What it measures
Frontmatter Quality	20%	Description length, trigger phrases, "Use when" clause
Trigger Coverage	18%	Synonym breadth, implied contexts, interrogative forms
Structural Completeness	20%	Workflow, error handling, output format, type-specific metadata
Content Depth	22%	Decision frameworks, multi-step workflows, type-specific signals
Consistency & Integrity	12%	Name matching, file references, description-body alignment
CONTRIBUTING Compliance	8%	Naming conventions, length limits, YAML validity

Eval Coverage

Every package has eval cases in {type}/<name>/evals/cases.yaml — positive triggers (should activate) and negative triggers (should not). Deprecated packages enforce 0 positive + 2 negative cases.

Validation:

uv run scripts/validate_evals.py    # Schema validation for all eval files
uv run scripts/generate_manifest.py # Regenerate manifest.yaml

CI pipeline (.github/workflows/evals.yml):

PR gate: validates manifest sync + eval schema on every pull request across all 7 type directories
Weekly cron: Monday runs for model drift detection

Pre-commit hook: auto-regenerates manifest.yaml when any package definition file changes.

MCP Server

An MCP server exposes armory packages as discoverable tools for any agent session. Register in your Claude Code config:

{
  "mcpServers": {
    "armory": { "command": "uv", "args": ["run", "mcp/server.py"] }
  }
}

Available tools:

Tool	Description
`search_packages`	Keyword search with type, category, and tag filters
`get_package`	Full metadata for a single package by name
`recommend_packages`	Context-aware recommendations by language, framework, or task
`list_categories`	All categories with package counts

Spec Compliance

Skills are validated against the agentskills.io open standard:

uv run scripts/validate_agentskills.py           # Warnings only (default)
uv run scripts/validate_agentskills.py --strict   # Extra fields are errors

All 57 skills pass with 0 errors. The validator checks the 6-field frontmatter spec (name, description, license, compatibility, metadata, allowed-tools) and flags Claude Code-specific fields as warnings.

Packaging

Each package can be archived for distribution. Archive type is auto-detected from the directory:

uv run scripts/package.py skills/architecture-reviewer  # produces .skill
uv run scripts/package.py agents/code-reviewer           # produces .agent
uv run scripts/package.py hooks/git-protection            # produces .hook

Cross-Platform Adapters

Packages are authored as Claude Code-native definitions. The adapter generator transforms them into platform-specific formats for Cursor, OpenAI Codex, and Gemini CLI.

Generate

# All platforms
uv run scripts/generate_adapters.py

# Single platform
uv run scripts/generate_adapters.py --platform cursor
uv run scripts/generate_adapters.py --platform codex
uv run scripts/generate_adapters.py --platform gemini

# Filter by package type
uv run scripts/generate_adapters.py --platform cursor --type skills --type rules

# Preview without writing
uv run scripts/generate_adapters.py --dry-run

Output lands in adapters/{platform}/ (gitignored — generated, not source).

Platform Mapping

Armory Type	Cursor	Codex	Gemini
Skills	`.cursor/rules/{name}.mdc`	`skills/AGENTS.md`	`.gemini/skills/{name}/SKILL.md`
Agents	`.cursor/rules/{name}.mdc`	`agents/AGENTS.md`	`.gemini/agents/{name}.md`
Rules	`.cursor/rules/{name}.mdc` (alwaysApply)	`standards/AGENTS.md`	Sections in `GEMINI.md`
Commands	`.cursor/commands/{name}.md`	`workflows/AGENTS.md`	`.gemini/commands/workflow/{name}.toml`
Hooks	—	—	—
Utilities	—	—	Wrapped as `.gemini/skills/`
Presets	—	—	—

Quick Install (no Python required)

Download pre-built adapter packages from the latest release:

# Cursor
npx @anthropic-armory/installer --target cursor

# Codex
npx @anthropic-armory/installer --target codex

# Gemini
npx @anthropic-armory/installer --target gemini --dir /path/to/project

Or download directly from GitHub Releases:

# Cursor — extract .cursor/ into your project root
curl -sL https://github.com/Mathews-Tom/armory/releases/download/latest/armory-cursor.tar.gz | tar -xz

# Codex — extract AGENTS.md + subdirectories into project root
curl -sL https://github.com/Mathews-Tom/armory/releases/download/latest/armory-codex.tar.gz | tar -xz

# Gemini — extract .gemini/ into your project root
curl -sL https://github.com/Mathews-Tom/armory/releases/download/latest/armory-gemini.tar.gz | tar -xz

Install via Python (with TUI)

The Python installer supports all targets with --target:

uv run scripts/install.py --target cursor --project-dir /path/to/project
uv run scripts/install.py --target codex --project-dir /path/to/project
uv run scripts/install.py --target gemini --project-dir /path/to/project

Generate Locally

Generate adapter output from source (requires Python 3.12):

uv run scripts/generate_adapters.py --platform cursor
uv run scripts/generate_adapters.py --platform codex
uv run scripts/generate_adapters.py --platform gemini

Output lands in adapters/{platform}/ (gitignored — generated, not source).

Platform Details

Cursor: Rules with alwaysApply: true (project standards) load on every prompt. Skills and agents load when Cursor matches the description or glob pattern.

Codex: The root AGENTS.md is a condensed index under the 32 KiB budget. Full content is in subdirectory AGENTS.md files, loaded via Codex's hierarchical discovery.

Gemini: Skills are a near 1:1 copy (references, scripts, and assets included). Rules become sections in GEMINI.md. Commands are converted to TOML format.

What's Lost

Not all package types have equivalents on every platform:

Hooks have no equivalent on Cursor or Codex. Gemini has hooks but uses a different event model.
Presets require a dependency resolver that no target platform provides.
Utilities with executable scripts are skipped on Cursor and Codex (passive context only). Gemini wraps them as skills.

Contributing

See CONTRIBUTING.md for guidelines on submitting new packages or improving existing ones.

Looking for something to build? Check WANTED.md for missing skill domains, requested agents, and infrastructure improvements.

Contributors

See CONTRIBUTORS.md for the full list.

Attributions

See ATTRIBUTIONS.md for the full list of upstream libraries, tools, and projects that armory packages wrap, depend on, or were inspired by.

License

MIT. See LICENSE for details.

Migrated from praxis-skills. If you had skills installed from the previous repo, re-run the installer to update paths. Existing skills continue to work — the content is unchanged.

Name		Name	Last commit message	Last commit date
Latest commit History 319 Commits
.claude-plugin		.claude-plugin
.github		.github
_templates		_templates
agents		agents
commands		commands
evals		evals
hooks		hooks
mcp		mcp
npm		npm
presets		presets
rules		rules
scripts		scripts
site		site
skills		skills
tests		tests
utilities		utilities
.gitignore		.gitignore
.gitleaks.toml		.gitleaks.toml
.pre-commit-config.yaml		.pre-commit-config.yaml
ATTRIBUTIONS.md		ATTRIBUTIONS.md
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTORS.md		CONTRIBUTORS.md
LICENSE		LICENSE
NEW_PACKAGE_CHECKLIST.md		NEW_PACKAGE_CHECKLIST.md
README.md		README.md
SECURITY.md		SECURITY.md
WANTED.md		WANTED.md
justfile		justfile
manifest.yaml		manifest.yaml
profiles.yaml		profiles.yaml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

armory

Overview

Package Catalog

Agents — Orchestrators

Agents — Analyzers

Skills — Development & Tooling

Skills — Research & Analysis

Skills — Review & Quality

Skills — Visualization & Documents

Skills — Documentation & Release

Skills — Backend & Data

Skills — Business Validation

Skills — AI/ML & Planning

Skills — Writing

Skills — Skill Evolution (EvoSkills)

Skills — Deprecated

Rules

Commands

Hooks

Utilities

Presets

Deprecated Presets

Installation

Usage

Package Quality

Eval Coverage

MCP Server

Spec Compliance

Packaging

Cross-Platform Adapters

Generate

Platform Mapping

Quick Install (no Python required)

Install via Python (with TUI)

Generate Locally

Platform Details

What's Lost

Contributing

Contributors

Attributions

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages