Curated, production-grade skills, agents, hooks, rules, commands, utilities, and presets for AI coding agents. No magic, no demos — battle-tested workflows built for developers who use AI seriously.
armory is a collection of packages for Claude Code and Claude.ai. Each package is a self-contained prompt or automation unit that extends Claude with a repeatable, opinionated workflow for a specific task domain. Packages span seven types: skills, agents, hooks, rules, commands, utilities, and presets.
Philosophy: Packages in this collection are practical and context-free. They define the how, not just the what — covering inputs, outputs, edge cases, and failure modes. They are tested in real workloads, not constructed as examples.
Intended for developers who treat AI coding agents as a serious part of their workflow.
Orchestrator agents compose skills and other agents into multi-phase workflows. Each can run solo or be spawned by another agent via the Agent tool.
| Agent | Model | Description |
|---|---|---|
| team-lead | opus | Meta-orchestrator — decomposes multi-domain requests, delegates to specialized agents, synthesizes results |
| codebase-auditor | sonnet | Unified quality assessment — spawns code-reviewer, security-reviewer, secret-scanner in parallel, merges report |
| project-architect | opus | Phased requirements discovery producing architecture documents with diagrams and tech stack justification |
| project-planner | sonnet | Task decomposition with dependency mapping, three-point estimates, milestone timelines, and risk logs |
| research-analyst | opus | Multi-source investigation with parallel agents across web, academic, video, and competitive sources |
| idea-scout | opus | Business idea validation — Lean Canvas, parallel market/competitive/feasibility research, weighted scorecard |
| full-stack-builder | opus | End-to-end implementation from spec — scaffolding, sprints, quality passes, documentation, pre-delivery review |
| release-captain | sonnet | Ship lifecycle with quality gates — pre-flight, secret scan, changelog, version bump, PR creation |
| proposal-writer | opus | Technical proposals with ROI calculations, three-tier pricing, and Problem-Agitate-Solve framing |
| content-strategist | sonnet | Multi-channel content creation with per-channel adaptation and automated quality passes |
| media-producer | sonnet | Visual and video format router — selects the right skill based on concept type and output needs |
| Agent | Model | Description |
|---|---|---|
| code-reviewer | sonnet | Multi-phase code review with severity-ranked findings |
| security-reviewer | sonnet | OWASP Top 10 vulnerability scanning |
| secret-scanner | haiku | Pre-commit detection of hardcoded credentials |
| test-engineer | sonnet | Co-evolutionary skill evolution with generate-verify-refine loops |
Model routing: Agents marked
opusrun on Claude Opus 4.7 withxhigheffort by default in Claude Code. Usemaxeffort only for genuinely hard novel problems (diminishing returns, overthinking risk);highwhen running concurrent sessions or for cost-sensitive work. Opus 4.7 uses adaptive thinking — there is no fixed thinking budget to tune.
| Skill | Description |
|---|---|
| agent-builder | Build AI agents using the Claude Agent SDK and headless CLI mode — covers tool definitions, MCP servers, and programmatic orchestration |
| github | GitHub CLI operations via gh — issues, PRs, CI/Actions, releases, search, REST/GraphQL API, with error handling and automation workflows |
| filesystem | File and directory operations via Claude Code built-in tools — replaces the Filesystem MCP server with native Read, Write, Edit, Glob, Grep |
| mcp-to-skill | Convert MCP servers into on-demand skills to reduce active context window token usage |
| gpu-optimizer | GPU optimization for consumer GPUs (8-24GB VRAM) — PyTorch, XGBoost, CuPy/RAPIDS, memory management, and CUDA tuning |
| tavily | AI-optimized web search and content extraction via Tavily API with structured output parsing |
| test-harness | Comprehensive pytest suite generation — happy path, edge cases, error conditions, fixtures, mocks, async, parametrized tests |
| debug-investigator | Systematic debugging framework — hypothesis-driven investigation with bisection, log analysis, instrumentation, and minimal reproduction |
| to-markdown | Convert any file or URL to clean Markdown via MarkItDown — PDF, DOCX, XLSX, PPTX, HTML, images, audio, CSV, JSON, XML, YouTube, EPub |
| web-fetch | Web content fetching via curl and WebFetch — replaces the Fetch MCP server with native HTTP operations and jq parsing |
| lightpanda-browser | Lightweight headless browser automation via Lightpanda + agent-browser CDP — 9x lower memory, 11x faster, for scraping, DOM extraction, and form automation |
| skill-library | Agent-native catalog for browsing, installing, updating, syncing, and removing armory skills from within a Claude Code session |
| env-validator | Validate .env files against project requirements — missing vars, type mismatches, insecure defaults, .env.example drift |
| Skill | Description |
|---|---|
| literature-review | Systematic literature review — search, screen, extract, and synthesize academic research with gap analysis and structured citations |
| youtube-search | Search YouTube by keyword via yt-dlp — returns structured metadata (title, URL, channel, views, duration, date) for discovery and source curation |
| youtube-analysis | YouTube video transcript extraction and structured concept analysis — multi-level summaries, key concepts, takeaways, no API keys required |
| notebooklm | Google NotebookLM automation via notebooklm-py — create notebooks, add sources, chat, generate podcasts, videos, infographics, quizzes, flashcards, and more |
| research-critique | Critical analysis of research papers — methodology evaluation, claims-evidence alignment, contribution assessment with collegial analytical posture |
| immune | Hybrid adaptive memory with Cheatsheet (positive patterns) and Immune (negative patterns) — Hot/Cold tiered memory, multi-domain antibody scanning, auto-learning |
| Skill | Description |
|---|---|
| architecture-reviewer | Architecture reviews across 7 scored dimensions — structural integrity, scalability, security, performance, enterprise readiness, operations, data |
| code-refiner | Deep code simplification and refactoring — structural complexity analysis, anti-pattern detection, idiomatic rewrites across Python, Go, TS, Rust |
| pr-review | Diff-based PR review across 5 dimensions — code quality, test coverage, silent failures, type design, comment quality with severity-ranked output |
| pre-landing-review | Gate-oriented safety audit with two-pass severity triage — CRITICAL (SQL, races, trust) blocks landing, INFORMATIONAL is advisory |
| plan-review | Pre-implementation plan audit stress-testing scope, assumptions, risks, and failure modes with product and engineering lenses |
| manuscript-review | Pre-publication manuscript audit with 24 diagnostic dimensions, citation hygiene, and cross-element coherence |
| manuscript-provenance | Computational provenance audit verifying every number, table, and figure in a manuscript traces back to code |
| repo-sentinel | Security audit and enforcement for public repos — 12 attack surfaces, pre-release readiness, history scrubbing, CI gates |
| package-evaluator | Evaluate package quality across 6 weighted dimensions with type-specific signals — frontmatter, triggers, structure, depth, consistency, compliance |
| devils-advocate | Challenges AI-generated plans, code, designs, and decisions — pre-mortem, inversion, Socratic questioning with steel-manning and clear verdicts |
| dependency-audit | Dependency risk assessment — license compliance, maintenance health scoring, CVE detection, bloat identification, supply chain analysis |
| qa-systematic | Systematic web QA testing with 8-category health scoring, issue taxonomy, and regression tracking — full, quick, and regression modes |
| ux-expert | UX audit and redesign for B2B SaaS dashboards — 8-dimension analysis, wireframes, component recommendations, severity-ranked findings |
| Skill | Description |
|---|---|
| architecture-diagram | Layered architecture diagrams as self-contained HTML with inline SVG icons and CSS Grid layout |
| concept-to-image | Turn concepts into polished HTML visuals, export as PNG or SVG |
| concept-to-video | Turn concepts into animated explainer videos using Manim — MP4/GIF output with audio overlay, templates, multi-scene |
| remotion-video | Production motion graphics using Remotion (React) — branded content, data-driven video, audio sync, TailwindCSS |
| html-presentation | Convert documents and outlines into self-contained HTML slide presentations |
| static-web-artifacts-builder | Self-contained interactive HTML artifacts — infographics, dashboards, diagrams |
| md-to-pdf | Markdown to styled PDF with Mermaid diagrams, KaTeX math, and syntax highlighting |
| Skill | Description |
|---|---|
| changelog-composer | Structured changelogs from git history — conventional commit parsing, audience filtering, breaking change detection |
| ship-workflow | Automated release pipeline — merge main, run tests, pre-landing review, version bump, changelog, bisectable commits, PR |
| engineering-retro | Git-based engineering retrospective — commit analysis, velocity metrics, session patterns, health scoring over time windows |
| adr-writer | Architecture Decision Records — context capture, alternatives analysis, consequence projection, status lifecycle |
| api-docs-generator | API documentation audit and enhancement — FastAPI docstrings, Pydantic examples, OpenAPI spec enrichment, coverage reports |
| Skill | Description |
|---|---|
| sql-optimizer | SQL performance analysis — EXPLAIN interpretation, anti-pattern detection, index recommendations, rewrites |
| migration-risk-analyzer | Database migration risk assessment — lock analysis, downtime estimation, rollback strategies, validation |
| benchmark-runner | Structured benchmark design — metric selection, test case matrix, environment capture, statistical rigor |
| Skill | Description |
|---|---|
| idea-validator | Full business idea validation orchestrator — Lean Canvas, JTBD, parallel market/competitive/feasibility agents, SWOT/PESTLE, weighted scoring |
| market-analyzer | Market sizing and trend analysis — TAM/SAM/SOM calculation, Rogers adoption curve, data triangulation, timing assessment |
| competitive-analyzer | Competitive landscape analysis — Porter's Five Forces, feature/pricing matrices, positioning maps, moat taxonomy |
| feasibility-assessor | Financial and technical feasibility — unit economics (CAC/LTV), revenue modeling, break-even, technical risk scoring, build-vs-buy |
| Skill | Description |
|---|---|
| prompt-lab | Systematic prompt engineering — variant generation, evaluation rubrics, failure mode analysis, test suites |
| rag-auditor | RAG pipeline evaluation — retrieval metrics, generation quality, failure taxonomy, diagnostic queries |
| task-decomposer | Feature decomposition — phased task breakdown, dependency mapping, edge case enumeration, sizing |
| estimate-calibrator | Calibrated three-point estimates — PERT ranges, unknown identification, confidence intervals, bias correction |
| Skill | Description |
|---|---|
| humanize | Detect and remove AI-generated writing patterns — 24 lexical patterns + 12 statistical signals, 6 domain profiles, 5-phase pipeline with semantic preservation |
| linkedin-post-style | Write LinkedIn posts in a specific technical voice with visual companion support — carousels via md-to-pdf, images via concept-to-image, video via concept-to-video |
| Skill | Description |
|---|---|
| paper-to-skill | Convert research papers into executable skill packages via methodology extraction and co-evolutionary refinement |
| skill-distiller | Distill Opus-quality skill packages into deterministic, Haiku-executable workflows via trace-driven distillation |
| surrogate-verifier | Information-isolated verification generating structured test assertions and failure diagnostics for skills |
Research lineage: the EvoSkills pipeline (arXiv 2604.01687) handles offline co-evolutionary refinement. The
immuneskill together with armory's auto-memory system implements the stateful-prompt concept from Memento-Skills (arXiv 2603.18743) — the read-write reflective loop for continual learning without parameter updates.
Skills below are superseded by base model capabilities. They remain installable but receive no further updates.
| Skill | Reason |
|---|---|
| doc-condenser | Base model handles summarization natively |
| regex-builder | Base model generates regex at equivalent quality |
| sequential-thinking | Base model handles chain-of-thought natively |
| Rule | Description |
|---|---|
| commit-standards | Conventional commit format, branch naming |
| test-standards | Coverage thresholds, test quality requirements |
| security-standards | Secret management, input validation, auth |
| token-efficiency | Token-efficient tool usage patterns |
| Command | Description |
|---|---|
| tdd | Test-driven development workflow |
| security-scan | Security vulnerability audit |
| refactor | Code simplification workflow |
| evolve | Co-evolutionary skill generation |
| Hook | Description |
|---|---|
| git-protection | Block dangerous git operations |
| pre-edit-backup | Backup files before edits |
| cost-tracker | Log session cost/token usage |
| anatomy-index | Maintain project file index with token estimates |
| read-dedup | Warn on duplicate file reads within a session |
| prompt-context | Inject text file as additionalContext on every prompt |
| Utility | Description |
|---|---|
| arxiv-search | Search arXiv for papers, output structured JSON metadata |
| dependency-tree | Visualize project dependency graph |
| test-coverage-report | Coverage summary for changed files |
Presets install curated bundles of passive packages (rules, hooks, commands) in one command. For active workflow orchestration, use agents instead.
| Preset | Packages | Description |
|---|---|---|
| core | 3 skills, 1 hook, 1 rule | Baseline review-commit lifecycle. Start here. |
| sec-strict | 5 skills, 3 agents, 2 rules, 2 hooks, 1 command | Audit-grade security stack with codebase-auditor. Superset of core. |
| python-strict | 4 skills, 2 agents, 3 rules, 2 hooks, 2 commands | Full Python enforcement — TDD, type checking, test coverage, security standards. |
| ai-builder | 6 skills | AI/ML development toolkit — agent building, prompt engineering, GPU optimization, RAG auditing. |
| skill-evolution | 6 skills, 1 agent, 1 command | EvoSkills pipeline — co-evolutionary skill factory with paper-to-skill, distillation, and verification. |
| terse-mode | 1 hook | Terse output enforcement via prompt-context hook with compaction-immune rule injection. |
Superseded by orchestrator agents that provide autonomous workflow orchestration instead of manual skill invocation.
| Preset | Replacement |
|---|---|
idea-scout agent |
|
media-producer agent |
|
content-strategist agent |
|
research-analyst agent |
|
release-captain + full-stack-builder agents |
Option 1 — Skills CLI (recommended)
Install any package directly using npx skills:
# Install all packages
npx skills add Mathews-Tom/armory
# Install a specific skill or agent
npx skills add Mathews-Tom/armory -s architecture-reviewer
npx skills add Mathews-Tom/armory -s codebase-auditor
# List available packages without installing
npx skills add Mathews-Tom/armory -lOption 2 — Profile installer
git clone https://github.com/Mathews-Tom/armory.git
cd armory
# Install by profile
just install-profile core
just install-profile python-strict
# Install by type
uv run scripts/install.py --type skills
uv run scripts/install.py --type agents
# Interactive TUI
uv run scripts/install.pyDisplays a version-aware table of all packages, detects installed versions, and lets you select which to install or upgrade. Profiles install curated bundles of packages across all types.
Option 3 — Claude Code plugin marketplace (skills, agents, commands only)
claude plugin marketplace add Mathews-Tom/armory
/plugin install armoryThis uses Claude Code's native plugin system and loads a subset of armory's catalog.
| Package type | Supported via plugin marketplace |
|---|---|
| skills | ✅ yes |
| agents | ✅ yes |
| commands | ✅ yes |
| hooks | ❌ no — requires npx skills or the profile installer |
| rules | ❌ no — armory-specific type, not a Claude Code plugin concept |
| utilities | ❌ no — armory-specific type, not a Claude Code plugin concept |
| presets | ❌ no — use just install-profile instead |
For the full catalog across all seven package types, use Option 1 (Skills CLI) or Option 2 (profile installer).
Option 4 — Manual
Clone the repo and symlink individual package folders:
git clone https://github.com/Mathews-Tom/armory.git
# Skills
ln -s "$(pwd)/armory/skills/architecture-reviewer" ~/.claude/skills/architecture-reviewer
# Agents
ln -s "$(pwd)/armory/agents/codebase-auditor" ~/.claude/agents/codebase-auditorOr download .skill / .agent archives from the Releases page.
Packages activate when Claude detects a matching intent. Each package defines trigger phrases in its frontmatter description — check the definition file (SKILL.md, AGENT.md, etc.) in each folder.
Example triggers:
"Run a security audit before I push this to GitHub"
-> activates: repo-sentinel (skill)
"Review this code for quality issues"
-> activates: code-reviewer (agent)
"Evaluate the quality of this package"
-> activates: package-evaluator (skill)
Commands are invoked explicitly via slash syntax:
/tdd calculate_discount -> TDD workflow for a function
/security-scan src/ -> security vulnerability audit
/refactor src/utils.py -> code simplification
Hooks fire automatically on Claude Code lifecycle events (PreToolUse, PostToolUse, Stop, UserPromptSubmit). Rules load as context when relevant. Presets install bundles via just install-profile.
Every package is evaluated against 6 shared dimensions using the package-evaluator, with type-specific signals for agents, hooks, rules, commands, utilities, and presets:
| Dimension | Weight | What it measures |
|---|---|---|
| Frontmatter Quality | 20% | Description length, trigger phrases, "Use when" clause |
| Trigger Coverage | 18% | Synonym breadth, implied contexts, interrogative forms |
| Structural Completeness | 20% | Workflow, error handling, output format, type-specific metadata |
| Content Depth | 22% | Decision frameworks, multi-step workflows, type-specific signals |
| Consistency & Integrity | 12% | Name matching, file references, description-body alignment |
| CONTRIBUTING Compliance | 8% | Naming conventions, length limits, YAML validity |
Every package has eval cases in {type}/<name>/evals/cases.yaml — positive triggers (should activate) and negative triggers (should not). Deprecated packages enforce 0 positive + 2 negative cases.
Validation:
uv run scripts/validate_evals.py # Schema validation for all eval files
uv run scripts/generate_manifest.py # Regenerate manifest.yamlCI pipeline (.github/workflows/evals.yml):
- PR gate: validates manifest sync + eval schema on every pull request across all 7 type directories
- Weekly cron: Monday runs for model drift detection
Pre-commit hook: auto-regenerates manifest.yaml when any package definition file changes.
An MCP server exposes armory packages as discoverable tools for any agent session. Register in your Claude Code config:
{
"mcpServers": {
"armory": { "command": "uv", "args": ["run", "mcp/server.py"] }
}
}Available tools:
| Tool | Description |
|---|---|
search_packages |
Keyword search with type, category, and tag filters |
get_package |
Full metadata for a single package by name |
recommend_packages |
Context-aware recommendations by language, framework, or task |
list_categories |
All categories with package counts |
Skills are validated against the agentskills.io open standard:
uv run scripts/validate_agentskills.py # Warnings only (default)
uv run scripts/validate_agentskills.py --strict # Extra fields are errorsAll 57 skills pass with 0 errors. The validator checks the 6-field frontmatter spec (name, description, license, compatibility, metadata, allowed-tools) and flags Claude Code-specific fields as warnings.
Each package can be archived for distribution. Archive type is auto-detected from the directory:
uv run scripts/package.py skills/architecture-reviewer # produces .skill
uv run scripts/package.py agents/code-reviewer # produces .agent
uv run scripts/package.py hooks/git-protection # produces .hookPackages are authored as Claude Code-native definitions. The adapter generator transforms them into platform-specific formats for Cursor, OpenAI Codex, and Gemini CLI.
# All platforms
uv run scripts/generate_adapters.py
# Single platform
uv run scripts/generate_adapters.py --platform cursor
uv run scripts/generate_adapters.py --platform codex
uv run scripts/generate_adapters.py --platform gemini
# Filter by package type
uv run scripts/generate_adapters.py --platform cursor --type skills --type rules
# Preview without writing
uv run scripts/generate_adapters.py --dry-runOutput lands in adapters/{platform}/ (gitignored — generated, not source).
| Armory Type | Cursor | Codex | Gemini |
|---|---|---|---|
| Skills | .cursor/rules/{name}.mdc |
skills/AGENTS.md |
.gemini/skills/{name}/SKILL.md |
| Agents | .cursor/rules/{name}.mdc |
agents/AGENTS.md |
.gemini/agents/{name}.md |
| Rules | .cursor/rules/{name}.mdc (alwaysApply) |
standards/AGENTS.md |
Sections in GEMINI.md |
| Commands | .cursor/commands/{name}.md |
workflows/AGENTS.md |
.gemini/commands/workflow/{name}.toml |
| Hooks | — | — | — |
| Utilities | — | — | Wrapped as .gemini/skills/ |
| Presets | — | — | — |
Download pre-built adapter packages from the latest release:
# Cursor
npx @anthropic-armory/installer --target cursor
# Codex
npx @anthropic-armory/installer --target codex
# Gemini
npx @anthropic-armory/installer --target gemini --dir /path/to/projectOr download directly from GitHub Releases:
# Cursor — extract .cursor/ into your project root
curl -sL https://github.com/Mathews-Tom/armory/releases/download/latest/armory-cursor.tar.gz | tar -xz
# Codex — extract AGENTS.md + subdirectories into project root
curl -sL https://github.com/Mathews-Tom/armory/releases/download/latest/armory-codex.tar.gz | tar -xz
# Gemini — extract .gemini/ into your project root
curl -sL https://github.com/Mathews-Tom/armory/releases/download/latest/armory-gemini.tar.gz | tar -xzThe Python installer supports all targets with --target:
uv run scripts/install.py --target cursor --project-dir /path/to/project
uv run scripts/install.py --target codex --project-dir /path/to/project
uv run scripts/install.py --target gemini --project-dir /path/to/projectGenerate adapter output from source (requires Python 3.12):
uv run scripts/generate_adapters.py --platform cursor
uv run scripts/generate_adapters.py --platform codex
uv run scripts/generate_adapters.py --platform geminiOutput lands in adapters/{platform}/ (gitignored — generated, not source).
Cursor: Rules with alwaysApply: true (project standards) load on every prompt. Skills and agents load when Cursor matches the description or glob pattern.
Codex: The root AGENTS.md is a condensed index under the 32 KiB budget. Full content is in subdirectory AGENTS.md files, loaded via Codex's hierarchical discovery.
Gemini: Skills are a near 1:1 copy (references, scripts, and assets included). Rules become sections in GEMINI.md. Commands are converted to TOML format.
Not all package types have equivalents on every platform:
- Hooks have no equivalent on Cursor or Codex. Gemini has hooks but uses a different event model.
- Presets require a dependency resolver that no target platform provides.
- Utilities with executable scripts are skipped on Cursor and Codex (passive context only). Gemini wraps them as skills.
See CONTRIBUTING.md for guidelines on submitting new packages or improving existing ones.
Looking for something to build? Check WANTED.md for missing skill domains, requested agents, and infrastructure improvements.
See CONTRIBUTORS.md for the full list.
See ATTRIBUTIONS.md for the full list of upstream libraries, tools, and projects that armory packages wrap, depend on, or were inspired by.
MIT. See LICENSE for details.
Migrated from praxis-skills. If you had skills installed from the previous repo, re-run the installer to update paths. Existing skills continue to work — the content is unchanged.