The Skill Development Workbench for ClawHub. An offline-first pipeline to build, validate, and publish agent skills: make new → make lint → make scan → make test → make publish.
Author: @gitgoodordietrying
A complete development workbench for ClawHub skills. Twenty-five published skills, a linter, an offline security scanner (87 patterns, 13 categories including prompt injection detection, SARIF output), a zero-trust skill verifier (guilty-until-proven-innocent line classification), a test framework (100% coverage), and a gated publishing pipeline — all driven from a single Makefile.
What you can do here: scaffold new skills from templates, lint them for structure and content quality, scan them for malicious patterns (offline, no network required), run behavioral tests, and publish through a gated pipeline.
What this isn't: a runtime for running OpenClaw agents, or a sandbox for executing untrusted code. For that, see openclaw-vault — the hardened container companion where your API key never enters the container.
"How is this better than just having Claude write a SKILL.md?"
An AI can draft a skill file in seconds. What it can't do is gate its own output through a security pipeline, track quality over time, or prove to a human operator that the result is safe. This workbench exists because the gap between "generated" and "production-ready" is where real risk lives.
What the pipeline provides that raw authoring doesn't:
- 71 malicious patterns across 13 categories — MITRE ATT&CK mapped, derived from real trojanized skills (the ClawHavoc campaign). Every skill is scanned offline before publishing.
- Prompt injection detection — 16 patterns detect LLM manipulation attempts: override instructions, persona hijacking, stealth commands, data theft instructions, and format token injection.
- Multi-file scanning — the scanner inspects ALL files in skill directories (
.md,.sh,.py,.js,.ts,.yaml,.yml,.json), not justSKILL.md. - Strict mode —
make scan-strictblocks HIGH findings too, not just CRITICAL. Prevents ssh key theft, persistence, and container escape patterns from slipping through. - Post-install quarantine — when
ALLOW_INSTALL=1is used, the scanner auto-runs on newly installed skills and quarantines failures. - Suppression audit —
.scanignorefiles are validated: ranges >50 lines are rejected (preventsL1-L9999blanket suppression). - Behavioral assertions — 168+ test assertions enforce content consistency, structural requirements, and domain accuracy across all skills.
- Mandatory test gate — every skill must have a test file to publish. No more silent test skipping.
- Gated publishing —
make publishwon't run until lint, scan, and test all pass. No human judgment call required at the gate. - SARIF output for CI — GitHub code scanning integration via
make scan-sarif, so the pipeline runs on every PR automatically. - Transparent allowlisting — skills that legitimately discuss security patterns (like
security-audit) use explicit.scanignorefiles, not global suppression. Every exception is auditable. - Zero-trust verification —
make verify-allclassifies every line in every skill file. One unrecognizable line quarantines the entire skill. Catches novel attacks the blocklist misses. - Tool test suite — 40+ behavioral tests verify the workbench tools themselves (
make test-tools). - Trend tracking —
make stats-trendshows whether skills are growing or dying.make stats-rankshows competitive positioning.
Run make report to see a concrete summary of what the pipeline catches.
Commands for the human operator to assess workbench health and competitive position:
make verify # 12-point workbench health check
make report # Pipeline value summary
make scan-strict # Scan with --strict (HIGH blocks)
make verify-all # Zero-trust verify all skills
make verify-skill SKILL=docker-sandbox # Verify single skill
make verify-report SKILL=docker-sandbox # Per-line verdict report
make test-tools # Run tool behavioral tests
make check-all # Full pipeline + self-test + tool tests
make explore # Top 20 skills by downloads
make explore QUERY="docker" # Semantic search for competitors
make explore SORT=trending # What's hot right now
make explore SORT=installs LIMIT=50 # Most installed, larger set
make stats # Current adoption metrics
make stats-trend # Growth deltas vs previous snapshots
make stats-rank # Our skills ranked against top 50Transparency about what the workbench can't do:
- Who installed your skills — the ClawHub API doesn't expose installer identities or per-user usage data. Download counts are the best signal available.
- Bot/agent network visibility — no way to distinguish human installs from automated agent installs. The API doesn't differentiate.
- Web dashboard — this is a CLI workbench; terminal tables are the right UX for the workflow. If you want a GUI, see the roadmap for the planned meta-repo.
- Skill dependencies — ClawHub has no dependency resolution. Skills are standalone by design.
- Auto-version bumping — too risky without human review. Version is always an explicit
VERSION=x.y.zparameter.
Open this repo in VS Code with the Dev Containers extension. The devcontainer installs Node.js, Python, molthub, and all dependencies automatically.
Requirements: bash, git, python3 (for YAML/JSON validation). Optional: molthub (npm install -g molthub).
make help # Show all commands
make new SKILL=my-tool # Scaffold from template
make lint # Lint all skills
make scan # Security scan all skills
make test # Run behavioral tests
make check # Full pipeline: lint + scan + testAutomates skill quality review. Checks:
- Frontmatter — delimiters, required fields (
name,description,metadata), valid slug, description length, valid JSON metadata - Structure — H1 title,
## When to Usesection,## Tipssection - Content quality — line count (150-700), code block density (8+ blocks), language tags on fences, no TODO/FIXME/XXX placeholders
- Metadata consistency — each binary in
requires.anyBinsis referenced in the content body
Offline security scanner — works without network. 87 patterns across 13 categories with MITRE ATT&CK IDs, derived from the real moltbook-ay trojan and ClawHavoc campaign analysis. Scans ALL files in skill directories, not just SKILL.md.
| Category | Severity | What it catches |
|---|---|---|
| C2/Download | CRITICAL | curl/wget/fetch to external URLs |
| Archive execution | CRITICAL | Password-protected ZIP/7z extraction |
| Exec download | CRITICAL | chmod+execute, bash -c with curl, eval with subshell |
| Credential access | HIGH | Reading .env, .ssh keys, AWS/K8s creds, /proc/environ, PEM files |
| Data exfiltration | CRITICAL | curl POST, netcat, DNS exfil, SCP, git push, FTP to IPs |
| Obfuscation | HIGH | Base64/hex decode to shell, Python/Perl/Ruby eval, OpenSSL decrypt |
| Persistence | HIGH | crontab, bashrc/profile/zshrc/fish, at now, launchctl |
| Privilege escalation | MEDIUM-HIGH | sudo chmod 777, setuid, sudo su, nsenter |
| Container escape | HIGH | --privileged, SYS_ADMIN, mount host, docker.sock, sysrq |
| Supply chain | MEDIUM | Unsafe npm install, pip --pre, registry hijack |
| Environment injection | MEDIUM | LD_PRELOAD, PATH manipulation, env -i |
| Resource abuse | HIGH | Fork bomb, infinite loop with network |
| Prompt injection | HIGH-CRITICAL | Override attempts, persona hijacking, stealth instructions, data theft, LLM control tokens |
Output modes: make scan (colored terminal), make scan-summary (one-line per skill), make scan-json (structured JSON), make scan-sarif (SARIF 2.1.0 for GitHub code scanning), make scan-strict (HIGH blocks too). Scanner self-test: make self-test.
Skills that legitimately discuss these patterns (like security-audit) can use <!-- scan:ignore --> inline or a .scanignore file. Scanignore files are audited: ranges >50 lines are rejected.
Guilty until proven innocent. The scanner uses a blocklist (scan for known-bad, let everything else through). The verifier flips this: every line in every file must be classified as SAFE, or the entire skill is quarantined. No partial passes, no thresholds.
Every line gets one of three verdicts:
| Verdict | Meaning |
|---|---|
SAFE |
Line matches a known-safe pattern (structural markdown, prose under 500 chars, code inside fenced blocks, frontmatter fields) |
SUSPICIOUS |
Line doesn't match any safe pattern (possible obfuscation, unknown encoding, excessively long content) |
MALICIOUS |
Line triggers the 87-pattern blocklist |
Release rule: A skill is released from quarantine ONLY if it has ZERO malicious lines AND ZERO suspicious lines. One unrecognizable line quarantines the entire skill.
Two-stage defense: Post-install, skills pass through skill-scan.sh --strict (blocklist, fast) then skill-verify.sh --strict (allowlist, thorough). Both must pass.
Trust manifests: Our own skills can carry .trust files with SHA-256 content hashes, allowing them to skip verification when unchanged. External skills never have trust manifests.
make verify-skill SKILL=docker-sandbox # Verify single skill
make verify-all # Verify all skills
make verify-report SKILL=docker-sandbox # Per-line verdict reportBehavioral assertions for skills — the "pytest for SKILL.md files":
assert_section_exists "$SKILL" "When to Use"
assert_contains "$SKILL" "docker\s+(run|build|exec)"
assert_not_contains "$SKILL" "(TODO|FIXME|XXX)"
assert_min_code_blocks "$SKILL" 8
assert_frontmatter_field "$SKILL" "name" "^docker-sandbox$"Write tests/<skill-name>.test.sh with test_* functions. The runner discovers and executes them automatically.
Gated pipeline: lint → scan → test must all pass before molthub publish runs. Usage:
make publish SKILL=my-tool VERSION=1.0.0Twenty-four production-quality skills filling infrastructure gaps in the ClawHub registry.
| Skill | Install | What It Does |
|---|---|---|
| Docker Sandbox | molthub install docker-sandbox |
Docker sandbox VM management, network proxy, workspace mounting, troubleshooting |
| CSV Data Pipeline | molthub install csv-pipeline |
CSV/JSON/TSV processing with awk and Python — filter, join, aggregate, deduplicate, validate, convert |
| API Development | molthub install api-dev |
curl testing, bash/Python test runners, OpenAPI spec generation, mock servers, Express scaffolding |
| CI/CD Pipeline | molthub install cicd-pipeline |
GitHub Actions for Node/Python/Go/Rust, matrix builds, caching, Docker build+push, secrets management |
| Skill | Install | What It Does |
|---|---|---|
| SQL Toolkit | molthub install sql-toolkit |
SQLite/PostgreSQL/MySQL — schema design, queries, CTEs, window functions, migrations, EXPLAIN, indexing |
| Test Patterns | molthub install test-patterns |
Jest/Vitest, pytest, Go, Rust, Bash — unit tests, mocking, fixtures, coverage, TDD, integration testing |
| Log Analyzer | molthub install log-analyzer |
Log parsing, error patterns, stack trace extraction, structured logging setup, real-time monitoring, correlation |
| Security Audit Toolkit | molthub install security-audit-toolkit |
Dependency scanning, secret detection, OWASP patterns, SSL/TLS verification, file permissions, audit scripts |
| Infrastructure as Code | molthub install infra-as-code |
Terraform, CloudFormation, Pulumi — VPC, compute, storage, state management, multi-environment patterns |
| Performance Profiler | molthub install perf-profiler |
CPU/memory profiling, flame graphs, benchmarking, load testing, memory leak detection, query optimization |
| Skill | Install | What It Does |
|---|---|---|
| Git Workflows | molthub install git-workflows |
Interactive rebase, bisect, worktree, reflog recovery, cherry-pick, subtree/submodule, sparse checkout, conflict resolution |
| Regex Patterns | molthub install regex-patterns |
Validation patterns, parsing, extraction across JS/Python/Go/grep, search-and-replace, lookahead/lookbehind |
| SSH Tunnel | molthub install ssh-tunnel |
Local/remote/dynamic port forwarding, jump hosts, SSH config, key management, scp/rsync, connection debugging |
| Container Debug | molthub install container-debug |
Docker logs, exec, networking diagnostics, resource inspection, multi-stage build debugging, health checks, Compose |
| Data Validation | molthub install data-validation |
JSON Schema, Zod (TypeScript), Pydantic (Python), CSV/JSON integrity checks, migration validation |
| Shell Scripting | molthub install shell-scripting |
Argument parsing, error handling, trap/cleanup, temp files, parallel execution, portability, config parsing |
| DNS & Networking | molthub install dns-networking |
DNS debugging (dig/nslookup), port testing, firewall rules, curl diagnostics, proxy config, certificates |
| Cron & Scheduling | molthub install cron-scheduling |
Cron syntax, systemd timers, one-off jobs, timezone/DST handling, job monitoring, locking, idempotent patterns |
| Encoding & Formats | molthub install encoding-formats |
Base64, URL encoding, hex, Unicode, JWT decoding, hashing/checksums, serialization format conversion |
| Makefile & Build | molthub install makefile-build |
Make targets, pattern rules, Go/Python/Node/Docker Makefiles, Just and Task as modern alternatives |
| Skill | Install | What It Does |
|---|---|---|
| Skill Writer | molthub install skill-writer |
SKILL.md authoring guide — format spec, frontmatter schema, content patterns, templates, publishing checklist |
| Skill Reviewer | molthub install skill-reviewer |
Skill quality audit — scoring rubric, defect checklists, structural/content/actionability review framework |
| Skill Search Optimizer | molthub install skill-search-optimizer |
Registry discoverability — semantic search mechanics, description optimization, visibility testing, competitive positioning |
| Skill | Install | What It Does |
|---|---|---|
| Emergency Rescue Kit | molthub install emergency-rescue |
Git disasters, credential leaks, disk full, OOM kills, database failures, deploy rollbacks, SSH lockouts, network outages — step-by-step recovery |
Research & Security Findings
Ecosystem exploration produced several research artifacts:
- Trojanized skill discovery —
moltbook-aycontained instructions to download and execute malware via password-protected archives. Classic social engineering adapted for autonomous agents. No code was executed; themolthub installprocess was verified from source to be download-extract-write only. - ClawHub platform analysis — API reverse-engineering, registry discovery protocol, skill format schema, publishing flow, semantic search mechanics, and registry statistics at one week old (~200+ skills). Full report: clawdhub-platform-report.md.
- Security compilation — Willison's "lethal trifecta" framework, CVE-2026-25253 (one-click RCE), the ClawHavoc supply chain campaign (341 malicious skills), the Moltbook database breach, and 21,639 exposed instances. Full analysis: security-report.md.
- End-to-end narrative — From package vetting to 24 published skills, ecosystem retraction, and lessons learned: journey.md.
clawhub-forge/
skills/ # Published skill bundles
docker-sandbox/SKILL.md
csv-pipeline/SKILL.md
... (25 skills total)
tools/ # Workbench tooling
lib/
common.sh # Colors, logging, skill discovery
frontmatter.sh # YAML frontmatter parser + validator
patterns.sh # Malicious pattern database (87 patterns, 13 categories, MITRE ATT&CK)
line-classifier.sh # Zero-trust line classifier (SAFE/SUSPICIOUS/MALICIOUS)
trust-manifest.sh # .trust file generation + hash validation
sarif_formatter.py # SARIF 2.1.0 output formatter
skill-lint.sh # Linter
skill-scan.sh # Offline security scanner
skill-verify.sh # Zero-trust skill verifier
skill-test.sh # Test runner wrapper
skill-new.sh # Skill scaffolder (creates skill + test)
skill-publish.sh # Gated publisher
skill-stats.sh # Adoption metrics with trends + ranking
registry-explore.sh # Registry browsing + competitive search
workbench-verify.sh # 12-point health verification
pipeline-report.sh # Pipeline value summary
templates/ # Skill templates
cli-tool/SKILL.md # CLI/tool reference template
workflow/SKILL.md # Process/methodology template
language-ref/SKILL.md # Language/syntax reference template
_test.template.sh # Auto-generated test file template
tests/ # Behavioral tests
_framework/
runner.sh # Test file discovery + execution
assertions.sh # assert_section_exists, assert_contains, etc.
scanner-self-test/ # Scanner accuracy validation
known-bad.md # Hits every pattern category
known-clean.md # Zero findings expected
allowlisted.md # Findings suppressed via .scanignore
run.sh # Self-test runner
docker-sandbox.test.sh # 25 test files (100% skill coverage)
...
docs/
journey.md # Full session narrative
research/
clawdhub-platform-report.md # API reverse-engineering report
security-report.md # Trojanized skill + security findings
setup/
claude-speckit.md # Spec-driven development reference
airgapped-sandbox.md # Docker sandbox setup guide
.devcontainer/
devcontainer.json # Dev container config
package.json # Workbench package manifest
setup.sh # Post-create setup script
.github/workflows/
skill-ci.yml # CI: lint → scan → test on PR
Makefile # Single entry point for all commands
Each skill is a SKILL.md file with YAML frontmatter that tells an AI agent when and how to use it:
---
name: my-skill
description: When to activate this skill
metadata: {"clawdbot":{"emoji":"...","requires":{"anyBins":["tool1","tool2"]}}}
---
# Skill Title
Reference material, patterns, commands, and examples the agent
can follow to perform the task.Install any skill with molthub install <slug>. Skills are placed in ./skills/<slug>/ and loaded by the agent on demand.
These three repos form a trifecta for safe engagement with the OpenClaw ecosystem:
- openclaw-vault — Run agents safely. Hardened container with proxy-side API key injection, domain allowlisting, kill switch, 15-point security verification.
- moltbook-pioneer — Socialize safely. Research and safe participation in the Moltbook agentic social network. Feed scanner, agent census, identity management.
Skills are published to ClawHub under its registry terms. Source files in this repo are MIT licensed.