Skip to content

feat(plugin): add destructive-command-guard plugin#23946

Open
leszekszpunar wants to merge 9 commits intoanthropics:mainfrom
leszekszpunar:feat/destructive-command-guard
Open

feat(plugin): add destructive-command-guard plugin#23946
leszekszpunar wants to merge 9 commits intoanthropics:mainfrom
leszekszpunar:feat/destructive-command-guard

Conversation

@leszekszpunar
Copy link
Copy Markdown

@leszekszpunar leszekszpunar commented Feb 7, 2026

Summary

  • Add destructive-command-guard plugin -- a PreToolUse hook that blocks irreversible Bash commands and warns about edits to agent policy files
  • Blocks dangerous mass-deletion (root, home, cwd, system paths), mass Docker operations, destructive git commands, indirect execution, and alternative deletion tools
  • Warns (once per session via systemMessage) on edits to CLAUDE.md, .claude/settings.json, .claude/settings.local.json, hooks/hooks.json
  • Supports multi-command chain splitting, env var disable (ENABLE_DESTRUCTIVE_GUARD=0), and session-scoped state with 30-day auto-cleanup

Closes #23871, closes #23870

Related issues

This plugin addresses or mitigates multiple open reports of destructive agent behavior:

Issue Description Coverage
#23871 Agent lacks detection for self-destructive operations Direct fix -- blocks mass deletion, Docker, git destructive
#23870 Agent can overwrite/delete CLAUDE.md and policy files Direct fix -- warns on policy file edits
#23010 Plan mode missing permission prompt for destructive rm Blocks the rm pattern regardless of mode
#22638 Claude ignored CLAUDE.md, executed git stash drop causing data loss Blocks git stash drop without specific ref
#18290 Destructive git checkout without confirmation Blocks git checkout -- . pattern
#14944 docker volume rm without confirmation, Supabase data loss Blocks mass docker volume rm $(docker volume ls -q)
#17908 CLAUDE.md rules for destructive ops not reliably followed Plugin enforces rules at tool level, independent of model compliance
#13904 Destructive git reset --hard during test restructuring Blocks git reset --hard without explicit target
#19874 Plan mode has no tool-level enforcement Plugin operates at PreToolUse level, enforced regardless of mode
#17296 Dangerous destructive model behavior Provides blocklist-based safety net
#24196 HIGH-PRIORITY Unauthorized rm -rf in chained command caused data loss Blocks chained rm -rf via multi-command splitting (&&, ;)
#24319 Newline in command string caused rm -rf to delete project directory Command splitter handles \n separators, checks each part independently

Not covered (out of scope for this plugin):

Security hardening

  • Session ID sanitization (CWE-22 path traversal prevention)
  • Debug logs in ~/.claude/ not /tmp (CWE-377 symlink attack prevention)
  • Atomic state writes via tempfile + os.replace (race condition mitigation)
  • Indirect execution detection (eval, sh -c, bash -c, pipe-to-shell, base64 decode)
  • Path normalization (catches //, /./, /../..)
  • Command splitter handles ||, newlines, &&, ;, |
  • Adversarial review + two rounds of Gemini cross-verification applied

Test plan

  • 102 automated tests passing (rm, Docker, git, file protection, indirect execution, find -exec, xargs, env var, session ID)
  • Verify blocked commands return exit code 2 in live session
  • Verify allowed commands return exit code 0 in live session
  • Verify protected file edits emit systemMessage on first occurrence, silent on repeat
  • Install plugin via claude --plugin-dir ./plugins/destructive-command-guard and test end-to-end

…xamples

Options in interactive-commands.md Pattern 2 used inline
comma-separated format without descriptions, causing
skill-creator to crash with 'description.split is not a function'
when generating AskUserQuestion tool calls from these patterns.

Convert all bare option lists to bulleted format with
parenthetical descriptions, consistent with the AskUserQuestion
tool schema and the rest of the reference document.

Fixes anthropics#23855
Add a PreToolUse hook plugin that protects against self-destructive
agent operations (addresses anthropics#23871 and anthropics#23870):

- Block dangerous mass-deletion commands targeting /, ~, ., .., *, $HOME
- Block mass Docker operations (system prune, volume prune,
  container mass-removal via subshell, compose down with volumes)
- Block destructive git commands (clean without dry-run,
  checkout all changes, hard reset without explicit target)
- Warn (once per session) on edits to policy files (CLAUDE.md,
  .claude/settings.json, .claude/settings.local.json, hooks.json)
- Support multi-command chains (&&, ;, |) splitting
- Disable via ENABLE_DESTRUCTIVE_GUARD=0 env var
- Session-scoped state with automatic cleanup (30 days)
Address bypass vectors found during adversarial security review and
Gemini code analysis:

- Add indirect execution blocking (eval, sh -c, bash -c, pipe-to-shell,
  base64 decode to shell)
- Add system path protection for rm (covers /etc, /usr, /var, /home,
  /boot, /opt, /bin, /sbin, /lib, /Users, /Applications, /System)
- Add path normalization to catch //, /./, /../.. variants
- Add command substitution and variable expansion detection in rm targets
- Add backtick detection in rm arguments
- Add Docker container/image/network/builder prune blocking
- Add git push --force, git branch -D, git stash clear blocking
- Add file protection via Bash commands (echo >, sed -i, mv, cp, tee,
  truncate, dd targeting CLAUDE.md and policy files)
- Add find -delete blocking on dangerous paths
- Fix symlink attack vector: move debug log from /tmp to ~/.claude
- Fix path traversal: sanitize session_id input (CWE-22)
- Fix race condition: use atomic writes via tempfile + os.replace
- Document security model limitations in README
Replace all 68 Polish comments and docstrings in
destructive_command_guard.py with English equivalents to maintain
consistency with the rest of the repository.
…ss ops

Address additional destructive patterns found in related open issues:

- Block 'git stash drop' without a specific stash ref (anthropics#22638: user lost
  3 days of stashed work). 'git stash drop stash@{N}' still allowed.
- Block 'docker volume rm $(docker volume ls -q)' mass removal (anthropics#14944:
  Supabase data volumes deleted). Named volume removal still allowed.
- Update README with new patterns
Address four bypass vectors identified during final Gemini code analysis:

- Fix command splitter to handle || operator and newline separators
  (previously only split on &&, ;, |)
- Block find -exec rm and find -execdir rm on dangerous paths
- Block xargs rm piped from find on dangerous paths
- Update README with expanded alternative deletion patterns
…lookup

Replace fragile substring-based message matching in
check_bash_file_modification with a call to check_file_protection(),
ensuring consistent message resolution across Write/Edit and Bash
code paths.
@leszekszpunar
Copy link
Copy Markdown
Author

Update: Added git restore . / --staged . / --worktree . blocking (commit 60d3663). This closes the gap where the modern git restore equivalent of git checkout -- . was not guarded. Also covers :/ (whole-repo pathspec) and short flags (-S, -W).

@frmoretto
Copy link
Copy Markdown

Hey @leszekszpunar — nice work on the safety coverage here, clearly a lot of thought went into it.

Just wanted to flag that Hardstop (npm i hardstop) already covers this space as a published Claude Code plugin (v1.4.3), and was submitted through the official plugin submission process. It's been on npm for a while now with:

  • 428 security patterns (destructive commands, credential theft, infrastructure teardown, prompt injection, indirect execution, macOS-specific vectors)
  • 100% code coverage across all hooks
  • Chain-aware command splitting (&&, |, ;, \n)
  • LLM-assisted semantic analysis for obfuscated/edge-case commands
  • Fail-closed design — blocks by default when uncertain
  • Separate hardstop-patterns package for reuse in other tools
  • SLSA build provenance via Sigstore

There's significant overlap between the two projects. I'd encourage the maintainers to consider the existing ecosystem work here before merging a parallel implementation. Happy to collaborate if there are gaps worth addressing together.

@leszekszpunar
Copy link
Copy Markdown
Author

Hey @frmoretto — thanks for the heads-up, and nice work on Hardstop. I've reviewed the codebase and patterns in detail. You're right there's overlap, but I think the two plugins occupy different design points rather than being direct duplicates.

Where Hardstop clearly wins:

  • Breadth of coverage — cloud CLIs (AWS/GCP/Firebase), reverse shells, credential exfiltration, macOS/Windows-specific vectors, SQL injection, IaC destructive ops. We have zero coverage there.
  • Read tool protection — blocking reads of .ssh/, .aws/credentials, .env etc. We don't hook into Read at all.
  • LLM semantic fallback for novel/obfuscated commands.
  • Risk scoring, MITRE ATT&CK mapping, structured audit logging.
  • Cross-platform (Windows/PowerShell support).

Where this plugin goes deeper:

  • Git analysis — we parse flags semantically rather than pattern-match. We distinguish --force vs --force-with-lease, git clean -n (dry-run, allowed) vs git clean -fd (blocked), git reset --hard with vs without target, git restore --staged . vs specific files, git stash drop with vs without ref. Hardstop's safe allowlist blanket-allows git checkout, git restore, git merge — meaning git checkout -- . passes Layer 1 uncaught.
  • Policy file protection — we warn on edits to CLAUDE.md, settings.json, hooks.json (both via Write/Edit tools and Bash redirects/sed/tee/mv). This guards against agent self-modification, which is a distinct threat vector.
  • Write/Edit tool monitoring — we hook into Write, Edit, MultiEdit. Hardstop only hooks Bash/PowerShell/Read.
  • Zero dependencies — Python 3.7+ stdlib only. No YAML, no npm runtime, no subprocess calls to Claude CLI. Single 820-line file.
  • Path normalization — explicit handling of //, /./, /../.. before matching.
  • Variable/subshell detection — explicitly blocks dynamic expansion targets like $(cmd), $VAR, backtick substitution with targeted error messages.

Different design philosophy:

  • Hardstop is broad + LLM-assisted — wide coverage net with semantic fallback.
  • This plugin is narrow + deep — fewer categories but more precise per-command analysis within those categories, zero external dependencies, and no API cost.

I think these are genuinely complementary. If maintainers are interested, one path forward could be combining the deep semantic analysis from this plugin (git, Docker, policy files, Write/Edit monitoring) with the broader pattern library from hardstop-patterns. Open to discussing that.

…ted text

Add _strip_quoted_content() that neutralizes content inside single quotes,
double quotes (with backslash escape handling), and heredocs before pattern
matching. This prevents false positives when dangerous patterns like rm -rf
or git reset --hard appear as text arguments rather than executable commands
(e.g., gh pr comment --body "text about rm -rf").

handle_bash() now strips quoted content once and passes the safe version to
all check functions. Actual dangerous commands remain blocked -- eval, bash -c,
and other indirect execution vectors are still caught because the command
structure (first token) survives stripping.
@leszekszpunar
Copy link
Copy Markdown
Author

Hi team 👋

Friendly bump — this PR has been open for ~18 days with no review yet. It's mergeable with no conflicts.

Quick context on why this matters:

Would really appreciate an initial review when someone has bandwidth. Happy to address any feedback. Thanks!

@leszekszpunar
Copy link
Copy Markdown
Author

Update (March 20): Since this PR was submitted, destructive-command incidents have continued to escalate. Recent reports from the last few days:

There's also growing community momentum around safety plugins — several new guard PRs since February: #34257, #33390, #31633, #30521, #30692.

This plugin remains conflict-free and opt-in. It specifically addresses git/Docker/filesystem destruction patterns — areas where broader pattern-matching approaches have documented gaps (e.g., git checkout -- ., git restore --staged .).

Would appreciate a review when the team has bandwidth. Happy to adapt to any feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants