AI-powered command approval gate for Claude Code. Classifies shell commands as allow, deny, or ask before execution.
shall runs as a Claude Code PreToolUse hook. Every Bash command is sent to an AI model which classifies it:
- allow — normal development commands pass through silently
- deny — dangerous commands are blocked (surfaced as "ask" so the human decides)
- ask — ambiguous commands or merge operations prompt the user for approval
Default provider: Gemini 2.5 Flash Lite (fast, no battery drain). Falls back to Ollama (local, offline-capable) if Gemini is unavailable.
macOS
brew install nushellLinux
brew install nushell
# or: https://www.nushell.sh/book/installation.htmlWindows
winget install Nushell.Nushellexport GEMINI_API_KEY=your-key-here # add to ~/.bashrc or ~/.zshrcGet a free key at aistudio.google.com.
mkdir -p ~/.claude/hooks
curl -o ~/.claude/hooks/shall.nu \
https://raw.githubusercontent.com/bonisoft3/shall/main/shall.nuAdd to ~/.claude/settings.json (create the file if it doesn't exist):
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "nu ~/.claude/hooks/shall.nu"
}
]
}
],
"PostToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "nu ~/.claude/hooks/shall.nu"
}
]
}
]
}
}That's it. Claude Code will now evaluate every Bash command through the gate. The PostToolUse entry feeds back which commands actually ran so shall can learn from your manual approvals — see Learning from your approvals.
For offline use, install Ollama as a fallback:
brew install ollama
ollama pull qwen2.5-coder:7bshall auto-falls back to Ollama when Gemini is unavailable (no API key, quota exhausted, network down).
Tested with 31 commands (23 allow, 1 ask, 7 deny):
| Model | Score | Avg latency | Provider |
|---|---|---|---|
gemini-2.5-flash-lite |
31/31 | ~630ms | Gemini (default) |
gemini-2.5-flash |
31/31 | ~1.9s | Gemini |
qwen2.5-coder:7b |
31/31 | ~19.7s | Ollama |
gemma3:1b |
29/31 | ~11.5s | Ollama |
qwen2.5-coder:3b |
28/31 | ~13.6s | Ollama |
Environment variables:
| Variable | Default | Description |
|---|---|---|
SHALL_PROVIDER |
gemini |
Provider: gemini or ollama |
SHALL_FALLBACK |
true |
Fall back to ollama if gemini fails |
GEMINI_API_KEY |
— | Google AI API key (required for gemini) |
GEMINI_MODEL |
gemini-2.5-flash-lite |
Gemini model ID |
GEMINI_URL |
https://generativelanguage.googleapis.com/v1beta |
Gemini API base URL |
OLLAMA_MODEL |
qwen2.5-coder:7b |
Ollama model for classification |
OLLAMA_URL |
http://localhost:11434 |
Ollama API endpoint |
SHALL_HISTORY |
~/.claude/shall-history.jsonl |
Verdict log used for on-the-fly learning |
If a command is consistently misclassified, add it as a few-shot example in the gate_prompt function in shall.nu. Small models are very sensitive to examples — a single added example often fixes an entire class of misclassifications.
shall keeps a log at ~/.claude/shall-history.jsonl (one record per Bash call). Every PreToolUse verdict is written with executed: null; the PostToolUse hook flips it to executed: true once Claude Code actually runs the command.
That gives shall one strong signal: cases where it said ask but you approved anyway. On the next call, up to 10 such overrides (deduped by command, newest first) are injected into the prompt as additional → allow examples, so the model nudges toward allowing similar commands automatically.
Caveats:
- Only the ask → approved direction is captured. If shall said
allowand you canceled, that is not currently logged (Claude Code does not fire PostToolUse for canceled tools). - The file is rotated to the last 1000 lines once it grows past 1500.
- To wipe the learned context, delete the file. To inspect it:
tail ~/.claude/shall-history.jsonl | nu --commands 'lines | each { from json } | table'. - Set
SHALL_HISTORY=/dev/nullto disable logging entirely.
# Install dev tools
mise install
# Syntax check
sayt build
# Run tests (default: gemini)
sayt test # or: nu shall.test.nu
# Test ollama provider
nu shall.test.nu --provider ollama
# Test a different model
nu shall.test.nu --provider ollama --model qwen2.5-coder:3b
# Prompt optimization (multi-model comparison)
sayt verify # runs promptfoo eval
# Integration tests (Docker, ollama only)
docker compose run --build integrate