PromptDiff treats prompts like code: compare revisions, call out risky behavior changes, and leave a tidy local report that reviewers can trust.
It is a small TypeScript CLI for developers and agent operators who want deterministic prompt review evidence without SaaS dashboards, telemetry, or hidden network calls.
Prompt edits can quietly change instruction hierarchy, tool access, and output contracts. Those changes are easy to miss in a normal text diff but can break agents, parsers, and safety assumptions. PromptDiff gives those changes names.
npm install
npm run buildFor local development, run the built CLI directly:
node dist/cli.js --helpAfter publishing/installing globally, use:
promptdiff --helpnpm run build
node dist/cli.js compare examples/prompts/v1.md examples/prompts/v2.md --out prompt-risk.md
node dist/cli.js compare examples/prompts/v1.md examples/prompts/v2.md --format json
node dist/cli.js check examples/prompts/*.md --rules examples/rules.json --fail-on highCompares two prompt/template revisions and produces stable Markdown or JSON.
promptdiff compare old.md new.md --format markdown --out report.md --fail-on high
promptdiff compare old.json new.json --format jsonIt classifies:
- instruction changes, including dangerous “ignore/bypass/override” language
- removed safety/refusal/secret-handling guardrails
- tool surface changes such as shell, browser, network, delete, database, or email access
- output-contract changes such as JSON/schema/format shifts
- secret-like values, redacted before output by default
Checks one or more prompt files against a simple JSON rules file.
{
"requiredPhrases": ["protect customer secrets"],
"forbiddenPhrases": ["ignore previous instructions"],
"requireSections": ["Role", "Instructions", "Output Contract"],
"maxSeverity": "high"
}promptdiff check prompts/*.md --rules promptdiff.rules.json --fail-on highExit code 2 means the configured quality gate failed. Exit code 1 means a command/runtime error.
- Markdown
- plain text
- JSON with stable key ordering
- JSONL / NDJSON with stable key ordering per line
- Local-first by design.
- No telemetry.
- No external network calls.
- No hidden file writes; files are written only when
--outis provided. - Redaction is enabled by default for common API key, token, password, and credential patterns.
- Reports use deterministic timestamps (
deterministic-local) to avoid noisy snapshots.
PromptDiff is a deterministic heuristic tool, not an LLM judge. It is deliberately conservative and review-friendly. It will miss subtle semantic changes, and it may flag harmless text that looks like risky instruction/tool/output-contract language.
Fixtures live in examples/:
npm run build
node dist/cli.js compare examples/prompts/v1.md examples/prompts/v2.md
node dist/cli.js check examples/prompts/safe.md --rules examples/rules.json
npm run smokenpm install
npm run check
npm test
npm run build
npm run smoke
bash scripts/validate.shIssues and small PRs are welcome. Please keep changes deterministic, local-first, and covered by fixtures/tests. If you add a new risk classifier, add one example prompt pair and one test that proves the report stays stable.
MIT