-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or requestimpact: infrastructureAdds new subsystem (config, MCP, state). Medium risk.Adds new subsystem (config, MCP, state). Medium risk.phase: 6-stateful-escalationPhase 6: Escalation logic (highest internal risk)Phase 6: Escalation logic (highest internal risk)risk: highHigh risk — contract change, deadlock potential, or external depsHigh risk — contract change, deadlock potential, or external depstopic: platform-integrationMCP, multi-platform, CLAUDE.md framingMCP, multi-platform, CLAUDE.md framing
Description
Summary
The initial CLAUDE.md behavioral framing prompt has been added (commit b560871). This issue tracks improvements and iterations on the proactive prevention approach — shaping assistant behavior before hallucinations occur rather than only catching them after.
Current State
CLAUDE.md in the plugin root is auto-loaded by Claude Code when the plugin is installed. It currently:
- Bans speculation language ("likely", "probably", "I think", etc.)
- Requires tool-verified claims
- Enforces evidence-based communication
- Prevents completeness overclaims
Planned Improvements
Framing Refinements
- Add examples of good vs bad responses for each category
- Include a decision tree: "Can I verify this? → Yes: verify → No: say 'I don't know'"
- Tune the wording to minimize false interventions on legitimate language
- Add escape hatch instructions for when speculation is explicitly requested by the user
Integration with Stop Hook
- Stop hook should reference CLAUDE.md rules in its output ("This violates the framing rule about...")
- Track repeat violations within a session — escalating responses
- First violation: gentle reminder. Second: stronger warning. Third: block and require verification.
Multi-Platform Framing
- Cursor: equivalent framing via
.cursor-plugin/configuration - Codex: framing guidance in
.codex/INSTALL.md - OpenCode: system prompt injection in
.opencode/plugins/hallucination-detector.cjs
Metrics
- Count how many stop-hook triggers occur per session
- Compare trigger rates before/after CLAUDE.md is installed
- Identify which categories are most frequently violated (focus improvement there)
Related Issues
- feat: config file handling — cascading settings from multiple sources #10 (config — framing text should be customizable)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestimpact: infrastructureAdds new subsystem (config, MCP, state). Medium risk.Adds new subsystem (config, MCP, state). Medium risk.phase: 6-stateful-escalationPhase 6: Escalation logic (highest internal risk)Phase 6: Escalation logic (highest internal risk)risk: highHigh risk — contract change, deadlock potential, or external depsHigh risk — contract change, deadlock potential, or external depstopic: platform-integrationMCP, multi-platform, CLAUDE.md framingMCP, multi-platform, CLAUDE.md framing