-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or requestimpact: externalRequires network, APIs, or new runtimes. High risk.Requires network, APIs, or new runtimes. High risk.phase: 7-external-isolatedPhase 7: External deps, separate packagesPhase 7: External deps, separate packagesrisk: highHigh risk — contract change, deadlock potential, or external depsHigh risk — contract change, deadlock potential, or external depstopic: verificationRAG, fact-checking, external verificationRAG, fact-checking, external verification
Description
Summary
Add an optional second-pass verification tier that uses RAG (Retrieval-Augmented Generation) to fact-check claims against real-world sources, complementing the existing pattern-based linguistic analysis.
Motivation
The current detector catches linguistic red flags (speculation, pseudo-quantification, etc.) — patterns that suggest unreliability. But it cannot verify whether a factual claim is actually true. A RAG tier would catch factual errors that sound confident.
Example the current detector misses:
"React 19 was released in March 2024 and introduced Server Actions."
This contains no speculation language — it sounds authoritative. But if the date or feature attribution is wrong, only source verification can catch it.
Approach (inspired by agenticassets/exa-hallucination-detector)
Three-stage pipeline:
- Claim extraction — identify verifiable factual assertions from assistant output
- Source retrieval — search for corroborating/contradicting sources (Exa.ai, web search, or local docs)
- Verification — compare claims against retrieved sources, produce True/False/Insufficient verdict
Design Constraints
- Must be optional — zero-dependency pattern matching remains the default tier
- Must not slow down normal operation — RAG verification should be explicitly opted into or run as a separate pass
- Must support multiple backends — Exa.ai, WebSearch, local document search, custom API
- Must degrade gracefully — if no API key configured, skip RAG tier silently
Configuration
// .hallucination-detectorrc.cjs
module.exports = {
ragVerification: {
enabled: false,
backend: 'exa', // 'exa' | 'websearch' | 'local' | 'custom'
apiKey: process.env.EXA_API_KEY,
maxClaimsPerCheck: 5,
confidenceThreshold: 70, // 0-100, below this → flag as unverified
},
};Acceptance Criteria
- Claim extraction works on assistant text output
- At least one backend (web search) works without paid API keys
- Results integrate with existing trigger match output format
- Zero impact on performance when disabled (default)
- Tests cover extraction, verification, and graceful degradation
Related Issues
- feat: config file handling — cascading settings from multiple sources #10 (config file handling — needed for RAG settings)
- feat: add short-termism / temporal bias detection #5 (source grounding — overlaps with this feature)
References
- agenticassets/exa-hallucination-detector — RAG-based approach using Exa.ai + Claude
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestimpact: externalRequires network, APIs, or new runtimes. High risk.Requires network, APIs, or new runtimes. High risk.phase: 7-external-isolatedPhase 7: External deps, separate packagesPhase 7: External deps, separate packagesrisk: highHigh risk — contract change, deadlock potential, or external depsHigh risk — contract change, deadlock potential, or external depstopic: verificationRAG, fact-checking, external verificationRAG, fact-checking, external verification