-
Notifications
You must be signed in to change notification settings - Fork 0
Open
0 / 20 of 2 issues completedOpen
0 / 20 of 2 issues completed
Copy link
Labels
enhancementNew feature or requestNew feature or requestimpact: additiveAdds patterns to existing pipeline. Low risk.Adds patterns to existing pipeline. Low risk.phase: 1-additive-patternsPhase 1: Zero-risk additive regex patternsPhase 1: Zero-risk additive regex patternsrisk: lowLow risk — additive, backward-compatibleLow risk — additive, backward-compatibletopic: detection-patternsNew regex-based detection categoriesNew regex-based detection categoriestopic: source-credibilitySource/citation/attribution detectionSource/citation/attribution detection
Description
Summary
Add regex-based detection of fabricated URLs, DOIs, and citation markers in assistant output. LLMs frequently hallucinate academic citations, fake URLs, and DOI numbers.
Technique (from ObvioSpectre/hallucination-detector)
Three regex patterns that detect common fabrication signals:
// URLs in output
const URL_PATTERN = /https?:\/\/(?:[-\w.]|(?:%[\da-fA-F]{2}))+/g;
// DOI patterns (academic citations)
const DOI_PATTERN = /\b10\.\d{4,9}\/[-._;()/:A-Z0-9]+\b/gi;
// Citation markers like [1], [Smith, 2024]
const CITATION_PATTERN = /\[(?:\d+|[A-Za-z]+(?:, \d{4})?)\]/g;Detection Logic
- If citations/DOIs/URLs are present in assistant output → elevated risk (LLMs frequently fabricate these)
- If URL contains
example.com,placeholder, or...→ definitely fabricated (score 1.0) - If DOIs present but not verifiable → flag as ungrounded
- No citations → no flag (not all responses need citations)
Why This Works
LLMs are notorious for inventing plausible-looking academic citations, DOIs, and URLs. This is zero-cost, zero-dependency detection that catches a specific and well-documented hallucination failure mode.
New Category
fabricated_source — a fifth detection category alongside speculation, causality, pseudo-quantification, and completeness.
Acceptance Criteria
- URL, DOI, and citation marker regex patterns implemented
- Known-fake URL detection (example.com, placeholder patterns)
- Suppression for URLs in code blocks and inline code
- Tests cover positive and negative cases
- Integrated into existing
findTriggerMatchespipeline
References
- ObvioSpectre/hallucination-detector
detectors/source_check.py
Reactions are currently unavailable
Sub-issues
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestimpact: additiveAdds patterns to existing pipeline. Low risk.Adds patterns to existing pipeline. Low risk.phase: 1-additive-patternsPhase 1: Zero-risk additive regex patternsPhase 1: Zero-risk additive regex patternsrisk: lowLow risk — additive, backward-compatibleLow risk — additive, backward-compatibletopic: detection-patternsNew regex-based detection categoriesNew regex-based detection categoriestopic: source-credibilitySource/citation/attribution detectionSource/citation/attribution detection