feat: self-contradiction detection via negation polarity

## Summary

Detect self-contradictory statements within a single assistant response using negation polarity heuristics. When the assistant says "X is Y" in one sentence and "X is not Y" in another, flag it as a contradiction.

## Technique (from ObvioSpectre/hallucination-detector)

Lightweight NLI approximation without requiring an NLI model:

1. Split response into sentences
2. For each sentence pair on the same topic:
   - Check if one contains negation words and the other does not
   - If so, flag as internal contradiction

### Negation Words

```javascript
const NEGATION_WORDS = ['not', 'no', 'never', "didn't", "isn't", "wasn't", "aren't", "won't", "can't", "doesn't", 'none', 'neither', 'nor', 'unable', 'lacks', 'failed'];
```

### Detection Logic (Regex-Adaptable)

Without embeddings, approximate "same topic" by checking for shared noun phrases or subjects:

```javascript
// Pattern: "X is Y" ... "X is not Y"
// Pattern: "always X" ... "never X"  
// Pattern: "X works" ... "X doesn't work"
```

### Why This Works

Self-contradiction is a strong hallucination signal — the assistant is confabulating rather than reasoning from consistent evidence. This catches a failure mode that none of our current four categories detect.

## New Category

`internal_contradiction` — a sixth detection category.

## Acceptance Criteria

- [ ] Sentence splitting implemented (regex: split on `.!?` followed by whitespace)
- [ ] Negation polarity detection for sentence pairs
- [ ] Handles common negation patterns beyond simple "not"
- [ ] Low false positive rate — only flags when sentences are on the same topic
- [ ] Tests with clear contradiction examples and non-contradiction controls
- [ ] Suppression for quoted text and code blocks

## References

- ObvioSpectre/hallucination-detector `detectors/consistency.py`
- Negation-polarity heuristic approximates NLI contradiction detection

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: self-contradiction detection via negation polarity #19

Summary

Technique (from ObvioSpectre/hallucination-detector)

Negation Words

Detection Logic (Regex-Adaptable)

Why This Works

New Category

Acceptance Criteria

References

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: self-contradiction detection via negation polarity #19

Description

Summary

Technique (from ObvioSpectre/hallucination-detector)

Negation Words

Detection Logic (Regex-Adaptable)

Why This Works

New Category

Acceptance Criteria

References

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions