[P0] Add human validation baseline for detector accuracy

## Problem
No ground truth for detector accuracy — can't measure false positive/negative rates.

## Tasks
- [ ] Create 50-100 attack sample with human-labeled outcomes
- [ ] Run both detectors against labeled set
- [ ] Report precision/recall for each detector
- [ ] Document in METHODOLOGY.md

## Acceptance Criteria
- Detector accuracy claims are backed by labeled validation set