## Problem No ground truth for detector accuracy — can't measure false positive/negative rates. ## Tasks - [ ] Create 50-100 attack sample with human-labeled outcomes - [ ] Run both detectors against labeled set - [ ] Report precision/recall for each detector - [ ] Document in METHODOLOGY.md ## Acceptance Criteria - Detector accuracy claims are backed by labeled validation set
Problem
No ground truth for detector accuracy — can't measure false positive/negative rates.
Tasks
Acceptance Criteria