Summary
Implement dataset sampling strategies for large evaluation sets.
Tasks
- Add
DatasetSource.sample(n, strategy) method.
- Support strategies: random, stratified (by metadata field), first-n.
- Ensure reproducibility with seed parameter.
- Document sampling in dataset docs.
Acceptance criteria
dataset.sample(100, strategy="stratified", field="difficulty") works.
- Same seed produces same sample.
- Original dataset unchanged.
Summary
Implement dataset sampling strategies for large evaluation sets.
Tasks
DatasetSource.sample(n, strategy)method.Acceptance criteria
dataset.sample(100, strategy="stratified", field="difficulty")works.