Objective
Rename the dataset field to eval_set throughout the codebase and change its derivation to use the YAML name field instead of the old dataset field or filename fallback.
Motivation
- Align with agentv-bench which already uses
eval_set terminology
dataset is generic and overloaded; eval_set is more precise
- Many eval files share the same filename (
dataset.eval.yaml), making filename-derived dataset values meaningless — the name field is a better source
Design
- Rename:
dataset → eval_set in types (EvalTest, EvaluationResult), Zod schema, orchestrator, CLI commands, OTel exporter, tests, and JSONL baselines
- Derivation change:
eval_set reads from suite.name (already in schema) instead of suite.dataset, falling back to filename
- Clean break: Remove
dataset from RawTestSuite — no backward compat shim
- CLI:
--group-by dataset becomes --group-by eval-set
- Examples: Add
name: field to example YAMLs that only had description:
Acceptance Signals
Non-Goals
- Renaming filenames (e.g.,
dataset.eval.yaml stays — it's a file naming convention)
- Adding
name: to every example YAML (just representative ones)
Related
Objective
Rename the
datasetfield toeval_setthroughout the codebase and change its derivation to use the YAMLnamefield instead of the olddatasetfield or filename fallback.Motivation
eval_setterminologydatasetis generic and overloaded;eval_setis more precisedataset.eval.yaml), making filename-derived dataset values meaningless — thenamefield is a better sourceDesign
dataset→eval_setin types (EvalTest,EvaluationResult), Zod schema, orchestrator, CLI commands, OTel exporter, tests, and JSONL baselineseval_setreads fromsuite.name(already in schema) instead ofsuite.dataset, falling back to filenamedatasetfromRawTestSuite— no backward compat shim--group-by datasetbecomes--group-by eval-setname:field to example YAMLs that only haddescription:Acceptance Signals
eval_setin results JSONL output reflectsnamefrom YAML (not filename)--group-by eval-setworksdatasetas a field name in core/CLI codeNon-Goals
dataset.eval.yamlstays — it's a file naming convention)name:to every example YAML (just representative ones)Related