-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
Description
Issue Description
The evaluation guide at shows how to run evaluations with experiments, but doesn't demonstrate how to actually create an experiment via the API.
Current State
The documentation currently:
- Shows generating an experiment ID locally:
experiment_id = f"email-eval-{datetime.now().strftime('%Y%m%d-%H%M%S')}" - References using this ID in completion metadata
- Shows sending annotations for the experiment
However, it's missing the actual API call to create the experiment entity.
Proposed Improvement
The evaluation pipeline script should:
- Create a new experiment via the experiments API
- Add completions to this experiment (already shown)
- Send annotations for evaluation results (already shown)
This would provide a complete end-to-end example of programmatically running experiments.
Affected Section
Lines 471-472 in the "Create the Evaluation Pipeline" section currently show only local ID generation without the API call.
-- Claude Code
Reactions are currently unavailable