Add openai_eval type to delegate evals to OpenAI APIs by krisztianfekete · Pull Request #73 · agentevals-dev/agentevals

krisztianfekete · 2026-03-30T10:14:32Z

This PR adds support for OpenAI's Evals API as an evaluator backend in agentevals. Instead of running grading logic locally, it delegates to OpenAI's hosted evaluation infrastructure. The first supported grader is TextSimilarityGrader (fuzzy_match, BLEU, ROUGE, cosine, etc.), but the design is set up so adding more grader types later is straightforward.

We introduce a new evaluator type, openai_eval.

You can configure it like this:

evaluators:
  - name: response_similarity
    type: openai_eval
    threshold: 0.8
    grader:
      type: text_similarity
      evaluation_metric: bleu

Under the hood, the backend does a create-eval, create-run, poll, collect-results, cleanup cycle against the OpenAI API. To make it work for pre-existing agent traces (because we just want grading) this puts both the actual and expected text into the item namespace with include_sample_schema: False, so OpenAI never tries to generate model outputs.

krisztianfekete requested a review from peterj March 30, 2026 10:14

add openai_eval type to delegate eval to OpenAI APIs

3899891

krisztianfekete force-pushed the feature/add-openai-eval-type branch from a74e734 to 3899891 Compare March 30, 2026 10:16

krisztianfekete requested a review from EItanya March 30, 2026 13:07

krisztianfekete changed the title ~~Add openai_eval type to delegate eval to OpenAI APIs~~ Add openai_eval type to delegate evals to OpenAI APIs Mar 30, 2026

EItanya approved these changes Mar 30, 2026

View reviewed changes

EItanya merged commit d7ef558 into main Mar 30, 2026
4 checks passed

This was referenced Mar 30, 2026

document OpenAI Graders #74

Merged

Add StringCheckGrader OpenAI Grader #95

Open

Add ScoreModelGrader OpenAI Grader #96

Open

Add LabelModelGrader OpenAI Grader #97

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add openai_eval type to delegate evals to OpenAI APIs#73

Add openai_eval type to delegate evals to OpenAI APIs#73
EItanya merged 1 commit intomainfrom
feature/add-openai-eval-type

krisztianfekete commented Mar 30, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

krisztianfekete commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

krisztianfekete commented Mar 30, 2026 •

edited

Loading