Skip to content

llama squad#5

Open
jeff3071 wants to merge 5 commits intoopen-evals:mainfrom
jeff3071:llama-evaluate-squad
Open

llama squad#5
jeff3071 wants to merge 5 commits intoopen-evals:mainfrom
jeff3071:llama-evaluate-squad

Conversation

@jeff3071
Copy link

@jeff3071 jeff3071 commented Mar 20, 2023

Eval details 📑

Eval name

squad

Eval description

We evaluate llama using 100 examples of the SQuAD dataset with the Open-evals framework, which extends OpenAI's Evals for different language models. We consider the sentence immediately following the prompt as the output of Llama and useinclude accuracy as a metric to measure its performance.

For a model completion a and a reference list of correct answers B
include: any([(a in b) for b in B])

model squad(100)
llama 0.63
gpt-3.5-turbo 0.9
text-davinci-003 0.87
text-davinci-002 0.66
text-davinci-001 0.58
ada 0.35

@jeff3071 jeff3071 marked this pull request as ready for review March 23, 2023 02:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant