[evaluate] support gpt evaluation#3807
[evaluate] support gpt evaluation#3807chengeharrison merged 1 commit intohpcaitech:dev/evaluationfrom
Conversation
| bert-score | ||
| rouge_chinese | ||
| scikit-metrics | ||
| nltk |
There was a problem hiding this comment.
Requirements for evaluation should be included in Chat/evaluate/requirements.txt instead of Chat/requirements.txt.
There was a problem hiding this comment.
Is the evaluation a part of the Chat? If we include these in Chat/evaluate/requirements.txt, we need to install them seperately from the Chat when we want to use evaluate feature?
There was a problem hiding this comment.
Chat/examples and Chat/inference have their own requirements.txt.
TongLi3701
left a comment
There was a problem hiding this comment.
Thanks Yuanchen.
Overall looks fine. Just have some questions.
- Are we going to add a sample config file later ?
- Do we support GPT-4? I assume that we only need to add argument to control if we want to use 4 or 3.5? So the function name can be renamed ?
| jieba | ||
| bert-score | ||
| rouge_chinese | ||
| scikit-metrics | ||
| nltk | ||
| openai | ||
| seaborn | ||
| pandas | ||
| matplotlib | ||
| numpy |
There was a problem hiding this comment.
We do not have additional setup.py for evaluation. So adding requirements.txt here can not install all these packages.
All these should be moved back to applications/Chat/requirements.txt as the setup.py will fetch this file there.
There was a problem hiding this comment.
But Chat/examples and Chat/inference have their own requirements.txt.
There was a problem hiding this comment.
I assume what we have done before was wrong? If you check the setup.py file it only fetches the requirements.txt from the same directory.
Any comments or my understanding is wrong? @ver217
There was a problem hiding this comment.
requirements.txt is essential to use coati. examples/requirements.txt is like extra requirements to run examples. I think it's OK. As for inference, the installation process of GPTQ is very difficult and users must follow its read me then install it manually.
As for evaluation, do you think it's a part of coati? If so, move the evaluation directory to coati/ and add requirements to requirements.txt. Otherwise, keep them apart.
There was a problem hiding this comment.
coati is regarded as a library. So if evaluation is a part of coati, you'd better provide CLI.
There was a problem hiding this comment.
Ok, sure.
We can keep it separate, but make sure in the readme file, you will need to add a setup section for pip install -r requirements.txt
| # gpt35 evaluation | ||
| for category in self.params: | ||
| category_metrics = self.params[category]["GPT-3.5"] | ||
|
|
||
| prompt = self.gpt_evaluation_prompt.get(category, None) | ||
| if prompt is None: | ||
| print(f"No prompt for category {category}! Use prompt for category general now.") | ||
| prompt = self.gpt_evaluation_prompt["general"] | ||
|
|
||
| self.gpt35_evaluation_results[category] = gpt_evaluate.gpt35_evaluate(answers_per_category[category], | ||
| prompt, category_metrics, category) |
There was a problem hiding this comment.
Do we support GPT-4? It seems that all functions are designed for GPT-3.5?
There was a problem hiding this comment.
For evaluation, we now only support GPT-3.5. Evaluation using GPT-4 requires sampling which costs a lot and we can't test it because we don't have access to GPT-4 currently.
There was a problem hiding this comment.
Sure. We can add it into TODO list.
📌 Checklist before creating the PR
[doc/gemini/tensor/...]: A concise description🚨 Issue number
#3714
Some legacy code was not formatted using pre-commit.
📝 What does this PR do?
Add support for gpt evaluation.
💥 Checklist before requesting a review
⭐️ Do you enjoy contributing to Colossal-AI?
Tell us more if you don't enjoy contributing to Colossal-AI.