Skip to content
Open

Most #189

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
750ec01
Merge pull request #169 from hpcaitech/main
jamesthesnake Sep 17, 2023
786d11e
Merge pull request #183 from hpcaitech/main
jamesthesnake Oct 15, 2023
0789a8d
Merge pull request #184 from jamesthesnake/co
jamesthesnake Oct 15, 2023
c6cd629
[Inference]ADD Bench Chatglm2 script (#4963)
CjhHa1 Oct 24, 2023
a000b61
Merge pull request #188 from hpcaitech/main
jamesthesnake Oct 25, 2023
1db6727
[Pipeline inference] Combine kvcache with pipeline inference (#4938)
FoolPlayer Oct 27, 2023
4e4a10c
updated c++17 compiler flags (#4983)
kurisusnowdeng Oct 27, 2023
cf579ff
[Inference] Dynamic Batching Inference, online and offline (#4953)
CjhHa1 Oct 30, 2023
459a88c
[Kernels]Updated Triton kernels into 2.1.0 and adding flash-decoding …
tiandiao123 Oct 30, 2023
abe071b
fix ColossalEval (#4992)
chengeharrison Oct 31, 2023
4f0234f
[doc]Update doc for colossal-inference (#4989)
tiandiao123 Oct 31, 2023
be82b5d
[hotfix] Fix the bug where process groups were not being properly rel…
littsk Oct 31, 2023
c040d70
[hotfix] fix the bug of repeatedly storing param group (#4951)
Oct 31, 2023
335cb10
[doc] add supported feature diagram for hybrid parallel plugin (#4996)
ppt0011 Oct 31, 2023
b6696be
[Pipeline Inference] Merge pp with tp (#4993)
FoolPlayer Nov 1, 2023
8993c8a
[release] update version (#4995)
ver217 Nov 1, 2023
dc003c3
[moe] merge moe into main (#4978)
oahzxl Nov 2, 2023
d99b2c9
[hotfix] fix grad accumulation plus clipping for gemini (#5002)
Nov 2, 2023
1a3315e
[hotfix] Add layer norm gradients all-reduce for sequence parallel (#…
littsk Nov 3, 2023
c36e782
[format] applied code formatting on changed files in pull request 492…
github-actions[bot] Nov 6, 2023
ef4c14a
[Inference] Fix bug in ChatGLM2 Tensor Parallelism (#5014)
CjhHa1 Nov 7, 2023
67f5331
[misc] add code owners (#5024)
ver217 Nov 8, 2023
f71e63b
[moe] support optimizer checkpoint (#5015)
oahzxl Nov 8, 2023
239cd92
Support mtbench (#5025)
chengeharrison Nov 9, 2023
7244412
[moe]: fix ep/tp tests, add hierarchical all2all (#4982)
cwher Nov 9, 2023
a448938
[shardformer] Fix serialization error with Tensor Parallel state savi…
imgaojun Nov 9, 2023
576a2f7
[gemini] gemini support tensor parallelism. (#4942)
flybird11111 Nov 10, 2023
70885d7
[hotfix] Suport extra_kwargs in ShardConfig (#5031)
KKZ20 Nov 10, 2023
43ad0d9
fix wrong EOS token in ColossalChat
Orion-Zheng Nov 14, 2023
28052a7
[Kernels]Update triton kernels into 2.1.0 (#5046)
tiandiao123 Nov 16, 2023
b2ad0d9
[pipeline,shardformer] Fix p2p efficiency in pipeline, allow skipping…
zeyugao Nov 16, 2023
3e02154
[gemini] gemini support extra-dp (#5043)
flybird11111 Nov 16, 2023
97cd0cd
[shardformer] fix llama error when transformers upgraded. (#5055)
flybird11111 Nov 16, 2023
3c08f17
[hotfix]: modify create_ep_hierarchical_group and add test (#5032)
cwher Nov 17, 2023
bc09b95
[exampe] fix llama example' loss error when using gemini plugin (#5060)
flybird11111 Nov 18, 2023
fd6482a
[inference] Refactor inference architecture (#5057)
Xu-Kai Nov 19, 2023
bce9197
[Kernels]added flash-decoidng of triton (#5063)
tiandiao123 Nov 20, 2023
8d56c9c
[misc] remove outdated submodule (#5070)
ver217 Nov 20, 2023
e5ce4c8
[npu] add npu support for gemini and zero (#5067)
ver217 Nov 20, 2023
0c7d8be
[hotfix/hybridengine] fix bug when tp*pp size = 1 (#5069)
FoolPlayer Nov 20, 2023
fb103cf
[inference] update examples and engine (#5073)
Xu-Kai Nov 20, 2023
8921a73
[format] applied code formatting on changed files in pull request 506…
github-actions[bot] Nov 20, 2023
4e3959d
[hotfix/hybridengine] Fix init model with random parameters in benchm…
FoolPlayer Nov 20, 2023
1cd7efc
[inference] refactor examples and fix schedule (#5077)
ver217 Nov 21, 2023
dce05da
fix thrust-transform-reduce error (#5078)
imgaojun Nov 21, 2023
fd3567e
[nfc] fix typo in docs/ (#4972)
digger-yu Nov 21, 2023
0d48230
[nfc] fix typo and author name (#5089)
digger-yu Nov 22, 2023
4ccb9de
[gemini]fix gemini optimzer, saving Shardformer in Gemini got list as…
flybird11111 Nov 22, 2023
75af66c
[Hotfix] Fix model policy matching strategy in ShardFormer (#5064)
KKZ20 Nov 22, 2023
aae4966
[shardformer]fix flash attention, when mask is casual, just don't unp…
flybird11111 Nov 22, 2023
3acbf6d
[npu] add npu support for hybrid plugin and llama (#5090)
oahzxl Nov 22, 2023
e53e729
[Feature] Add document retrieval QA (#5020)
YeAnbang Nov 23, 2023
68fcaa2
remove duplicate import (#5100)
oahzxl Nov 23, 2023
2bdf76f
fix typo change lazy_iniy to lazy_init (#5099)
digger-yu Nov 24, 2023
35722c5
Merge pull request #192 from hpcaitech/main
jamesthesnake Nov 26, 2023
62ebc0a
Merge pull request #193 from jamesthesnake/ra
jamesthesnake Nov 26, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
* @hpcaitech/colossalai-qa
4 changes: 3 additions & 1 deletion .github/workflows/release_test_pypi_before_merge.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,9 @@ jobs:
echo $new_version > ./version.txt
echo "version=$new_version" >> $GITHUB_OUTPUT

- run: python setup.py sdist build
- run: |
pip install --upgrade pip
python setup.py sdist build

# publish to PyPI if executed on the main branch
- name: Publish package to PyPI
Expand Down
54 changes: 54 additions & 0 deletions .github/workflows/run_colossalqa_unit_tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
name: Run colossalqa unit tests

on:
pull_request:
types: [synchronize, opened, reopened]
paths:
- 'applications/ColossalQA/colossalqa/**'
- 'applications/ColossalQA/requirements.txt'
- 'applications/ColossalQA/setup.py'
- 'applications/ColossalQA/tests/**'
- 'applications/ColossalQA/pytest.ini'

jobs:
tests:
name: Run colossalqa unit tests
if: |
github.event.pull_request.draft == false &&
github.base_ref == 'main' &&
github.event.pull_request.base.repo.full_name == 'hpcaitech/ColossalAI'
runs-on: [self-hosted, gpu]
container:
image: hpcaitech/pytorch-cuda:1.12.0-11.3.0
volumes:
- /data/scratch/test_data_colossalqa:/data/scratch/test_data_colossalqa
- /data/scratch/llama-tiny:/data/scratch/llama-tiny
options: --gpus all --rm
timeout-minutes: 30
defaults:
run:
shell: bash
steps:
- name: Checkout ColossalAI
uses: actions/checkout@v2

- name: Install colossalqa
run: |
cd applications/ColossalQA
pip install -e .

- name: Execute Unit Testing
run: |
cd applications/ColossalQA
pytest tests/
env:
NCCL_SHM_DISABLE: 1
MAX_JOBS: 8
ZH_MODEL_PATH: bigscience/bloom-560m
ZH_MODEL_NAME: bloom
EN_MODEL_PATH: bigscience/bloom-560m
EN_MODEL_NAME: bloom
TEST_DATA_PATH_EN: /data/scratch/test_data_colossalqa/companies.txt
TEST_DATA_PATH_ZH: /data/scratch/test_data_colossalqa/companies_zh.txt
TEST_DOCUMENT_LOADER_DATA_PATH: /data/scratch/test_data_colossalqa/tests/*
SQL_FILE_PATH: /data/scratch/test_data_colossalqa/sql_file_path
4 changes: 0 additions & 4 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,7 +1,3 @@
[submodule "inference"]
path = inference
url = https://github.com/hpcaitech/EnergonAI.git
branch = main
[submodule "examples/tutorial/fastfold/FastFold"]
path = examples/tutorial/fastfold/FastFold
url = https://github.com/hpcaitech/FastFold
75 changes: 75 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -477,3 +477,78 @@ Copyright 2021- HPC-AI Technology Inc. All rights reserved.
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


---------------- LICENSE FOR torch-int ----------------

MIT License

Copyright (c) 2022 Guangxuan Xiao

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


---------------- LICENSE FOR smoothquant ----------------

MIT License

Copyright (c) 2022 MIT HAN Lab

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


---------------- LICENSE FOR LangChain TEAM ----------------

The MIT License

Copyright (c) Harrison Chase

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,8 @@ distributed training and inference in a few lines.
- One half-day of training using a few hundred dollars yields similar results to mainstream large models, open-source and commercial-free domain-specific LLM solution.
[[code]](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Colossal-LLaMA-2)
[[blog]](https://www.hpc-ai.tech/blog/one-half-day-of-training-using-a-few-hundred-dollars-yields-similar-results-to-mainstream-large-models-open-source-and-commercial-free-domain-specific-llm-solution)
[[model weights]](https://huggingface.co/hpcai-tech/Colossal-LLaMA-2-7b-base)
[[HuggingFace model weights]](https://huggingface.co/hpcai-tech/Colossal-LLaMA-2-7b-base)
[[Modelscope model weights]](https://www.modelscope.cn/models/colossalai/Colossal-LLaMA-2-7b-base/summary)

| | Backbone | Tokens Consumed | | MMLU | CMMLU | AGIEval | GAOKAO | CEval |
| :----------------------------: | :--------: | :-------------: | :------------------: | :-----------: | :-----: | :----: | :----: | :------------------------------: |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ def main(args):
tokenizer.pad_token = tokenizer.eos_token
elif args.model == "llama":
tokenizer = LlamaTokenizer.from_pretrained(args.pretrain)
tokenizer.eos_token = "<\s>"
tokenizer.eos_token = "</s>"
tokenizer.pad_token = tokenizer.unk_token
else:
raise ValueError(f'Unsupported model "{args.model}"')
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ def train(args):
padding_side="right",
use_fast=False,
)
tokenizer.eos_token = "<\s>"
tokenizer.eos_token = "</s>"
tokenizer.pad_token = tokenizer.unk_token
else:
raise ValueError(f'Unsupported model "{args.model}"')
Expand Down
2 changes: 1 addition & 1 deletion applications/Chat/examples/inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ def eval(args):
tokenizer.pad_token = tokenizer.eos_token
elif args.model == "llama":
tokenizer = LlamaTokenizer.from_pretrained("hf-internal-testing/llama-tokenizer")
tokenizer.eos_token = "<\s>"
tokenizer.eos_token = "</s>"
tokenizer.pad_token = tokenizer.unk_token
else:
raise ValueError(f'Unsupported model "{args.model}"')
Expand Down
2 changes: 1 addition & 1 deletion applications/Chat/examples/train_prompts.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ def main(args):
tokenizer = LlamaTokenizer.from_pretrained(
"hf-internal-testing/llama-tokenizer" if args.tokenizer is None else args.tokenizer
)
tokenizer.eos_token = "<\s>"
tokenizer.eos_token = "</s>"
tokenizer.pad_token = tokenizer.unk_token
else:
raise ValueError(f'Unsupported model "{args.model}"')
Expand Down
2 changes: 1 addition & 1 deletion applications/Chat/examples/train_reward_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ def train(args):
tokenizer = LlamaTokenizer.from_pretrained(
"hf-internal-testing/llama-tokenizer" if args.tokenizer is None else args.tokenizer
)
tokenizer.eos_token = "<\s>"
tokenizer.eos_token = "</s>"
tokenizer.pad_token = tokenizer.unk_token
else:
raise ValueError(f'Unsupported model "{args.model}"')
Expand Down
2 changes: 1 addition & 1 deletion applications/Chat/examples/train_sft.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ def train(args):
tokenizer = LlamaTokenizer.from_pretrained(
"hf-internal-testing/llama-tokenizer" if args.tokenizer is None else args.tokenizer
)
tokenizer.eos_token = "<\s>"
tokenizer.eos_token = "</s>"
tokenizer.pad_token = tokenizer.unk_token
elif args.model == "chatglm":
tokenizer = ChatGLMTokenizer.from_pretrained(
Expand Down
22 changes: 20 additions & 2 deletions applications/Colossal-LLaMA-2/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,9 @@
* [2023/09] [One Half-Day of Training Using a Few Hundred Dollars Yields Similar Results to Mainstream Large Models, Open-Source and Commercial-Free Domain-Specific Llm Solution](https://www.hpc-ai.tech/blog/one-half-day-of-training-using-a-few-hundred-dollars-yields-similar-results-to-mainstream-large-models-open-source-and-commercial-free-domain-specific-llm-solution)
[[code]](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Colossal-LLaMA-2)
[[blog]](https://www.hpc-ai.tech/blog/one-half-day-of-training-using-a-few-hundred-dollars-yields-similar-results-to-mainstream-large-models-open-source-and-commercial-free-domain-specific-llm-solution)
[[model weights]](https://huggingface.co/hpcai-tech/Colossal-LLaMA-2-7b-base)
[[HuggingFace model weights]](https://huggingface.co/hpcai-tech/Colossal-LLaMA-2-7b-base)
[[Modelscope model weights]](https://www.modelscope.cn/models/colossalai/Colossal-LLaMA-2-7b-base/summary)


## Colossal-LLaMA-2-7B
The [Colossal-AI](https://github.com/hpcaitech/ColossalAI) team has introduced the open-source model **Colossal-LLaMA-2-7B-base**. This model, a derivation of LLaMA-2, has undergone continual pre-training involving approximately 8.5 billion tokens over a duration of 15 hours with 64 A800 GPUs. At a cost of **less than $1,000**, you can achieve results **similar to those that cost millions of dollars to pretrain from scratch**. It is licensed under the LLaMA-2 license and [Apache 2.0 License](https://github.com/hpcaitech/ColossalAI/blob/main/LICENSE) **without any additional commercial use restrictions**. This solution can also be used to build models of specific domain knowledge or tasks.
Expand Down Expand Up @@ -122,7 +124,23 @@ pred = model.generate(**inputs,
print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True)[len(input):])
```

You can also download model weights from [🤗HuggingFace](https://huggingface.co/hpcai-tech/Colossal-LLaMA-2-7b-base).
You can also load our model using modelscope, use the following code:
```Python
from modelscope import AutoModelForCausalLM, AutoTokenizer, snapshot_download
model_dir = snapshot_download('colossalai/Colossal-LLaMA-2-7b-base', revision='v1.0.1')
tokenizer = AutoTokenizer.from_pretrained(model_dir, device_map="auto", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto", trust_remote_code=True).eval()
generation_kwargs = {"max_new_tokens": 256,
"top_p": 0.95,
"temperature": 0.3
}
input = '离离原上草,'
inputs = tokenizer(input, return_token_type_ids=False, return_tensors='pt')
inputs = inputs.to('cuda:0')
output = model.generate(**inputs, **generation_kwargs)
print(tokenizer.decode(output.cpu()[0], skip_special_tokens=True)[len(input):])
```
You can download model weights from [🤗HuggingFace](https://huggingface.co/hpcai-tech/Colossal-LLaMA-2-7b-base) or [👾Modelscope](https://modelscope.cn/models/colossalai/Colossal-LLaMA-2-7b-base/summary).

## Usage
### Install
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,25 +6,20 @@

import torch
import torch.nn.functional as F
from einops import rearrange
from flash_attn.bert_padding import pad_input, unpad_input
from flash_attn.flash_attn_interface import flash_attn_func, flash_attn_varlen_kvpacked_func
from flash_attn.ops.rms_norm import rms_norm
from transformers.models.llama.modeling_llama import (
LlamaRMSNorm,
LlamaAttention,
LlamaModel,
LlamaForCausalLM,
LlamaModel,
LlamaRMSNorm,
apply_rotary_pos_emb,
repeat_kv,
)

from colossalai.logging import get_dist_logger
from einops import rearrange

from flash_attn.bert_padding import pad_input, unpad_input
from flash_attn.flash_attn_interface import (
flash_attn_func,
flash_attn_varlen_kvpacked_func,
)
from flash_attn.ops.rms_norm import rms_norm


logger = get_dist_logger()

Expand Down Expand Up @@ -65,6 +60,7 @@ def attention_forward(
past_key_value: Optional[Tuple[torch.Tensor]] = None,
output_attentions: bool = False,
use_cache: bool = False,
**kwargs,
) -> Tuple[torch.Tensor, Optional[torch.Tensor], Optional[Tuple[torch.Tensor]]]:
"""
Re-define LLaMA-2 `LlamaAttention` forward method using flash-attention.
Expand Down
2 changes: 2 additions & 0 deletions applications/ColossalEval/colossal_eval/dataset/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from .gaokaobench import GaoKaoBenchDataset
from .longbench import LongBenchDataset
from .mmlu import MMLUDataset
from .mtbench import MTBenchDataset

__all__ = [
"AGIEvalDataset",
Expand All @@ -16,4 +17,5 @@
"LongBenchDataset",
"MMLUDataset",
"ColossalDataset",
"MTBenchDataset",
]
72 changes: 72 additions & 0 deletions applications/ColossalEval/colossal_eval/dataset/mtbench.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
import copy
import json
import os
from collections import defaultdict
from typing import Dict, List

from colossal_eval.utils import get_json_list

from colossalai.logging import DistributedLogger

from .base import BaseDataset

default_inference_kwargs = {
"calculate_loss": False,
"all_classes": None,
"language": "English",
"pretrain": False,
"max_new_tokens": 1024,
"turns": 2,
}


class MTBenchDataset(BaseDataset):
"""
Dataset class for mt_bench dataset.
Data source: https://github.com/lm-sys/FastChat/blob/main/fastchat/llm_judge/data/mt_bench/question.jsonl
This dataset class will convert the original dataset into the inference dataset.
"""

def __init__(self, path, logger, few_shot):
self.multiturn = True
self.dataset = self.load(path, logger, few_shot)

@staticmethod
def load(path: str, logger: DistributedLogger, few_shot: bool) -> List[Dict]:
dataset = {"test": defaultdict(dict)}

file_path = os.path.join(path, "question.jsonl")
ref_path = os.path.join(path, "reference_answer/gpt-4.jsonl")

reference = defaultdict(list)
ref_origin = get_json_list(ref_path)
for ref in ref_origin:
reference[ref["question_id"]] = ref["choices"][0]["turns"]

with open(file_path, "r", encoding="utf-8") as file:
for line in file:
question = json.loads(line)
category = question["category"]
turn_number = len(question["turns"])
data_point = {
"id": question["question_id"],
"dataset": "mtbench",
"split": "test",
"category": category,
"instruction": question["turns"],
"input": "",
"output": [],
"target": [""] * turn_number
if question["question_id"] not in reference
else reference[question["question_id"]],
}

if category in dataset["test"]:
dataset["test"][category]["data"].append(data_point)
else:
dataset["test"][category] = {
"data": [data_point],
"inference_kwargs": copy.deepcopy(default_inference_kwargs),
}

return dataset
Loading