Skip to content
Merged

Ra #193

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
750ec01
Merge pull request #169 from hpcaitech/main
jamesthesnake Sep 17, 2023
1db6727
[Pipeline inference] Combine kvcache with pipeline inference (#4938)
FoolPlayer Oct 27, 2023
4e4a10c
updated c++17 compiler flags (#4983)
kurisusnowdeng Oct 27, 2023
cf579ff
[Inference] Dynamic Batching Inference, online and offline (#4953)
CjhHa1 Oct 30, 2023
459a88c
[Kernels]Updated Triton kernels into 2.1.0 and adding flash-decoding …
tiandiao123 Oct 30, 2023
abe071b
fix ColossalEval (#4992)
chengeharrison Oct 31, 2023
4f0234f
[doc]Update doc for colossal-inference (#4989)
tiandiao123 Oct 31, 2023
be82b5d
[hotfix] Fix the bug where process groups were not being properly rel…
littsk Oct 31, 2023
c040d70
[hotfix] fix the bug of repeatedly storing param group (#4951)
Oct 31, 2023
335cb10
[doc] add supported feature diagram for hybrid parallel plugin (#4996)
ppt0011 Oct 31, 2023
b6696be
[Pipeline Inference] Merge pp with tp (#4993)
FoolPlayer Nov 1, 2023
8993c8a
[release] update version (#4995)
ver217 Nov 1, 2023
dc003c3
[moe] merge moe into main (#4978)
oahzxl Nov 2, 2023
d99b2c9
[hotfix] fix grad accumulation plus clipping for gemini (#5002)
Nov 2, 2023
1a3315e
[hotfix] Add layer norm gradients all-reduce for sequence parallel (#…
littsk Nov 3, 2023
c36e782
[format] applied code formatting on changed files in pull request 492…
github-actions[bot] Nov 6, 2023
ef4c14a
[Inference] Fix bug in ChatGLM2 Tensor Parallelism (#5014)
CjhHa1 Nov 7, 2023
67f5331
[misc] add code owners (#5024)
ver217 Nov 8, 2023
f71e63b
[moe] support optimizer checkpoint (#5015)
oahzxl Nov 8, 2023
239cd92
Support mtbench (#5025)
chengeharrison Nov 9, 2023
7244412
[moe]: fix ep/tp tests, add hierarchical all2all (#4982)
cwher Nov 9, 2023
a448938
[shardformer] Fix serialization error with Tensor Parallel state savi…
imgaojun Nov 9, 2023
576a2f7
[gemini] gemini support tensor parallelism. (#4942)
flybird11111 Nov 10, 2023
70885d7
[hotfix] Suport extra_kwargs in ShardConfig (#5031)
KKZ20 Nov 10, 2023
43ad0d9
fix wrong EOS token in ColossalChat
Orion-Zheng Nov 14, 2023
28052a7
[Kernels]Update triton kernels into 2.1.0 (#5046)
tiandiao123 Nov 16, 2023
b2ad0d9
[pipeline,shardformer] Fix p2p efficiency in pipeline, allow skipping…
zeyugao Nov 16, 2023
3e02154
[gemini] gemini support extra-dp (#5043)
flybird11111 Nov 16, 2023
97cd0cd
[shardformer] fix llama error when transformers upgraded. (#5055)
flybird11111 Nov 16, 2023
3c08f17
[hotfix]: modify create_ep_hierarchical_group and add test (#5032)
cwher Nov 17, 2023
bc09b95
[exampe] fix llama example' loss error when using gemini plugin (#5060)
flybird11111 Nov 18, 2023
fd6482a
[inference] Refactor inference architecture (#5057)
Xu-Kai Nov 19, 2023
bce9197
[Kernels]added flash-decoidng of triton (#5063)
tiandiao123 Nov 20, 2023
8d56c9c
[misc] remove outdated submodule (#5070)
ver217 Nov 20, 2023
e5ce4c8
[npu] add npu support for gemini and zero (#5067)
ver217 Nov 20, 2023
0c7d8be
[hotfix/hybridengine] fix bug when tp*pp size = 1 (#5069)
FoolPlayer Nov 20, 2023
fb103cf
[inference] update examples and engine (#5073)
Xu-Kai Nov 20, 2023
8921a73
[format] applied code formatting on changed files in pull request 506…
github-actions[bot] Nov 20, 2023
4e3959d
[hotfix/hybridengine] Fix init model with random parameters in benchm…
FoolPlayer Nov 20, 2023
1cd7efc
[inference] refactor examples and fix schedule (#5077)
ver217 Nov 21, 2023
dce05da
fix thrust-transform-reduce error (#5078)
imgaojun Nov 21, 2023
fd3567e
[nfc] fix typo in docs/ (#4972)
digger-yu Nov 21, 2023
0d48230
[nfc] fix typo and author name (#5089)
digger-yu Nov 22, 2023
4ccb9de
[gemini]fix gemini optimzer, saving Shardformer in Gemini got list as…
flybird11111 Nov 22, 2023
75af66c
[Hotfix] Fix model policy matching strategy in ShardFormer (#5064)
KKZ20 Nov 22, 2023
aae4966
[shardformer]fix flash attention, when mask is casual, just don't unp…
flybird11111 Nov 22, 2023
3acbf6d
[npu] add npu support for hybrid plugin and llama (#5090)
oahzxl Nov 22, 2023
e53e729
[Feature] Add document retrieval QA (#5020)
YeAnbang Nov 23, 2023
68fcaa2
remove duplicate import (#5100)
oahzxl Nov 23, 2023
2bdf76f
fix typo change lazy_iniy to lazy_init (#5099)
digger-yu Nov 24, 2023
35722c5
Merge pull request #192 from hpcaitech/main
jamesthesnake Nov 26, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
* @hpcaitech/colossalai-qa
4 changes: 3 additions & 1 deletion .github/workflows/release_test_pypi_before_merge.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,9 @@ jobs:
echo $new_version > ./version.txt
echo "version=$new_version" >> $GITHUB_OUTPUT

- run: python setup.py sdist build
- run: |
pip install --upgrade pip
python setup.py sdist build

# publish to PyPI if executed on the main branch
- name: Publish package to PyPI
Expand Down
54 changes: 54 additions & 0 deletions .github/workflows/run_colossalqa_unit_tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
name: Run colossalqa unit tests

on:
pull_request:
types: [synchronize, opened, reopened]
paths:
- 'applications/ColossalQA/colossalqa/**'
- 'applications/ColossalQA/requirements.txt'
- 'applications/ColossalQA/setup.py'
- 'applications/ColossalQA/tests/**'
- 'applications/ColossalQA/pytest.ini'

jobs:
tests:
name: Run colossalqa unit tests
if: |
github.event.pull_request.draft == false &&
github.base_ref == 'main' &&
github.event.pull_request.base.repo.full_name == 'hpcaitech/ColossalAI'
runs-on: [self-hosted, gpu]
container:
image: hpcaitech/pytorch-cuda:1.12.0-11.3.0
volumes:
- /data/scratch/test_data_colossalqa:/data/scratch/test_data_colossalqa
- /data/scratch/llama-tiny:/data/scratch/llama-tiny
options: --gpus all --rm
timeout-minutes: 30
defaults:
run:
shell: bash
steps:
- name: Checkout ColossalAI
uses: actions/checkout@v2

- name: Install colossalqa
run: |
cd applications/ColossalQA
pip install -e .

- name: Execute Unit Testing
run: |
cd applications/ColossalQA
pytest tests/
env:
NCCL_SHM_DISABLE: 1
MAX_JOBS: 8
ZH_MODEL_PATH: bigscience/bloom-560m
ZH_MODEL_NAME: bloom
EN_MODEL_PATH: bigscience/bloom-560m
EN_MODEL_NAME: bloom
TEST_DATA_PATH_EN: /data/scratch/test_data_colossalqa/companies.txt
TEST_DATA_PATH_ZH: /data/scratch/test_data_colossalqa/companies_zh.txt
TEST_DOCUMENT_LOADER_DATA_PATH: /data/scratch/test_data_colossalqa/tests/*
SQL_FILE_PATH: /data/scratch/test_data_colossalqa/sql_file_path
4 changes: 0 additions & 4 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,7 +1,3 @@
[submodule "inference"]
path = inference
url = https://github.com/hpcaitech/EnergonAI.git
branch = main
[submodule "examples/tutorial/fastfold/FastFold"]
path = examples/tutorial/fastfold/FastFold
url = https://github.com/hpcaitech/FastFold
25 changes: 25 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -527,3 +527,28 @@ Copyright 2021- HPC-AI Technology Inc. All rights reserved.
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


---------------- LICENSE FOR LangChain TEAM ----------------

The MIT License

Copyright (c) Harrison Chase

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ def main(args):
tokenizer.pad_token = tokenizer.eos_token
elif args.model == "llama":
tokenizer = LlamaTokenizer.from_pretrained(args.pretrain)
tokenizer.eos_token = "<\s>"
tokenizer.eos_token = "</s>"
tokenizer.pad_token = tokenizer.unk_token
else:
raise ValueError(f'Unsupported model "{args.model}"')
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ def train(args):
padding_side="right",
use_fast=False,
)
tokenizer.eos_token = "<\s>"
tokenizer.eos_token = "</s>"
tokenizer.pad_token = tokenizer.unk_token
else:
raise ValueError(f'Unsupported model "{args.model}"')
Expand Down
2 changes: 1 addition & 1 deletion applications/Chat/examples/inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ def eval(args):
tokenizer.pad_token = tokenizer.eos_token
elif args.model == "llama":
tokenizer = LlamaTokenizer.from_pretrained("hf-internal-testing/llama-tokenizer")
tokenizer.eos_token = "<\s>"
tokenizer.eos_token = "</s>"
tokenizer.pad_token = tokenizer.unk_token
else:
raise ValueError(f'Unsupported model "{args.model}"')
Expand Down
2 changes: 1 addition & 1 deletion applications/Chat/examples/train_prompts.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ def main(args):
tokenizer = LlamaTokenizer.from_pretrained(
"hf-internal-testing/llama-tokenizer" if args.tokenizer is None else args.tokenizer
)
tokenizer.eos_token = "<\s>"
tokenizer.eos_token = "</s>"
tokenizer.pad_token = tokenizer.unk_token
else:
raise ValueError(f'Unsupported model "{args.model}"')
Expand Down
2 changes: 1 addition & 1 deletion applications/Chat/examples/train_reward_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ def train(args):
tokenizer = LlamaTokenizer.from_pretrained(
"hf-internal-testing/llama-tokenizer" if args.tokenizer is None else args.tokenizer
)
tokenizer.eos_token = "<\s>"
tokenizer.eos_token = "</s>"
tokenizer.pad_token = tokenizer.unk_token
else:
raise ValueError(f'Unsupported model "{args.model}"')
Expand Down
2 changes: 1 addition & 1 deletion applications/Chat/examples/train_sft.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ def train(args):
tokenizer = LlamaTokenizer.from_pretrained(
"hf-internal-testing/llama-tokenizer" if args.tokenizer is None else args.tokenizer
)
tokenizer.eos_token = "<\s>"
tokenizer.eos_token = "</s>"
tokenizer.pad_token = tokenizer.unk_token
elif args.model == "chatglm":
tokenizer = ChatGLMTokenizer.from_pretrained(
Expand Down
2 changes: 2 additions & 0 deletions applications/ColossalEval/colossal_eval/dataset/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from .gaokaobench import GaoKaoBenchDataset
from .longbench import LongBenchDataset
from .mmlu import MMLUDataset
from .mtbench import MTBenchDataset

__all__ = [
"AGIEvalDataset",
Expand All @@ -16,4 +17,5 @@
"LongBenchDataset",
"MMLUDataset",
"ColossalDataset",
"MTBenchDataset",
]
72 changes: 72 additions & 0 deletions applications/ColossalEval/colossal_eval/dataset/mtbench.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
import copy
import json
import os
from collections import defaultdict
from typing import Dict, List

from colossal_eval.utils import get_json_list

from colossalai.logging import DistributedLogger

from .base import BaseDataset

default_inference_kwargs = {
"calculate_loss": False,
"all_classes": None,
"language": "English",
"pretrain": False,
"max_new_tokens": 1024,
"turns": 2,
}


class MTBenchDataset(BaseDataset):
"""
Dataset class for mt_bench dataset.
Data source: https://github.com/lm-sys/FastChat/blob/main/fastchat/llm_judge/data/mt_bench/question.jsonl
This dataset class will convert the original dataset into the inference dataset.
"""

def __init__(self, path, logger, few_shot):
self.multiturn = True
self.dataset = self.load(path, logger, few_shot)

@staticmethod
def load(path: str, logger: DistributedLogger, few_shot: bool) -> List[Dict]:
dataset = {"test": defaultdict(dict)}

file_path = os.path.join(path, "question.jsonl")
ref_path = os.path.join(path, "reference_answer/gpt-4.jsonl")

reference = defaultdict(list)
ref_origin = get_json_list(ref_path)
for ref in ref_origin:
reference[ref["question_id"]] = ref["choices"][0]["turns"]

with open(file_path, "r", encoding="utf-8") as file:
for line in file:
question = json.loads(line)
category = question["category"]
turn_number = len(question["turns"])
data_point = {
"id": question["question_id"],
"dataset": "mtbench",
"split": "test",
"category": category,
"instruction": question["turns"],
"input": "",
"output": [],
"target": [""] * turn_number
if question["question_id"] not in reference
else reference[question["question_id"]],
}

if category in dataset["test"]:
dataset["test"][category]["data"].append(data_point)
else:
dataset["test"][category] = {
"data": [data_point],
"inference_kwargs": copy.deepcopy(default_inference_kwargs),
}

return dataset
Original file line number Diff line number Diff line change
@@ -1,12 +1,15 @@
import os
from typing import Dict, List

import colossal_eval.evaluate.dataset_evaluator.metrics as metric_helper
import numpy as np
import tqdm
from colossal_eval.utils import jdump

LabelBasedMetrics = ["first_token_accuracy", "matthews_correlation"]
LossBasedMetrics = ["perplexity", "ppl_score", "ppl_score_over_choices", "per_byte_perplexity", "per_byte_ppl_score"]
CombinedMetrics = ["combined_single_choice_accuracy"]
GPTMetrics = ["mtbench_single_judge"]
OtherMetrics = [
"f1_score",
"f1_zh_score",
Expand All @@ -29,8 +32,9 @@ class DatasetEvaluator(object):

"""

def __init__(self):
pass
def __init__(self, config_path: str, save_path: str):
self.config_path = config_path
self.save_path = save_path

def _calculate_label_metrics(self, metric: str, category: str):
"""Calculate label-based metrics."""
Expand Down Expand Up @@ -60,6 +64,11 @@ def _calculate_label_metrics(self, metric: str, category: str):
sample["output"], ref, all_classes=self.data[category]["inference_kwargs"]["all_classes"]
),
)

score = max(
score,
metric_helper.accuracy_by_options(sample["input"], sample["output"], ref),
)
softmaxs.append(references[i] if score == 1 else -1)
else:
softmaxs.append(np.argmax(np.array(list(sample["softmax_over_choices"].values()))))
Expand Down Expand Up @@ -151,6 +160,24 @@ def _calculate_other_metrics(self, metric: str, category: str):
self.evaluation_results[metric][category] = (total_score, len(self.data[category]["data"]))
self.evaluation_results[metric]["ALL"] += total_score * weight

def _calculate_gpt_metrics(self, metric: str, category: str):
"""Calculate gpt metrics."""
weight = len(self.data[category]["data"]) / self.metric_total_length[metric]

metric_method = eval("gpt_helper." + metric)

judgements, avg_ratings = metric_method(self.data[category]["data"], self.config_path)
self.judgements[category] = judgements

self.evaluation_results[metric][category] = (np.mean(avg_ratings), len(self.data[category]["data"]))
self.evaluation_results[metric]["ALL"] += np.mean(avg_ratings) * weight

for i in range(avg_ratings.shape[0]):
if f"{metric}_{i+1}" not in self.evaluation_results:
self.evaluation_results[f"{metric}_{i+1}"] = {cat: 0 for cat in (["ALL"] + self.categories)}
self.evaluation_results[f"{metric}_{i+1}"][category] = (avg_ratings[i], len(self.data[category]["data"]))
self.evaluation_results[f"{metric}_{i+1}"]["ALL"] += avg_ratings[i] * weight

def _calculate_loss_metrics(self, metric: str, category: str):
"""Calculate perplexity."""
if metric == "perplexity":
Expand Down Expand Up @@ -212,10 +239,20 @@ def _evaluate(self):
for category in self.suggested_categories[metric]:
self._calculate_combined_metrics(metric, category)
pbar.update(1)
elif metric in GPTMetrics:
for category in self.suggested_categories[metric]:
self._calculate_gpt_metrics(metric, category)
pbar.update(1)
elif metric in OtherMetrics:
for category in self.suggested_categories[metric]:
self._calculate_other_metrics(metric, category)
pbar.update(1)
else:
raise Exception(f"{metric} not supported.")

if self.judgements:
judgement_path = os.path.join(self.save_path, f"{self.model_name}_judgements.json")
jdump(self.judgements, judgement_path)

return self.evaluation_results

Expand All @@ -235,6 +272,7 @@ def get_evaluation_results(self, data: List[Dict], dataset_name: str, model_name
self.model_name = model_name
self.categories = list(data.keys())
self.metrics = metrics
self.judgements = {}

self.evaluation_results = {
metric: {category: 0 for category in (["ALL"] + self.categories)} for metric in self.metrics
Expand Down
Loading