Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
677cbfa
[Fix/Example] Fix Llama Inference Loading Data Type (#5763)
yuanheng-zhao May 30, 2024
68359ed
[release] update version (#5752)
ver217 May 31, 2024
3f2be80
fix (#5765)
flybird11111 Jun 3, 2024
1b76564
[test] Fix/fix testcase (#5770)
duanjunwen Jun 3, 2024
4064432
[Hotfix] Add missing init file in inference.executor (#5774)
yuanheng-zhao Jun 3, 2024
e22b827
[CI/tests] simplify some test case to reduce testing time (#5755)
Hz188 Jun 4, 2024
32f4187
[misc] update dockerfile (#5776)
ver217 Jun 4, 2024
ee6fd38
[devops] fix docker ci (#5780)
ver217 Jun 4, 2024
b45000f
[Inference]Add Streaming LLM (#5745)
isky-cd Jun 5, 2024
50b4c8e
[hotfix] fix llama flash attention forward (#5777)
flybird11111 Jun 5, 2024
79f7a7b
[misc] Accelerate CI for zero and dist optim (#5758)
Edenzzzz Jun 5, 2024
80c3c87
[Test/CI] remove test cases to reduce CI duration (#5753)
botbw Jun 5, 2024
10a19e2
[hotfix] fix testcase in test_fx/test_tracer (#5779)
duanjunwen Jun 5, 2024
3f7e313
[gemini] optimize reduce scatter d2h copy (#5760)
botbw Jun 5, 2024
c46e097
Allow building cuda extension without a device. (#5535)
ccoulombe Jun 5, 2024
b9d646f
[misc] fix dist logger (#5782)
ver217 Jun 5, 2024
a1e39f4
[install]fix setup (#5786)
flybird11111 Jun 6, 2024
5ead00f
[misc] update requirements (#5787)
ver217 Jun 6, 2024
73e88a5
[shardformer] fix import (#5788)
ver217 Jun 6, 2024
7a7e869
upgrade colossal-chat support tp_group>1, add sp for sft
YeAnbang May 27, 2024
929e1e3
upgrade ppo dpo rm script
YeAnbang May 28, 2024
7e65b71
run pre-commit
YeAnbang May 28, 2024
0b4a335
moupdate ci tests, st ci test cases passed, tp failed in generation f…
YeAnbang May 28, 2024
7ae87b3
fix training script
YeAnbang May 28, 2024
b1031f7
fix ci
YeAnbang May 28, 2024
1b880ce
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 28, 2024
b8b5cac
fix transformers version
YeAnbang May 29, 2024
62eb28b
remove duplicated test
YeAnbang May 29, 2024
0bbac15
fix datasets version
YeAnbang May 29, 2024
bf57b13
remove models that require huggingface auth from ci
YeAnbang May 29, 2024
45195ac
remove local data path
YeAnbang May 29, 2024
e16ccc2
update ci
YeAnbang May 29, 2024
ac1520c
remove baichuan from template test due to transformer version conflict
YeAnbang Jun 3, 2024
790e136
merge
YeAnbang Jun 7, 2024
04386d9
Refactor modeling by adding attention backend
char-1ee Jun 3, 2024
eec77e5
Fix tests and naming
char-1ee Jun 3, 2024
5f398fc
Pass inference model shard configs for module init
char-1ee Jun 7, 2024
ceba662
Clean up
char-1ee Jun 7, 2024
0d7ff10
replace the customized dataloader setup with the build-in one
YeAnbang Jun 7, 2024
77db216
replace the customized dataloader setup with the build-in one
YeAnbang Jun 7, 2024
f5981e8
Remove flash attention backend
char-1ee Jun 7, 2024
2abdede
fix readme
YeAnbang Jun 10, 2024
b303976
Fix test import
char-1ee Jun 10, 2024
77a219a
Merge pull request #5771 from char-1ee/refactor/modeling
char-1ee Jun 10, 2024
84eab13
update sft trainning script
YeAnbang Jun 11, 2024
c0948af
[Inference]refactor baichuan (#5791)
LRY89757 Jun 11, 2024
74f4a29
Merge pull request #5759 from hpcaitech/colossalchat_upgrade
YeAnbang Jun 11, 2024
587bbf4
[test] fix chatglm test kit (#5793)
ver217 Jun 11, 2024
aa125bc
[shardformer] fix modeling of bloom and falcon (#5796)
ver217 Jun 11, 2024
aac941e
[test] fix qwen2 pytest distLarge (#5797)
GuangyaoZhang Jun 12, 2024
8554585
[Inference] Fix flash-attn import and add model test (#5794)
char-1ee Jun 12, 2024
d9dddf5
[Gemini] Use async stream to prefetch and h2d data moving (#5781)
Hz188 Jun 12, 2024
3bcbba9
[gemini] quick fix on possible async operation (#5803)
botbw Jun 13, 2024
2ddf624
[shardformer] upgrade transformers to 4.39.3 (#5815)
flybird11111 Jun 14, 2024
8795bb2
Support 4d parallel + flash attention (#5789)
Edenzzzz Jun 17, 2024
45ec5b3
fix .github worfflow conflict with main branch
Hz188 Jun 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/compatiblity_test_on_dispatch.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,11 +51,11 @@ jobs:
container:
image: ${{ matrix.container }}
options: --gpus all --rm -v /dev/shm -v /data/scratch/:/data/scratch/
timeout-minutes: 120
timeout-minutes: 200
steps:
- name: Install dependencies
run: |
pip install -U pip setuptools wheel --user
pip install -U pip setuptools==68.2.2 wheel --user
- uses: actions/checkout@v2
with:
repository: hpcaitech/TensorNVMe
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/compatiblity_test_on_pr.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,14 +42,14 @@ jobs:
container:
image: ${{ matrix.container }}
options: --gpus all --rm -v /dev/shm -v /data/scratch/:/data/scratch/
timeout-minutes: 120
timeout-minutes: 200
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}-run-test-${{ matrix.container }}
cancel-in-progress: true
steps:
- name: Install dependencies
run: |
pip install -U pip setuptools wheel --user
pip install -U pip setuptools==68.2.2 wheel --user
- uses: actions/checkout@v2
with:
repository: hpcaitech/TensorNVMe
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/compatiblity_test_on_schedule.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,11 +39,11 @@ jobs:
container:
image: ${{ matrix.container }}
options: --gpus all --rm -v /dev/shm -v /data/scratch/:/data/scratch/
timeout-minutes: 120
timeout-minutes: 200
steps:
- name: Install dependencies
run: |
pip install -U pip setuptools wheel --user
pip install -U pip setuptools==68.2.2 wheel --user

- uses: actions/checkout@v2
with:
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/release_docker_after_publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ jobs:
docker tag $tag $latest
echo "tag=${tag}" >> $GITHUB_OUTPUT
echo "latest=${latest}" >> $GITHUB_OUTPUT
env:
DOCKER_BUILDKIT: 0

- name: Log in to Docker Hub
uses: docker/login-action@f054a8b539a109f9f41c372932f1ae047eff08c9
Expand Down
11 changes: 6 additions & 5 deletions .github/workflows/run_chatgpt_examples.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,11 @@ on:
pull_request:
types: [synchronize, opened, reopened]
paths:
- "applications/Chat/coati/**"
- "applications/Chat/requirements.txt"
- "applications/Chat/setup.py"
- "applications/Chat/examples/**"
- "applications/ColossalChat/coati/**"
- "applications/ColossalChat/requirements.txt"
- "applications/ColossalChat/setup.py"
- "applications/ColossalChat/examples/**"
- "applications/ColossalChat/tests/**"

jobs:
tests:
Expand Down Expand Up @@ -41,7 +42,7 @@ jobs:

- name: Install Transformers
run: |
pip install transformers==4.34.1
pip install transformers==4.36.2

- name: Execute Examples
run: |
Expand Down
11 changes: 5 additions & 6 deletions .github/workflows/run_chatgpt_unit_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,11 @@ on:
pull_request:
types: [synchronize, opened, reopened]
paths:
- 'applications/Chat/coati/**'
- 'applications/Chat/requirements.txt'
- 'applications/Chat/setup.py'
- 'applications/Chat/requirements-test.txt'
- 'applications/Chat/tests/**'
- 'applications/Chat/pytest.ini'
- 'applications/ColossalChat/coati/**'
- 'applications/ColossalChat/requirements.txt'
- 'applications/ColossalChat/setup.py'
- 'applications/ColossalChat/tests/**'
- 'applications/ColossalChat/pytest.ini'

jobs:
tests:
Expand Down