[shardformer] Support Mistral for Shardformer by eric8607242 · Pull Request #4836 · hpcaitech/ColossalAI

eric8607242 · 2023-09-28T07:53:08Z

📌 Checklist before creating the PR

I have created an issue for this PR for traceability
The title follows the standard format: [doc/gemini/tensor/...]: A concise description
I have added relevant tags if possible for us to better distinguish different PRs

🚨 Issue number

📝 What does this PR do?

Hi, I added a new policy and model to support Mistral with ShardFormer.
The current policy supports tensor parallel, fused normalization, and flash attention.

Although the minimum version requirement for Mitral is transformers==4.34.0.dev0 currently, it is still very excited to make ColossalAI support such impressive models!!!

💥 Checklist before requesting a review

I have linked my PR to an issue (instruction)
My issue clearly describes the problem/feature/proposal, with diagrams/charts/table/code if possible
I have performed a self-review of my code
I have added thorough tests.
I have added docstrings for all the functions/methods I implemented

⭐️ Do you enjoy contributing to Colossal-AI?

🌝 Yes, I do.
🌚 No, I don't.

Tell us more if you don't enjoy contributing to Colossal-AI.

github-actions · 2023-10-02T13:54:09Z

The code coverage for the changed files is 27%.

Click me to view the complete report

Name                                            Stmts   Miss  Cover
-------------------------------------------------------------------
colossalai/shardformer/layer/normalization.py      50     10    80%
colossalai/shardformer/modeling/mistral.py         42     42     0%
colossalai/shardformer/policies/mistral.py         58     58     0%
-------------------------------------------------------------------
TOTAL                                             150    110    27%

Fridge003 · 2023-10-07T04:25:45Z

Hi, have you tested the correctness of your codes?

eric8607242 · 2023-10-07T16:14:10Z

Hi @Fridge003,

I have successfully fine-tuned mistral-7B with the HybridPlugin (flash attention + tensor parallelism + fused normalization) with my code.
Did you encounter any issues to run with this setting?

Fridge003 · 2023-10-08T02:27:40Z

Hi, @eric8607242. Usually when we implement policy for a new model, the corresponding tests should also be added under folder tests/test_shardformer/test_model. Since our CI doesn't use the newest version of transformers library, this pull request might bypass the CI tests. I want to make sure that the codes in this PR doesn't cause error.

eric8607242 · 2023-10-08T03:52:28Z

Hi @Fridge003 ,

I see.
This code does not raise any errors in my testing.
As I mentioned above, the mistral-7B can be fine-tuned successfully.

But, in this PR, pipeline parallelism is not yet supported.

zhoumengbo · 2023-11-09T11:27:31Z

Hello, why has the code not been merged into the trunk? Is there something wrong? Can you provide the complete fine-tuning code for mistral?

eric8607242 · 2023-11-10T01:58:38Z

Hi @zhoumengbo,
Unfortunately, I have no idea why this MR has not been merged yet.

I modified the fine-tuning code from https://github.com/hpcaitech/ColossalAI/blob/main/examples/language/gpt/hybridparallelism/finetune.py

flybird11111 · 2023-11-10T02:38:43Z

Hi @zhoumengbo, Unfortunately, I have no idea why this MR has not been merged yet.

I modified the fine-tuning code from https://github.com/hpcaitech/ColossalAI/blob/main/examples/language/gpt/hybridparallelism/finetune.py

Hi, this PR lacks code for unit tests, we would greatly appreciate it if you could help add unit tests. Alternatively, you could submit it to the feature/shardformer branch, and we will enhance the tests for this PR. Thank you.

eric8607242 · 2023-11-23T15:12:54Z

Close this PR as I create a new PR to feature/shardformer in #5103.

Add Mistral support for Shardformer

4bbb53e

eric8607242 changed the title ~~[shardformer] Add Mistral support for Shardformer~~ [shardformer] Support Mistral for Shardformer Sep 30, 2023

flybird11111 marked this pull request as draft November 10, 2023 02:25

flybird11111 marked this pull request as ready for review November 10, 2023 02:25

flybird11111 requested a review from a team as a code owner November 10, 2023 02:25

flybird11111 marked this pull request as draft November 10, 2023 03:13

flybird11111 marked this pull request as ready for review November 10, 2023 03:13

eric8607242 marked this pull request as draft November 14, 2023 02:07

Merge branch 'hpcaitech:main' into feature/mistral

e4caf39

eric8607242 marked this pull request as ready for review November 14, 2023 02:11

eric8607242 changed the base branch from main to feature/shardformer November 23, 2023 14:42

eric8607242 changed the base branch from feature/shardformer to main November 23, 2023 14:43

eric8607242 mentioned this pull request Nov 23, 2023

Add Mistral support for Shardformer #5103

Merged

10 tasks

eric8607242 closed this Nov 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[shardformer] Support Mistral for Shardformer#4836

[shardformer] Support Mistral for Shardformer#4836
eric8607242 wants to merge 2 commits intohpcaitech:mainfrom
eric8607242:feature/mistral

eric8607242 commented Sep 28, 2023 •

edited

Loading

Uh oh!

github-actions Bot commented Oct 2, 2023

Uh oh!

Fridge003 commented Oct 7, 2023

Uh oh!

eric8607242 commented Oct 7, 2023 •

edited

Loading

Uh oh!

Fridge003 commented Oct 8, 2023

Uh oh!

eric8607242 commented Oct 8, 2023

Uh oh!

zhoumengbo commented Nov 9, 2023

Uh oh!

eric8607242 commented Nov 10, 2023

Uh oh!

flybird11111 commented Nov 10, 2023

Uh oh!

eric8607242 commented Nov 23, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

eric8607242 commented Sep 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📌 Checklist before creating the PR

🚨 Issue number

📝 What does this PR do?

💥 Checklist before requesting a review

⭐️ Do you enjoy contributing to Colossal-AI?

Uh oh!

github-actions Bot commented Oct 2, 2023

Uh oh!

Fridge003 commented Oct 7, 2023

Uh oh!

eric8607242 commented Oct 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Fridge003 commented Oct 8, 2023

Uh oh!

eric8607242 commented Oct 8, 2023

Uh oh!

zhoumengbo commented Nov 9, 2023

Uh oh!

eric8607242 commented Nov 10, 2023

Uh oh!

flybird11111 commented Nov 10, 2023

Uh oh!

eric8607242 commented Nov 23, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

eric8607242 commented Sep 28, 2023 •

edited

Loading

eric8607242 commented Oct 7, 2023 •

edited

Loading