Skip to content

[shardformer] Add dropout layer in shard model and refactor policy api#3949

Merged
FrankLeeeee merged 10 commits intohpcaitech:feature/shardformerfrom
FoolPlayer:use_dropout
Jun 12, 2023
Merged

[shardformer] Add dropout layer in shard model and refactor policy api#3949
FrankLeeeee merged 10 commits intohpcaitech:feature/shardformerfrom
FoolPlayer:use_dropout

Conversation

@FoolPlayer
Copy link
Copy Markdown
Contributor

@FoolPlayer FoolPlayer commented Jun 9, 2023

📌 Checklist before creating the PR

  • I have created an issue for this PR for traceability
  • The title follows the standard format: [doc/gemini/tensor/...]: A concise description
  • I have added relevant tags if possible for us to better distinguish different PRs

🚨 Issue number

fixed #3935 #3958

📝 What does this PR do?

add dropout layer in shardmodel
refactor the structure of policy and the logical of sharder

💥 Checklist before requesting a review

  • I have linked my PR to an issue (instruction)
  • My issue clearly describes the problem/feature/proposal, with diagrams/charts/table/code if possible
  • I have performed a self-review of my code
  • I have added thorough tests.
  • I have added docstrings for all the functions/methods I implemented

⭐️ Do you enjoy contributing to Colossal-AI?

  • 🌝 Yes, I do.
  • 🌚 No, I don't.

Tell us more if you don't enjoy contributing to Colossal-AI.

@FoolPlayer FoolPlayer changed the title [shardformer] Add dropout layer in shard model [shardformer] Add dropout layer in shard model and refactor policy api Jun 12, 2023
@FoolPlayer
Copy link
Copy Markdown
Contributor Author

Local pytest for shardformer:

image

@FrankLeeeee FrankLeeeee merged commit fd8b9c5 into hpcaitech:feature/shardformer Jun 12, 2023
@FoolPlayer FoolPlayer deleted the use_dropout branch June 12, 2023 09:00
FrankLeeeee pushed a commit that referenced this pull request Jun 26, 2023
#3949)

* add dist dropout in model

* update docstring and bert policy with dropout

* refactor basepolicy and sharded, update bert

* update format

* update gpt2 policy

* update bert policy

* remove unused code

* update readme for new policy usage
flybird11111 pushed a commit to flybird11111/ColossalAI that referenced this pull request Jul 3, 2023
hpcaitech#3949)

* add dist dropout in model

* update docstring and bert policy with dropout

* refactor basepolicy and sharded, update bert

* update format

* update gpt2 policy

* update bert policy

* remove unused code

* update readme for new policy usage
FrankLeeeee pushed a commit that referenced this pull request Jul 4, 2023
#3949)

* add dist dropout in model

* update docstring and bert policy with dropout

* refactor basepolicy and sharded, update bert

* update format

* update gpt2 policy

* update bert policy

* remove unused code

* update readme for new policy usage
ver217 pushed a commit to ver217/ColossalAI that referenced this pull request Jul 13, 2023
hpcaitech#3949)

* add dist dropout in model

* update docstring and bert policy with dropout

* refactor basepolicy and sharded, update bert

* update format

* update gpt2 policy

* update bert policy

* remove unused code

* update readme for new policy usage
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[shardformer] Refactor some module api [shardformer] add dropout layer to sharded layer

2 participants