Skip to content

[moe]: add flash attention & optimize top2 router#4712

Merged
oahzxl merged 6 commits intohpcaitech:feature/MoEfrom
cwher:close-moe
Sep 18, 2023
Merged

[moe]: add flash attention & optimize top2 router#4712
oahzxl merged 6 commits intohpcaitech:feature/MoEfrom
cwher:close-moe

Conversation

@cwher
Copy link
Copy Markdown
Contributor

@cwher cwher commented Sep 14, 2023

📌 Checklist before creating the PR

  • I have created an issue for this PR for traceability
  • The title follows the standard format: [doc/gemini/tensor/...]: A concise description
  • I have added relevant tags if possible for us to better distinguish different PRs

🚨 Issue number

Link this PR to your issue with words like fixed to automatically close the linked issue upon merge

e.g. fixed #1234, closed #1234, resolved #1234

📝 What does this PR do?

Summarize your work here.
if you have any plots/diagrams/screenshots/tables, please attach them here.

💥 Checklist before requesting a review

  • I have linked my PR to an issue (instruction)
  • My issue clearly describes the problem/feature/proposal, with diagrams/charts/table/code if possible
  • I have performed a self-review of my code
  • I have added thorough tests.
  • I have added docstrings for all the functions/methods I implemented

⭐️ Do you enjoy contributing to Colossal-AI?

  • 🌝 Yes, I do.
  • 🌚 No, I don't.

Tell us more if you don't enjoy contributing to Colossal-AI.

Comment thread examples/language/openmoe/model/modeling_openmoe.py Outdated
Comment thread examples/language/openmoe/benchmark/benchmark_train.py Outdated
@cwher cwher requested a review from oahzxl September 15, 2023 01:56
@oahzxl
Copy link
Copy Markdown
Contributor

oahzxl commented Sep 15, 2023

need to fix import error in test

@oahzxl
Copy link
Copy Markdown
Contributor

oahzxl commented Sep 15, 2023

e2c0e7f#diff-60052b0dcfac281a5cb4066eec7883fb118236af31fe84e240742c44dbe0c034 this pr may be helpful

need to fix import error in test

@cwher cwher changed the base branch from feature/moe to feature/MoE September 18, 2023 02:35
@cwher cwher changed the title [moe]: add flash attention [moe]: add flash attention & optimize top2 router Sep 18, 2023
@oahzxl oahzxl merged commit f59a6c1 into hpcaitech:feature/MoE Sep 18, 2023
oahzxl pushed a commit to oahzxl/ColossalAI that referenced this pull request Oct 26, 2023
* feat: add benchmark train

* perf: use flash_attn

* fix: modify benchmark config

* fix: check flash attn installation

* fix: update config with args

* perf: optimize top2 router
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants