From a9237724fcc71ef2ab49533602e0ad2fdd763226 Mon Sep 17 00:00:00 2001 From: binmakeswell Date: Tue, 28 Nov 2023 17:11:42 +0800 Subject: [PATCH 1/3] [doc] add moe news --- examples/language/openmoe/README.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/examples/language/openmoe/README.md b/examples/language/openmoe/README.md index a0821a5330a4..45657f192024 100644 --- a/examples/language/openmoe/README.md +++ b/examples/language/openmoe/README.md @@ -1,6 +1,15 @@ ## OpenMoE [OpenMoE](https://github.com/XueFuzhao/OpenMoE) is the open-source community's first decoder-only MoE transformer. OpenMoE is implemented in Jax, and [Colossal-AI](https://github.com/hpcaitech/ColossalAI) has pioneered an efficient open-source support for this model in PyTorch, enabling a broader range of users to participate in and use this model. The following example of [Colossal-AI](https://github.com/hpcaitech/ColossalAI) demonstrates finetune and inference methods. + +

+ +

+ +* [2023/11] [Enhanced MoE Parallelism, Open-source MoE Model Training Can Be 9 Times More Efficient](https://www.hpc-ai.tech/blog/enhanced-moe-parallelism-open-source-moe-model-training-can-be-9-times-more-efficient) +[[code]](https://github.com/hpcaitech/ColossalAI/tree/main/examples/language/openmoe) +[[blog]](https://www.hpc-ai.tech/blog/enhanced-moe-parallelism-open-source-moe-model-training-can-be-9-times-more-efficient) + ## Usage ### 1. Installation From 00b8c3702d09a1536f6fbc36ccabb26264765a4c Mon Sep 17 00:00:00 2001 From: binmakeswell Date: Tue, 28 Nov 2023 17:12:29 +0800 Subject: [PATCH 2/3] [doc] add moe news --- README.md | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 1898d255e31c..98592b85c8ca 100644 --- a/README.md +++ b/README.md @@ -25,7 +25,8 @@ ## Latest News -* [2023/09] [One Half-Day of Training Using a Few Hundred Dollars Yields Similar Results to Mainstream Large Models, Open-Source and Commercial-Free Domain-Specific Llm Solution](https://www.hpc-ai.tech/blog/one-half-day-of-training-using-a-few-hundred-dollars-yields-similar-results-to-mainstream-large-models-open-source-and-commercial-free-domain-specific-llm-solution) +* [2023/11] [Enhanced MoE Parallelism, Open-source MoE Model Training Can Be 9 Times More Efficient](https://www.hpc-ai.tech/blog/enhanced-moe-parallelism-open-source-moe-model-training-can-be-9-times-more-efficient) +* [2023/09] [One Half-Day of Training Using a Few Hundred Dollars Yields Similar Results to Mainstream Large Models, Open-Source and Commercial-Free Domain-Specific LLM Solution](https://www.hpc-ai.tech/blog/one-half-day-of-training-using-a-few-hundred-dollars-yields-similar-results-to-mainstream-large-models-open-source-and-commercial-free-domain-specific-llm-solution) * [2023/09] [70 Billion Parameter LLaMA2 Model Training Accelerated by 195%](https://www.hpc-ai.tech/blog/70b-llama2-training) * [2023/07] [HPC-AI Tech Raises 22 Million USD in Series A Funding](https://www.hpc-ai.tech/blog/hpc-ai-tech-raises-22-million-usd-in-series-a-funding-to-fuel-team-expansion-and-business-growth) * [2023/07] [65B Model Pretraining Accelerated by 38%, Best Practices for Building LLaMA-Like Base Models Open-Source](https://www.hpc-ai.tech/blog/large-model-pretraining) @@ -52,6 +53,7 @@ Parallel Training Demo