Describe the feature
Hi,
A new LLM model, called Mistral, is published currently, which has a very impressive performance.
In the official document, the performance of Mistral-7B outperforms LLaMAv2-13B on all benchmarks.
I think It is very exciting to support this powerful model with ShardFormer!!
Describe the feature
Hi,
A new LLM model, called Mistral, is published currently, which has a very impressive performance.
In the official document, the performance of Mistral-7B outperforms LLaMAv2-13B on all benchmarks.
I think It is very exciting to support this powerful model with ShardFormer!!