Can you feel the MoE? Mixtral available with over 100 tokens per second through Together Platform!

Company

Together AI

Date Published

Dec. 11, 2023

Author

Together

Word count

323

Language

English

Hacker News points

None

URL

www.together.ai/blog/mixtral

Summary

Mixtral 8x7B, a high-quality sparse mixture of experts model (SMoE) with open weights, has been released by Mistral. This model is now live on the Together Platform, offering up to 100 tokens per second and a competitive pricing at $0.0006/1K tokens. Mixtral outperforms Llama 2 70B on most benchmarks, making it the strongest open-weight model with a permissive license, providing the best cost/performance trade-offs. The model can handle a context of 32k tokens and supports multiple languages including English, French, Italian, German, and Spanish. It also shows strong performance in code generation and can be finetuned into an instruction-following model that achieves a score of 8.3 on MT-Bench. Users can easily switch from OpenAI to Mixtral by adding their API key, changing the base URL, and using one of the open-source models. Additionally, the RedPajama-V2 Dataset is conceptualized as a foundation for creating high-quality datasets and should be filtered out depending on the application's intended use.