Mistral Small 3, the latest open-weight AI model, is now accessible on the Fireworks platform, offering significant speed and efficiency advantages, including 150 TPS generation speeds and a 32K context window, under Apache 2.0 licensing. It surpasses Llama 3.3 70B in pretraining benchmarks and is three times faster on the same hardware, making it ideal for applications like conversational AI, function calling, and specialized fine-tuning in fields such as legal and finance. Fireworks advocates for using a combination of small and large AI models to create flexible, cost-effective, and high-performing compound AI systems, where Mistral Small 3 handles fast-response tasks with low latency, while larger models like DeepSeek V3 or GPT-4o tackle complex reasoning. Mistral Small 3 is available for experimentation and deployment on Fireworks, supporting both serverless and on-demand configurations to facilitate its integration into diverse AI workflows.