Mellum2 Goes Open Source: A Fast Model for AI Workflows | The JetBrains AI Blog
Blog post from JetBrains
Mellum2, an AI model developed by JetBrains and now open-sourced under the Apache 2.0 license, is designed to address core challenges in AI production systems, such as latency, throughput, and cost, primarily in software engineering environments. With a Mixture-of-Experts architecture, Mellum2 utilizes 12 billion parameters but activates only 2.5 billion per token, leading to lower compute costs and faster inference times. This model is specialized in natural language and code, making it highly suitable for tasks such as routing, summarization, and intermediate reasoning in AI workflows. Mellum2's design philosophy emphasizes the importance of "focal models," which are fast, specialized components that perform high-frequency tasks efficiently, as opposed to relying solely on large, multimodal models. It offers a cost-effective and high-performance solution for routing AI workloads, building low-latency retrieval-augmented generation (RAG) pipelines, and enabling private, local AI deployments, thereby providing a significant competitive edge in production-grade AI applications.