Introducing Mellum2: A 12B Mixture-of-Experts Model by JetBrains

Post Details

Company

Hugging Face

Date Published

June 1, 2026

Author

Nikita Pavlichenko

Word Count

600

Company Posts That Month

94

Language

-

Hacker News Points

-

Post removed?

No

Source URL

huggingface.co/blog/JetBrains/mellum2-launch

Summary

JetBrains has introduced Mellum2, a 12-billion parameter Mixture-of-Experts (MoE) model, designed to efficiently handle natural language and code tasks by activating only 2.5 billion parameters per token for high-throughput, low-latency inference. Released under the Apache 2.0 license, Mellum2 is optimized for various applications, including routing, retrieval-augmented generation (RAG), summarization, and private deployments, particularly in software engineering contexts. The model stands out for its competitive benchmark performance and more than double the inference speed compared to similar-sized models, making it suitable for high-frequency tasks within larger AI systems. Mellum2's architecture, which focuses on text and code rather than multimodal tasks, aims to enhance efficiency and reduce costs for real-time workloads, positioning it as a key component in modern AI systems that require specialized yet integrated model components. The model is available for download on Hugging Face, and detailed information on its architecture, training, and evaluation is provided in a technical report.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
RAG	3	992	256	104	-53%
Real-time	1	5,515	1,316	255	-4%

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.