Home / Companies / HuggingFace / Blog / Post Details
Content Deep Dive

EuroLLM-22B

Blog post from HuggingFace

Post Details
Company
Date Published
Author
EuroLLM Team, Miguel Moura Ramos, Duarte Alves, and Hippolyte Gisserot-Boukhlef
Word Count
1,162
Language
-
Hacker News Points
-
Summary

EuroLLM-22B, a fully open multilingual language model developed in Europe, marks a significant advancement in supporting the 24 official EU languages and 11 additional international languages. Built using the EuroHPC infrastructure, it has been trained on approximately 4 trillion tokens using 400 Nvidia H100 GPUs on the MareNostrum5 supercomputer. The model excels at machine translation and general benchmarks, outperforming other models like Gemma-3-27B, Qwen-3-32B, and Apertus-70B. Its development involved several European institutions and utilized a multi-phase training process to ensure high-quality language understanding and generation. EuroLLM-22B is notable for its large context size of 32K tokens and its ability to handle multi-turn conversations, making it a powerful tool for diverse language tasks. Its creation was supported by grants from EuroHPC, the EU's Horizon Europe Research and Innovation Actions, and the Portuguese Recovery and Resilience Plan, highlighting a collaborative effort across multiple research centers and universities.