Granite 4.1 LLMs: How They’re Built

Post Details

Company

HuggingFace

Date Published

April 29, 2026

Author

Yousaf Shah

Word Count

2,848

Language

-

Hacker News Points

-

Source URL

huggingface.co/blog/ibm-granite/granite-4-1

Summary

Granite 4.1 is a family of open-source, dense, decoder-only large language models (LLMs) developed by IBM, featuring models with 3 billion, 8 billion, and 30 billion parameters. These models are trained on approximately 15 trillion tokens using a multi-phase pre-training pipeline that emphasizes data quality, including a long-context extension of up to 512,000 tokens. The models undergo supervised fine-tuning on about 4.1 million curated samples and are further refined through a multi-stage reinforcement learning process to enhance their capabilities in math, coding, instruction following, and general conversation. Notably, the 8B model outperforms the previous Granite 4.0-H-Small model despite its simpler architecture. Released under the Apache 2.0 license, Granite 4.1 models aim to deliver high performance with predictable latency and lower operational costs, making them suitable for enterprise applications. These models are also quantized to FP8 precision to optimize inference efficiency, significantly reducing their GPU memory usage and disk footprint.