Home / Companies / Replicate / Blog / Post Details
Content Deep Dive

IBM's Granite 4.0 is now on Replicate

Blog post from Replicate

Post Details
Company
Date Published
Author
-
Word Count
618
Language
English
Hacker News Points
-
Summary

IBM's Granite 4.0 represents the latest in open-source, small language models designed for efficiency and cost-effectiveness, utilizing a hybrid architecture that reduces memory usage, allowing them to run on standard consumer GPUs. With 30 billion parameters, the Granite 4.0 models are particularly suitable for document summarization, retrieval-augmented generation systems, and AI agents, featuring a combination of the linear-scaling Mamba-2 model and Transformer blocks for handling extensive sequences efficiently. The models are further enhanced by a mixture of experts (MoE) routing strategy, ensuring only necessary parameters are activated during inference, thus maintaining performance on less powerful hardware. As open-source under the Apache 2.0 license, Granite models offer flexibility for both commercial and non-commercial use, allowing modifications and customizations to meet specific business needs. Additionally, integration with platforms like Replicate and LangChain is facilitated, offering users streamlined access and deployment options for Granite models.