Gemma 4 on DeepInfra: Fast & Scalable Open AI Models

Post Details

Company

Deepinfra

Date Published

May 25, 2026

Author

Deep

Word Count

1,488

Company Posts That Month

23

Language

English

Hacker News Points

-

Post removed?

No

Source URL

deepinfra.com/blog/gemma-4-on-deepinfra

Summary

Gemma 4, developed by Google DeepMind and available on DeepInfra, is a family of AI models designed to offer significant improvements over its predecessor, Gemma 3, particularly in areas like mathematics, coding, and agentic tasks. The models, ranging from sub-5B edge-optimized variants to a 31B dense model, leverage a Mixture-of-Experts (MoE) architecture, which activates only a fraction of the total parameters during inference, making them efficient and scalable. Notably, the 26B A4B variant outperforms the previous version with nearly tripled scores on benchmarks like AIME 2026 and LiveCodeBench v6. These models support a wide range of capabilities, including native function calling, extensive multimodal processing, and a 256K token context window, all under an Apache 2.0 license that allows for unrestricted commercial use. DeepInfra provides a straightforward pricing model and an OpenAI-compatible API for seamless integration, appealing to developers who require powerful AI solutions without complex infrastructure setups.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
Vector Search	3	2,268	422	128	+30%
AI Model Fine-tuning	1	615	196	69	+46%
LLM	1	9,074	1,640	224	+53%

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.