DeepSeek V4 Pro Is Now Available on DeepInfra
Blog post from Deepinfra
DeepSeek V4 Pro, released by DeepInfra on April 24, 2026, is a cutting-edge Mixture of Experts model featuring 1.6 trillion parameters with a focus on efficiency and reasoning depth, tailored for tasks requiring long-context retrieval and advanced reasoning. The model boasts significant architectural advancements, such as a Hybrid Attention Architecture that combines Compressed Sparse Attention and Heavily Compressed Attention, resulting in a substantial reduction of inference FLOPs and KV cache use at a 1-million-token context window. It provides three distinct reasoning modes, allowing developers to balance computational cost and output quality, and is pre-trained on over 32 trillion tokens with weights available under an MIT license on Hugging Face, enabling self-hosting. DeepSeek V4 Pro outperforms other models on competitive coding and agentic tasks, with its benchmarks demonstrating superior performance in reasoning-heavy scenarios. Offered through DeepInfra's managed infrastructure, it features an OpenAI-compatible endpoint, ensuring seamless integration without infrastructure overhead, and is priced on a usage basis, emphasizing its practicality for long-running workloads.