Home / Companies / Modular / Blog / Post Details
Content Deep Dive

Modverse #46: MAX 25.1, MAX Builds, and Democratizing AI Compute

Blog post from Modular

Post Details
Company
Date Published
Author
Caroline Frasca
Word Count
952
Language
English
Hacker News Points
-
Summary

MAX 25.1 introduces significant advancements in AI development, focusing on enhancing agentic and LLM workflows with features like GPU programming, GPU-accelerated embeddings, and OpenAI-compatible function calling. This release debuts MAX Builds, a centralized hub for GenAI models and application recipes, and shifts to a nightly release model, enabling developers to access new features and community-driven improvements continuously. The update includes high-performance optimizations such as paged attention and prefix caching, offline batch inference, and streamlined deployment capabilities from local to cloud environments. MAX Serve's new features, such as Paged Attention and Prefix Caching, improve LLM inference, while community engagement is encouraged through forums, live streams, and events, including a keynote by Chris Lattner at the Democratize Intelligence conference. The release also includes novel projects like CombustUI for Mojo and various community contributions, and emphasizes the advantages of the MAX Engine for accelerating GenAI workloads without relying on CUDA, showcased in upcoming events like NVIDIA GTC.