Home / Companies / Cloudflare / Blog / Post Details
Content Deep Dive

Powering the agents: Workers AI now runs large models, starting with Kimi K2.5

Blog post from Cloudflare

Post Details
Company
Date Published
Author
Michelle Chen, Kevin Flansburg, Ashish Datta, and Kevin Jain
Word Count
1,949
Language
English
Hacker News Points
-
Summary

Cloudflare is enhancing its platform to support the development and deployment of AI agents with the introduction of Workers AI, which now offers powerful open-source models like Moonshot AI's Kimi K2.5. This model is noted for its extensive 256k context window and capabilities in multi-turn tool calling, vision inputs, and structured outputs, making it suitable for agentic tasks. Cloudflare engineers have incorporated Kimi K2.5 into their internal tools, finding it a cost-effective alternative to larger proprietary models, significantly reducing expenses by 77% in some use cases. The platform has been optimized to support large-scale AI models through techniques such as custom kernels, data parallelization, and prefix caching to improve performance and reduce costs. Additionally, Cloudflare has introduced new features like session affinity headers and a revamped asynchronous API to enhance cache hit rates and manage inference requests efficiently. These advancements allow for scalable deployment of AI models, offering an attractive solution for enterprises seeking to transition to cost-effective, open-source AI models.