Workers AI gets a speed boost, batch workload support, more LoRAs, new models, and a refreshed dashboard

Company

Cloudflare

Date Published

April 11, 2025

Author

Michelle Chen, Jesse Kipp

Word count

2424

Language

English

Hacker News points

None

URL

blog.cloudflare.com/workers-ai-improvements

Summary

The Cloudflare team has been working on improving the quality and usability of their Workers AI platform, a distributed inference platform that makes inference accessible to everyone. They have introduced several new features, including speculative decoding, which speeds up inference times by 2-4x, an asynchronous batch API for large workloads, and expanded LoRA support for more customized responses. The team has also updated pricing and added a new dashboard to improve the usability of the platform. Additionally, they have released new models, including four new ones, with improved performance and capabilities compared to their existing models. These updates aim to make Workers AI faster, more reliable, and more customizable while reducing costs.