The Cloudflare team has been working on improving the quality and usability of their Workers AI platform, a distributed inference platform that makes inference accessible to everyone. They have introduced several new features, including speculative decoding, which speeds up inference times by 2-4x, an asynchronous batch API for large workloads, and expanded LoRA support for more customized responses. The team has also updated pricing and added a new dashboard to improve the usability of the platform. Additionally, they have released new models, including four new ones, with improved performance and capabilities compared to their existing models. These updates aim to make Workers AI faster, more reliable, and more customizable while reducing costs.