Home / Companies / Google Cloud / Blog / Post Details
Content Deep Dive

LiteRT: Maximum performance, simplified

Blog post from Google Cloud

Post Details
Company
Date Published
Author
Mogan Shieh, Terry Heo, and Jingjiang Li
Word Count
1,410
Language
English
Hacker News Points
-
Summary

Over the past decade, the integration of powerful accelerators like GPUs and NPUs into mobile phones has significantly enhanced the performance of AI models, offering speed increases of up to 25 times compared to CPUs and reducing power consumption by five times. Despite these benefits, developers have struggled with the complexities of interfacing with hardware-specific APIs and vendor-specific SDKs. To address these challenges, the Google AI Edge team has introduced improvements to LiteRT, including a new API that simplifies on-device ML inference, cutting-edge GPU acceleration, and NPU support developed with MediaTek and Qualcomm. These updates feature MLDrift for superior GPU performance, a uniform method for developing and deploying models on various NPUs, and an advanced TensorBuffer API that reduces memory overhead. Additionally, asynchronous execution capabilities allow for more efficient parallel processing across different processors, enhancing the responsiveness and efficiency of AI applications on mobile devices. These advancements aim to provide developers with tools to maximize AI model performance on mobile platforms, with further enhancements and broader support anticipated in the coming year.