Google Boosts LiteRT and Gemini Nano for On-Device AI Efficiency

Post Details

Company

SSOJet

Date Published

May 26, 2025

Author

Devraj Patel

Word Count

765

Company Posts That Month

57

Language

English

Hacker News Points

-

Source URL

ssojet.com/blog/google-boosts-litert-and-gemini-nano-for-on-device-ai-efficiency

Summary

The latest release of LiteRT, formerly TensorFlow Lite, introduces significant enhancements for on-device machine learning, including a simplified API, improved GPU acceleration, and support for Qualcomm NPUs, aimed at accelerating AI models on mobile devices while reducing power consumption. The release features MLDrift, a new GPU acceleration implementation that improves performance for models like CNNs and Transformers, and includes the TensorBuffer API to minimize unnecessary data transfers between GPU and CPU memory. Asynchronous execution is also supported, allowing for concurrent processing across different processors. Additionally, LiteRT now supports small language models (SLMs) with multimodality, including the Gemma 3 models, optimized for mobile and web applications, and introduces the concept of Retrieval Augmented Generation (RAG) to enhance SLMs with application-specific data. Google is also preparing to announce new ML Kit APIs to enable developers to leverage Gemini Nano for on-device AI functionalities, offering a more consistent mobile AI experience without relying heavily on cloud resources. This update is complemented by an API-first platform from SSOJet that facilitates secure SSO and user management for enterprises.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
Local AI	3	41	16	9	+32%
RAG	3	899	167	74	-45%
AI Model Fine-tuning	1	671	147	64	-4%
Edge Computing	1	23	14	13	-65%
Real-time	1	3,344	937	222	-51%