Gemini Live API & Lyria 3: Generate Music From Text, Phone & Video Calls

Post Details

Company

Stream

Date Published

May 15, 2026

Author

Amos G.

Word Count

4,437

Company Posts That Month

9

Language

English

Hacker News Points

-

Source URL

getstream.io/blog/gemini-lyria-ai-music

Summary

Google DeepMind's Lyria 3 is an AI tool designed to generate music using multimodal prompts, such as text, images, and voice, through the Gemini API. It supports the creation of both short 30-second clips and full-length songs by analyzing input prompts. The Lyria 3 model, available in the Gemini API, accommodates various use cases, whether it is for soundtracks, ambient tracks, or cinematic pieces. Integrating with Vision Agents allows users to generate music during video or phone calls via Twilio, providing real-time agentic voice output. The setup involves configuring several tech stacks, including NGROK for URL conversion, and requires API keys for operation. This tool represents a versatile approach to AI music generation, offering users the ability to customize output through creative prompt crafting and environmental setup.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
LLM	14	9,074	1,640	224	+53%
Real-time	6	5,735	1,391	247	-9%
Voice AI	3	3,462	242	43	+46%
AI Agents	1	4,942	1,264	250	+12%