Build an AI Travel Advisor That Speaks with Gemini 3.1 Pro
Blog post from Stream
Google's Gemini 3.1 Pro model is a significant advancement in creating conversational voice agents, offering improved reasoning, extended context handling, and enhanced tool-use capabilities. It serves as the core for a real-time voice AI agent and travel advisor, capable of delivering coherent and natural responses with strong reasoning and storytelling abilities. The setup employs Vision AI Agents and integrates with technologies such as ElevenLabs for text-to-speech, Deepgram for speech-to-text, and Stream for real-time communication via WebRTC, all orchestrated with the Vision Agents Gemini plugin. The guide illustrates how to build and deploy this voice AI system, emphasizing the ease of switching between Gemini's standard preview and custom tools variants for optimal conversational output.