Back to Browse

Voice Agents with Gemini Native Audio

3.6K views
Dec 13, 2025
12:21

Google has officially dropped the new Gemini 2.5 Native Audio models, marking a significant leap forward in multimodality. This December update introduces "Thinking" capabilities to the Flash lineup, drastically improving function calling, instruction following, and long-context performance to finally unlock reliable voice agent use cases. In this video, I break down the technical documentation and pricing, test the model's native expressiveness in AI Studio, and showcase three custom-built agents, including a Search Agent, a customer service bot, and a hands-free coding agent to demonstrate how this model handles complex, asynchronous tasks. Blog Post: https://blog.google/products/gemini/gemini-audio-model-updates/ Genie Demo App: https://github.com/johnbean393/Genie Timestamps: 0:00 Intro 0:10 Gemini's Multimodal Evolution 0:48 The Idea of Voice Agents 1:14 Technical Specs 3:17 AI Studio Demo 5:36 Genie App 6:03 Search Agent Demo 8:19 Customer Service Agent Demo 9:47 Coding Agent Demo 11:47 Final Thoughts #Google #Gemini #VoiceAI #AIAgents #VoiceAgents #ArtificialIntelligence #LLM #TechReview #SoftwareEngineering #GenerativeAI

Download

0 formats

No download links available.

Voice Agents with Gemini Native Audio | NatokHD