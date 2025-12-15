What you need to know

Gemini Live and Search Live are now using Google's new-and-improved Gemini 2.5 Flash Native Audio model.

The upgraded model is more conversational, can interact with external sources without impeding the chat's flow, and handles complex requests better.

It surpasses the previous 9-25 revision while also topping OpenAI's gpt-realtime model in benchmarks.

Gemini's voice agents are getting a major upgrade this week, as Google is updating the Gemini 2.5 Flash Native Audio model to improve its conversational sound, understanding of user instructions, and ability to fit into complex workflows. The latest Gemini 2.5 Flash Native Audio is rolling out now for developers in Google AI Studio and Vertex AI, and for Gemini Live and Search Live users.

The changes will make it easier to converse with Gemini while chatting live, and can improve the quality of Google's Live Voice Agents. Specifically, new Gemini 2.5 Flash Native Audio 12-25 model improves multi-turn conversation quality. When you chat with Gemini Live across multiple turns, it'll remember context from old turns. The extra context helps create "more cohesive conversations," according to Google.

The model is also better at interacting with external workflows without impacting the smoothness of your conversation. It can pick up on your audial cues to figure out when to access these outside functions. These external workflows can provide real-time information that Gemini 2.5 Flash Native Audio can subsequently insert into its audio responses.

Gemini 2.5 Flash Native Audio: Powering conversational experiences - YouTube Watch On

Gemini's Live Voice Agent is also better at understanding and acting upon complex instructions from a user. Google says these upgrades result in "higher user satisfaction on content completeness." In other words, when interacting with a Live Voice Agent powered by Gemini 2.5 Flash Native Audio 12-25, you may not need to demand to speak to a human representative. The artificial intelligence model might be able to handle more multi-step tasks on its own.

It's more reliable overall, with a 90% adherence rate to developer instructions. That's an increase of six percent compared to the older Gemini 2.5 Flash Native Audio 9-25 model.

(Image credit: Google)

In the ComplexFuncBench Audio benchmark, the latest Gemini 2.5 Flash Native Audio model beats both its predecessor and OpenAI's gpt-realtime model with a score of 71.5%.

The upgraded Gemini 2.5 Flash Native Audio, as well as Live Voice Agents, are available now in Google AI Studio and Vertex AI. It's also debuting in preview in the Gemini API. Android users can find the model in action in Gemini Live and Search Live, too.