Google brings Gemini and other accessibility tools across Android devices

Google brings more accessibility tools with Gemini

(Image credit: Google)

What you need to know

Google is enhancing Android with AI-powered accessibility features, including improved TalkBack with image and screen questioning.
New "Expressive Captions" will use AI to provide real-time captions across apps, conveying not just words but also the speaker's tone and emotions.
Developers now have access to Google's open-source Project Euphonia repository to build and personalize speech recognition tools for diverse speech patterns.
This new version of Expressive Captions is rolling out in English in the U.S., U.K., Canada, and Australia for devices running Android 15 and above.

Each year, Google announces a slew a features for Global Accessibility Awareness Day (GAAD) and this year isn't any different. Today (May 15), Google announced that it is rolling out new updates to its products across Android and Chrome, and adding new resources for developers who building speech recognition tools.

Google is highlighting how it's integrating Gemini and AI into its accessibility features to improve usability for users with low vision or hearing.

Google brings Gemini powered updates for GAAD — (Image credit: Google)

Google is expanding its existing TalkBack option with the ability for people to ask questions and get responses related to the image that they were sent or are viewing. That means the next time a friend texts you a photo of their new guitar, users can get a description of the image and ask follow-up questions about the make and color, or even what else is in the image.

Taking it a step further with AI, Google says users can not only get descriptions of the image, but also the entire screen they're viewing. For instance, "if you’re shopping for the latest sales on your favorite shopping app, you can ask Gemini about the material of an item or if a discount is available," the tech giant explained.

Google is launching a much-needed feature called Expressive Captions, which helps the listener gauge the mood and emotion of the video. The tech giant states that phones will be able to provide real-time captions for anything with sound across apps.

It uses AI to not only capture what someone says, but more importantly, how they say it. For instance, if you're watching an intense game and the commentator is calling out an “amaaazing shot,” or when the video message is not “no” but a “nooooo.”

This AI-powered feature will be able to identify sounds that are dragging out words, to give you accurate captions. It can also spot when people are whistling or clearing their throats, making captions less redundant.

Expressive captions on Google — (Image credit: Google)

This new version of Expressive Captions is rolling out in English in the U.S., U.K., Canada, and Australia for devices running Android 15 and above.

Lastly, Google is taking steps to continue to make its digital platform more accessible and inclusive. It is giving developers access to its open-source repository from Project Euphonia (How AI improves products for people with impaired speech).

By opening up its resources, developers around the world will be able to personalize audio tools for research ,which could help train their models for diverse speech patterns, Google explained.

TOPICS

Nandika Ravi is an Editor for Android Central. Based in Toronto, after rocking the news scene as a Multimedia Reporter and Editor at Rogers Sports and Media, she now brings her expertise into the Tech ecosystem. When not breaking tech news, you can catch her sipping coffee at cozy cafes, exploring new trails with her boxer dog, or leveling up in the gaming universe.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.