Google's AI Test Kitchen lets you teach its AI better conversational skills

Josh Woodward showing off AI Test Kitchen app at Google I/O 2022
Josh Woodward showing off AI Test Kitchen app at Google I/O 2022 (Image credit: Google)

What you need to know

  • During Google I/O 2022, we saw a live demo of the AI Test Kitchen app for Android.
  • The app has three LaMDA 2 demos for in-depth conversations with Google Assistant: List It, Imagine It, and Talk About It. 
  • AI Test Kitchen is currently in closed beta, with Google "gradually" letting in new users.
  • This crowdsourced testing app is designed to reduce "inaccurate or offensive responses" so the AI can stay "on topic."

After unveiling its futuristic language model for dialogue applications (LaMDA) at last year's I/O, Google announced the new-and-improved LaMDA 2 at Google I/O 2022. And unlike last year's on-stage demo, you'll be able to test Google's revamped AI and machine learning tools for yourself — eventually.

The goal of LaMDA is to let you have extended conversations with Google Assistant, where the AI stays on a particular topic or branches out to other topics based on your interest — simulating a proper conversation. This isn't available in the Google Assistant app (yet), but you can test out LaMDA in the AI Test Kitchen app currently in a closed beta.

Google's goal with AI Test Kitchen is to essentially crowdsource making its AI more helpful to people's needs. It first sent the app to Google employees for feedback, leading to a "reduction in inaccurate or offensive responses" from LaMDA. Now, it'll slowly open up the app to everyday people to get their feedback too, and "learn, improve, and innovate responsibly on AI together." 

Google I/O 2022 LaMDA test

(Image credit: Google)

In the app, you'll find three demos. List It has you bring up a topic like "Plant a vegetable garden" and receive a to-do list of what you'll need to do or learn before starting; you can then tap specific items to learn more or ask it to regenerate the list with new ideas.

The second demo is Imagine It, and it follows the pattern of last year's demo: you ask LaMDA to describe an experience like visiting Pluto or the ocean floor, and then prompt it to take the story in different directions like "what is the temperature like" or "describe the jellyfish" based on your interests.

AI Test Kitchen app

(Image credit: Google)

Lastly, Talk About It (Dogs Edition) lets you chat about whatever dog-related topic strikes your fancy. This isn't just to be cute; Google needs to test whether the AI can remember the original topic of discussion or if you can accidentally (or intentionally) get it to forget how new questions relate to old questions.

You can tag LaMDA responses as Nice, Offensive, Off Topic, or Not True, which should help the model learn to anticipate our needs better. 

The difference between LaMDA and a typical smart speaker saying "Here's something we found on the web" is that it's designed to say things confidently and conversationally, so users will expect statements to be true. That's why Google itself stated that "there are significant challenges to solve before these models can truly be useful," and why it will remain in beta for some time.

"While we have improved safety, the model might still generate inaccurate, inappropriate, or offensive responses. That’s why we are inviting feedback in the app, so people can help report problems," Google said.

Hopefully, it goes better than when Twitter users taught Microsoft's chatbot to be a racist in 2016, and users don't skew LaMDA's sense of "appropriateness" in the wrong direction. But assuming it does go well, it'll evolve Google Assistant beyond a question-and-answer tool into something far more useful.

Michael L Hicks
Senior Editor, VR/AR and fitness

Michael is Android Central's resident expert on fitness tech and wearables, with an enthusiast's love of VR tech on the side. After years freelancing for Techradar, Wareable, Windows Central, Digital Trends, and other sites on a variety of tech topics, AC has given him the chance to really dive into the topics he's passionate about. He's also a semi-reformed Apple-to-Android user who loves D&D, Star Wars, and Lord of the Rings.

For wearables, Michael has tested dozens of smartwatches from Garmin, Fitbit, Samsung, Apple, COROS, Polar, Amazfit, and other brands, and will always focus on recommending the best product over the best brand. He's also completed marathons like NYC, SF, Marine Corps, Big Sur, and California International — though he's still trying to break that 4-hour barrier.