Skip to main content

Google Translate adds two dozen languages used by over 300 million people

Google Pixel 6 Pro Live Translate
(Image credit: Nick Sutrich / Android Central)

What you need to know

  • During Google I/O 2022, Google announced Google Translate would support 24 new languages. 
  • It is the first time Indigenous American languages or an English dialect have been supported on Translate.
  • Google typically uses bilingual learning to translate languages, but used monolingual learning for these new languages.
  • 133 languages in total are now supported on Google Translate. 

Today, during the Google IO 2022 keynote address, Google CEO Sundar Pichai announced that the company would bring 24 new languages to Google I/O, targeting languages that are "underrepresented on the web today."

"More people are using Google Translate than ever before, but we still have work to do to make it universally accessible," said Pichai.

Google Translate typically relies on "bilingual learning" to translate text, comparing phrases between two languages to properly translate different phrases. But because these new languages have less text for Google's AI to delve into, it had to rely on "monolingual learning," where "the model learns to translate a new language without ever seeing a data translation of it."

Google Translate examples

(Image credit: Google)

The 24 languages listed below cover about 310 million people across the globe, and include three Indigenous languages of the Americas (Quechua, Guarani and Aymara) and an English dialect (Sierra Leonean Krio), all of which are firsts for Google Translate.

  • Assamese, used by about 25 million people in Northeast India
  • Aymara, used by about two million people in Bolivia, Chile and Peru
  • Bambara, used by about 14 million people in Mali
  • Bhojpuri, used by about 50 million people in northern India, Nepal and Fiji
  • Dhivehi, used by about 300,000 people in the Maldives
  • Dogri, used by about three million people in northern India
  • Ewe, used by about seven million people in Ghana and Togo
  • Guarani, used by about seven million people in Paraguay and Bolivia, Argentina and Brazil
  • Ilocano, used by about 10 million people in northern Philippines
  • Konkani, used by about two million people in Central India
  • Krio, used by about four million people in Sierra Leone
  • Kurdish (Sorani), used by about eight million people, mostly in Iraq
  • Lingala, used by about 45 million people in the Democratic Republic of the Congo, Republic of the Congo, Central African Republic, Angola and the Republic of South Sudan
  • Luganda, used by about 20 million people in Uganda and Rwanda
  • Maithili, used by about 34 million people in northern India
  • Meiteilon (Manipuri), used by about two million people in Northeast India
  • Mizo, used by about 830,000 people in Northeast India
  • Oromo, used by about 37 million people in Ethiopia and Kenya
  • Quechua, used by about 10 million people in Peru, Bolivia, Ecuador and surrounding countries
  • Sanskrit, used by about 20,000 people in India
  • Sepedi, used by about 14 million people in South Africa
  • Tigrinya, used by about eight million people in Eritrea and Ethiopia
  • Tsonga, used by about seven million people in Eswatini, Mozambique, South Africa and Zimbabwe
  • Twi, used by about 11 million people in Ghana

Isaac Caswell and Ankur Bapna, Research Scientists for Google Translate, wrote a technical post for the Google AI Blog detailing how their new monolingual translation, or "zero-resource translation," tools work. It's technically dense, but explains how Google created datasets for 1138 languages in order to "learn representations of under-resourced languages directly from monolingual text."

Google Translate has come a long way in the past few years. Recent Pixels like the Pixel 6 can Live Translate spoken words or text viewed through a camera from dozens of languages. Now, Android phones will become more useful for communities across the globe that have historically been left behind by technology that targets just a few languages.

In other language-related Google IO news, Pichai noted that YouTube's auto-generated captions are now available in 16 languages.

Michael L Hicks
Michael L Hicks

Michael spent years freelancing on every tech topic under the sun before settling down on the real exciting stuff: virtual reality, fitness wearables, gaming, and how tech intersects with our world. He's a semi-reformed Apple-to-Android user who loves running, D&D, and Star Wars. Find him on Twitter at @Michael_L_Hicks.