Google is using AI to add over 100 languages to the Google Translate feature, an expansion the company claims is the largest to date. The languages added represent up to a tenth of the world’s population.
In a blog post on Thursday, Google wrote that the new language support was made possible with the company’s PaLM 2 large language model. PaLM 2 is an LLM that features in several Google products, including Bard, which eventually became Gemini.
The last time Google added new languages to Translate was in 2022 when the company added 24 languages to the tool using a different machine-learning program. At the time, it brought the supported languages up to 133.
This new addition takes the total to 243 supported languages. The last language leap coincided with an announcement introducing a commitment by Google to add 1,000 of the most spoken languages into Translate using AI models. There was no set timeline to get to 1,000, but the company seems well on its way, with over 140 new languages introduced in the last two years.
According to Google, the additional languages represent more than 614 million speakers.
The most prominent languages include Cantonese, which is a version of Chinese that can be traced back to Middle Chinese. Compared to Mandarin, which is already in Google Translate and is the official language of China. While spoken broadly, Cantonese is mainly found in the southeast region of China, especially in Hong Kong and Guangzhou. For example, speaking Mandarin vs Cantonese is like saying French or Italian. There are similarities because of the root language, but understanding can be complex.
Another populous language is Punjabi, which is spoken mainly by Pakistanis.
I think it’s cool that Google is also adding smaller, more regional languages like Manx, a Celtic language spoken on the Isle of Man that nearly went extinct in the 1970s. There’s also Tok Pisin, an English-based creole and lingua franca in Papua New Guinea.
Google says that some languages added were by crowd request or received many volunteer contributions, like Afar, a tonal language spoken in Djibouti, Eritrea and Ethiopia. Cantonese was also a language that was highly requested.
It’ll be interesting if these get filtered into Gemini. Live translate is a boon to travel, and the more languages AI tools can process, the easier it will be to communicate with our neighbors worldwide.