Google’s India interest grows, expands reach to non-English speakers
December 17, 20201.2K views0 comments
By Zainab Iwayemi
Google, on Thursday, unveiled plans to invest more in machine learning and AI efforts at Google’s research centre in India and make its AI models accessible to everyone across the ecosystem.
The company says it also plans to collaborate with local start-ups that are serving users in local languages, and drastically improve the experience of Google products and services for Indian language users.
Google also announced a range of changes it is rolling out across some of its services to make them speak more local languages and unveiled a whole new approach it is taking to translate languages.
While most online services and much of the content on the web are available exclusively in English, India – rated Google’s biggest market by users – has over 600 million internet users; however, only a handful understand English.
The world second largest market continues to experience a digital divide as a result of the language barrier, which has limited hundreds of millions of users’ rendition of the World Wide Web to a select few websites and services.
Google currently provides a feature that enables quick translation of the web content page from English to Indian languages and has been used more than 17 billion times by users in India in the past year.
In its effort to improve access to information, by increasingly attempting to make the web and their services accessible to more people, Google is counting on emerging markets such as India to continue their growth.
What changes?
In addition to English and Hindi that are currently available, users will now be able to see search results to their queries in Tamil, Telugu, Bangla, and Marathi. The novel development comes four years after the Hindi’s, which brought about more than 10 times increase in the volume of search queries after the introduction of the tab.
The company noted that typing in English is a challenge users face today. “Getting search results in a local language is helpful, but often people want to make their queries in those languages as well. As a result, many users search in English even if they really would prefer to see results in a local language they understand,” it said.
Addressing the challenge, Search will now show relevant content in supported Indian languages where appropriate even if the local language query is typed in English. The new feature, which supports five languages, Hindi, Bangla, Marathi, Tamil, and Telugu, would be activated next month.
In addition, Google is also making it easier for users to immediately change the preferred language in which the result is seen in an app without necessarily having to alter the device’s language settings. The feature, which is currently available in Discover and Google Assistant, will now be available in Maps. Maps support nine Indian languages.
In the same vein, Google Lens’s Homework feature, which allows users to take a picture of a math or science problem, while delivering its answer, and walks students through the steps on how to get there, now supports the Hindi language. India is the biggest market for Google Lens, said Nidhi Gupta, senior product manager at Google India.
Unleashing MuRIL
Multilingual Representations for Indian Languages (MuRIL) is a new language AI model that delivers more efficiency and accuracy in handling transliteration, spelling variations and mixed languages and other nuances of languages. It helps to support transliterated text when writing Hindi using Roman script, a feature missing from previous models of its kind, noted Partha Talukdar, research scientist at Google Research India.
In addition, the company revealed that it has trained the new model with articles on Wikipedia and texts from a dataset called Common Crawl and other sources. The result shows that MuRIL handles Indian languages better than previous, more general language models and can contend with letters and words that have been transliterated. In other words, Google is using the closest corresponding letters of a different alphabet or script.
MuRIL significantly outperforms the earlier model — by 10 per cent on the native text and 27 per cent on the transliterated text. MuRIL, which was developed by Google executives in India and has been in use for about a year, is now open source. “Building such language-specific modelling for each and every task is not resource-efficient as we often don’t have training data for tasks like this,” Talukdar said.