Google has announced Neural Machine Translation (NMT) for nine Indian languages, and extended the same technology to its auto-translation feature on the Chrome browser as well as Google Maps reviews.
Google’s Neural Machine Translation system is a more advanced one, and relies on deep neural networks for translations. The model is similar to how machine learning works for other tasks like say image recognition, except in this case the neural network is taught the language and translations sentence by sentence.
The NMT support is being rolled out for Hindi, Bengali, Marathi, Tamil, Telugu, Gujarati, Punjabi, Malayalam and Kannada. Google says this system allows for a speedier, more accurate, better quality of translation compared to its older system. This technology will also be included the Chrome browser’s auto-translate functionality, and allow for more accurate translations of web content into Indian regional languages.
Google also announced its Gboard app will now support all 22 schedule Indian languages, and an integration for Hindi Dictionary on Search. This integration will mean that users can now search for words in Hindi, using either devnagari script or English, and Google will display the meaning for word, similar to how it does for English words.
For Google, the focus on Indian languages is not something new. In 2014, Google had announced the “Indian Languages Internet Alliance (ILIA)” in order to promote Hindi web and make it more accessible to users. Google has also been crowding sourcing translations in India for various regional languages; the company says so far over 10 million translations have been done thanks to the initiative.
“India has over 400 million internet users, and this is expected to go up to 650 million users by 2020. India is the second largest internet user market in the world. Most people access this through mobile with over 300 million people in India accessing the internet with smartphones. Our mission at Google is to make Internet inclusive for every Indian,” said Rajan Anandan Vice President, South East Asia and India, Google at the event.
However, the challenge for India will always be around presence of online content in regional languages. “Google has been working hard to solve this problem,” added Anandan.
Google started the project of using Neural Networks for translations in September 2015 with its own TensorFlow technology. In September 2016, the Chinese to English translation using this was launched.
“When we started the project, we expected it to take 3 years for network to start translating. However, in a little over 13 and a half months, we’ve managed to launch this,” explained Melvin Johnson, Research Scientist Engineer at Google, who has been part of this project.
“The Neural Machine Translation is bridging the gap between phrase-based translation and human translation. We’ve now moved it to a multi-lingual model, where the system is learning translations for multiple languages. It looks for parallel documents in the web for two different languages, and then breaks them down into sentences,” explained Johnson at the announcement.
For Google and its products ensuring presence in regional languages in India matters a lot. As Google’s own study conducted with KPMG shows Indian language internet users are much more in the country, compared to English Internet users.
Google and KPMG’s study puts the number at 234 million and predicts this will rise to 536 million by 2021 for Indian language internet users. The current number of English language internet users in India is pegged at 175 million, which will rise to just 199 million by 2021.
According to Google and KPMG’s study nearly nine out of every 10 new internet users in India in the next five years is expected to be one using an Indian language with major growth expected in Hindi, where the number could rise to 75 million by 2021. In such a scenario, it becomes more imperative to have content on the web, which is available in regional languages.
Finally Google’s Gboard app, which is on iOS and Android now supports transliteration for Hindi, Bengali, Telugu, Marathi, Tamil, Urdu, and Gujarati along with support for the 22 Indian schedule languages. Gboard has the Google Search button built in, and users can now rely on this to search in their own regional language thanks to the new translation tools.
The Gboard will also offer auto-correction and prediction in these new languages. There will be a layout of the keyboard in the native script, in addition to the traditional QWERTY layout.