As you may have noticed, localization into Indian languages is becoming a hot topic. With greater volumes of Indian-language speakers accessing the internet and using mobile phone apps, companies are perking up and finding ways to reach out to this large population of potential consumers. And Microsoft is one of the frontrunners.
In 1998, Microsoft launched Project Bhasa to promote local language computing in India via support of Indic language input tools. Now, two decades later, Microsoft has announced that they’re stepping up their Indian language translation initiative. Promoted for India’s 69th Republic Day on January 26, Microsoft’s real-time translation of Bengali, Hindi, and Tamil will now incorporate artificial intelligence (AI) and deep neural networks (DNN).
Many Microsoft desktop products and mobile apps currently support 22 constitutionally recognized Indian languages. The company previously used statistical machine translation (SMT) for Indic languages, and had some notable challenges training their systems due to, among other hindrances, language complexity and limited available content. While SMT nevertheless had its own success, Microsoft believes that they have demonstrated how AI and DNN integration markedly increase translation fluency.
In their announcement, the company specifically noted that the re-engineering of their translation systems better handles language aspects such as grammatical gender (feminine, masculine, and neutral) as well as the formal versus informal differences that produce more natural-sounding translations.
“We have witnessed at least 20 percent improvement in translation quality for all Indic languages,” said Krishna Doss Mohan, Senior Program Manager for Microsoft India and a member of the India languages work team.
Microsoft’s AI-Powered Speech Translation
AI and DNN will now power Indic language support across the breadth of Microsoft’s products—including the Edge browser, Bing search, and Microsoft Office 365 applications like Excel, Word, Outlook, PowerPoint, and Skype. While a boon for Microsoft partners and customers in Indian markets, the effort is not without continuing challenges.
“Six Indian languages are part of the top 20 global languages by population. Ironically, these languages are not on top of the digital content list. There’s not enough material on the internet that we could use to train the system,” said Mohan.
The rewards are nevertheless apparent. According to a report published last year by Google and KPMG, Indian-language internet users will be more than 2.5 times the number of India’s English internet user base by 2021. Global companies seeking to capitalize on products and services targeting India’s online consumers, especially for internet retail (e-tail) and mobile communications, must therefore understand and localize for 234 million Indic language users online.
“Microsoft celebrates the diversity of languages in India and wants to make the vast internet even more accessible,” said Sundar Srinivasan, General Manager for AI and Research at Microsoft India. “We’re committed to empower every Indian and every business in India by bringing the power of AI into their daily life and become a driving force for Digital India.”