At its essence, sentiment analysis is basically large-scale, real-time, real-world product testing, where the resulting data is entirely user-generated, and therefore features inconsistent processes and reporting in multiple languages worldwide. Making sense of this data requires close cooperation between global brands and translation companies, but the resulting insights are almost always worth the effort.
So what are some of the main challenges with multilingual sentiment analysis?
- You need a system to capture and track the wealth of user sentiment data from global social media sites.
- This is Big Data and will come through diverse channels and data formats such as ratings, reviews, and comments. Homogenizing them to some level is necessary so that they are machine-translation friendly.
- Content will include informal or slang terms across multiple locales and languages. That is, the data will be non-standard in not just the format, but also the terminology.
How to handle multilingual sentiment analysis
Companies must work closely with their localization partner to understand the different options available to them and to forge a solution that’s best suited for their needs. Here are the six key steps we normally recommend in handling multilingual sentiment analysis:
- Fine-tune off-the-shelf tools or create proprietary tools to compile, translate, and analyze user posts. The ready-to-use tools may not always have the multilingual capabilities. Hence, you will have to choose one that allows for easy integration with customized machine translation engines.
- Aggregate and translate posts using machine translation. The way to do this is to usually translate all posts from various languages into English or any other primary language and then perform sentiment analysis on the translated content.
- Analyze posts for opinion-related terms and phrases. These are the terms that will give you clues about what users are feeling about your product.
- Assign sentiment score based on the number of positive or negative words.
- Report sentiment scores by language, market, platform, product, date range, etc. via a dashboard. This will give you insights into why a product may be performing well in one market while failing in another.
- Drill down to original and translated posts. Make sure your tool lets you see the original as well as the translated text for future reference as well as spot quality checks.
It’s not just about consumers
This advanced use of collecting and analyzing consumer sentiments may have started in the B2C space, but it’s spreading fast into other industries, including regulated ones such as life sciences. For instance, the advances in applied use of natural language processing (NLP) enable pharmaceutical companies to mine online content and social media for user-generated content in order to identify potential Adverse Drug Reactions (ADRs).
Collecting ADRs is a regulatory requirement and one that pharmaceutical companies as such fully comply with. But the reality is that only a fraction of ADRs get actually reported by patients or healthcare professionals via the official channels. So listening on social media and looking for possible drug reactions opens up a brand-new way of identifying potential ADRs. In this scenario, customized tools analyze users’ comments publicly posted in social media, in multiple languages, providing a short-list of potential ADRs that can be narrowed down to actual adverse reactions.
So if you haven’t already, consider applying some of the current multilingual social listening approaches, too. This may be a new way of connecting with other parts of your organization, such as marketing. It also provides a new opportunity to show the tangible contribution that the language and translation function makes to the global revenue and better understanding of your international customers.