Your Introduction to Enterprise MT

Lee Densmer 10 Aug 2020

If your business, like many others in this digital age, has growing content volumes and needs faster and faster turnaround times, yet your budget for localization is not increasing, then you might be considering machine translation (MT).

In order to help enterprises start their journey with MT, we recently held a webinar in which Adam LaMontagne, MT Program Manager at RWS Moravia, reviewed the history of MT, its enterprise use cases and how to get started. Here is a recap of what he presented.

A bit of history

Machine translation (MT), one of the oldest sub-fields of artificial intelligence (going back to the 1950's), uses software to translate texts from one language to another. When they first introduced the technology, scientists were confident that any kinks would be ironed out within a few years. However, some 70 years later, MT technology is still very much an ongoing experiment with some notable advances and innovations in recent years, but also much headway still left to be made.

One of the most significant changes to the world of MT was the shift from rule-based to statistical models. Rule-based MT, which entails developing linguistic rules to translate from one language to another, is still in use today in some applications. However, since the late 1980's, most MT applications have used statistical models. On top of bringing costs down, statistical MT takes greater advantage of modern CPU capabilities and can enable economies of scale in that, contrary to rule-based models, the same algorithm can be used to train many language pairs. That being said, statistical models have a certain ceiling in terms of quality, most notably in fluency.

A newer sub-category of statistical models is neural MT (NMT), which is built on the same fundamental concepts but mimics the brain's neural systems in its design. NMT harnesses the increased processing power of modern computers to offer significant improvements in translation quality. NMT models can use deep learning techniques to produce both faster and better-quality translations compared to traditional statistical models. At present, they represent the state of the art of enterprise MT and are in use by giants like Google and Microsoft. It should be noted that NMT comes with corollaries in cost due to the increased computational power it demands.

In what enterprise applications is MT useful?

Generally speaking, MT makes the most sense in translation programs with high content volumes, in terms of justifying the ROI. This can be particularly true with content that is useful but not high enough of a priority to justify the cost of full human translation, such as customer feedback. MT can also be particularly beneficial in contexts where speed is a crucial factor, costs need to be cut or a fixed budget is running up against increasing translation needs.

Another scenario in which MT can be a great choice is when a self-service solution is needed to facilitate communication among employees and/or users across a language barrier, such as in a community forum for a multinational company. And a final MT application worth mentioning is as an integrated component of a larger service workflow, such as in multilingual sentiment analysis or speech-to-speech translation.

All said, MT is now being used in all sorts of scenarios, even for marketing material, which has formerly been thought to be too complicated for a machine to handle because it is highly branded and its meaning is nuanced.

Where is MT advantageous?

The first advantage MT can offer enterprises in these cases is greater productivity, which can drive faster time-to-market, help maintain budget and handle growing volumes. Particularly in larger multinational organizations with vast amounts of content to be translated, a single robust MT system can serve multiple purposes, providing translation of anything from internal communications to blog posts and community forums. And, a properly deployed MT system can avoid the typos or misspellings that can easily evade the human eye, thereby achieving greater consistency across the board.

What factors should be considered when choosing an MT service?

Quality requirements

Obviously, MT can work in a variety of contexts, and translation quality can vary just as widely as the contexts themselves. Ultimately, quality will increase in direct proportion to the calibre and volume of training data input into the system. Moreover, the quality, consistency and complexity of the source content will have a direct effect on translation quality. For example, slang or acronyms not included in the training data can lead to poorer translation results.

It is important to weigh budget and time considerations alongside the need for quality and nuance. For example, if you only need to convey the basic idea of a text, your MT solution will likely be simpler and your specified quality level easier to achieve. On the other hand, where high emotional impact is required, such as in sales and marketing content, a more specialized engine would be needed, in which case, achieving the level of nuance necessary may be much more of a challenge.

Language differences

Another factor that is important to keep in mind is that not all languages or language pairs are created equal. For instance, while Dutch-to-English translation and vice versa generally produce good results thanks to these languages’ similarities, Chinese-to-English translation is a much more complex task due to the vast differences in syntax, morphology and logic between the languages.

Generic vs. customized MT

MT can be deployed in several different ways, depending on both needs and constraints. For instance, generic engines can be used fairly straightforwardly and with lower deployment costs, but typically with a corresponding effect on quality. Customizing an engine to a specific application will obviously improve results, but the price tag can vary greatly depending on the specific languages and content involved. In many cases, however, much of the cost of customization is upfront rather than ongoing. As noted previously, in terms of ROI, customized MT generally makes the most sense for high content volumes.

Post-editing

MT can be used with differing levels of post-editing, the process by which human translators/editors review MT translations. This phase can be particularly helpful in identifying terminology the MT engine may have failed to recognize or render correctly, which can then be fixed by the post-editor. Plus, the engine can be retrained to improve future output and reduce post-editing effort. Depending on quality requirements and the corresponding scale of post-editing in the workflow, this phase can obviously involve varying levels of investment of both time and money.

Security

Finally, it bears mentioning that not all MT providers offer the same level of security for data that passes through MT engines. Free MT services such as Google Translate, for instance, store and use data uploaded to them in order to train their engines. One of the advantages, therefore, of a paid MT service is that security needs can be clarified in the discovery phase and ensured through direct agreements with the MT service provider.

What does MT deployment look like?

While each case is unique, MT deployment generally consists of the following:

1. Discovery

This initial stage serves to analyse existing processes, clarify goals such as languages and content types to be translated and identify quality requirements.

2. Pilot

In this stage, an appropriate engine is selected based on discovery findings and, in the case of a customized engine, training of the engine begins.

3. Testing

The selected engine gets tested and evaluated through automatic quality assessments, human review or both. The testing phase is meant to produce results that can indicate whether quality goals are being met or whether results should be further optimized.

4. Engine improvement

If the test phase reflects a need to improve the results being generated, the MT system can be fed additional data and trained further.

5. Deployment

In this stage, the MT engine gets integrated into the workflow, either by connecting to CAT (computer-assisted translation) or TMS (translation management system) tools or as a stand-alone application.

6. Maintenance

The MT engine is monitored and retrained over time as content evolves. This is particularly important as new products or services are introduced, in cases of re-branding or when a company wants to change its tone.

Conclusion: successful MT deployment requires careful consideration

As you can see, there are quite a few factors to consider when designing and deploying an MT system. Depending on the languages and content involved, the required level of quality and budget and time considerations, different solutions will make sense in different contexts. Therefore, it is always wise to work with a professional localization firm well-versed in working with MT solutions. They will help you assess your needs, prioritize your considerations of cost, time and quality and deploy an MT solution that will get you the results you need.

At RWS Moravia, we pride ourselves as leaders in the field, and it is our passion to work with companies to identify and implement the right MT technology for their specific goals.

If this summary has piqued your interest, or if you want to hear more on the subject, you can listen to the full webinar on-demand.

Author

Lee Densmer

Lee Densmer has been in the localization industry since 2001, starting as a project manager and moving up into solutions architecture and marketing management. Like many localization professionals, she entered the field through an interest and education in languages. She holds a master’s in linguistics from University of Colorado. Lee lives in Idaho, and enjoys foreign travel and exploring the mountains of the region.

All from Lee Densmer