Considering the intense pressure to maximize corporate localization budget utility and speed products to foreign markets, it is no surprise that Machine Translation (MT) often comes up in our business conversations. It is well known that an MT engine can translate more quickly: tens of thousands of words overnight. Also, there is no doubt that MT can get translation done more cheaply; with no human intervention, the only cost is that of the engine. However, with no human touch, the linguistic quality and degree of faithfulness to the source text is suspect at best.
The quality of machine translated content varies, and it depends on many things, including:
- Quality and level of standardization of source material (garbage in, garbage out)
- TM availability, quality and size used to train the engine
- Level of post-editing applied
- Client’s definition of high versus low quality
- Language pair
When MT is in play, it is mission critical to discuss how to achieve the desired level of output quality. Often, we are asked to provide one of two levels of post-editing: light or full.
This involves taking the raw MT output and performing as few modifications as possible to the text in order to make the translation understandable, factually accurate, and grammatically correct.
Light post-editing tasks include:
- correcting only the most obvious typos, word, and grammatical errors
- rewriting confusing sentences partially or completely
- fixing machine-induced mistakes
- deleting unnecessary or extra translation alternatives generated by the machine
- making key terminology consistent, but with no in-depth term checking
The localized text needs to convey the meaning of the source text concepts correctly. It doesn’t matter if there is not a 1-to-1 correspondence between the source and target texts, as long as the original concept is there in the translation. Only major errors (errors which impact the user’s ability to perform the task, comprehend the text correctly, and impair productivity) and critical errors (errors which may incur legal consequences, block the user’s ability to perform the task at all, or comprehend the text at all) are covered. The resulting content might sound robotic or just a little bit off in tone and style, yet it is fluid enough for a reader to understand the meaning. All stylistic polishing is skipped.
This level of light editing is not easy to achieve: naturally detail-oriented linguists literally have to force themselves to skip over ‘minor’ errors and limit their work; their job is to achieve the stated quality level and no more. A light edit has a faster pace than a full edit, and if linguists do more than a light post-edit, they may not be paid for that extra effort
The key phrases for light post-editing are ‘factual correctness’ and ‘good enough’.
Full post-editing, a slower and more in-depth pass, must produce absolutely accurate translations that consistently use correct and approved terminology, have the appropriate tone and style, have no stylistic inconsistencies and variations, and are free from any grammatical mistakes. After this edit, the translation should read as if written in the target language.
Full post-editing tasks include all of the light post-editing tasks plus:
- checking terminology against approved terminological resources to make sure it is consistent and appropriate
- cross-referencing translations against other resources
- making syntactic modifications in accordance with practices for the target language
- producing stylistically consistent, fluent content
- adapting all cultural references, including idioms, examples, etc.
- ensuring perfect faithfulness — a 1-to-1 correspondence — between the source and target text
- applying correct formatting and tagging
- correcting ALL grammatical errors, typos, punctuation errors, and spelling mistakes
The expectation is high: full post-edited content must be equal to human translation in all aspects. Therefore, content must meet the quality criteria defined by the client for human translations.
This is a tall order, especially when all the factors I listed at the beginning of this blog do not line up perfectly. In some cases, the effort to achieve human level quality from MT output may exceed the effort to have it translated by a linguist in the first place.
Shades of grey
Is it as simple as choosing one level of post-edit or the other? Unfortunately, no. There are plenty of clients who define a level of quality that seeks the polish of full post-editing as well as the speed and low-cost of light post-editing. In order for them to fulfill their quality needs, post-editing processes, activities, and throughputs must be discussed and defined beforehand.
Last but not least, the effort to achieve ANY level of quality can be very different from project to project, client to client, language to language. Therefore, quality levels, throughputs, and expectations must be carefully defined regardless of whether machine translation and post-editing are part of the translation process.
What has been your experience in achieving MT quality?