Anecdotes of a localization veteran
09 Nov 2020
August 2020 marked my 30th year in the localization services industry (despite numerous attempts to escape). To mark the milestone, I am happy to share some of the experiences that have made an impression on me over the years.
I got into localization after earning a Bachelor’s degree in French, Russian and International Relations from the University of Surrey and a Master’s degree in Artificial Intelligence and Computational Linguistics from the University of Sussex. Much of what was covered in the Master’s degree is commonplace now, but back in the day, it was largely theoretical because computers just weren't powerful enough. I was lucky, however, to start my career in Welwyn Garden City, Hertfordshire, England, in the internal translation department of Rank Xerox, which at the time had advanced localization processes. We used computer-assisted translation (CAT) tools integrated with a graphical user interface (UI) environment for immediate contextualization of the translation, wrote Simplified English for machine translation (MT) and were developing MT language pairs for SYSTRAN. It was a great starting point for the next 30 years.
Know your audience
Early in my career, I was asked to give a demonstration of some of our translation tools. The experience taught me my first lesson about presenting: find out about and start with what the audience understands. The demo was for a very senior overseas visitor to the department, though no one seemed completely sure of who he was. I recall him wearing a large Stetson hat. He pulled up a chair next to mine in front of two screens—one showing the translation environment split-screen and the other showing a representation of the graphical UI of the product. It was very advanced for the time. I showed terminology lookup and translation memory (TM), translated some strings and pulled them into the UI simulation to show context. I also demonstrated how issues with space constraints caused by the translation being longer than the English were flagged. I adapted the translation in the translation environment and pulled it across to show it fitting in the space available. I remember feeling relieved that the demo had gone well. Then the visitor turned to my colleagues gathered around and, as I remember it, he asked, “Why do we have to translate this sh*t anyway?”In the pink
An early project that I participated in that demonstrates one of the basic issues in localization was the European re-engineering of a Japanese photocopier. There was a range of different shades it could produce, which were displayed on the single-line LCD display. After many delays in engineering, the product was localized into English, which went well enough. The characters ピンク, which translate to “pink” in English, displayed perfectly. The problems arose with the Finnish language version when the machine crashed on that same message—not surprising, as the Finnish word for pink is “vaaleanpunainen,” which was too long for the LCD display with no suitable abbreviation. In the end, the display showed numbers for the colours instead of their names, and a chart mapping the numbers to colours was added to the user manual.Tombstones
One of the biggest projects I was lucky to spend time on was the translation into Russian of the massive DocuTech Production Publisher, an early digital printer and a first in terms of its capabilities. It had hundreds of pages of help information available from a touch-screen display. The newly burned chip with the English source and its Russian translation arrived from the United States and was installed on the machine. But there was a problem. Every so often, seemingly randomly, it displayed a black rectangle or “tombstone” tile rather than a Russian character. Relatively fresh out of university, with three years of Russian as a foreign language under my belt, I had become the go-to Russian-language expert and was summoned to take a look. My conversational Russian vocabulary, useful when talking about the wonders of the Soviet Union (which had collapsed in the time between my graduation and starting work), was not much help in understanding a service manual for a photocopier. However, after a while, I realized that the tombstones were not random at all, but were occurring where there should have been a “soft sign” (the character “ь”). The soft sign appears after certain consonants and changes how they are pronounced; it softens the sound and alters the word’s meaning. For example, “угол,” transliterated as ugol, without a soft sign means “corner.” (Check the pronunciation here.) The word “уголь,” transliterated as ugol’—with the soft sign represented by the single quote—means “coal.” (Check the pronunciation here.) We got on the phone with the developers to explain what we were seeing. They explained that there wasn’t enough room on the chip for all the English and Russian characters, so they had removed a few of the Russian ones that didn't look important. After some debate and more than a little persuasion regarding the importance of this seemingly innocuous character, they agreed to remove a few pre-revolutionary Russian letters from the character set—hoping they wouldn’t come back into fashion after the fall of the Soviet Union—and reinstated the necessary soft sign.Ready, steady, translate
I was also fortunate enough to be involved in multiple aspects of MT. One project compared the efficiency of post-editing raw MT in a desktop publishing (DTP) environment versus translating using a CAT tool. In those pre-Trados days, the cutting edge for CAT was a terminal attached to a server, with a split screen for source and target and automatic dictionary lookup with keystroke shortcuts. For MT, we were using SYSTRAN output post-edited in Viewpoint, Xerox’s DTP environment, which was more advanced than anything Microsoft had at the time. There are all sorts of considerations in such a comparison to make it truly fair, or at least to give it the appearance of fairness. And I think we did a pretty good job of comparing efficiencies in as unbiased of a way as possible. What came out of the study, however, was a surprise—to me at least—as there wasn’t a significant difference between the two approaches. TM match rates for the documentation were high, meaning that CAT throughputs and those of post-edited MT were essentially identical. The biggest factor in efficiency turned out to be the linguist’s expertise in using the CAT tool or the DTP environment. Today, the more advanced approaches use TM and MT combined, with adaptive approaches being developed. However, the main lesson is still valid. Last year, one of the talks I attended at an MT conference presented a study that found that the most important factor in a post-editor’s throughput isn’t the quality of the raw MT, but the skill of the post-editor in finding efficient ways of working. Plus ça change…Obscene Finnish
Over the years, I have been involved in a variety of localization projects in automotive and heavy engineering. They’re always interesting, especially if a tour of the factory is involved. One such project was to investigate the reported poor-quality translation of an automotive diagnostic tool provided to a UK car manufacturer that no longer exists (nothing to do with the diagnostic tool, I hasten to add). If I remember correctly, there was a group of European engineers visiting the UK for training purposes, so it was an ideal moment to get feedback for various languages in situ on the factory floor. I interviewed engineers from various countries who used the tool. It quickly became clear from the experience of the French engineers that the main issue was to do with variables and concatenated strings. For example, the four messages—“{door} open” and “{door} closed” and “{sunroof} open” and “{sunroof} closed”—were generated from two strings in the software, which were “{item}” plus “open” and “{item}” plus “closed.” In French, two versions of the adjectives “open” (“ouverte” and “ouvert”) and “closed” (“fermée” and “fermé”) must agree with the feminine noun “door” and the masculine noun “sunroof.” At the time of translation, the translator had no idea what “open” and “closed” referred to and had translated them only in the masculine forms. There were other internationalization issues, too, reflecting the list of don’ts that should be familiar to all of us in the language services industry today. The plan was that I would report the issues and they would be fixed in the translation, if possible, as the software couldn’t be re-engineered. When I approached the Finnish engineers, however, they weren’t so forthcoming. (I had expected that, of all languages, Finnish would have the most problems.) After some probing, they explained that every morning when they booted up the diagnostic tool, for some reason that I could not fathom (I struggle to believe it was caused by concatenated strings), it would post a message that was an obscene remark in Finnish. It would brighten up their day and they didn’t want me to have it changed.Y2K stereotypes
During a brief pause from localization, I did some work supporting an audit of readiness of key sites around Europe for the Year 2000 bug. I was tasked with accompanying a US senior manager who had sent out a massive audit questionnaire to be completed by each site. We were supposed to visit the sites, go through the questionnaire and spot-check what had been filled in. The activity brought out what might be considered stereotypical reactions of the various nations involved:- The Italians told us to not bother coming because there wasn’t going to be a problem.
- The Germans also told us to not bother coming, as they’d filled out the questionnaire thoroughly.
- The Swiss hosted us politely and took us around the site with their completed questionnaire at the ready.
- The French hosted us politely with a mid-morning break for coffee and then a large lunch with wine followed by coffee. In fact, every time a difficult question was asked, they would tell us it was time to break for coffee. In this way, they managed to deflect any detailed inspection of their questionnaire until the taxi arrived (early) to take us back to the airport.