Few people who’ve ever dealt with language have been spared from hearing the linguistic myth about the Eskimo language having a few dozen (or, according to some bold sources, up to a hundred) words for snow. Upon hearing this mindboggling and deeply wrong linguistic claim, non-Eskimo speakers, struck with awe at the number of conceptual distinctions Eskimos can allegedly encode into their vocabulary, scornfully frown at the pathetically all-inclusive English snow. Eskimo speakers, in turn, are most likely struck by their incapability to identify those dozens of words for snow in their own language.
I was reminded of this linguistic hoax by a blog post where an intriguing classification of languages caught my eye. The author, Doug McGowan, divided languages into high-res(olution) and low-res(olution), meaning languages that make a lot of distinctions and those that don’t, respectively. Having learned that there was no Eskimo conceptualization of snow that English wouldn’t know how to express, I wondered: “Is this division real? What aspects of language does it apply to? And how does it matter to localization?”
Sound inventory and phonological resolution
Different languages have different numbers of sounds (called phonemes), ranging from a total of 11 (Rotokas) to more than 110 (!Xóõ). The question is: does a language with 11 phonemes have a lower phonological resolution than a language with 100? The answer is: probably yes. The next question is: does it matter?
Speakers of a phonologically low-res language would hardly ever reach a stage when they cannot coin new words because they ran out of sounds. Language has numerous ways to compensate for a limited phonemic inventory (tone, stress, vowel length and other exotic tricks). Plus, one can always add yet another sound to an already existing word and create a new one. So, speakers of those low-res languages shouldn’t be all too worried.
Does phonorogical lesorution cause tloubles?
Phonological resolution starts causing trouble only when a speaker of a language that does not make a given phonological distinction attempts to pronounce a word from a language that does make the distinction. Think of the English words “belly” and “berry”, which sound the same in Japanese since Japanese does not distinguish between /r/ and /l/.
Simplifying the picture a bit, the problem arises only in the direction from a high-res language with more phonological distinctions (having both sounds /r/ and /l/) to a low-res language with fewer distinctions (lacking one of the two sounds or having one sound for both “r/l”).
From a localization point of view, the problem surfaces exclusively when non-English speakers try to pronounce an English brand name. So, stick to those 11 sounds of Rotokas when coining the name of your next product, and chances are that the majority of your international customers will pronounce it in a recognizable way.
Lexical inventory and conceptual resolution
Different languages have different word inventories, the understanding of which is pretty intuitive. Speaking of Inuits (Eskimos) again, they may indeed have more than the average number of words for snow, spending their lives in the blizzard-swept Artic, while Arab nomads in the Sahara may have just one word for snow, naturally reflecting their environment and things that matter.
Strikingly, Arabic has just one word for sand, too, but allegedly quite a few words for camel. Similarly, English has a whole lot of words for horse reflecting different breeds, ages, colors, etc.; and each one of us has a plethora of words for different fonts on our PCs, making me wonder about our environment and the things that matter to us.
The in-laws problem
In any case, differences in lexical inventories lead to a very tangible, frequent and hard-to-crack problem: how to translate a word from a “low-res” language into a language that makes more fine-grained conceptual distinctions?
For instance, the 2-in-1 English kinship term mother-in-law has distinct Bulgarian translations depending on whether it is a “husband’s mother” or a “wife’s mother”. Without any context, the correct translation is impossible to figure out. The problem can only be alleviated by providing extensive comments about the meaning of the English word, but then again, can a native-English source writer know that in some language out there it matters whether a female relative is the mother of a wife or of a husband, not to mention the knowledge of whether a camel is a “she-camel that gives birth only to male offspring”…and comment on it?
Where things get really tough — grammatical resolution
Differences in grammatical resolution are the number one problem for localizers and second language learners alike. Here, we’re not talking about errors such as mispronouncing “berry” or mistranslating a wife’s mother as a husband’s mother (even though the latter may be a critical error in some families). The errors caused by grammatical resolution are too obvious and unattractive — errors that would make customers raise eyebrows and second language teachers reach for the red pen.
A beautiful illustration of discrepancy in grammatical resolution comes from the Japanese-English pair again. As mentioned in Doug’s post, Japanese has no grammatically-encoded singular-plural distinction, so a Japanese-to-English translator must search for cues to determine if a Japanese noun is to be translated with or without the -s plural marker in English.
But Japanese has in no way an impoverished grammar — it makes up for the lack of plural with, for instance, its rich and complex system of honorific suffixes encoding different levels of politeness. Faced with the invariant and ubiquitous English “you”, Japanese translators have no choice but to try to determine what would be the best-suited honorific morphology for the target audience to be added to nouns, verbs, and even adjectives.
Is English a gramatically low-res language?
In general, when it comes to grammatical distinctions, English appears to be situated on the low-res end of the scale in quite a few linguistic categories:
- English has no gender. Slavic languages have up to four; some Bantu languages of Africa have as many as 10.
- English verbal tenses only distinguish between past, present and (arguably) future. The Bamileke languages of Cameroon have up to 9 tenses drawing distinction between, for example, near future and remote future.
- English has just one passive voice form. Quite a few Germanic languages have two passives.
- Apart from the third person -s, English has no verbal agreement. Romance languages, such as Spanish, French and Italian, have scores of agreement markers.
On top of all this, English has a typologically rare and dangerous penchant for creating new words and names by concatenating their components, such as sample service level agreement. Such a compound noun can be parsed in five logical ways, “sample of a service level agreement”, or “agreement about a sample service level” being two of them.
While an English speaker can happily live with this ambiguity, the overwhelming majority of languages must disambiguate the compound by inserting little words between the components to encode the actual relationship between them. Doubtlessly, a tough task for a translator when the context is absent.
Englishmen are content with English…and the Japanese with Japanese
Even though English may be one of the low-res languages of the world, English speakers never feel limited by its resolution level. Similarly, Japanese do not mourn the lack of plural — after all, one can always use words like many, multiple and the like to express the idea of “many cat”.
The resolution problem surfaces only when there’s the need to convert sounds, grammatical distinctions and meaning from one language to another. The rule of thumb is this: a low-res language needs a lot of disambiguating context information, while a high-res language can do with less source commenting. Thus, source commenting is vital — it gives translators that critical information about the specific mother-in-law and the right honorific level with which she needs to be addressed.