As Central Asian and Middle Eastern economies improve, forward-thinking globalization directors are starting to target a special group of culturally diverse markets that share a distinct linguistic form: bidirectional writing systems, or BiDi for short.
As such, we’re providing a “crash course” in BiDi languages for globalization managers. Stay tuned for the rest of the series, spotlighting key implications for international software testing and localizing content for BiDi markets.
What are BiDi Languages?
Featuring writing systems that run both right-to-left (RTL) and left-to-right (LTR), BiDi languages are native to a fairly significant population.
The two principal modern BiDi languages are Arabic and Hebrew. Native Hebrew speakers number roughly 5 million. Arabic boasts 295 million native speakers, it’s the worship language of 1.6 billion Muslims worldwide, and it’s one of the six official languages of the United Nations.
However, 310 million people speak one of the ten major languages based on the Arabic script and writing system: Farsi (Persian), Urdu, Dari, Pashto, Uyghur, Sindhi, Malay (Jawi), Kazakh, Kurdish, and Kyrgyz.
Why This Matters
Successfully accommodating BiDi languages is the ticket to reaching more than 600 million people in developing economies. If you’re localizing for Arabic alone, you’re not even reaching half the BiDi market.
How Did BiDi Languages Become a Thing?
Arabic, fine; Hebrew, sure: but how did other languages using LTR or RTL alphabets become bidirectional?
Primarily, it was the spread of Islam from the Arabian Peninsula outward, starting in the 7th century AD. Having been introduced to Arabic script via the Qur’an, these non-Arabic peoples either abandoned their previous writing systems or evolved from exclusively spoken languages using the Arabic script as a model.
In the 19th and early 20th centuries, European colonization left a legacy of dual languages in many of today’s BiDi markets: for example, French and Arabic co-exist in Algeria, and most Egyptians are familiar with English along with the dominant Arabic. Even Israel has two national languages, Hebrew and Arabic, with English as an unofficial third, thanks to the influence of American pop culture. Even without official dual languages, many BiDi languages rely on Western alphabets for cognates and brand names.
Why This Matters
Mixed-language support is imperative for packaged, homegrown, and web-based software targeting BiDi languages, which often feature snippets of LTR text in the middle of a BiDi text block — all of which needs to be editable. How do cursors behave? Which way is “forward,” and which way is “backward?” How does selecting and highlighting text work? And how will you handle markets with more than one official language — one of which may be RTL or LTR?
The good news is, the tech industry is beginning to prioritize support for BiDi languages. Unicode v5.0 brought support for internationalized resource identifiers — both URLs and URIs, opening the doors for savvy globalization programs to gain first-mover advantages in these markets. Arabic, Hebrew, Farsi, and other BiDi language domain names and internet addresses are now a possibility.
At any rate, BiDi languages and scripts represent one of the world’s major language families, and it’s therefore important to address the challenges of preparing software, help content, web pages, mobile devices, and so forth for BiDi languages.
In the weeks ahead, we’ll examine not just the technical challenges — how do you program a cursor for BiDi? — but also the challenges of user orientation — where do you put “next” and “back” buttons on a web page? — and common cultural pitfalls when targeting BiDi language regions.
But next week, we’ll dive into the key characteristics of BiDi languages and the implications for globalization.
In the meantime, please let us know if you have a special interest in localization for BiDi languages: burning questions? Interesting experience, insights, or “lessons learned?” Please share in comments and we’ll do our best to address them.