Recently, I read an article in which Alan Packer, director of engineering for Facebook’s language technology team, talked about the use of neural machine translation (MT) by his company. One of the things he said was that they needed to build their own MT engine and not use a third-party service because people spoke differently on Facebook.
It got me wondering if that were indeed true, and brought to mind a few relevant questions. How does our social media communication affect language? And how does it affect translation?
The language of FB? Not really
In the article, Packer elaborates on what speaking differently meant: it’s the use of slang, colloquialisms, acronyms made on-the-go and achieving sudden virality, hashtags, and sometimes even more than one language in a single post.
If you compare this to the way people communicate on other social networks, Packer could have a point. People aren’t as informal on LinkedIn, for example, as they may be conscious of their boss, work colleagues, and clients looking over their shoulder. Twitter, too, is not a walled garden like Facebook, so conversation still needs to be decipherable.
However, if you compare Facebook’s language to that which we use with friends and family, how different do you think it is? Not much, except for maybe “friending” someone or “unliking” something — terminology unique to Facebook. On FB, we’re in our pajamas, so we can let the guard down a little and speak more informally.
And it’s for this reason that Facebook cannot use off-the-shelf MT solutions: those engines haven’t had the same corpus as Facebook has to learn from. Instead, they have learned from government documents and other content on the web that has more formal language. Hence, the output would not mimic the informality that Facebook users expect.
We get that. Still, it’s hard to accept that people speak uniquely on Facebook. Let’s step back a little. If you’re as old as I am, you may remember Yahoo! Messenger, MSN Messenger, or ICQ. (Does indeed sound like a long time ago, sigh.) You would also know the familiar conversation starter with strangers those days: “ASL (Age, Sex, Location)?” Come to think of it, we talked to a lot more strangers on the Internet in those days than we do now.
The point is that we had started BRB-ing and G2G-ing long before Facebook arrived on the scene, and the brevity and irreverence aren’t limited to Facebook even today.
It’s not just English
English is not the only language that people use differently on social media. People the world over are tweaking and twisting their language to the horror of traditionalists just so they can move their fingers a little less.
There’s also the niggling, larger question of whether social media, and the Internet by extension, is ruining our language, but The Economist thinks we worry for nothing and it might be right. For the record, all leading social network sites (SNSs) take language quite seriously. They invest heavily in translation and localization and often lead market launches with the local language. In many countries with emerging languages, where in-language content on the Internet is comparatively rare, Facebook and Twitter enable conversations that don’t need English.
How does social media language affect translation?
Not very well, as you can guess. Informal, colloquial language and slang are notoriously difficult to translate. It requires that you expertly straddle two languages, and that’s a tall order even for those who know the languages but are not immersed in both cultures.
Now, most SNSs are entrusting this translation to machines for reasons of scalability and speed, which are understandable concerns. No human could ever keep up with the amount of social media posts done in a single hour, and professional translation may be an overkill in this case. Gisting is usually enough.
But the thing is, translation no longer sits in its own corner. It’s not just a language issue. Rather, translation will be key to whether artificial intelligence (AI) can transform our lives or not. More and more, the world is getting multilingual. Our own business and personal networks include people who don’t speak the same language as us. So, the accuracy of machine-translated status messages, conversations, and hashtags may assume more importance. Errors will not just mean misunderstandings between people, but can also impact the feasibility of AI itself.
How will this change in the Age of the Voice?
Everything on the Internet (and that’s pretty much everything in the world), we hear, will be voice-driven in the near future. How will the voice phenomenon on social media affect language? I mean, can I just say “Skype my boss brb” before I go for coffee and trust that my boss didn’t think I was gone for good? And what about using emoticons, given that they don’t convey the same meaning everywhere? My guess is that we’ll talk more proper so that yet-maturing AI systems can understand us better. If so, it would be good news for instant translation solutions.
I started this post asking if Facebook’s language is really unique unto itself. It isn’t, though it is true that every new wave of technology, including that which Facebook is a part of, has influenced language. Language will only continue to evolve and grow, and technology will be left playing catch-up. And, things are only going to get more interesting with voice.