Content Does Not Equal Text
Click here to close
Click here to close
Subscribe here

Content Does Not Equal Text

Content Does Not Equal Text

Content Does Not Equal Text

In my blog post Is There Such a Thing as a Foreign Market? I elaborate on one of the key assertions made in the article “Sound and Vision,” published in the September issue of MultiLingual magazine. In both pieces, I suggest that the concept of the “foreign market” may be approaching the end of its useful life.

In this post, I’d like to elaborate on a second key assertion from that article: Your content does not equal its form.

The principle of separating content from form has in large part been institutionalized, thanks to advancements such as CSS, XML-based publishing, and systems designed around the concepts of component content management and single-source publishing.

Well, at least in the world of text, anyway.

If you’re talking about content that doesn’t exist natively as text—including audio and video—then as far as I’m concerned, we’re still living in the wild frontier.

But I’d like to punch it up a bit, because I don’t think that very many people would disagree with the above. So, let’s make the assertion bolder: Content does not equal text.

If not text, then what?

I’m going to need to drop a little ontology on you here.

Ontology is a field that originates from classic philosophy and attempts to answer the question, “what is a thing?” as opposed to what’s a characteristic of a thing, a part of a thing, a collection of things, a permutation of a thing, etc. (i.e. what is something’s “category of being”). Not surprisingly, it’s an important concept in computer science where abstract concepts need to be made more concrete so that they can be modelled and developed into working systems.

In the mid-to-late 1990s, it was the application of ontological thinking that yielded the assertion that books, documents, web pages, etc., aren’t content; they’re forms of content.

It’s an idea that gave birth to the content management revolution, which in turn helped to create today’s content and channel-rich world. If you’re a content producer, this is why your authoring efforts can be deployed across more channels more flexibly and affordably than ever before.


Figure 1: The model of content versus form at the heart of many contemporary content management systems.

In this now-standard model, the “ground-level object” (the “thing”) is assumed to be some unit of text, encoded in some human-readable language. Pick any sentence in this blog post, for example. It’s the molecule in the model.

But what if we break the model down even further and look at what comprises a unit of text?


Figure 2: Focusing on a new ground-level object: the core idea.

The new ground-level object becomes something like a sememe: a stand-alone unit encapsulating the core idea of a message. The atoms that make the molecule. Everything else is a representation of that idea, the result of that idea going through a multi-layered process of encoding.

In the case of me typing this sentence, the idea was first encoded by my brain into English words, then encoded into written English text by my fingers on the keyboard, then ultimately encoded into a digital object by my laptop.

It’s intuitive that ideas can live outside the context of written language; we have non-written conversations with other people all the time. And the existence of art and music demonstrates that a core idea can be expressed outside of the traditional construct of “human language” altogether.

But representing that idea in a computer system without anchoring it to a piece of written, human-language, interpretable text? Well, that’s a different challenge altogether.

But it’s one worth tackling, because…

Our digital world is becoming less text-centric

With the shift toward mobile computing, advances in speech recognition technology, the rise of new social channels, the proliferation of internet-connected appliances, and the ease by which new types of digital content can be created by anyone, there’s been an increase in the amount of audio, video, and other classes of non-text-based digital information out there.

And we’re no longer necessarily typing on keyboards these days either.

All of this content lives in different modalities, but we don’t want it to be siloed into those different modes. This is the grand promise of the internet: all the world’s information should be accessible and useful from anywhere. Regardless of language…and form.

One solution to indexing content with its meaning has been through semantic metadata that travels with content. This approach helps make content more findable and usable across its various modes, but it doesn’t necessarily address the challenge of the human language barrier, as these tags and namespaces are often based on—you guessed it—strings of human-language-encoded text.

You can see this limitation in action by going to any search engine that allows you to find images using text. Search for “kitten”. Now search for “小猫” or “gatito” or “kätzchen”. In all cases, the engine will return pictures of baby cats, but you’ll notice that the sets of results you get back are entirely different for each search term. Each of these pictures or videos lives in—from the standpoint of findability—different internets, segregated by language.


Figure 3: Nothing to see here. Move along.

To remove this language barrier using metadata, you’d need to either translate all the world’s metadata into all human languages, or find alternative ways of adding content descriptors that aren’t language-dependent.

When it comes to voice content, for now, text appears to be the lingua franca for brokering spoken language differences too.

The translation industry seems poised to respond to the upcoming explosion of voice-based systems with a combination of speech recognition, transcription, machine translation, and text-to-speech technologies. And while these technologies are becoming impressively mature, used in tandem they will only ever be as strong as the weakest link in the chain. An error in one step will only compound as it travels through the process, yielding results that are incoherent (but can be surprisingly entertaining).

Is there a better way?

Text-free alternatives?

In another post, I’ll further explore how “core ideas” might be encoded without the use of text—or at least without the use of text that belongs to a specific written human language.

And maybe “meaning” doesn’t need to be encoded at all, as long as we can associate content with enough raw statistical data about its characteristics, and can call upon those characteristics in a way that consistently returns and lets us use the “things” that we’re looking for.

With regard to voice translation, there is already movement in the direction of removing intermediary text-based steps.

This approach would cut out transcription and text-based machine translation altogether, and may support systems that work closer to the way human interpreters do.

Download the full article: Sound and Vision: Taking Audiovisual Content from Globalization Liability to Globalization Asset


What do you think? Do you see a benefit in separating content from its form even further? How might our content management and translation practices need to evolve in a “post-text” digital world? Let’s start a dialog!