NLP in the context of linguistics, semantic AI and localization

Wali Naderi 30 Jun 2022 5 mins
RWS Crowdsourcing
In a webinar held in May 2022, Maciej Szczerba from Phoenix Technology interviewed Elsa Sklavounou from RWS about ‘Natural Language Processing. It covered several key topics, such as linguistics, semantic AI, and the use of AI in authoring and localization. As we know, natural language processing (NLP) is a subfield and crossroads of linguistics, and computer science resulting now in artificial intelligence concerning the representation of human language in computing language, especially how to program computers to process and analyze large amounts of natural language data. Furthermore, it is the application of computational techniques to analyze and synthesize natural language and speech.

NLP – the origin

Elsa began by acknowledging the fact that NLP is quite an old topic, dating back to the 1950s. It is about establishing relationships between words in different languages, which is otherwise obscure or unclear. It is about decoding ambiguous language and then encoding it in a way we can follow.
 
There is a significant amount of complexity involved; even if we consider a single language English, it changes depending on the domain in which it is used, e.g., electronics, medicine, finance, and law. At RWS, NLP is being advanced to capture such finite knowledge substances to track the language exhaustively. These are vectors in attention-based models driven by machine learning (ML) techniques.

NLP in the era of cloud computing

With computing power increasing rapidly, even smaller machines can now process a lot of information. In NLP, this has helped to process a lot of data and provide insights on what languages mean in different contexts and domains. And it is evolving further to cover human aspects such as judgment, intention, common sense, and behavioral characteristics for people interacting through NLP. And even to translate technical language, e.g., one can compile and transform codified language into text so that it is easy to follow even for a non-technical person.
 
In this era of high content speed, NLP is evolving to empower users worldwide to consume content for any purpose, whether it is educational, commercial, or anything else. Elsa highlighted that today consumers expect a quick response time as short as 5 seconds, and if the lag extends to 12 seconds, then the business has already lost the customer. Therefore, NLP needs to be fast, accurate and responsive, whether it is predictive text, smart assistant, search result, or any application where it is being used.

What is GTP3 and what it is not?

Speaking about GPT3, a hot topic in NLP, Elsa mentioned that it is an experimental attempt to browse all the world web in all the languages across all the domains. It was done so that all languages can be recognized and then reproduced. It is a new way of compiling data through the graphical user interface. As a result, GTP3 has all the knowledge to interact but is not an expert by itself in any domain. Examples are virtual agents such as Siri on iPhone or Alexa on Amazon’s audio devices. These assistants don’t serve us when we need consultation that requires deep knowledge.

Need for semantic AI: The behavioral and ethical aspects

GTP3’s limitation brings up the need for semantic AI that can leverage knowledge graphs to add context to our mission by going beyond the text or keywords. This brings us to the behavioral aspect of human judgment. Today we are in the world of spatial computing: the metaverse, the extended reality, and the virtual reality. According to Elsa, these are all emerging from the way NLP is being exposed. The technology’s evolution and performance is now supporting experimental ways to mirror the human brain (not to replace it).
 
Elsa referred to an AI tool as an example that she uses to help her in authoring. She clarified that she is always the creator; the AI simply does the mundane tasks of sentence formation and typing. This can be a great help in the context of social media, corporate communication, or even day-to-day tasks. But, if you are authoring a creative book, this tool won’t help as it cannot feel and replicate the author’s emotions that ultimately help such authors connect with their audiences’ sharing feelings or expertise.
 
While NLP’s application is evolving beyond machines to include humans with behavioral and ethical aspects attached to it, it does not reach the creativity-in-context or in a situation that human brains do.

AI is ideal for technical content authoring

NLP is best suited for technical content authoring than creative writing as there is a lot of standardization involved. Subject matter expertise can be used to bring standardization in terms of all aspects of technical language, grammar, format, representation, distribution, and delivery to transform it into experience. In addition, it can bring that consistency that corporations want so that the relevant data represents the company’s principles and complies with regulations and standards.
 
In a corporate environment, content needs to be accessible, interoperable, findable, and reusable. ML helps us in the way information or knowledge can be retrieved from a large amount of content that the corporate has stored in a central repository.
 
Today, ML is used in media, whether it is a Netflix or a music streaming app. These apps remember your preferences to generate results that match your interests. Metadata is being used to retrieve the most authentic media or text or any type of human production that corporate consider as their legacy representing corporate reputation and standards.
 
This is where we enter the area of subject matter expertise, which is defined by knowledge. Metadata and annotations are being used to represent this knowledge. In addition, metadata helps generate knowledge from even outside the content that had been annotated and represented by the knowledge graphs. Thus, it brings the much-desired accessibility, interoperability, findability, and reusability that the corporate world desires today.
 
Elsa beautifully summed it up, stating that AI is a mirror of ourselves, not a representation of our brain, social behavior, attitude, judgments, ethical standards, or particularities between these visualizations and cultures. It is more like a simulation of what would have happened if? In some technical fields, results have been impressive, e.g., robotics running tiny theological operations or self-driving cars.
 
AI compiles the knowledge gained from several experiments over the years to reproduce the actions that deliver the same desired result that we want in the given situation.

Machine learning and AI: Relevance of manual translation

With technological advances, manual translation has been reinvented and transformed into the localization and serves as subject matter experts (SMEs) who build the knowledge graphs for automation and provide specialized translation services. Only the repetitive tasks have been automated, like in other industries, while the creative and also sign-off part remains manual.
 
There are content technology solutions that help corporate to ‘create’ content. All the creative content goes through either that company’s internal SME or through RWS’s SMEs located across 85 countries covering 120 languages. SMEs localize the content and align it based on our expected digital content consumption experience.
 
Automation, all by itself, will never work in language translation, SMEs are absolutely needed, as they hold the accountability by serving as quality auditors, compliance officers, chief information officers, and even in the process of hiring people. Elsa nicely summed up that the borderless approach to data needs to be regulated, and there are public institutions that monitor this digital data currency. The role of language translators has thus evolved to experts having a much larger and more critical responsibility.

Click here to watch a recording of the discussion.

Wali Naderi
Author

Wali Naderi

Senior Product Marketing Manager
Wali Naderi has 20 years' experience in the IT industry with some well-known IT organizations in various positions (Product Management, Product Marketing, and Sr. Alliance Management). He joined RWS in late 2020 as a Senior Product Marketing Manager, focusing on the partner community.
All from Wali Naderi