Creating a voice app to drive traffic to your brand

Hinde Lamrani 15 Mar 2021
Creating a Voice App to Drive Traffic to Your Brand
What is a voice app, and why does your business need one? The simple answer is that with the digital world rapidly moving towards voice (or arguably already there), your users demand it. 41% of adults use voice search at least once per day. If you don’t prepare for voice, you’ll fall miles behind your competitors. Personal assistants, phones, cars, lights, ovens, fridges and even toilets—just about everything reacts to voice now and consumers love it. And for businesses? Optimizing your digital presence so that you take advantage of voice search is one of the keys to unlocking sales growth and keeping up with consumer trends. How can your content be read aloud by the voice assistant to the user? There are two ways to achieve voice visibility: either your website’s content ranks high for voice queries (you can’t control this; you can optimize for it and hope for the best) or you create your own voice app. The latter allows you to be in control of your presence for voice search—give your app a name and users can simply invoke it just like going directly to a website. Or, through implicit invocation, your voice app can be suggested by the voice assistant based on the user’s query. You can make your app anything you want. For example, a transactional voice app allows a client to order from you by talking to their device without opening their computer or mobile device. Or, you could build a customer service app that your existing customers can invoke through their voice assistant to get a question answered. This could be a solution for a specific need or just a general enquiry; you can make it as specific or as broad as you want. The app’s contents and information are based on research about the types of tasks your customers would want to perform and what data they need to be successful at them. The bottom line is if you want to be visible online, you create a website. If you want to be found by voice, you optimize your website for voice or, even better, create a voice app. So, where do you start?

Do you need a voice app?

The first question your business needs to ask of itself is “do we need a voice app?” Not everyone does. If you have a business that isn’t suited to consumers requesting you by speaking into a voice device, then this expense is unwarranted. However, if you’re consumer-facing and delivering any kind of product or service, there’s a good chance that your customer base is moving towards voice search. If your research concludes that having a voice app will enhance your channels of communication with your customers, there are considerations you need to clarify before starting its development.

What you need to consider before building a conversational voice app

There are a lot of considerations that go into building any app. Far too many to put in a single blog post. So, what we’ll focus on here are the high-level considerations you need in place to be able to brief the voice app development team with your requirements. There are four main components to planning your voice app:
  1. Understand what “intent” means in the context of voice;
  2. Make sure you consider conversational design;
  3. Create the right app—a skill app, an action app or both; and
  4. Figure out the implications of dominant search engines and locales.
Let’s look at these four areas in a little more detail.

Do your research to understand “user intent”

The key to good customer experience is making sure your voice app recognizes what people are asking it and provides relevant results. But there are several ways that people might ask for the same thing. Their “intent” is the same, but people of different age groups, genders, cultures, interests and backgrounds may say something in completely different ways. Instead of focusing on just keywords, capturing the intent of what someone is saying is the path to voice success. To do this, you train your app to understand the intent of the speaker and answer their question in the best way. The way to train your app is to give it as much data, or training phrases, as possible. The natural language understanding engine, or NLU, is the technology that identifies the intent of the user query. You want the NLU to recognize the infinite number of ways someone could ask for the same thing. You’ve already built the intent and possible answers into your app, but you need to train your NLU on recognizing and identifying that “intent” no matter how the user asks for it. This is where training phrases are important. Imagine when your app goes multilingual…you need to do the same for the target languages, as translation is not the best solution here.

Conversational design—keeping the consumer engaged

In order to keep the customer journey active and the user engaged, the dialog management multi-turn conversation engine (responsible for replies) should be programmed to keep the conversation alive even if the original request can’t be fulfilled. For example, if the question is ”do you have these shoes in red?” and the answer is ”no“, saying that would stop the conversation if you don’t have the app programmed correctly. Instead, the reply could be ”no, but we do have them in burgundy”. This is called a repair scenario.

Skill versus action apps—get it right for the right voice assistant

When you want to be visible to a consumer asking a question to their voice device, you need to know whether you should build a voice app as a skill or an action app. Skill or action are two different ways of coding your app to suit the voice device your customer base uses. This is a crucial point—this isn’t like developing a website that can be seen in any web browser like Chrome, Explorer, Safari, Firefox, etc. This is creating your voice app for a specific voice device. Each type of app relates to a main voice device: you need to develop a skill for Alexa users and an action for Google Assistant. It works this way: when a consumer talks to their Google Assistant, for example, and says the name of your app, it’s called explicit invocation. It will take the consumer to your voice app, which is programmed to operate through Google Assistant. Skill versus action is a fundamental programming difference, which is why you need to commit to one or the other, or both. Whichever your target market uses the most will determine which way you go. Chances are, you’re going to have to develop both if you’re rolling out globally into different countries. Or even develop an app for a different assistant if you’re targeting Russia, South Korea or China, since these countries have other voice assistants that are more popular than Alexa and Google Assistant in their respective markets.

Device technology

Because you have to build an app that sits within a voice search engine’s digital programming, you have to consider device technology (skill, action or other app). Which of the two main voice device technologies dominates? Is it a Google or Amazon product? Most English-speaking countries tend to have a mix, but there will still be one that has greater market-share than the other. In the US, for example, Amazon is dominant with 70% of the market. In some areas of the world, however, neither of the big two are the leading voice search devices. In South Korea, for example, Google is popular, but not as widely used as Naver (which is the predominant search engine in South Korea), so it would be a good idea to develop two apps: a Naver voice app and a Google voice app. The predominant voice device in the territory determines the type of app (or apps if you want to cover more than one type of voice device) you build.

You have to address voice

In 2020, Statista valued the smart speaker market at $7.2 billion. By 2025, they predict it will be worth $35.5 billion. Businesses with an online presence have to consider voice, even if the decision is that now is not the right time. For most consumer-facing brands, however, now is definitely the right time, because voice activation and search are only getting more popular. Get your research and technology right for all your markets and you’ll be in a good position to take advantage of voice in any language.
Hinde Lamrani RWS
Author

Hinde Lamrani

Hinde Lamrani is RWS’s in-house subject matter expert for International Search, specifically focusing on SEO/SEM. She has a decade of experience helping organizations achieve online visibility on a global scale. Hinde is fascinated by IoT and next-gen technology. She enjoys occasionally horseback riding and she’s fluent in three languages.
All from Hinde Lamrani