Speech technology can truly bring the customer experience to life, but it takes a unique blend of creativity, technology, and hardware to do so. We recently interviewed LumenVox Software Engineer Shaun McThomas to gain his perspective on the art of integrating speech technology with IVRs to enhance the customer experience.
Hi, my name is Shaun McThomas. I’m a Software Engineer at LumenVox, and today I’m here to talk to you about creating next-generation conversational Interactive Voice Response systems. IVR for short.
What are the biggest issues facing customers and IVR-s today?
One of the biggest issues with IVRs today is that callers are forced to follow a rigid script. It’s not a conversation, it’s an interrogation. First, they are asked, give me this bit of information; then they are asked for another bit and another bit. There is no flow like you would have with another person, just a series of “painful, tiny steps”, making the whole process ridged and uncomfortable for the caller.
Another issue is you often get trapped in IVR jail there is no escape route. You are forced to listen to the very end of the prompt before you can respond, “This isn’t what I want; let me go back to the Main menu.”
What do you think the solution is to address these pain points?
Most of these issues are easily addressed with an artful blend of good design and use of modern speech recognition technologies, what LumenVox calls Speech Art. If you listen to the very best contact center agents within a business and model how they solve the same issues and how they question a caller, you’ll understand how callers really ask questions and can provide very lifelike IVR responses. By following this model, you can produce frictionless, intuitive (and personalized) interactions with callers, radically improving their experience.
The very first thing a good IVR should do is quickly identify who the caller is, confirm that assumption. Remember, a blend of technologies can make this easier. Look up the phone number they are calling from in your back-end systems and see if you can determine their identify from that. You can use both speech recognition and voice biometric authentication to make that process simple and easy if needed. More than likely calling from their cell phone, which provides a unique identifier.
Once you’ve identified the caller, use data available from your back-end systems to anticipate the reason for calling and personalize the next steps.
For example, if you’re a power company and a customer’s home is in the middle of a known power outage, assume that’s the reason for the call. Likewise, if you’re an airline and they have a flight booked on your airline that departs within the next 24 hours, assume that’s the reason for the call. Now that you’ve made an assumption, confirm that’s the reason they are calling with a simple yes/no prompt and if yes provide them with appropriate information. If they are not calling for that reason, ask them why they’ve called, but allow them to use natural language to answer. And always give them a way to correct themselves.
How does Conversational IVR work, exactly?
Conversational IVRs work by leveraging three key technologies, ASRs, NLU, and TTS. These aren’t the only piece to the puzzle, but they are important ones. Let’s talk a little about each.
First, there is Text to Speech (or TTS for short). TTS is the method to turn text into speech. This is key to allow you to easily ask questions quickly. It is important to use TTS instead of recordings to allow questions to be personalized. For example, when a caller first calls in and you want to verify them, you can use their name and directly ask if it’s them.
Next, there is the Automatic Speech Recognizers (or ASR for short). An ASR’s job is to take speech, recognize it as something meaningful, and then turn it into something useful like text. There are lots of types of ASRs. LumenVox’s new transcription ASR uses machine learning techniques such as deep neural networks for natural language processing. This is effective for transcribing text from human speech. Before this sort of technology existed, you had to constrain your recognizers to a limited set of words (called a grammar) that it could only recognize. Modern NLP models have a large set of words they can recognize allowing you can speak naturally, and it will be able to feed you back the raw transcribed text. Once the ASR has done its job, we have that text.
Finally, we need to use another technology, which is Natural Language Understanding or NLU for short. NLU takes this text and converts it into meaning, intents, and slots, for example:
The caller can say: “I want to fly from New York to LA.” And we parse out “to fly, New York” “destination, from LA.”
Using these three technologies we can create a conversation with a caller rather than a scripted interrogation. First, we would use TTS to ask the caller a question, then an ASR to get text back from the caller’s response, then NLU to understand that response, and then finally use that understanding to figure out what additional information we need from the caller or processes the caller’s request.
What sets LumenVox apart?
At LumenVox, we’re creating a Configurable AI Gateway that makes it easy to integrate many different NLU engines with our ASR. This approach opens the possibility to use widely available NLU platforms from IBM, Google, Microsoft, Amazon, and others with your existing IVR along with LumenVox ASR, TTS and Voice Biometrics.
Many technology vendors don’t offer choices in the combinations of ASR, NLP, and NLU that you can use to build a solution. Their entire suite of technology and tools is often proprietary, expensive, and because its proprietary, involves the use of expensive, dedicated professional service teams. At LumenVox we want to be able to easily integrate existing technologies with our speech recognition, text-to-speech, and voice biometrics software as part of the solution stack. We want to take the technology that’s already out there and make it easier for our customers to use.
Ready to take your contact center to the next level by implementing a conversational IVR? Contact us today!