Speech technology can truly bring the customer experience to life, but it takes a unique blend of creativity, technology, and hardware to do so. We recently interviewed LumenVox Software Engineer Shaun McThomas to gain his perspective on the art of integrating speech technology with IVRs to enhance the customer experience.
Hi, my name is Shaun McThomas. I’m a Software Engineer at LumenVox, and today I’m here to talk to you about creating next-generation conversational Interactive Voice Response systems. IVR for short.
What are the biggest issues facing customers and IVR-s today?
One of the biggest issues with IVRs today is that callers are forced to follow a rigid script. It’s not a conversation, it’s an interrogation. First, they are asked, give me this bit of information; then they are asked for another bit and another bit. There is no flow like you would have with another person, just a series of “painful, tiny steps”, making the whole process ridged and uncomfortable for the caller.
Another issue is you often get trapped in IVR jail there is no escape route. You are forced to listen to the very end of the prompt before you can respond, “This isn’t what I want; let me go back to the Main menu.”
What do you think the solution is to address these pain points?
Most of these issues are easily addressed with an artful blend of good design and use of modern speech recognition technologies, what LumenVox calls Speech Art. If you listen to the very best contact center agents within a business and model how they solve the same issues and how they question a caller, you’ll understand how callers really ask questions and can provide very lifelike IVR responses. By following this model, you can produce frictionless, intuitive (and personalized) interactions with callers, radically improving their experience.
The very first thing a good IVR should do is quickly identify who the caller is, confirm that assumption. Remember, a blend of technologies can make this easier. Look up the phone number they are calling from in your back-end systems and see if you can determine their identify from that. You can use both speech recognition and voice biometric authentication to make that process simple and easy if needed. More than likely calling from their cell phone, which provides a unique identifier.
Once you’ve identified the caller, use data available from your back-end systems to anticipate the reason for calling and personalize the next steps.
For example, if you’re a power company and a customer’s home is in the middle of a known power outage, assume that’s the reason for the call. Likewise, if you’re an airline and they have a flight booked on your airline that departs within the next 24 hours, assume that’s the reason for the call. Now that you’ve made an assumption, confirm that’s the reason they are calling with a simple yes/no prompt and if yes provide them with appropriate information. If they are not calling for that reason, ask them why they’ve called, but allow them to use natural language to answer. And always give them a way to correct themselves.
How does Conversational IVR work, exactly?
Conversational IVRs work by leveraging three key technologies, ASRs, NLU, and TTS. These aren’t the only piece to the puzzle, but they are important ones. Let’s talk a little about each.
First, there is Text to Speech (or TTS for short). TTS is the method to turn text into speech. This is key to allow you to easily ask questions quickly. It is important to use TTS instead of recordings to allow questions to be personalized. For example, when a caller first calls in and you want to verify them, you can use their name and directly ask if it’s them.
Next, there is the Automatic Speech Recognizers (or ASR for short). An ASR’s job is to take speech, recognize it as something meaningful, and then turn it into something useful like text. There are lots of types of ASRs. LumenVox’s new transcription ASR uses machine learning techniques such as deep neural networks for natural language processing. This is effective for transcribing text from human speech. Before this sort of technology existed, you had to constrain your recognizers to a limited set of words (called a grammar) that it could only recognize. Modern NLP models have a large set of words they can recognize allowing you can speak naturally, and it will be able to feed you back the raw transcribed text. Once the ASR has done its job, we have that text.
Finally, we need to use another technology, which is Natural Language Understanding or NLU for short. NLU takes this text and converts it into meaning, intents, and slots, for example:
The caller can say: “I want to fly from New York to LA.” And we parse out “to fly, New York” “destination, from LA.”
Using these three technologies we can create a conversation with a caller rather than a scripted interrogation. First, we would use TTS to ask the caller a question, then an ASR to get text back from the caller’s response, then NLU to understand that response, and then finally use that understanding to figure out what additional information we need from the caller or processes the caller’s request.
What sets LumenVox apart?
At LumenVox, we’re creating a Configurable AI Gateway that makes it easy to integrate many different NLU engines with our ASR. This approach opens the possibility to use widely available NLU platforms from IBM, Google, Microsoft, Amazon, and others with your existing IVR along with LumenVox ASR, TTS and Voice Biometrics.
Many technology vendors don’t offer choices in the combinations of ASR, NLP, and NLU that you can use to build a solution. Their entire suite of technology and tools is often proprietary, expensive, and because its proprietary, involves the use of expensive, dedicated professional service teams. At LumenVox we want to be able to easily integrate existing technologies with our speech recognition, text-to-speech, and voice biometrics software as part of the solution stack. We want to take the technology that’s already out there and make it easier for our customers to use.
Ready to take your contact center to the next level by implementing a conversational IVR? Contact us today!
In response to COVID-19 and its burden upon our customers, we wanted to do our part. As of today, LumenVox Advanced Speech Recognition (ASR) customers who have a maintenance agreement can use up to a 50% increase of their current ASR license volume, at no additional cost, for 60 days.
It’s imperative to stay connected to your customers right now. During this uncertain time, LumenVox remains flexible and committed to serving our partners and customers, making sure your service remains seamless, as demand and dependency upon remote channels increases.
“We hope that this flexibility to our licensing allows for increased automation and alleviates some of the contact center stress we are experiencing during COVID-19. This is just one way we can help come together and support one another.”
-Edward Miller, CEO of LumenVox.
This free bursting also demonstrates our commitment to both outstanding customer service and meaningful support, from speech application development to deployment and daily use.
LumenVox ASR is a software solution that converts spoken audio into text. Its ability to recognize naturally spoken language and its tuning flexibility set the technology apart as an industry standard. With LumenVox ASR, user experiences improve, and completion rates rise. To accompany this technology, we provide a wide selection of licensing options, including per-port, monthly subscription, use-based, bursting and Software as a Service (SaaS).
LumenVox is also directly addressing one of the major issues resulting from COVID-19, a spike in fraudulent attacks on contact centers, with Fraud Scanner, a voice-based fraud detection tool. Learn more about this fraud detection strategy here.
To further discuss your licensing needs, or to inquire about any of our solutions, contact a service representative here.
You can read a message from our CEO further detailing our response to COVID-19 here.
How the customer experience evolved from snail mail to Conversational IVR
LumenVox and its artificial intelligence speech technology has initiated an evolution within the world of customer service. In the good ol’ days, people went to the bank to deposit a check. They absolutely had to make the trip; there were no other options. Eventually, banks got savvy and integrated with another institution—the post office—to provide printed account information to your doorstep, send or receive checks, etc. Eventually, with the integration of the telephone and its proliferation of use in businesses, we got phone support. If we had any questions or concerns, we could call a teller or bank representative to get the needed information. Over time, the number of incoming calls vs. the number of representatives available were too incongruous. There could never be enough reps to answer the number of needed phone calls per day.
With speech recognition, self-service phone options were put into place. When customers called the bank, instead of a human voice, they got automated options: “Press 1 for your balance. Press 2 for hours and location,” etc. But those menus can get quite annoying—people quickly got tired of the amount of time it took to listen and navigate. Everyone pressed zero, hoping to skip straight to a human voice.
To solve this issue, speech interactions were introduced, but only via directed dialog, i.e., “Say yes or no.” It wasn’t that transformative, really; there was still a fair amount of patience required by the customer. Eventually phrases such as “Account Balance,” “Did my check clear?” were put into place. This was considered an improvement by customers, but there were issues with the inconsistencies in customers’ speech. As a result, there was a good amount of–“I’m sorry, I didn’t quite get that.”
But businesses did finally get it. They learned from all the trying and testing that the level of ease of the customer experience, now primarily implemented by mobile phone, is just as important to a customer as the quality/ price of items or service purchased. Essentially, time is money—to everyone. This realization is what spurred on the notion of a conversational IVR and how it became instrumental in the design of self-service solutions.
The goal of Conversational IVR is to have automation mimic a conversation with the consumer. Open-ended questions, such as “Thank you for calling. How can I help you?”, have become the desired greeting. This evolution into machine, yet human-like interaction relies on Deep Neural Networks (DNNs), their ability to seek out, classify and order information in more complex or “intuitive” ways, and their contribution to Natural Language Understanding (NLU)/ Natural Language Processing (NLP). NLU/NLP is a specific branch of artificial intelligence that enables human-computer interaction using the input of sentences in text or speech.
Integrating NLU/NLP into a customer service environment results in a customized user experience: Enterprises are now able to recognize and automatically speak to you by name when you dial in. Your phone number is matched to your account information so they can even predict what you may be calling about (i.e., “I see you have booked a flight…)”.
Conversational IVR promotes a better customer experience. Frustration is reduced, just as conversation is encouraged. It’s a win-win: Customers have instant, secure access to vital information while businesses can effortlessly exceed service expectations. LumenVox’ next generation of Advanced Speech Recognition incorporates this advanced technology into a seamless solution for businesses. And our presence at Avaya Engage (Booth 300) will allow you to familiarize yourself with its capabilities in person to see exactly how it can benefit your business.
Pivot Technology Services Becomes LumenVox Skills Certified
LumenVox and Pivot announced today that Pivot Technology Services (Pivot) has officially become a LumenVox Skills Certified Partner. LumenVox Partner Skills Certification demonstrates Pivot’s capability to deliver high-quality speech solutions based on the LumenVox speech automation suite. Pivot, a trusted provider and integrator of the world’s leading technology solutions and services, utilizes the complimentary LumenVox Automated Speech Recognizer and Text-to-Speech solutions to improve the customer experience and employee productivity within contact center agent portal environments.
“LumenVox is excited to see the facilitation of a large joint opportunity that the evolution of our partnership with Pivot has already brought,” stated Ed Miller, LumenVox CEO, “and we look forward to supporting their development of innovative and dynamic speech-enabled solutions.”
“Becoming LumenVox Skills Certified broadens the benefits for our customers by combining Contact Center solutions design and implementation expertise with the best technologies available. The earned partnership strengthens our strategic position in an underserved niche of the market between traditional VARs on one end and large IT service providers on the other,” stated Jeff Brinckman, Pivot Director of Customer Experience Solutions.
Contact center technologies are ever-evolving. LumenVox Automated Speech Recognizer and Text-to-Speech solutions add functionality and improve capabilities. With Pivot’s expertise and LumenVox’ world-class technology, businesses and their customers can now experience efficient and customized communication.
Pivot Technology Solutions has created a portfolio of operating companies and partners, differentiated in their respective markets by superior competencies and an unmatched commitment to total customer satisfaction. Through its portfolio companies, Pivot has built an organization with deep knowledge of the industry, an extensive partner network, and the implementation capacity required to take on large and complex projects. Additionally, through Pivot Technology Services, the Company offers a comprehensive portfolio of services on a global scale.
The Customer Experience Solutions Practice provides insight and technology leadership that enables organizations to adapt to changing market conditions in the client care space. Recognizing that 85% of customer interactions flow through the contact center architectures, Pivot helps organizations transform their contact center platforms into a strategic corporate asset that creates competitive advantages.
Interested in learning more about how your organization can leverage LumenVox’ suite of speech and authentication solutions by becoming a LumenVox Skills Certified Partner? Contact us today!
LumenVox is excited to announce the release of LumenVox Version 17.0.200. In this release, we have:
- Added support for a new short-utterance transcription (Natural Language) functionality to process audio with a maximum length of approximately 30 seconds.
- Added a new Out of Service configuration option for the ASR (Automated Speech Recognizer) service, allowing system administrators to enter maintenance mode from the Dashboard, which permits currently pending requests to be completed, but any new requests will be rejected (to be potentially handled by other ASR servers in the cluster).
- Added a new feature to the ASR load-balancing mechanism to actively route ASR requests based on the language specified.
Useful for situations where you do not want to be constrained by a specific grammar, or challenged by implementing a more complex and costly Statistical Language Model, the LumenVox Short-Utterance Transcription functionality utilizes a built-in, general Statistical Language Model that has been tuned for everyday use to provide a text representation of supplied audio.
Supporting LumenVox’ commitment to making speech applications more secure and easier to administer, additional enhancements were made to our diagnostic tools and dashboard, including more robust grammar handling within the LumenVox Speech Tuner.
For a comprehensive list of improvements and features released with LumenVox Version 17.0.200, please click here.
If you’d like to watch a previously recorded webinar about the release, including participant Q&A, please click here.
In a recent post; The ROI of Speech we discussed ways in which the use of speech recognition technology has changed the face of how companies interact with their customers. Perhaps the most significant benefit realized through the implementation of a speech enabled service solution is the enhanced level of intelligence delivered during the customer interaction – the improved intelligence of the interaction. Customers are no longer bound to pushing keys to force fit their call reason into the company’s pre-determined options.
Since speech-enabled solutions provide a highly conversational interaction with customers, organizations are empowered to expand the level of intelligence their self-service solutions offer. Benefits from implementing such a solution come from two perspectives: 1) reduced costs and accelerated ROI, and 2) enhanced customer experience.
For the purposes of this post, we’ll focus on how the customer experience is enhanced by implementing a robust speech self-service solution. We’ll specifically address the questions posed in the ROI of Speech post:
- Can I engage customers in manner that allows me to dynamically generate personalized treatment that results in higher rates of self-service or cross/up sell opportunities?
- When customers don’t want to play in the IVR, can I gather enough information to avoid costly misroutes?
- Can I take what I know about the customer and provide proactive information that might resolve their need before they move into the transactional path or transfer to an agent?
Understanding how each of these factors tie into the overall speech self-service strategy will help to position the organization for success and yield an intelligent experience that customers will engage in time and time again.
Given the dynamic nature of conversational speech, companies can leverage speech technology to build very robust interactions with consumers. Let’s assume a customer calls to inquire about their checking account. Based on this customer’s profile we know that they are a high-net worth customer and would be eligible for numerous up-sell offers. Using conversational dialogue, we can begin to ask the consumer targeted questions in conjunction with what we know about their relationship with the bank. This depth of conversation would be controlled by the consumer and all information collected would ultimately be used to improve the intelligence of the customer record. Over time, the organization would have a targeted view of this consumer using a combination of profile and behavioral information.
As organizations begin to consider whether a speech-enabled solution is right for them they should take inventory of their current personalization strategies and lay out numerous use cases that could be supported through a more robust speech solution.
The most successful companies across all industry verticals recognize that holding consumers hostage within the IVR system is the most egregious error they can make. The internet is filled with horror stories of consumers being trapped in automation purgatory. In fact, being trapped in the IVR is one of most common reasons consumers hate to use automation. To combat this, companies tried, often unsuccessfully, to build “second chance” menus to capture caller intent to get them to the right location. Of course, consumers who despise automation rarely play at this level.
Fortunately, speech recognition technology provides a viable solution for both the consumer and the organization.
For companies, the conversational approach of the speech solution provides a sense of forward progress to the consumer. This approach promotes engagement and therefore reduces the rate of costly internal transfers, as well as improves the perception of the company.
Consumers benefit from easy transfers without the need for sitting through verbose second chance menus or cycles of repeated commands.
While proactive information can be successfully pushed in a DTMF solution, the use of speech recognition technology can expand the interaction, thereby delivering a much more targeted push. The depth of engagement will be far deeper with speech technology. Consumers can provide complex responses and the company can offer multiple data points in a single question. The dynamic interaction will reduce the cognitive load on a consumer as the flow of the information will more fluid and natural. This approach is highly successful in keeping calls in the self-service channel and avoiding the costlier agent channel across many industry verticals, particularly when high rates of repeat callers are common, for example credit card and bank account balance inquiries.
The power of speech technology continues to change the face of the self-service world and customer experience as a whole for improved intelligence of the interaction. Understanding the use cases for the technology requires a solid understanding of current capabilities and consumer behavior and expectations. While the questions presented above represent a significant portion of developing a business case for a speech technology solution, numerous other factors must be addressed to build a comprehensive roadmap.
In our next installment, we will discuss how speech can open opportunities for new functionality and scope of coverage across the entire self-service solution.