LumenVox Luminaries is a podcast that broadcasts thought leadership pieces on the subject of voice technology. This episode features Jeff Hopper, Vice President of Business Development with his perspective on LumenVox’ next generation of conversational IVR.
I want to tell you about some work that we’re doing in our engineering team right now that will begin to become available in 2020. We’ve taken a step back and looked at the existing state of the speech recognition market for the IVR space and the product that we used to have, that we deprecated, what our competitors do, etc. And we’ve concluded that there’s a better way to go about this than the way the industry has historically.
When you look at our competition, their traditional tier-four speech recognition was speech recognition with natural language understanding. It was first and foremost 10-year-old technology and a proprietary black box. The only people who could develop an application for a customer with it was the that speech vendor’s professional services team. With my 20 years of personal experience in the space, I can only name–with the fingers on one hand–people outside of that vendor who can actually build a tier-four application successfully for you.
So our first driver to this new idea was let’s take advantage of some things that have changed in the state of the art technically, and let’s build a new platform that is more open more accessible, easier to use and not that proprietary black box, if you will, for speech recognition. So if you understand any of the history of natural language IVRs, essentially the idea is that instead of asking specific questions, like “What city do you want to fly to?” And you say, “Memphis or Nashville,” or whatever the choice is and the recognizer can only make a determination from a defined list of choices. You should be able to say things like, “I’d like to book a flight next Tuesday from Seattle to Memphis in the afternoon.” And that recognizer should be able to parse out both the intent–“I want to book a flight”–and all of the values in that statement that are necessary, like the departure city of Seattle, the arrival city is Memphis, and the travel date is next Thursday from that conversational statement that the caller makes. So the traditional mechanisms have been to build these proprietary applications that use two parts under the hood, but most people don’t realize they’re two parts. The first is the speech recognizer that takes what I said and converts it into raw text. The second part is something called an NL or an SLM, traditionally, in the speech space, a statistical language model that will take those words, parse them apart, and try to infer the meaning based on machine learning.
It is not very different conceptually to modern machine learning and artificial intelligence except that it’s built on a much older set of tools and a much more limited set of machine learning capabilities. So when you build an application like that today with our competition’s ASR offering, it is a sealed box. It’s difficult to make changes to it over time, and they tend to be extremely expensive from a professional services perspective to deliver.
So what we’re proposing, and not just proposing, but building the infrastructure for, is a new generation of conversational IVR. And we’re going to do it in a couple of ways: We’ve already done what I call part A of the three parts, and that is we have built an entirely new speech recognition engine based on the latest in machine learning processes, specifically deep neural networks so that the core recognizer that will work in this stack is absolutely state of the art/ has excellent recognition capabilities and is easy to stand up, install and configure to run in your application stack. And more importantly, it’s designed to do transcription, not directed dialogue with grammars like that old style of IVR application. It’s intended to take raw tech or raw speech from a collar and transform that into text. The second part, part B, of our application stack is going to be a new AI platform artificial intelligence that uses machine learning. It’s built on commercially available AI components that already exist today that are also state-of-the-art. They’re components from companies like Google, or some firms that Google has purchased, that Google has put out into the open source world. We’re going to build the machine learning AI piece that does the intent determination from the text and extracts those values or entities, like departure city, arrival city or whatever the particular conversation might be. From that text we can pass that back to an application in your IVR to do work. That second part is in engineering now, in the process of productization, and it will give you an excellent starting point to accomplish what is very typically a difficult process with tier-four applications today. And the tool set is one that is widely commercially adopted; there’s lots of people who already understand how to use it. We’re essentially just going to provide the plumbing to connect it into the rest of your IVR stack and our speech recognizer in a simple and easy way.
Coming on top of that in the third part of this process will be the addition of something that we’re calling on AI gateway. If you look at the slide in front of you right now, you can see the AI platform over on the right hand side and LumenVox listed down below it as one possible AI platform, but up above you see a number of other names that you’ll recognize things like Amazon Lex Microsoft Luis, Google’s dialogue flow IBM Watson and others. Those are all widely used, commercially available AI engines today that use machine learning to produce artificial intelligence that help you parse out the answers you’re looking for from the text. What we’re going to do is provide a configurable gateway that will operate from the LumenVox media server so that in your IVR applications you can take advantage of existing AI that you’ve already built with those commercial tools, things like FAQ question chat bots that are on your website today, or other mobile applications that you’ve built that use text and machine learning or AI to respond to that text. You’ll be able to take those models and add them to your existing IVR stack so you’re not starting from scratch with the learning process for the AI mechanisms. You can continue to reuse something you’ve already built and enhance it. That’s almost always less expensive than starting from scratch to build a new AI platform and a new AI model for your particular business situation.
We have some customers who are already using this approach in an experimental stack, and I say experimental–some early proof of concept applications today rather than going out of the LumenVox media server. They’re making the AI request out of their voice application platform today, which requires a little bit more work on their part. But we know that the new generation of recognizer we have in place, when combined with that kind of external AI approach, is actually working well. And then in 2020 we will add that third part of the AI gateway to the LumenVox media server to make all of the integration work simpler or quicker and easier for you.
Have questions about our next generation of conversational IVR? Contact us today!
As the world continues to react to COVID-19 (coronavirus), the LumenVox team is readily prepared to help our clients deliver excellent customer service amid surmounting challenges. Here are a few LumenVox best practice ideas and relevant solutions that apply to the new sign of the times:
Best Practice #1: Play it safe.
It is important for everyone to deliver great customer service and balance the need to protect your staff. At LumenVox we are limiting staff travel and providing our headquarter offices with the option to work from home. Many of our colleagues are doing the same, as significant industry events have been rescheduled (See our new appearance dates here and watch this space for upcoming virtual events). In every business, people should come first.
Best Practice #2: Boost remote engagement technology.
Communication is key when daily life is disrupted. Customers rely upon remote channels to gather information. One way you can elevate your business’ remote engagement strategy is with automated outbound calling. LumenVox Call Progress Analysis enables innovative, tone-based outbound calling to inform customers of changes to times, events, etc. What’s special about LumenVox’ solution is that we can deliver a significantly higher payload compared to competitors, as our Call Progress Analysis is able to verify if an outbound call has been connected to a live person or an automated recording system. Additionally, we have such flexible licensing, we offer “bursting” to assist with higher than usual call volumes.
Best Practice #3: Ensure secure authentication.
Secure authentication is paramount as call centers are increasingly overloaded with calls. Do you have cutting-edge technology to make sure every customer is who they say they are? Voice biometrics protects and defends vulnerable voice channels from criminal activity. Both active and passive biometrics keep customer accounts secure, and the interaction seamless.
As for LumenVox, our technology is immune, always up and running and we are here to ensure no disruption of our products and services. We want you to be able to say the same. Let us know how we can help you keep your customers feeling connected and safe.
Recently the LumenVox team attended CCW to empower businesses to deliver seamless yet secure customer communication through advanced speech and authentication solutions. While there, we had the opportunity to have been interviewed by CrmXchange for their CCW Nashville 2020 column: Solution Providers and End Users Harmonize in Music City
Good question. We have zeroed in on the following:
Speech Recognition: This includes conversational IVR with text-to-speech capabilities that create a very functional, conversational IVR self-service.
Voice Biometrics: We get really excited about this, as it addresses a huge threat to businesses right now—fraud. Voice biometrics is used to keep customers safe, to secure authentication and keep the customer experience painless and easy. Businesses can implement voice biometrics using IVR, mobile applications or in the contact center.
In 2019 we hit over a million voiceprints. For 2020, our development team combined customer feedback with their biometric expertise and created a robust passive engine, which allows for machine learning to enhance features and customer benefits. With the inclusion of Deep Neural Network technology, LumenVox has positioned itself to provide higher accuracy, more fraud prevention tools and increased customer satisfaction/ service.
There are numerous companies in your space…how do your solutions stand out from your competitors?
We’re easier to work with in every way. Our company is very focused on our channel partners. And our customers rave about how flexible the architecture and capabilities are. Our consistently high NPS scores (currently 89) reflect how much they appreciate our responsiveness. All speech recognizers and text-to-speech engines perform the same tasks, but they really don’t offer the higher value that comes from our simple install-configure-use approach. Things like our built-in diagnostics make it quick and easy to set up or troubleshoot things, and the LumenVox Speech Tuner is the easiest product on the market to implement and improve the performance of your speech applications, based on real customer usage.
On the voice biometrics side, while we do the installation and training, we’re also very open and flexible: We have APIs that are easily integrated into a contact center agent’s desktop or CRM applications. We see this as an important differentiator because managers really want applications consolidated. So the capability to take our biometrics results and integrate that information smoothly makes it easier for everyone.
From an implementation services standpoint, we don’t lock our customers into having to use our professional services. We enable both our partners and IVR developers to work from various platforms, using their own services while deploying our speech recognition technology. Our holistic approach results in us being more competitive, or and cost-effective. We don’t think cutting-edge technology needs to come at a premium price. We want this technology to be available to everyone.
Can you define how biometrics work in the contact center?
All biometrics measure something you’re made of. We’re used to smartphones using fingerprints and facial recognition/ faceprints to authenticate us now. Makes you feel more secure, right? The call center is evolving, too, and with our technology can create the same level of protection using the human voice. An enterprise contact center obtains a sample of your voice and converts it into a secure file called a voiceprint. Once there’s a voiceprint on file, the next time you call into the center, you don’t have to answer those painful security questions (which, by the way, are often vulnerable to theft). Using a voiceprint means that as a consumer you have a better calling experience. And as a business, you get more security.
Is it possible to fraudulently manipulate a voice biometric? Can someone pick up a customer’s voice pattern or convincingly imitate them?
Well, even mothers can be fooled by twins. We don’t want to be so hubristic as to claim that we somehow have something over Mother Nature. But voice biometrics takes major precautions: We use multiple factors when creating the voiceprint. The human ear might not be able to detect an impersonator, but our solution will notice hundreds of subtle differences. A company can also use multiple phrases as an identifier or ask varying questions to prevent a breach.
The truth is customers understand that the threat is real and want an added layer of unique security. A recent study noted that 74% of Americans believe that biometrics is a more secure method of verifying accounts than traditional PINs and passwords. Hacks and data breaches are commonplace occurrences now. And there are long, long gaps in notification—customers may not know they were exposed for nine months sometimes. Since many people tend to use the same PINs and passwords for multiple accounts, they are vulnerable. Voice biometrics protects and defends privacy. It’s that simple.
In what industries are recognition technologies becoming prevalent?
Any and all, as the desire for a seamless customer-agent interaction increases. People want to reach a customer representative to solve the bigger, more complex problems and spend their valuable time self-servicing/ solving the easier ones on their own. We’re working more and more with financial institutions and healthcare providers, as we have the capabilities to not only enhance their IVR experience, but also provide stronger security. As businesses grow, we see these two factors go hand-in-hand. People want security, but they don’t want to compromise speed and efficiency to get it. Our —speech solutions provide the best of both worlds.
What are some of the tangible benefits of Natural Language Processing IVR applications?
NLU gives self-service that human touch that people really seek in the customer journey. It’s the best of all worlds: Customers can help themselves quickly, but can also feel as if they’re doing it effortlessly–with a fellow human mind at the helm. As for LumenVox, our new Conversational ASR combines 20 years of experience in Speech Recognition with the latest in Artificial Intelligence & Machine Learning, allowing any business to build new AI-based IVR applications that support natural language processing and intent determination from an existing voice application platform (IVR). The best part is that as a business you really don’t have to start from scratch to do this. Text-based AI tools can be given a voice with LumenVox ASR. You can leverage your existing infrastructure and preferred tools to provide rich, voice based self-service that exceeds expectations.
How the customer experience evolved from snail mail to Conversational IVR
LumenVox and its artificial intelligence speech technology has initiated an evolution within the world of customer service. In the good ol’ days, people went to the bank to deposit a check. They absolutely had to make the trip; there were no other options. Eventually, banks got savvy and integrated with another institution—the post office—to provide printed account information to your doorstep, send or receive checks, etc. Eventually, with the integration of the telephone and its proliferation of use in businesses, we got phone support. If we had any questions or concerns, we could call a teller or bank representative to get the needed information. Over time, the number of incoming calls vs. the number of representatives available were too incongruous. There could never be enough reps to answer the number of needed phone calls per day.
With speech recognition, self-service phone options were put into place. When customers called the bank, instead of a human voice, they got automated options: “Press 1 for your balance. Press 2 for hours and location,” etc. But those menus can get quite annoying—people quickly got tired of the amount of time it took to listen and navigate. Everyone pressed zero, hoping to skip straight to a human voice.
To solve this issue, speech interactions were introduced, but only via directed dialog, i.e., “Say yes or no.” It wasn’t that transformative, really; there was still a fair amount of patience required by the customer. Eventually phrases such as “Account Balance,” “Did my check clear?” were put into place. This was considered an improvement by customers, but there were issues with the inconsistencies in customers’ speech. As a result, there was a good amount of–“I’m sorry, I didn’t quite get that.”
But businesses did finally get it. They learned from all the trying and testing that the level of ease of the customer experience, now primarily implemented by mobile phone, is just as important to a customer as the quality/ price of items or service purchased. Essentially, time is money—to everyone. This realization is what spurred on the notion of a conversational IVR and how it became instrumental in the design of self-service solutions.
The goal of Conversational IVR is to have automation mimic a conversation with the consumer. Open-ended questions, such as “Thank you for calling. How can I help you?”, have become the desired greeting. This evolution into machine, yet human-like interaction relies on Deep Neural Networks (DNNs), their ability to seek out, classify and order information in more complex or “intuitive” ways, and their contribution to Natural Language Understanding (NLU)/ Natural Language Processing (NLP). NLU/NLP is a specific branch of artificial intelligence that enables human-computer interaction using the input of sentences in text or speech.
Integrating NLU/NLP into a customer service environment results in a customized user experience: Enterprises are now able to recognize and automatically speak to you by name when you dial in. Your phone number is matched to your account information so they can even predict what you may be calling about (i.e., “I see you have booked a flight…)”.
Conversational IVR promotes a better customer experience. Frustration is reduced, just as conversation is encouraged. It’s a win-win: Customers have instant, secure access to vital information while businesses can effortlessly exceed service expectations. LumenVox’ next generation of Advanced Speech Recognition incorporates this advanced technology into a seamless solution for businesses. And our presence at Avaya Engage (Booth 300) will allow you to familiarize yourself with its capabilities in person to see exactly how it can benefit your business.
LumenVox and Pivot announced today that Pivot Technology Services (Pivot) has officially become a LumenVox Skills Certified Partner. LumenVox Partner Skills Certification demonstrates Pivot’s capability to deliver high-quality speech solutions based on the LumenVox speech automation suite. Pivot, a trusted provider and integrator of the world’s leading technology solutions and services, utilizes the complimentary LumenVox Automated Speech Recognizer and Text-to-Speech solutions to improve the customer experience and employee productivity within contact center agent portal environments.
“LumenVox is excited to see the facilitation of a large joint opportunity that the evolution of our partnership with Pivot has already brought,” stated Ed Miller, LumenVox CEO, “and we look forward to supporting their development of innovative and dynamic speech-enabled solutions.”
“Becoming LumenVox Skills Certified broadens the benefits for our customers by combining Contact Center solutions design and implementation expertise with the best technologies available. The earned partnership strengthens our strategic position in an underserved niche of the market between traditional VARs on one end and large IT service providers on the other,” stated Jeff Brinckman, Pivot Director of Customer Experience Solutions.
Contact center technologies are ever-evolving. LumenVox Automated Speech Recognizer and Text-to-Speech solutions add functionality and improve capabilities. With Pivot’s expertise and LumenVox’ world-class technology, businesses and their customers can now experience efficient and customized communication.
About Pivot: Pivot Technology Solutions has created a portfolio of operating companies and partners, differentiated in their respective markets by superior competencies and an unmatched commitment to total customer satisfaction. Through its portfolio companies, Pivot has built an organization with deep knowledge of the industry, an extensive partner network, and the implementation capacity required to take on large and complex projects. Additionally, through Pivot Technology Services, the Company offers a comprehensive portfolio of services on a global scale.
The Customer Experience Solutions Practice provides insight and technology leadership that enables organizations to adapt to changing market conditions in the client care space. Recognizing that 85% of customer interactions flow through the contact center architectures, Pivot helps organizations transform their contact center platforms into a strategic corporate asset that creates competitive advantages.
Interested in learning more about how your organization can leverage LumenVox’ suite of speech and authentication solutions by becoming a LumenVox Skills Certified Partner? Contact us today!
Artificial Intelligence is a buzz word in tech because it has unlimited potential to transform the human experience. It’s especially relevant in 2020, as the keyword dominates Google searches and news outlets, including Forbes’ list of technology trends.
What is Artificial Intelligence?
AI is a segment of computer science that focuses on emulating human attributes. It’s the things we take for granted as human beings: listening and comprehension, sight, movement/ grabbing/ manipulation and reasoning with the world.
How Do Artificial Intelligence & LumenVox Intersect?
LumenVox offers artificial intelligence business solutions in two main ways:
The original LumenVox speech recognition software utilized Hidden Markov Models (HMMs) to decipher speech from the sound waves recorded. Hidden Markov models are simplified probability models that work very well in guessing what words are when heard. For example, if the first sound heard is likely to be “Duh” then, in English, it is very unlikely that the next sound will be an “Ess.” But it is very possible that the next sound will be, perhaps, “Oh” as in the word “Doughnut.” Overlaying the sounds, with the probability model of English that speech scientists developed, has produced a very good quality speech recognizer. But in recent years Hidden Markov Models have been replaced by DeepNeural Networks (DNNs). Deep Neural Networks are still probability models; however, they do not contain explicit knowledge of the likelihood of sounds or words in close proximity. Instead, Deep Neural Networks are brute-force models that emerge from “deep” analysis of very large data sets. They record connections for which we may have no explanation. From millions of examples of speech, they extract more complex, irregular, and idiosyncratic statistical rules about sounds, as actually spoken by people, than would be practical to represent in a Hidden Markov Model. LumenVox now utilizes Deep Neural Networks in our passive engine to provide state of the art results in speaker verification and recognition.
Voice Biometrics, aka Voiceprints, make a judgement on whether someone really is who he/she claims to be. With voice biometrics an initial voiceprint is created, in a secure manner, that validates the person’s identity. In future verification attempts, the person’s voice is compared to the original voiceprint. Voice Biometrics are a useful form of AI because they can replace human judgment, prevent human error and be hyperalert to threats with a highly accurate algorithm.
How Can Artificial Intelligence Protect You?
If a fraudster calls a business and claims to be Joe Smith, the agent might ask him a few questions to verify his identity. If the putative Joe is an experienced scammer, he may have researched these questions and prepared answers. Using a voiceprint, however, removes this risk since it is nearly impossible to fake a human voice effectively. Instead of allowing the fraud actor access, the intelligence of the machine exceeds the intelligence of a human agent and fraud is detected.
LumenVox’ aim is to safeguard both businesses and customers using artificial intelligence and the human voice, to take two complexities and make an interaction that is simple yet secure.