LumenVox Luminaries is a podcast that broadcasts thought leadership pieces on the subject of voice technology. This episode features Jason Kawakami, LumenVox Senior Sales Engineer, outlining the sophisticated speech science, functions, and benefits of LumenVox Transcription Engine. You can follow Jason on LinkedIn here.
Listen to the Intersection of Speech Science and Transcription Podcast below:
Read the Transcript
So this is a dive into one particular component of our speech products. LTE or LumenVox Transcription Engine is part of the ASR component of our speech suite.
Q: What is LumenVox Transcription Engine?
What the transcription engine does, is it delivers transcribed text which is representative of the decoded speech. So we take in an utterance. We process it against an unconstrained grammar called a Statistical Language Model or SLM. We take that text and provide it to a downstream piece of technology that might be an existing trained AI model, or it might be fed into any number of things.
Q: What’s the primary use case for LumenVox Transcription Engine?
The primary use case today has been in our case talking about Natural Language IVR’s. The application, the SLM all of the bits of this are focused on that middle of the road. We’re providing a supporting technology to IVR’s that are providing natural language applications.
Straight out of central casting use cases speech-enabling a chatbot.
How does LumenVox Transcription Engine work?
We process the audio against a Statistical Language Model, an SLM instead of a traditional grammar. The traditional grammar is a constrained search space. The Statistical Language Model is a big giant search space that is focused on the spoken language in a particular unique language whether that’s English, Spanish, or other. Now that mathematical model predicts what is going to be spoken next, and that prediction is used to narrow the search space.
This is speech science. This is really high-end science that is done on how languages are spoken, what phonemes come out for what–this is very, very high, complex computational science.
Q: What’s the primary use case for LTE?
Our SLM is tuned for general typical conversation. We’ve talked about the primary use case for our LTE is to feed downstream AI-based processes. The straight-up middle of the road? We’re thinking about NLU- IVR. This is some type of telephony solution that requires the decoding of spoken utterance and that text being provided to some type of AI to determine the meaning. Taking that audio either from the IVR from the conversation that’s between the agent and a customer and feeding that into an AI engine that is specifically tuned and trained to detect sentiment.
Q: What are other mainstream use cases?
Another potential here, as we start getting into the shoulders of the road, is agent-assist applications, so listening to agent conversations in real-time and processing their audio, processing both their leg and the consumer’s leg–and maybe training a model to key in and be integrated with the company’s knowledgebase; that knowledgebase is providing or prompting the agent with particular articles out of the knowledgebase that will assist them with what the consumers are asking them for.
As we move farther outside of the mainstream, potentially that middle the road use case–using LTE to support speech to text applications, true transcription applications as the word transcription is used by normal people, not necessarily speech industry people. So think note-taking. Note-taking is a big deal in lots of industries. A few that stick out– the medical industry, the legal industry–the transcription engine could be applied to an application that is doing verbal note-taking in those cases, legal the same way.
The other one I was thinking about–dispatch apps. There’s lots and lots of mobile workforces these days and mobile workforces are becoming more prevalent with the world that we’re in. People are dispatching service vehicles to your home instead of you taking your car down to some type of central garage to get fixed. Every one of those activities has the standard “did I complete my thing?” and “how much time did I spend?” And there’s oftentimes notes that are associated with those trouble tickets or those service tickets. Our LTE could be used to support taking those notes spoken and pushing them into a text-based system to feed it to analytics applications. And we can use LTE to provide the words to an application that is providing the analysis and the real meanings of the words that are being spoken.
Hey, this is Bill Petty with 8 & Out, a podcast series dedicated to highlights updates and events around the Avaya DevConnect program, Avaya technologies, and most importantly our technology partners. Joining me today is Mr. Jeff Hopper who is the Vice President of Business Development for LumenVox. Thank you so much for joining us.
Good morning Bill, thank you for having me on. I really appreciate the opportunity whenever I can chat with you.
So, we’ve got some great news that is coming up from LumenVox. I know you guys have some stuff that’s cooking in the background and a couple of announcements that you’d like to make, so tell us what’s going on at LumenVox these days.
Well, we have so much stuff going on it’s almost hard to fit in your 8 & Out podcast, but some of the highlights are that we’ve just wrapped up working with the Avaya product team on the Experience Portal orchestration designer feature preview for the next release. We’re all lined up for a joint LumenVox product advancement to a new release, along with the Avaya product; so we’ll be in lockstep in compatibility-tested on day one which I think will be good for all of our customers ’cause then there’s not that out of sync thing that can go on. And we are continuing to work on the process I mentioned the last time I was visiting with you back from the show floor at iAug in Phoenix earlier this year. We’re just about squared away with all of the internal SAP codes in order configuration so that all of the Avaya ecosystem can buy LumenVox directly through Avaya. It’s available today, and it will be in, I believe, the next release of the configuration tool, so it’s completely automated at that point for order entry. We’re all squared away to become the replacement product for the deprecated Loquendo ASR and TTS that’s now end of life.
We’ve also continued to bring a bunch of partners on board, and it’s one of the things we’re really excited about. We have two new partners since I last chatted with you; both CCT and Damovo, which are firms that have business both in the EU, based in Germany, and in the United States; and we’re continuing to see a real interest in growing those partnerships. In fact, one of the things I wanted to mention during this call is we are glad to provide free skill certification training for new partners when they’re on boarded so they’re all up and ready. We think those partnerships start with strong skills and experience so we will be conducting a series of our LumenVox skill certification training classes this fall. We do it virtually just like everything these days over zoom or spaces or, you know, some virtual mechanism. It’s two, half-day sessions that customers can participate in, and it really will bring them up to speed on integrating LumenVox in as the speech recognizer for those self-service applications.
Then the really good stuff is the new release of LumenVox that we have coming out in tandem with the next release of Experience Portal. We have several new capabilities that I think are going to be really useful for our customer base. There’s a lot of interest in security these days and one of the areas that has come up again and again over the last few years is protecting that information in audio between the Experience Portal and the LumenVox stack. So we’re going to be releasing TLS encryption of that audio traffic so that the audio between your Experience Portal and the LumenVox recognizer, and the return information, is all fully encrypted so that it’s not possible to snoop or sniff or interrupt that – even if you’re inside the walls of your data center. We think that will be a great security advancement for things like PCI compliant applications where there’s a credit card or payment information collected especially.
We’re also extending the audio capabilities for the LumenVox transcription engine that we introduced last year. It came out in its first version as a short utterance transcribers, we’re extending that to unlimited audio duration now so we’ll be able to handle longer use cases where there’s more than just 30 seconds of audio for the particular application.
The thing I’m really excited about is our new configurable AI gateway. It’s going to provide a really simple low or no code integration with external AI platforms like Google Dialogflow, Microsoft Luis, Amazon Lex and IBM Watson. It is going to give developers a really new and easier way to build and deploy natural language understanding in their IVR environment. We’ll continue to expand the number of AI resources that it supports, putting one of our own that we’ll bring to market next year. We think we’re going to have lots of ways to up the ante in these customer self-service conversations to make them more conversational, more modern in style, and easier and better from a customer experience perspective and a development perspective.
Tell me a little more about this new solution that you’ve got a new release coming out in the fall. So you’ve got encryption, you’ve got this extension of the audio capabilities and you’ve got this new AI gateway for configuration. What kind of impact is that going to have on our mutual end customers?
Well, I think the AI gateway is significant in changing the way we build these applications and is a good way to simplify them. Today most of the combination speech recognition and AI tools that create natural language understanding are proprietary. They don’t have a wide use base in terms of the people who have the skills to develop applications there because there’s almost two or three different skills that apply in the process and what this is going to let you do is tightly integrate all of these commercially available AI platforms like Dialogflow or IBM Watson easily into your IVR. Your existing IVR developers won’t have to change anything other than some information that goes into a speech grammar that tells us which platform to use and we’ll take care of the integration to that external platform. It’ll still allow you to keep all of the audio local in the speech recognizer, which is more efficient, and more secure, and let you use the external AI resources to create natural language understanding.
What’s the impact of the encryption? I mean, we talked about it a little bit and how important is that to our end customers now that you start seeing some data intrusion and that stuff out in the marketplace. How important is the encryption to the customer?
In general, we all have security concerns these days. I have on my desk an email from an organization–that I’m a customer of–telling me about a data breach. Who hasn’t experienced that? Especially things like financial information, Social Security numbers, and credit card information. They all have legal compliance issues around them, and we all want to make sure we secure that data and prevent accidental use or access to it. By encrypting the audio data and the traffic between the Experience Portal and the LumenVox Speech Recognizer, it’s one more safety wall around that information. That information, once it is encrypted, can’t be decrypted outside of the two platforms, so you can’t snoop and listen to the SIP traffic that carries that audio and steal that information out of it.
You know, as we move forward in this relationship, LumenVox demonstrates what it means to really shake hands with Avaya and come to the market together.
We look to the future and we look forward to what’s coming on in 2021. Do you guys have any events or an opportunity for people to come by and say hello or for people to contact you during a virtual event?
Actually, we do. The Avaya fiscal year 21 America Sales Conference is coming up in a few weeks. We’ll be attending that as a virtual participant like everybody else and we’d love to have you stop by our virtual booth and say hello speak to us tell us about your opportunities and we can talk about the new product extensions. I am also actively trying to recruit some channel partners to be preview customers for the new AI gateway capability. So if there’s somebody in the audience who has an interest in taking a deeper dive in the gateway, who may have a potential project, or would like to build a proof of concept in their lab to demonstrate to customers, I’d love for them to reach out to me. Again, it’s easy to find me, just email@example.com. I’d be glad to get you into that process because we are seeking some preview customers, especially in the partner ecosystem.
That sounds like an excellent opportunity. For anybody who is out there that might want to take advantage of that, please do reach out to Jeff. If you’re looking for additional information on LumenVox, please head over to the Avaya DevConnect marketplace at www.devconnectmarketplace. Jeff, as always, thank you for participating in the call today.
Thank you, Bill. I appreciate the chance to chat with you and to talk to your audience and look forward to continuing to work with you and Avaya and the whole ecosystem here.
Absolutely! For anybody looking for additional information, again, head over to the DevConnect marketplace. Once again, this is Bill Petty, that was about 8 minutes, and I’m out.
LumenVox Luminaries is a podcast that broadcasts thought leadership pieces on the subject of voice technology. This episode features Dr. Clive Summerfield, LumenVox Managing Director of EMEA-ANZ, discussing a case for active voice biometrics with his perspective on the specific benefits of Active Voice Biometrics.
I’m Clive Summerfield, Managing Director for LumenVox EMEA [Europe, Middle East, Africa] Australia and New Zealand, and I’m here to talk about Active Voice Biometrics. Many of you may have actually heard about Voice Biometrics, and I’m here to talk about the particular benefits of Active Voice Biometrics as opposed to Passive Voice Biometrics.
Q: What is Voice Biometrics?
Voice biometrics is a technology that allows a system to authenticate the identity of a person from an analysis of their voice. Like your fingerprint, your voice is unique. And so, like your fingerprint voice can be used as a very powerful technology for authenticating the identity of the speaker: you actually are who you say you are. And that’s a literal statement; the voice and the sound wave that emanates from your lips and your nostrils contains within it an acoustic signature that is unique to that individual speaker. And that in a nutshell is what Voice Biometrics is all about.
Q: What’s the difference between Active and Passive Voice Biometrics?
Voice biometrics comes in two flavors, essentially. There’s Active and Passive Voice Biometrics. Active is where a system asks you a specific question, and you have to answer that question with the correct voice. The obvious one here is a phrase like “My voice is my password,” but it can be anything—it can be your name; it can be your date of birth; it can be your address, your zip code or here in the UK your post code. It could be an account number or your telephone number. So this is where Active Voice Biometrics is Actively verifying your identity from a phrase that you’re actually saying. In Active Voice Biometrics you have to say the correct phrase with the correct voice. Now Passive on the other hand is a technology that sits in the background and listens to a conversation. Passive Voice Biometrics is, as the name implies, passively listening to a conversation and just understanding the voice characteristics of the speakers to recognize who the speakers are.
Q: Where is Active Voice Biometrics most appropriate?
The ideal application for Active Voice Biometrics is in telephone self-service and particularly for authenticating identity of speakers using IVR systems, particularly self-service applications—such as banking applications, government applications, applications for retail and telecommunications services. So in those applications the IVR prompts you for a piece of information. “Please say your telephone number.” And you have to say the correct telephone number with the correct voice in order to positively authenticate your identity, giving you a very strong level of surety that the speaker is the account holder and not an imposter trying to break into your accounts.
I feel like the big application for Active Voice Biometrics has been in telephone-based password reset. And there are numerous examples of Active Voice Biometrics being used for telephone password reset applications, and those are principally in help desks and internal-facing employee helpdesk applications and services. But Active Voice Biometrics is far more than just a password reset application; it is also a password replacement technology where voice can be used instead of passwords for many applications. These include IVR telephone self- service chat bots and increasingly online services and IOT devices.
Q: What about the future of Active?
Well, there is good news in that voice interactions are growing in the world, I mean, despite the demise of telephone call centers, the growth in voice communications is growing exponentially at the moment. According to Gartner, 30% of all searches are now voice-driven which tells you where voice communications is going in the future. So the digital channel is where all the action is actually going to be in the near term.
And whilst Active still has a very important role to play in telephone services, the future of Active is actually in the digital channel. Now digital channels are almost by definition driven by phrases and single phrases, which is where Active has very strong application over and above Passive. So Active is the only technology in the digital channel that is applicable for things like second-factor authentication. I like to think that Voice Biometrics and Active Voice Biometrics is actually the world’s best second-factor authentication which can be used to augment things like PINs and passwords that are traditionally stored up now in browsers. Active Voice Biometrics can be very effectively applied in digital channels and particularly in second factor authentication for browsers channels.
One of the big applications for Active Voice Biometrics is capturing voice through the browser as somebody is accessing a secure website such as Internet banking and other services of that nature, to positively authenticate that the person who is actually the account holder. This provides a much stronger security credential than PINs or passwords on their own. And Active is unique in so far that it’s the only technology that can harmonize the authentication between the emerging digital channels and the legacy telephone channels. You can use the same phrase in a digital channel as you can in a telephone channel. There’s no other biometric technology that provides this type of flexibility which means that organizations can harmonize their authentication strategies across all their customer service channels thereby providing a much more effective customer service experience.
LumenVox Luminaries is a podcast that broadcasts thought leadership pieces on the subject of voice technology. This episode features Matt Whipple, Senior Vice President of Global Voice Biometrics Sales with his perspective on fraud and its effect on contact centers as it relates to COVID-19.
You can connect with Matt on LinkedIn here and Twitter here.
Listen to the Podcast
Read the Transcript
Hi, I’m Matt Whipple, the Senior Vice President of Sales for the Voice Biometrics Suite of products within Lumenvox. I’ve been working in voice biometrics for approximately 15 years, give or take.
So today we’re talking about fraud and specifically fraud as it pertains to current events; COVID is changing the world rapidly, and those changes are increasing fraud dramatically, particularly for contact centers. So today we’ll be discussing how COVID-19 is changing the face of fraud in the context in our world.
Q: How is COVID-19 affecting fraud in the contact center? We’re seeing a couple of things: One is a dramatic increase in unemployment. And whenever we see increases in unemployment, we see increases in theft. People are either desperate or people become opportunistic, so let’s look at that a little further. When people are unemployed, they don’t have income; they still have mouths to feed; they’re willing to take advantage of other people or particularly companies when they can’t. So we get people who are not necessarily normally fraudsters starting to perform fraudulent activities. And the least risky way to steal from a financial institution, for example, is over the phone. Walking into a bank with a mask on and a gun in your hand is a surefire way to get caught; calling a call center, well, it’s a very difficult way to get caught. So we’re seeing fraud rising, fraud-related phone calls rising. Moreover, fraudsters are now opportunistic. They’re preying on people who are scared about the uncertainty in the market and as a result, fraudsters are doing social engineering with individuals for the sake of taking over those individuals accounts. They call a financial institution; they pretend to be you using stolen credentials–a Social Security number, a mother’s maiden name, and so on–and they are increasing their attacks within financial institutions.
Q: How can businesses mitigate the risk? Specifically, around fraud in the contact center, there are a bunch of tools. One of the things that contact centers have been doing for a long time is looking for risky transactions. If I never place a high-dollar value wire transfer out of my account and all of a sudden somebody is trying to wire a whole bunch of money out of my account, that could be a sign of fraud. But it also could be a sign of the times. Maybe I’m wired wiring money to friends and family who need it, so there it doesn’t mean there is fraud, it means that banks have to be especially cautious today on these sorts of transactions and any tools that they can have, like the kind of tools that uncover these anomalies, for example, are beneficial tools. Another layer of security–we always think of security in terms of layers—that is being deployed is voice biometrics. When a real customer is calling in on their own account, we compare their voice to their voiceprint on file, and we know who we’re speaking to. We know that this is the real customer; we also have the capability with the sound of the voice to compare a caller’s voice to a voiceprint of known fraudsters. If this is somebody who is stolen from a particular financial institution, for example, we can identify that voice as the voice of a known fraudster; we can flag those transactions and prevent those transactions; we can secure our customers account while ensuring that this company isn’t losing money to those fraudsters. This is how voice biometrics plays a dual role: one is authenticating real users and two is stopping the fraudsters from stealing.
Q: Can you go into greater detail about LumenVox Fraud Scanner? So when we’re looking for fraud within the contact center, specifically using voice, there’s a couple of areas/ a couple of techniques that we use; one of those techniques is scanning high risk calls shortly after the fact. Here’s the idea: If a fraudster is calling it on my account, and they’re using my stolen Social Security number my mother’s maiden whatever the case is, fraudsters may be performing a benign transaction, like just getting my account balance, that might be a low risk transaction, and we might not chase that transaction because there’s not a lot of damage you can do. Now that is the fraudster probing my account which will come back into play a little bit later, but specifically when we get high risk transactions—such as have fraudsters trying to change my mailing address or change my email address or trying to order a new credit card, reporting a lost/ stolen card—or the areas where the fraudster might have the opportunity to intercept my snail mail or my email or to get a new car–which they can then go use either online or in retail, there are certain transactions that we want to scan more than others. So what we do is we take the call recording just after the call. So the fraudster hangs up with the call center agent. Within an hour or a day depending on business rules, we have flexibility, we scan that call; we compare that caller’s voice to the voiceprints of known fraudsters that is people who stolen from us before.
If a fraudster is successful in an account takeover, we listen to that recording, we take that fraudster’s voice; we add it to the watchlist. Now we compare the high-risk calls from today against that watchlist of known fraudsters. If any of today’s voices matched the voiceprints of the fraudsters who have attacked us in the past those accounts/ transactions are flagged, and then what happens is the fraud analyst says “OK, we’ve got a voice matches somebody on the watchlist. It’s an account that doesn’t look like it has other signs of fraud.” Say it’s an address change, or a new card request, for example, the fraud analyst will call that customer the next morning and say, “Hey did you change your address? Did you order a new card?” If the customer says, “Yeah, I did.” Ok, well good. Then you did the right thing. If, however, the customer says, “No, I didn’t do that now.” The fraud analyst says, “OK, good, we just got something on your account; your account is perfectly safe; there’s nothing to worry about here.” We’ve created a positive customer touchpoint, and we’ve salvaged this customer from going through an identity theft, and we’ve saved the bank or the financial institution money. So it’s pretty easy process once it’s set up–these batch files occur almost automatically; the fraud analyst look at the results; they react fairly quickly; we help keep customers safe; we help keep fraud out of it out of the organization.
Q: Can you go into greater detail about LumenVox Passive Voice Biometric Authentication? Passive authentication deployment listens to the audio in real time. So you’ve all heard, “Your call may be monitored or recorded for quality and training purposes.” And it’s true, your call is being recorded, but there’s a couple of things that can be going on behind the scenes. For some of the large banks in the US and for a few of the smaller banks, what’s actually happening is as a customer is having a conversation with a call center agent, their voice is being used to initially create a voiceprint. Once that voiceprint has been built, once the agent has received consent something like, “We’re using voice security now. Is it OK if I tie your voice to the security of your account?” Consumers overwhelmingly say yes. The next time that caller calls in, we’re comparing the caller’s voice to the voiceprint on file and instead of the agent asking for last for your social or your PIN or your mother’s maiden name, the caller’s voice is being compared to the voice profile; the agent is getting a green light on their desktop saying no more security questions necessary. We lower handle times. We save operational costs. And we increase customer satisfaction as well as agent satisfaction. So this is a very positive technology in terms of customer enhancement while driving operational costs down.
Q: What sets LumenVox apart in the market today? LumenVox has a long and rich history in both speech recognition and in voice biometrics. In speech recognition we’re deployed all over the world in dozens of languages. In the voice biometrics world where we have historically played is doing password resets, doing employe-facing applications and deploying active biometrics in the IVR. What’s changing–as the market evolves, as we as a company evolve–we recognize that fraud is growing contact center fraud is growing faster than all other fraud humans are the weakest link, and fraudsters know that. They are exploiting the fact that it’s human, contact center agents whose job is to be helpful, not to be security experts, their job is to be helpful. It’s pretty easy to socially engineer them, so it’s a space that many of us within LumenVox have been playing in for a very long time. We’ve got tremendous depth in terms of a fraud detection capabilities. We’re bringing it to market in a slightly different way, which is meant to be repeatable, fast and nimble. As we’re catching fraudsters, the market’s changing quickly we’re going to differentiate by adapting more quickly than some of the big more established vendors who are already in this space.
LumenVox Luminaries is a podcast that broadcasts thought leadership pieces on the subject of voice technology. This episode features Jeff Hopper, Vice President of Business Development with his perspective on LumenVox’ next generation of conversational IVR.
I want to tell you about some work that we’re doing in our engineering team right now that will begin to become available in 2020. We’ve taken a step back and looked at the existing state of the speech recognition market for the IVR space and the product that we used to have, that we deprecated, what our competitors do, etc. And we’ve concluded that there’s a better way to go about this than the way the industry has historically.
When you look at our competition, their traditional tier-four speech recognition was speech recognition with natural language understanding. It was first and foremost 10-year-old technology and a proprietary black box. The only people who could develop an application for a customer with it was the that speech vendor’s professional services team. With my 20 years of personal experience in the space, I can only name–with the fingers on one hand–people outside of that vendor who can actually build a tier-four application successfully for you.
So our first driver to this new idea was let’s take advantage of some things that have changed in the state of the art technically, and let’s build a new platform that is more open more accessible, easier to use and not that proprietary black box, if you will, for speech recognition. So if you understand any of the history of natural language IVRs, essentially the idea is that instead of asking specific questions, like “What city do you want to fly to?” And you say, “Memphis or Nashville,” or whatever the choice is and the recognizer can only make a determination from a defined list of choices. You should be able to say things like, “I’d like to book a flight next Tuesday from Seattle to Memphis in the afternoon.” And that recognizer should be able to parse out both the intent–“I want to book a flight”–and all of the values in that statement that are necessary, like the departure city of Seattle, the arrival city is Memphis, and the travel date is next Thursday from that conversational statement that the caller makes. So the traditional mechanisms have been to build these proprietary applications that use two parts under the hood, but most people don’t realize they’re two parts. The first is the speech recognizer that takes what I said and converts it into raw text. The second part is something called an NL or an SLM, traditionally, in the speech space, a statistical language model that will take those words, parse them apart, and try to infer the meaning based on machine learning.
It is not very different conceptually to modern machine learning and artificial intelligence except that it’s built on a much older set of tools and a much more limited set of machine learning capabilities. So when you build an application like that today with our competition’s ASR offering, it is a sealed box. It’s difficult to make changes to it over time, and they tend to be extremely expensive from a professional services perspective to deliver.
So what we’re proposing, and not just proposing, but building the infrastructure for, is a new generation of conversational IVR. And we’re going to do it in a couple of ways: We’ve already done what I call part A of the three parts, and that is we have built an entirely new speech recognition engine based on the latest in machine learning processes, specifically deep neural networks so that the core recognizer that will work in this stack is absolutely state of the art/ has excellent recognition capabilities and is easy to stand up, install and configure to run in your application stack. And more importantly, it’s designed to do transcription, not directed dialogue with grammars like that old style of IVR application. It’s intended to take raw tech or raw speech from a collar and transform that into text. The second part, part B, of our application stack is going to be a new AI platform artificial intelligence that uses machine learning. It’s built on commercially available AI components that already exist today that are also state-of-the-art. They’re components from companies like Google, or some firms that Google has purchased, that Google has put out into the open source world. We’re going to build the machine learning AI piece that does the intent determination from the text and extracts those values or entities, like departure city, arrival city or whatever the particular conversation might be. From that text we can pass that back to an application in your IVR to do work. That second part is in engineering now, in the process of productization, and it will give you an excellent starting point to accomplish what is very typically a difficult process with tier-four applications today. And the tool set is one that is widely commercially adopted; there’s lots of people who already understand how to use it. We’re essentially just going to provide the plumbing to connect it into the rest of your IVR stack and our speech recognizer in a simple and easy way.
Coming on top of that in the third part of this process will be the addition of something that we’re calling on AI gateway. If you look at the slide in front of you right now, you can see the AI platform over on the right hand side and LumenVox listed down below it as one possible AI platform, but up above you see a number of other names that you’ll recognize things like Amazon Lex Microsoft Luis, Google’s dialogue flow IBM Watson and others. Those are all widely used, commercially available AI engines today that use machine learning to produce artificial intelligence that help you parse out the answers you’re looking for from the text. What we’re going to do is provide a configurable gateway that will operate from the LumenVox media server so that in your IVR applications you can take advantage of existing AI that you’ve already built with those commercial tools, things like FAQ question chat bots that are on your website today, or other mobile applications that you’ve built that use text and machine learning or AI to respond to that text. You’ll be able to take those models and add them to your existing IVR stack so you’re not starting from scratch with the learning process for the AI mechanisms. You can continue to reuse something you’ve already built and enhance it. That’s almost always less expensive than starting from scratch to build a new AI platform and a new AI model for your particular business situation.
We have some customers who are already using this approach in an experimental stack, and I say experimental–some early proof of concept applications today rather than going out of the LumenVox media server. They’re making the AI request out of their voice application platform today, which requires a little bit more work on their part. But we know that the new generation of recognizer we have in place, when combined with that kind of external AI approach, is actually working well. And then in 2020 we will add that third part of the AI gateway to the LumenVox media server to make all of the integration work simpler or quicker and easier for you.
Have questions about our next generation of conversational IVR? Contact us today!
Avaya Podcast Network spoke with our very own Jeff Hopper, Vice President Business Development, for the 8 & Out podcast, a series featuring Avaya Select Product Partners. LumenVox has been a proud Avaya Supported Select Products Provider since 2012 and offers LumenVox Call Progress Analysis, LumenVox Speech Recognizer and the LumenVox Speech-to-Text Server on the Avaya DevConnect Marketplace. In this interview, Jeff explains new, exciting innovations within LumenVox’ technology stack as well as his perspective on the industry itself—where it’s headed, and how LumenVox can continue to set itself apart with flexible, cost-effective solutions.
Read the Transcript Below:
Hey this is Bill Petty with APN, the Avaya Podcast Network. I’m sitting here live on the Avaya engage 2020 floor, talking with Jeff Hopper of LumenVox. Jeff thanks for joining us.
Thank you very much Bill. I’m really delighted to be here, despite the raspy voice from three days on the trade show floor.
I think we all have a little experience with that. So, tell me a little bit about what LumenVox is doing and what you are pitching here to our customers and channels?
Sure, LumenVox is a provider of speech recognition technologies including speech recognition, text to speech, call progress analysis and voice biometrics for authentication for your callers in your customer self-service or your contact center environment.
And how pervasive is speech-to-text these days?
You know, it used to not be so much, but now it’s everywhere. We all have things like Amazon Echoes, or Google Homes, or other personal assistant devices, so it’s become an expected component of a contact center these days for self-service and for assisting the agents.
Right, and I know LumenVox is a long-standing relationship partner with Avaya but, tell us a little bit about the kind of progression of what’s happening these days.
Absolutely. It’s one of the most exciting parts of where we are now in our journey with Avaya and with the Avaya customers and partners channel. I’ve been at LumenVox 8 years, my 7th IAUG in that capacity, and when I started we were a DevConnect member. We had developed some business overseas more than in the North American market and then we progressed into the SPP program about 5 years ago. We had some phenomenal growth in business and awareness in the Avaya customer base. We’ve taken on several dozen new large customers in the ecosystem, and just this past year, we’ve signed a further advancement of that agreement. We now have official Avaya part codes so our product can be ordered through Avaya as a reseller – making the process much simpler for everybody and just hopefully helping to accelerate the ease of adopting our speech recognition products.
So I know that as we move a company through that type of relationship models, you start with DevConnect and you go into the SPP (which is a big deal and you know they’re very selective about who they choose) and then they move into a resale model or kind of expanding the approachability and the availability of the solutions. Tell me a little bit about what’s really driving that relationship from the LumenVox and Avaya side.
I would be delighted, and I think it’s the thing from a personal point of pride I’ve been involved with the last 8 years, so I really take some joy in this. I think the best exemplar I have is our Net Promoter Score. We have sustained an 89 average Net Promoter Score, and in a business to business model, you know you’re working your keister off to accomplish that for your customers. Our customers and our partners consistently come back and say “great job,” “easy to work with,” “the product is easy to install, configure and use,” so we have just tried to reduce the friction of using speech recognition in the environment. Make it easy for everybody to use that technology effectively in their contact center and in their self-service.
Oh, that’s fantastic! You know, at Avaya, one of our key slogans at this time is “Experiences that Matter,” so apparently LumenVox is making the experience of installing and using and implementing and configuring the solution, they’re making that experience very positive. Share with me a little bit about the experience of the end-users, how is this relating to what’s going on, and how your solution is kind of a game-changer?
Speech recognition was always something kind of like Harry Potter magic. You had to go to Hogwarts and learn some secret wizard handshakes and incantations. We’ve just tried to simplify that install-configure-and-use part, and then we’ve tried to do work to enable the channel partners that actually build applications to raise their skill level, raise their expertise and user interface design. All the things that allowed them to work with the end customers to get a really great customer experience out of the applications.
How is the relationship with Avaya progressing as far as a technology development perspective as you look for opportunities to build hooks and implement within our shared structure? How are things working from that perspective?
It’s been a marvelous year. I was in New York back in November to meet with a team of Avaya executives at the Briefing Center and we exposed some new technology that we’re bringing to the table in 2020. We’re adding a new approach if you will, to conversational speech recognition. Where those have been very closed proprietary systems in the past, we’re building a mechanism to allow you to use commercially available AI tools to bring AI to conversational speech, it’s not proprietary. You’ll be able to use any of the major AI resources, you know whether it’s tensor flow in a data center level, or Google or Watson, or any of those things, and the speech recognition in your self-service applications. You can reuse AI that you’ve already created for other channels and more easily incorporated into the product stack.
How’s this improving the customers experience?
The better, or more well-trained, AI models are [improving the customer experience], and we all know the people who have been working on those kind of things. When you’ve got a better tool set, not a proprietary one, but stuff that lots of people are contributing to, it just makes it easier to get the AI right to give an appropriate response to the caller and make their experience less friction bound if you will.
As you look towards 2020 and start building these new measurement tools, what are you going to be delivering for the customer and how is that going to improve what they’re seeing in the use of the LumenVox solution?”
So, from the customer who implements this perspective, they’ll be able to take advantage of other initiatives they’ve already done, like chat bots for example, and voice enable them in their customer pathways through the contact center. They won’t have to rebuild the entire thing twice with the new learning model; they can simply voice enable. I always say, give your chatbot a voice.
I love the idea of a voice enabled chat bot because I have big thumbs.
Yes, me too.
And I have a really hard time trying to type on that little keyboard in an amount of time that somebody, or the bot, is actually waiting for me to respond. I should be able to talk into it and say, “I need, this is what I’m looking for,” and not have to type it.
Let the computer change the speech into text. One of the things that we can add here is because we have such a complete stack of products now with voice biometric authentication, we can help you secure those application pathways as well as service them with the speech recognition in the AI technology.
So, tell me something unique about what LumenVox is doing. I know we’re here on a tradeshow floor and we’re amongst all these partners; give me an idea of what LumenVox is doing that’s kind of new and unique, especially if it has something to do with what Avaya is doing in the market space.
Sure, I think the thing that we’re focused on the most, and I’ve mentioned it a little bit earlier, is making this stuff easier to use and incorporate into the solutions that get to the end customer, to the caller in that respect so that there is lower project costs, quicker time to market. Basically, accelerating and making it easier is a big enhancement because we all know it’s been an area that was kind of a black box traditionally and between technology advancements in the software and how we produce these things a real strident focus on quality and management user interface to make those things simpler, get rid of any friction we can essentially and then the growth in computational capacity with cloud computing in the general reduction in the cost of computing have all brought us to a place in time where speech should be ubiquitous and should be expected component of the customer service path.
Well, it is becoming more and more pervasive that’s no doubt.
That’s something that we’re all seeing, and I think it’s something we’re all becoming a little more comfortable with.
Especially if you’re in my generation.
Well yah, it was tough, you’d question the little device sitting on your desk that’s listening to every word you say to see if you talk to it. But I think we’re becoming a little more comfortable with that. You know, our cell phone’s listening to every word. I can guarantee it that the ads that pop up on social media because I talked about wanting to buy my wife a weighted blanket, then all of a sudden I’ve got an ad for a weighted blanket showing up on my phone within an hour. I don’t put my tin foil hat on though.
I agree with you completely. You know, we get into that question in our space. Like with the voice biometrics, we’ve built a set of products that are very secure. They made GDPR compliance in Europe for the ability to protect people’s privacy, which is essentially the highest standard around the globe today. We are very mindful of those elements of our product development process to in making all the things secure whether they’re in your premise in a private cloud or even in a public cloud in that case.
So Jeff, I know you guys are here on the floor, and for those that are out there listening, please make sure you go to the Avaya DevConnect marketplace at devconnectmarketplace.com, look up LumenVox, and find out about their solutions. Jeff, thank you so much for sitting down talking with us today.
I appreciate it very much Bill, it’s been a pleasure.