The development of Automatic Speech Recognition techniques continues to accelerate. Already an established technology, Automatic Speech Recognition is growing by leaps and bounds each year, especially as artificial intelligence contributes to evolution. A crucial building block of artificial intelligence is deep learning.
What is Deep Learning?
Deep learning refers to the process of a computer model learning how to do classification tasks by example, directly from audio, text, or images. These models are trained using very large sets of data and neural network topologies with many hidden layers, to which the word “deep” refers. Deep Neural Networks can achieve state-of-the-art performance in many different fields, even exceeding human-level performance on some of them.
What are Neural Networks?
More specifically, neural networks are a series of algorithms, whose job it is to identify relationships within a set of data, a process that simulates the way a human brain identifies underlying connections. When it comes to speech technology, neural networks enable us to push the limits of speech recognition.
Which Neural Network for Automatic Speech Recognition?
Deep Neural Networks are transforming the way humans interact, playing an important role in the technological revolution of artificial intelligence. At LumenVox, our Research and Development team is currently utilizing Time Depth Separable Convolutional Neural Networks (TDS CNN).
Convolutional Neural Networks are advantageous for a few reasons: They are computationally efficient, making them highly useful for mobile applications, and they have fewer knobs to toy with, fewer parameters to adjust. That means LumenVox customers get an ASR engine with greater speech recognition accuracy without requiring more compute performance, encouraging greater efficiency and performance.
LumenVox’ deep learning technology is applied to many of our technologies, including Automatic Speech Recognizer, Natural Language Processing, and Voice Biometrics. To learn more about our comprehensive stack, or to take an even deeper dive into deep learning, contact us today!
Speech technology can truly bring the customer experience to life, but it takes a unique blend of creativity, technology, and hardware to do so. We recently interviewed LumenVox Software Engineer Shaun McThomas to gain his perspective on the art of integrating speech technology with IVRs to enhance the customer experience.
Hi, my name is Shaun McThomas. I’m a Software Engineer at LumenVox, and today I’m here to talk to you about creating next-generation conversational Interactive Voice Response systems. IVR for short.
What are the biggest issues facing customers and IVR-s today?
One of the biggest issues with IVRs today is that callers are forced to follow a rigid script. It’s not a conversation, it’s an interrogation. First, they are asked, give me this bit of information; then they are asked for another bit and another bit. There is no flow like you would have with another person, just a series of “painful, tiny steps”, making the whole process ridged and uncomfortable for the caller.
Another issue is you often get trapped in IVR jail there is no escape route. You are forced to listen to the very end of the prompt before you can respond, “This isn’t what I want; let me go back to the Main menu.”
What do you think the solution is to address these pain points?
Most of these issues are easily addressed with an artful blend of good design and use of modern speech recognition technologies, what LumenVox calls Speech Art. If you listen to the very best contact center agents within a business and model how they solve the same issues and how they question a caller, you’ll understand how callers really ask questions and can provide very lifelike IVR responses. By following this model, you can produce frictionless, intuitive (and personalized) interactions with callers, radically improving their experience.
The very first thing a good IVR should do is quickly identify who the caller is, confirm that assumption. Remember, a blend of technologies can make this easier. Look up the phone number they are calling from in your back-end systems and see if you can determine their identify from that. You can use both speech recognition and voice biometric authentication to make that process simple and easy if needed. More than likely calling from their cell phone, which provides a unique identifier.
Once you’ve identified the caller, use data available from your back-end systems to anticipate the reason for calling and personalize the next steps.
For example, if you’re a power company and a customer’s home is in the middle of a known power outage, assume that’s the reason for the call. Likewise, if you’re an airline and they have a flight booked on your airline that departs within the next 24 hours, assume that’s the reason for the call. Now that you’ve made an assumption, confirm that’s the reason they are calling with a simple yes/no prompt and if yes provide them with appropriate information. If they are not calling for that reason, ask them why they’ve called, but allow them to use natural language to answer. And always give them a way to correct themselves.
How does Conversational IVR work, exactly?
Conversational IVRs work by leveraging three key technologies, ASRs, NLU, and TTS. These aren’t the only piece to the puzzle, but they are important ones. Let’s talk a little about each.
First, there is Text to Speech (or TTS for short). TTS is the method to turn text into speech. This is key to allow you to easily ask questions quickly. It is important to use TTS instead of recordings to allow questions to be personalized. For example, when a caller first calls in and you want to verify them, you can use their name and directly ask if it’s them.
Next, there is the Automatic Speech Recognizers (or ASR for short). An ASR’s job is to take speech, recognize it as something meaningful, and then turn it into something useful like text. There are lots of types of ASRs. LumenVox’s new transcription ASR uses machine learning techniques such as deep neural networks for natural language processing. This is effective for transcribing text from human speech. Before this sort of technology existed, you had to constrain your recognizers to a limited set of words (called a grammar) that it could only recognize. Modern NLP models have a large set of words they can recognize allowing you can speak naturally, and it will be able to feed you back the raw transcribed text. Once the ASR has done its job, we have that text.
Finally, we need to use another technology, which is Natural Language Understanding or NLU for short. NLU takes this text and converts it into meaning, intents, and slots, for example:
The caller can say: “I want to fly from New York to LA.” And we parse out “to fly, New York” “destination, from LA.”
Using these three technologies we can create a conversation with a caller rather than a scripted interrogation. First, we would use TTS to ask the caller a question, then an ASR to get text back from the caller’s response, then NLU to understand that response, and then finally use that understanding to figure out what additional information we need from the caller or processes the caller’s request.
What sets LumenVox apart?
At LumenVox, we’re creating a Configurable AI Gateway that makes it easy to integrate many different NLU engines with our ASR. This approach opens the possibility to use widely available NLU platforms from IBM, Google, Microsoft, Amazon, and others with your existing IVR along with LumenVox ASR, TTS and Voice Biometrics.
Many technology vendors don’t offer choices in the combinations of ASR, NLP, and NLU that you can use to build a solution. Their entire suite of technology and tools is often proprietary, expensive, and because its proprietary, involves the use of expensive, dedicated professional service teams. At LumenVox we want to be able to easily integrate existing technologies with our speech recognition, text-to-speech, and voice biometrics software as part of the solution stack. We want to take the technology that’s already out there and make it easier for our customers to use.
Ready to take your contact center to the next level by implementing a conversational IVR? Contact us today!
In response to COVID-19 and its burden upon our customers, we wanted to do our part. As of today, LumenVox Advanced Speech Recognition (ASR) customers who have a maintenance agreement can use up to a 50% increase of their current ASR license volume, at no additional cost, for 60 days.
It’s imperative to stay connected to your customers right now. During this uncertain time, LumenVox remains flexible and committed to serving our partners and customers, making sure your service remains seamless, as demand and dependency upon remote channels increases.
“We hope that this flexibility to our licensing allows for increased automation and alleviates some of the contact center stress we are experiencing during COVID-19. This is just one way we can help come together and support one another.”
-Edward Miller, CEO of LumenVox.
This free bursting also demonstrates our commitment to both outstanding customer service and meaningful support, from speech application development to deployment and daily use.
LumenVox ASR is a software solution that converts spoken audio into text. Its ability to recognize naturally spoken language and its tuning flexibility set the technology apart as an industry standard. With LumenVox ASR, user experiences improve, and completion rates rise. To accompany this technology, we provide a wide selection of licensing options, including per-port, monthly subscription, use-based, bursting and Software as a Service (SaaS).
LumenVox is also directly addressing one of the major issues resulting from COVID-19, a spike in fraudulent attacks on contact centers, with Fraud Scanner, a voice-based fraud detection tool. Learn more about this fraud detection strategy here.
To further discuss your licensing needs, or to inquire about any of our solutions, contact a service representative here.
You can read a message from our CEO further detailing our response to COVID-19 here.
How the customer experience evolved from snail mail to Conversational IVR
LumenVox and its artificial intelligence speech technology has initiated an evolution within the world of customer service. In the good ol’ days, people went to the bank to deposit a check. They absolutely had to make the trip; there were no other options. Eventually, banks got savvy and integrated with another institution—the post office—to provide printed account information to your doorstep, send or receive checks, etc. Eventually, with the integration of the telephone and its proliferation of use in businesses, we got phone support. If we had any questions or concerns, we could call a teller or bank representative to get the needed information. Over time, the number of incoming calls vs. the number of representatives available were too incongruous. There could never be enough reps to answer the number of needed phone calls per day.
With speech recognition, self-service phone options were put into place. When customers called the bank, instead of a human voice, they got automated options: “Press 1 for your balance. Press 2 for hours and location,” etc. But those menus can get quite annoying—people quickly got tired of the amount of time it took to listen and navigate. Everyone pressed zero, hoping to skip straight to a human voice.
To solve this issue, speech interactions were introduced, but only via directed dialog, i.e., “Say yes or no.” It wasn’t that transformative, really; there was still a fair amount of patience required by the customer. Eventually phrases such as “Account Balance,” “Did my check clear?” were put into place. This was considered an improvement by customers, but there were issues with the inconsistencies in customers’ speech. As a result, there was a good amount of–“I’m sorry, I didn’t quite get that.”
But businesses did finally get it. They learned from all the trying and testing that the level of ease of the customer experience, now primarily implemented by mobile phone, is just as important to a customer as the quality/ price of items or service purchased. Essentially, time is money—to everyone. This realization is what spurred on the notion of a conversational IVR and how it became instrumental in the design of self-service solutions.
The goal of Conversational IVR is to have automation mimic a conversation with the consumer. Open-ended questions, such as “Thank you for calling. How can I help you?”, have become the desired greeting. This evolution into machine, yet human-like interaction relies on Deep Neural Networks (DNNs), their ability to seek out, classify and order information in more complex or “intuitive” ways, and their contribution to Natural Language Understanding (NLU)/ Natural Language Processing (NLP). NLU/NLP is a specific branch of artificial intelligence that enables human-computer interaction using the input of sentences in text or speech.
Integrating NLU/NLP into a customer service environment results in a customized user experience: Enterprises are now able to recognize and automatically speak to you by name when you dial in. Your phone number is matched to your account information so they can even predict what you may be calling about (i.e., “I see you have booked a flight…)”.
Conversational IVR promotes a better customer experience. Frustration is reduced, just as conversation is encouraged. It’s a win-win: Customers have instant, secure access to vital information while businesses can effortlessly exceed service expectations. LumenVox’ next generation of Advanced Speech Recognition incorporates this advanced technology into a seamless solution for businesses. And our presence at Avaya Engage (Booth 300) will allow you to familiarize yourself with its capabilities in person to see exactly how it can benefit your business.
LumenVox and Pivot announced today that Pivot Technology Services (Pivot) has officially become a LumenVox Skills Certified Partner. LumenVox Partner Skills Certification demonstrates Pivot’s capability to deliver high-quality speech solutions based on the LumenVox speech automation suite. Pivot, a trusted provider and integrator of the world’s leading technology solutions and services, utilizes the complimentary LumenVox Automated Speech Recognizer and Text-to-Speech solutions to improve the customer experience and employee productivity within contact center agent portal environments.
“LumenVox is excited to see the facilitation of a large joint opportunity that the evolution of our partnership with Pivot has already brought,” stated Ed Miller, LumenVox CEO, “and we look forward to supporting their development of innovative and dynamic speech-enabled solutions.”
“Becoming LumenVox Skills Certified broadens the benefits for our customers by combining Contact Center solutions design and implementation expertise with the best technologies available. The earned partnership strengthens our strategic position in an underserved niche of the market between traditional VARs on one end and large IT service providers on the other,” stated Jeff Brinckman, Pivot Director of Customer Experience Solutions.
Contact center technologies are ever-evolving. LumenVox Automated Speech Recognizer and Text-to-Speech solutions add functionality and improve capabilities. With Pivot’s expertise and LumenVox’ world-class technology, businesses and their customers can now experience efficient and customized communication.
About Pivot: Pivot Technology Solutions has created a portfolio of operating companies and partners, differentiated in their respective markets by superior competencies and an unmatched commitment to total customer satisfaction. Through its portfolio companies, Pivot has built an organization with deep knowledge of the industry, an extensive partner network, and the implementation capacity required to take on large and complex projects. Additionally, through Pivot Technology Services, the Company offers a comprehensive portfolio of services on a global scale.
The Customer Experience Solutions Practice provides insight and technology leadership that enables organizations to adapt to changing market conditions in the client care space. Recognizing that 85% of customer interactions flow through the contact center architectures, Pivot helps organizations transform their contact center platforms into a strategic corporate asset that creates competitive advantages.
Interested in learning more about how your organization can leverage LumenVox’ suite of speech and authentication solutions by becoming a LumenVox Skills Certified Partner? Contact us today!
LumenVox is excited to announce the release of LumenVox Version 17.0.200. In this release, we have:
Added support for a new short-utterance transcription (Natural Language) functionality to process audio with a maximum length of approximately 30 seconds.
Added a new Out of Service configuration option for the ASR (Automated Speech Recognizer) service, allowing system administrators to enter maintenance mode from the Dashboard, which permits currently pending requests to be completed, but any new requests will be rejected (to be potentially handled by other ASR servers in the cluster).
Added a new feature to the ASR load-balancing mechanism to actively route ASR requests based on the language specified.
Useful for situations where you do not want to be constrained by a specific grammar, or challenged by implementing a more complex and costly Statistical Language Model, the LumenVox Short-Utterance Transcription functionality utilizes a built-in, general Statistical Language Model that has been tuned for everyday use to provide a text representation of supplied audio.
Supporting LumenVox’ commitment to making speech applications more secure and easier to administer, additional enhancements were made to our diagnostic tools and dashboard, including more robust grammar handling within the LumenVox Speech Tuner.
For a comprehensive list of improvements and features released with LumenVox Version 17.0.200, please click here.
If you’d like to watch a previously recorded webinar about the release, including participant Q&A, please click here.