The sound of silence and human speech: Being able to interpret this automatically is the key to an effective call center strategy for outbound calling success.
LumenVox Call Progress Analysis offers call center operators and predictive dialer developers the core technology to quickly and accurately classify whether calls were answered by a human or a machine. It accomplishes this with Voice Activity Detection (VAD). VAD is vital in creating effective predictive dialing solutions while complying with related regulatory restrictions governing the use of such automated systems.
Silence & Timing
There can be a varying amount of silence before someone begins speaking/ when a human answers a call. This silence needs to be taken into consideration by call flows, which means allowing some silence before speech, but also limiting the amount of time the application should wait for someone to speak. For instance, if no speech is detected after 5 seconds from the start of the audio stream, it is highly unlikely it was a human that answered. The ability to interpret this silence intelligently is critical to an effective call center strategy.
Speech & Timing
Once human speech has been detected, the next thing to determine is the length of that speech. Our extensive research has shown that statistically residential lines answered by a human have less than 1.8 seconds (1800ms) of speech in the greeting. That same research concluded that business greetings by humans are likely to be between this residential threshold and 3 seconds (3000ms). By extension, we can also conclude that anything greater than this should be classified as machine, likely a prerecorded message, possibly from an answering machine.
With the ability to classify and make these intelligent determinations, Call Progress Analysis can minimize the wait-time between calls handled by each agent, by predicting when the next agent will become available.
Not only can this contribute to significant operational cost savings for a contact center, but it also can dramatically impact the overall competitive advantage, since calls are being handled in a more efficient manner. This effective call center strategy and commercial advantage can be reflected as increased profit margin, more competitive pricing for customers or a combination of the two.
Customer experience is often highest on the priority list for businesses. But in a post-pandemic world, organizations are looking for ways to improve the agent experience as well. By raising the importance of agent concerns to the same priority level as customer concerns, organizations can dramatically improve employee satisfaction levels, optimize the efficiency of individuals, and demonstrate best care practices. These steps can also improve agent recruitment and retention. The benefits also translate into a positive customer experience. Happy agents equal happy customers.
Here are three ways to enhance and improve the agent experience with voice technology.
Challenge 1:Authentication. A renowned technology research firm, Opus Research, has “long seen zero-effort authentication as a necessity for creating trusted links between brands and their customers.”
Solution: Voice biometrics provides a convenient and secure form of authentication for customers and an effortless task for agents. There are two types of voice biometrics, Active and Passive.
Active Voice Biometrics means that the customer enrolls by repeating a set phrase. On subsequent calls, the customer speaks their passphrase, which is compared to their stored enrollment voiceprint.
Passive Voice Biometrics means the customer is seamlessly enrolled by capturing historical or real-time audio. Enrollment is completed by recording the customer’s unique voiceprint during an initial conversation, not a specific phrase. On subsequent calls, customer’s conversational voice is compared to their stored voiceprint. A customer speaks with an agent and is transparently verified within the agent desktop.
It’s secure and effortless, turning a common pain point for agents into a seamless experience.
Challenge 2:Fraud. Right now, attackers are shifting from spoofing to using virtual call services since they are anonymous and untraceable. These fraudulent calls are legitimate calls that can be placed from many devices from anywhere in the world. This allows fraudsters to bypass spoofing detection technology with numbers unrelated to a record. Fraudsters then use social engineering on agents to grant them control over a customer’s record.
Solution: With Passive Voice Biometrics, agents themselves can help identify fraudulent activity in real-time. If a customer’s voiceprint does not match the one on file, the agent will be alerted of the voiceprint mismatch. The agent can then pass the warning signs to the fraud department to initiate investigation—while the call is still in progress. By giving these agents greater peace of mind, and the fraud department greater resources, everyone is protected and more productive.
Challenge 3: Containment. With greater reliance upon the contact center to solve customer issues, the workload of agents rises daily. Without intelligent proactive outbound communication and robust Automatic Speech Recognition and Text-to-Speech technology, customers route to agents for menial tasks to mine information that could be accessible ahead of the need/ask.
Solution: LumenVox’ Automatic Speech Recognition and Text-to-Speech enable automated yet dynamic and personalized interactions with customers within the IVR. This improves containment and reduces Agent Handle Time, as customers gain immediate access to relevant information, often removing the need for a live agent. With the use of Call Progress Analysis, agents’ time can be spent solely on live interactions with customers. Call Progress Analysis is a novel predictive dialer technology which uses Voice Activity Detection (VAD) to quickly and accurately classify whether calls were answered by a human or a machine. With this technology as well as statistical analysis, adjustments and feedback, agents can achieve optimal performance.
Voice technology can be leveraged in a variety of ways to shave valuable minutes from each agent’s workload. This translates into benefits for the business, for the agents themselves and for customers.
A solid outbound contact strategy requires technology that can perform intelligent proactive outbound communication. Predictive dialers can accomplish this, with the right type of detection, boosting efficiency and cutting cost.
Predictive dialers give businesses a shortcut through a simple issue: delay. When any phone number is dialed, there is a high likelihood (over 60%) that the call will not be answered. Therefore, the time spent waiting for it is wasted, if/ when performed by a human operator or agent.
Even calls answered are not answered immediately. On average there can be 15 to 20 seconds of delay before the phone is picked up. Predictive dialers can filter out productive and unproductive outbound calls–ones that require the expense of agent interactions, and ones that do not.
Modern predictive dialers that can distinguish the difference between live parties, answering machines or voicemail services fall into the category of “voice activity detection.” Predictive dialers which can only identify machines by their busy, SIT, answering machine beep or fax tones fall into “tone detection.”
Voice Activity Detection (VAD) Predictive dialers perform an intelligent interpretation of delay. VAD enables agents to spend more time speaking with their intended call recipients. This dramatically increases the overall productivity of the contact center.
LumenVox has moved the goalpost with this Voice Activity Detection approach. We call our technology “Call Progress Analysis.” Call Progress Analysis leverages VAD to filter out these “dead” calls, so that only “live” calls (ones that have been answered by live parties) are passed to agents.
Any contact center with any number of agents working can reap the benefits. With the use of Call Progress Analysis, agents’ time can be spent solely on live interactions with customers. With statistical analysis, adjustments and feedback, that contact center can achieve optimal performance and considerable cost savings.
Since 2009, Speech Technology Magazine has been the premier online destination for comprehensive, independent coverage of information impacting speech technologies. This year LumenVox is proud to have a white paper featured, addressing exactly how businesses can leverage speech technology to take on the Total Experience.
What’s Total Experience? Total Experience (TX) refers to the entire company experience – employee, customer, and user. It takes a 360-degree look at a business to identify gaps and fill them, and it ties the critical pieces together—people and technology.
Now that it’s 2021, reality has set in: The business environment is evolving and doing so rapidly. That means it’s critical organizations stay one step ahead of their customers and their competitors. A business’ technology must evolve in tandem—and that technology needs to address not just the Customer Experience (CX), but the Total Experience.
The truth is, though, that organizations are overwhelmed. Where do they start? How do they know which technology is really going to make an impact?
Speech technology is the ideal starting point because it can empower organizations to shift from enhancing CX to addressing and enriching the Total Experience. In this white paper, we outline exactly how speech technology can address major pain points including friction in the IVR, for employees and customers.
The International Avaya Users Group is holding its second annual online virtual event: IAUG Wired, this Tuesday, February 10, 2021. We were able to attend last year in Phoenix, Arizona, and can speak to the exciting engagement it offers.
The best part about this year? You don’t have to leave the comfort of your office or home to participate. It’s completely virtual and accessible from anywhere. IAUG has a host of panelists delivering highly interactive, 3D virtual environments, breakout sessions, and interactive partner exhibits within the virtual show floor.
As a proud sponsor, LumenVox will have a virtual booth to showcase our best-in-class voice technology. We’ll have a plethora of information on the LumenVox Avaya-compliant, speech-enabled contact center applications including:
Call Progress Analysis ensures outbound message delivery and is currently beating the competition with 98.6% accuracy of over 500 million outbound calls per month. As an enhancement to the Avaya Proactive Outreach Manager, you can ensure the message is delivered successfully and clearly. No cutoff. No miscommunication.
Automatic Speech Recognition converts spoken audio into text, providing users with a more efficient way to interact with automated systems. Increase IVR containment by allowing callers to self-service.
Text-to-Speech provides the voice output for conveying information to callers – further assisting with containment. With Text-to-Speech you can speech-enable your telephony platform or software application.
Want to chat with us during the event (Feb. 10, 2021)?
Go ahead and log in before registering if you have an IAUG account. Members will also have access to the recordings. For inquires, contact email@example.com.
Register here. By registering for this event you are opting in to have your name and email visible on the event platform while the event is live in order to participate and interact with attendees and speakers. If you are not an IAUG member already, you will receive a basic entry-level “affiliate” membership after the event.
LumenVox Luminaries is a podcast that broadcasts thought leadership pieces on the subject of voice technology. This episode features Jason Kawakami, LumenVox Senior Sales Engineer, outlining the sophisticated speech science, functions, and benefits of LumenVox Transcription Engine. You can follow Jason on LinkedIn here.
Listen to the Intersection of Speech Science and Transcription Podcast below:
Read the Transcript
So this is a dive into one particular component of our speech products. LTE or LumenVox Transcription Engine is part of the ASR component of our speech suite.
Q: What is LumenVox Transcription Engine?
What the transcription engine does, is it delivers transcribed text which is representative of the decoded speech. So we take in an utterance. We process it against an unconstrained grammar called a Statistical Language Model or SLM. We take that text and provide it to a downstream piece of technology that might be an existing trained AI model, or it might be fed into any number of things.
Q: What’s the primary use case for LumenVox Transcription Engine?
The primary use case today has been in our case talking about Natural Language IVR’s. The application, the SLM all of the bits of this are focused on that middle of the road. We’re providing a supporting technology to IVR’s that are providing natural language applications.
Straight out of central casting use cases speech-enabling a chatbot.
How does LumenVox Transcription Engine work?
We process the audio against a Statistical Language Model, an SLM instead of a traditional grammar. The traditional grammar is a constrained search space. The Statistical Language Model is a big giant search space that is focused on the spoken language in a particular unique language whether that’s English, Spanish, or other. Now that mathematical model predicts what is going to be spoken next, and that prediction is used to narrow the search space.
This is speech science. This is really high-end science that is done on how languages are spoken, what phonemes come out for what–this is very, very high, complex computational science.
Q: What’s the primary use case for LTE?
Our SLM is tuned for general typical conversation. We’ve talked about the primary use case for our LTE is to feed downstream AI-based processes. The straight-up middle of the road? We’re thinking about NLU- IVR. This is some type of telephony solution that requires the decoding of spoken utterance and that text being provided to some type of AI to determine the meaning. Taking that audio either from the IVR from the conversation that’s between the agent and a customer and feeding that into an AI engine that is specifically tuned and trained to detect sentiment.
Q: What are other mainstream use cases?
Another potential here, as we start getting into the shoulders of the road, is agent-assist applications, so listening to agent conversations in real-time and processing their audio, processing both their leg and the consumer’s leg–and maybe training a model to key in and be integrated with the company’s knowledgebase; that knowledgebase is providing or prompting the agent with particular articles out of the knowledgebase that will assist them with what the consumers are asking them for.
As we move farther outside of the mainstream, potentially that middle the road use case–using LTE to support speech to text applications, true transcription applications as the word transcription is used by normal people, not necessarily speech industry people. So think note-taking. Note-taking is a big deal in lots of industries. A few that stick out– the medical industry, the legal industry–the transcription engine could be applied to an application that is doing verbal note-taking in those cases, legal the same way.
The other one I was thinking about–dispatch apps. There’s lots and lots of mobile workforces these days and mobile workforces are becoming more prevalent with the world that we’re in. People are dispatching service vehicles to your home instead of you taking your car down to some type of central garage to get fixed. Every one of those activities has the standard “did I complete my thing?” and “how much time did I spend?” And there’s oftentimes notes that are associated with those trouble tickets or those service tickets. Our LTE could be used to support taking those notes spoken and pushing them into a text-based system to feed it to analytics applications. And we can use LTE to provide the words to an application that is providing the analysis and the real meanings of the words that are being spoken.