By Scott Baker | Senior Analyst, Opus Research
Voice has found new life as we enter the new year. Enterprises and their customers have high expectations for solutions that closely link accurate speech recognition, human-like text-to-speech renderings, natural language understanding (NLU), and voice biometrics to serve specific, or even multiple, use cases. ASR has grown up to become the conversational intelligence opportunity that’s bigger than the sum of its parts.
Once upon a time, automatic speech recognition (ASR) resided most comfortably in the enterprise contact center. It didn’t get out much. Once Alexa arrived in 2013, ASR found a massive new audience in the home but living in the cloud. A rush of new use cases quickly gathered around ASR. The big three smart home assistants, voice authentication, and contact centers are largely where ASR remained and thrived. But as we enter 2022, we’re in the early days of the great ASR acceleration, or as Opus Research’s own Dan Miller calls it, the ASR Renaissance — born of strategic acquisitions, product innovation, tech stack modularization, and partially in response to a worldwide pandemic.
As part of a recent two-part webinar series, Opus Research joined LumenVox Chief Product Officer, Joe Hagan, to discuss how ASR is being redefined, and the opportunities it presents for self-service and enterprise communications.
“Traditional call center IVR type applications continue to be the bellwether for growth and utility for speech applications,” Hagan explained, “but we’re seeing an awful lot of new applications that need to be served, things in consumer and retail, and things like audio mining and virtual assistants. These are important applications that are a course of growth for ourselves and our customers.”
Consider the rich information available for a retail call center wherein they’re fielding and recording calls about garment fit, price, inventory, shipping, and returns or refunds. ASR tech is capable of differentiating between caller and agent voice, tracking interest in certain brands and sizes, evaluating and quantifying reported issues of fit or quality inconsistencies, voice biometrics can determine caller demographics enquiring about certain brands and product types, etc. This is conversational intelligence in action, informing business decisions with a direct impact on inventory and sales.
Yet, despite the easy sell for that integrated vision, a recent Opus Research survey found that of 80% of companies leveraging ASR for transcription, less than one-third of those same companies are applying their findings to improve outcomes and drive new business decisions.
There’s also the matter of how entrenched a brand is in its current technology stack. Many of them might feel locked into a legacy ecosystem of custom integrations and services. When considering their next move, cost and time are huge considerations. Historically, the road to leveling up an organization’s conversational intelligence has been a precarious and potentially costly one. That’s not necessarily the case anymore as LumenVox and others begin to modularize their services in a way to become more accessible.
As the technology improves and becomes more modularized and open (i.e. through APIs, containers, etc), the playing field levels for companies providing solutions in the ASR space. This will no doubt drive new competition and more attractive price points for enterprise customers looking to realize this holistic model for conversational intelligence.
The current pace of innovation means that companies no longer need to take a giant and expensive leap of faith with a vendor who promises to deliver it all. Speed and accuracy no longer come at a steep premium, and the modularization of services also means that each component can be judged on its own merits and weaved together for a solution that specifically addresses a company’s most critical needs.
Interested in learning more about our speech-enabling software? Drop us a note.