Speech Automation


Multi-Factor Authentication


Request a Demo
Sun – Closed

Contact Us
24/7 online feedback

The Age of Voice Innovation, Part II: What Keeps You Up at Night

The Age of Voice Innovation, Part II: What Keeps You Up at Night

In the follow-up to “The Age of Voice Innovation, Part I: Attributes of the New Engine” webinar, we are excited to bring you Part II in the series.

Voice has remained pervasive for business communications, and it is especially having an impact in this Age of Digital Transformation. However, voice poses major challenges for Contact Center and CX professionals to keep their voice-based resources up and running — not to mention managing to keep technology fresh and add new capabilities.

Too often they feel they are “damned if they do” when buying into the view that customer care is moving to chat and text. Or they feel “damned if they don’t” to keep the voice channel up to date with the latest and greatest AI-infused technologies.

On October 5 at 10 AM PT / 1PM ET, join Dan Miller and Derek Top from Opus Research along with LumenVox’s Chief Product Officer, Joe Hagan for “The Age of Innovation, Part II: What Keeps You Up at Night” webinar.

During this conversational session, we will discuss what keeps contact center and CX professionals up at night, including:

  • What a move to the cloud looks like and will there be an impact on the customer experience?
  • What is driving the increasing need for openness and modularity?
  • What is impacting the pace of adoption for speech recognition?
  • What happens if you choose to do nothing?

All this and more. You won’t want to miss this session if you are starting to strategize your digital transformation initiative or are about to make decisions on the next steps. Register now for The Age of Innovation, Part II: What Keeps You Up at Night.

ALSO SEE: The on-demand session The Age of Voice Innovation, Part I: Attributes of the New Engine.

The Age of Voice Innovation, Part I: Attributes of the New Engine Webcast

The Age of Voice Innovation, Part I: Attributes of the New Engine Webcast

Dramatic improvements in automatic speech recognition (ASR) and voice technologies have transformed the role of voice communication in the enterprise for customer and employee-facing applications.

Speech recognition has reached unprecedented levels of accuracy. Synthetic text-to-speech voices are often indistinguishable from humans. Voice biometrics detects both real and synthesized imposters reliably and at-scale.

We’re excited to join Dan Miller and Derek Top of Opus Research along with Joe Hagan, Chief Product Officer at LumenVox, on Tuesday, September 14th at 10am PT/1pm ET, for a lively discussion on how speech and voice technologies are shaping next-generation customer and employee experiences, including:

  • Accuracy – how accuracy and other performance gains instill the confidence businesses need to build new voice-first applications
  • Accessibility – guidance on choosing the right technology foundation and partner to meet current and future business needs
  • Affordability – the myth of “it’s expensive” and why it no longer applies – and options for businesses where the reverse is true
  • Flexibility – Deploy speech applications in any environment, in any cloud: on-premise, multi-cloud, or a hybrid model.

“New demands have redefined the very meaning of Automated Speech Recognition,” said Dan Miller, lead analyst at Opus Research. “LumenVox’s new ASR engine provides high levels of accuracy and intelligence required to capture, recognize, and react to each customer’s intent and define what’s possible for speech and voice recognition software.”

Register now to save your seat! Can’t make it? Register to receive a link to the webinar recording!

ALSO SEE: The Age of Voice Innovation, Part II: What Keeps You Up at Night on October 5 at 10:00 AM PT / 1:00 PM ET

About Opus Research

Opus Research is a diversified advisory and analysis firm providing critical insight on software and services that support multimodal customer care. Opus Research is focused on “Conversational Commerce,” the merging of intelligent assistant technologies, conversational intelligence, intelligent authentication, enterprise collaboration and digital commerce.

Strengthening the Customer Experience with Automatic Speech Recognition

Strengthening the Customer Experience with Automatic Speech Recognition

With companies rapidly evolving and seeking more voice-enabled applications to deliver powerful experiences, LumenVox was pleased to recently discuss the benefits organizations can see when utilizing an Automatic Speech Recognition (ASR) engine with extremely accurate transcription, flexibility, and high availability.

The Power of Speech

ASR’s everyday applications are vast, and it’s transforming how multiple industries do business. For example, media and entertainment companies can produce content faster when hours of audio or video files are converted into searchable transcripts.

Educational institutions can deliver accessible remote learning through real-time captioning in video conferencing software. In addition, researchers can begin analyzing qualitative data in a matter of minutes thanks to asynchronous, machine-generated transcription.

These are just a few examples of how speech-to-text technology is impacting society.

In addition to industry-leading accuracy and speed, LumenVox’s ASR engine utilizes an end-to-end Deep Neural Network (DNN) architecture to accelerate the ability to add new languages and to recognize non-native speaking accents. This enables LumenVox customers to serve a more diverse base of users.

The Value of Artificial Intelligence

With typical Machine Learning (ML) models, there are two fundamental elements: (1) the language model and (2) the creator of the language model.

The language model can ‘learn’ based upon the data it’s given. With a DNN, creators are not required to augment the code base when building or adding data, which is helpful in eliminating inherent biases.

Ultimately, the more robust data sets will provide a highly accurate, broadly applied language mode.

Delivering Enhanced Customer Experiences with Speech

ASR is a programmatic way to turn voice into text. Voices come in different dialects, languages, and with various levels of background noise.

A good ASR can turn the spoken word from a variety of languages and accents into readable, understandable text. Businesses can then use the text to strengthen decision-making and enhance customer experiences by serving a more diverse user base.

Ready to learn more about automatic speech recognition? Join Dan Miller, lead analyst at Opus Research, and Joe Hagan, chief product officer at LumenVox on September 14 at 11:00 a.m. PT / 2:00 p.m. ET as they discuss what is required to deliver meaningful employee and customer experiences through voice channels. Register now for the webinar.

Why is Good Speech Recognition so Hard to Find?

Why is Good Speech Recognition so Hard to Find?

As an organization that interacts with customers through speech applications, the quality of your speech recognition technology can make or break your CX. 

In an ideal world, communicating with technology via speech would be as easy and natural as conversing with a human. This would make it so simple to access information and services remotely. It would also offer more independence to those who have no other option but voice user interfaces, such as young children who aren’t literate yet and people living with visual, motor or mobility impairments.    

While some speech recognition technologies have made great strides in achieving these ideals, others are still falling far below expectations. This raises the question, why do some speech recognition technologies work well, while others fail? 

The reality is: human speech is complex and constantly changing. 

The challenges faced by modern speech recognition tools

An Automatic Speech Recognition (ASR) engine’s job is to take speech and identify it as something meaningful. Some ASRs have transcription capabilities, which allow them to turn that meaning into something useful, like text.  

Getting this right is actually an incredibly challenging process. Firstly, ASRs must keep pace with the fact that language is constantly changing. In 2021, for instance, Merriam-Webster added 520 new words and definitions to its American English dictionary. 

Secondly, they must navigate a huge amount of variation that occurs within each language itself. This includes a diversity of accents and dialects between speakers of the same mother tongue. This is a huge stumbling block for many speech applications. One study found that 66% of people cite accent or dialect recognition issues as a barrier to voice technology adoption

Also, ASRs must be able to separate speech from background and environmental noise. This could be the sound of traffic, a busy shopping mall, or even the interference that occurs due to the quality of the microphone used.  

Unfortunately, many ASRs are simply not capable of handling these variables efficiently. 

How to solve these problems

All this considered, companies need to choose their ASR engines carefully when building or modernizing speech-enabled customer experiences. 

There are many different types of ASR engines on the market. Ideally, you want one that:

  • Supports all dialects within a given language
  • Offers advanced artificial intelligence and machine learning capabilities for maximum accuracy
  • Is able to continually learn from real-world usage and expand the language model to serve a more diverse base of users

LumenVox ASR with Transcription: Next-generation speech recognition 

Status-quo speech recognition engines don’t have the machine learning capabilities to manage all the differentials in natural human speech—certainly not with the accuracy users expect. This is where LumenVox’s new ASR engine changes the game.

The technology that sets the LumenVox ASR engine apart is its end-to-end Deep Neural Network (DNN) architecture and state-of-the-art natural language processing and understanding capabilities. This creates an ASR engine that serves a much more diverse base of users. 

Whereas other ASR engines treat different dialects as separate languages, LumenVox’s new ASR Engine with Transcription supports multiple dialects with one language model. This considers many different pronunciations in a single language, as opposed to having to train according to each individual user. The end-to-end recognizer matches audio to the written word—regardless of accent or other factors that impact pronunciation. 

Additionally, no matter where the call or audio is coming from, the LumenVox Speech Recognizer separates speech from background noise using Voice Activity Detection (VAD). This takes a range of qualities into consideration, including energy level (volume), frequency (pitch) and changes in duration, to accurately detect the actual speech.

All this means that your speech solution can cater for a more diverse user base, in a broader range of scenarios, with market-leading accuracy.

Improve your speech application success rate with tuning

To get maximum value from your speech applications, LumenVox also offers an advanced turning tool that does all the heavy lifting for you, making it far easier for you to manage tuning in-house (and avoid expensive professional service fees). 

LumenVox’s Speech Tuner performs transcriptions, instant parameter and grammar-tuning, and version upgrade-testing of any speech application, in less time and with less effort. This way, you can continually enhance speech recognition accuracy and build competitive advantage. 

Looking ahead

While there is room for improvement in the speech recognition technology landscape, the demand for voice-enabled solutions continues to grow. A study by National Public Media found that 52% of voice-assistant users say they use voice tech several times a day or nearly every day, compared to 46% before the pandemic. 

If your company gets speech recognition right, you will be in a strong position to capitalize on this market growth. 

Learn how LumenVox can help you seize the future of speech recognition, request a demo today.

Everything You Didn’t Know About Speech Recognition

Everything You Didn’t Know About Speech Recognition

With smart speakers and virtual assistants like Amazon Alexa, Apple’s Siri and Google Assistant part of our everyday lives, most of us understand the concept of voice-enabled technology. But how does speech recognition fit into this landscape and, more importantly, what value can it offer your business?  

What is Speech Recognition?

The goal of speech recognition is to let people operate applications and devices, and access services, in a more natural and convenient way—using voice. This reduces reliance on clicking, tapping and typing. These manual approaches are not only more laborious but also exclude certain customers, such as those with motor disabilities who can’t use keyboards or other tactile devices.

The brain behind the modern speech recognition system is called an automatic speech recognizer (ASR) engine. This intelligent software is able to interpret spoken audio and convert it from a verbal format into a text format. This text then acts as a command to drive the next steps of your speech-enabled solution.  

Decades of Development

Speech recognition technology is by no means a new concept, but it has evolved substantially since the mid-20th Century. While today, you can carry voice-enabled technology in your pocket, the first documented speech recognizer, launched in 1952, involved an entire room of electronics. Made by Bell Labs, this ‘Automatic Digit Recognition Machine’ was dubbed Audrey, and it could recognize the sound of spoken digits (zero through nine) when it was ‘adapted’ to the speaker—a ground-breaking achievement at the time.

In 2021, there are a great many speech recognition applications and devices available on the market.  The more advanced ASRs, built on the foundations of artificial intelligence and deep neural networks, are able to recognize a diverse range of natural languages and dialects, spoken by millions of customers, with great accuracy. All this translates into a high-quality, friction-free automated user experience.

But the journey is far from over. Speech recognition is an ever-advancing field and the market for this technology continues to expand. Looking forward, experts predict that the global voice and speech recognition market will grow at a CAGR of 19.5% during 2021-2026.

Looking at it from another angle: in 2020, there were over 4 billion digital voice assistants being used around the world. In just four years, that number is expected to double. That means there could be more voice assistants on our planet than humans in the near future.  

How Does This Impact Your Business?

Speech recognition technology has a wide range of use cases in the commercial world today. These offer numerous benefits for your organization.

  • Improve efficiency:
    Organizations can use speech recognition to step up productivity and performance through a wide range of services, such as voice-activated banking or apps that allow users to compose messages verbally.
  • Enhance your IVR:
    With a well-chosen ASR, you can boost accuracy and speed within your IVR, reducing agent handling times and routing calls more efficiently to improve the overall customer experience.
  • Support analytics:
    You can automatically transcribe all verbal conversations in your contact center. This makes these interactions easier to analyze, whether you’re using automated sentiment analysis tools to gauge customer satisfaction levels or flagging common call patterns and issues for swift resolution.  
  • Enable multi-tasking:
    Speech-enabled applications are hands-free. This way, your users can do other tasks (such as drive) while accessing your service. This improves usability and customer satisfaction.
  • Scale your reach:
    As with any automated technology, you can scale speech recognition rapidly without increasing human headcount. This makes it easier for you to expand into new markets or manage seasonal spikes in demand.

When you think about it, there are so many ways for your organization to integrate speech recognition into your solutions and services, to boost usability, save time and enhance CX.

LumenVox Automated Speech Recognizer – Speech Recognition, But Better

To harness these advantages and meet customer expectations, it’s vital that you choose a high-performing speech recognition engine. LumenVox’s new AI-driven ASR engine is unique in its ability to accurately recognize naturally spoken language and learn from real-world use for maximum ROI.

To explore what LumenVox can do for your business, request a demo.

Speech Recognition 101 Video Series 

If you’d like to dive deeper into the nuts and bolts of speech recognition technology, we have created Speech Recognition 101, a series of short video courses:

Speech Recognition 101 – Part 1

In this video, we explore the basic types of ASR, providing a technical overview and looking at the fundamental inputs. We also explain the difference between speaker dependent speech recognition software and speaker independent speech recognition software.

Speech Recognition 101 – Part 2

Part two takes an in-depth look at the grammar component of speech recognition. The number one problem developers have is building good grammars, or modeling how users speak to applications. Find out how to overcome these hurdles with LumenVox.

The Secret to Maximizing Speech Recognition ROI: Speech Tuning

The Secret to Maximizing Speech Recognition ROI: Speech Tuning

The automation provided by Speech Recognition can save your business significant time and resources, with a tangible impact on your profitability. It can also revolutionize your customer experience by enabling self-service, enhancing the value offered in your contact center, and augmenting the usability of your speech applications.

All these attributes drive revenue growth. But if yours is the kind of organization that views success as a process rather than an end state, why let the advantages end there? With Speech Tuning, you can eternally optimize your Speech Recognition capabilities. 

Born out of the belief that no matter how good technology is, it can always get better, Speech Tuning is the process of continually improving applications, including Automatic Speech Recognition-based systems, after they have been deployed. While this may sound like a chore, rather view Speech Tuning as an excellent opportunity to ensure the efficacy of your applications, maintain your competitive edge and amplify your Speech Recognition ROI.

The reality is: everything around us is evolving at a continuous pace. This includes your customers, the world they live in, their language and, most importantly, their expectations. You simply can’t stand still and expect to remain relevant.

If you’re not familiar with the term, Speech Tuning is a technology-driven approach to refining the performance of your speech-enabled applications, based on data gathered from real-world use. The goal is to perpetually enhance recognition accuracy, with a direct impact on call completion rates, containment rates, user experience scores and other metrics that matter to your business.

Fast, Accurate and Powerful Tuning with the LumenVox Speech Tuner

The LumenVox Speech Tuner is a complete tuning and maintenance tool and can be used to improve the effectiveness of any LumenVox product, including Automated Speech Recognizer, Text-to-Speech and more.

This tool offers value on multiple levels. First of all, it is designed to drive a swift and seamless tuning process. This allows applications to be tuned in less time with less effort, which lowers the total cost of ownership (TCO) of your speech applications. 

There are also benefits for your users. The LumenVox Speech Tuner allows you to find and improve issues that you might otherwise have overlooked. This improves CX and strengthens your brand credibility.

How Speech Tuning for Automatic Speech Recognition Works

Speech Tuning assesses how users interact with the system and its testing changes. The process takes time, but when it comes to speech, every millisecond counts. Even minute improvements in application performance produce an impactful ROI within a brief period of time. 

The LumenVox Speech Tuner performs transcriptions, instant parameter and grammar-tuning, and version upgrade-testing of any speech application. This reduces your workload during post-deployment application revisions. It also allows you to bring tuning in-house and thus avoid costly professional service fees.

LumenVox Speech Tuner is up and running, maximizing ROI, with just 3 easy steps:

1. Data Import

First, you import call log data into the Speech Tuner database. All information stored by the call log is available in the Speech Tuner. The Call Indexer service automatically scans remote speech applications for fresh logged calls, ensuring key data is just a click away.

2. Speech Transcription

Then, transcribers type the text of the caller’s speech directly into the Transcriber. Once the audio is transcribed, the Speech Tuner compares audio transcripts with the Speech Engine results to determine accuracy, greatly reducing errors associated with manual evaluations. The transcripts are evaluated using the actual decode grammar, producing measurements such as word error rates (WER), in-and out-of-grammar rates and semantic error rates.

3. Immediate Testing

Selecting an interaction in the Call Log automatically loads the associated audio and grammar into the Tester. The grammar can be edited, Speech Engine parameters set, and individual recognition tests generated. The Speech Tuner natively supports industry standard SRGS grammars. Once a set of possible changes is identified, users can batch test audio to evaluate performance, using those changes.

Ready to Reduce the TCO of Your Speech Applications?

The LumenVox Speech Tuner accelerates ROI by decreasing the time spent in tuning cycles. The more efficient your tuning process is, the more you’ll be able to decrease the Total Cost of Ownership (TCO) of your speech applications. The numbers are significant, with LumenVox clients documenting hundreds of thousands of dollars in savings per year, all as a result of fast, accurate and powerful speech tuning. 

Interested in migrating off of your legacy speech applications? Contact us!