As our relationship with our partners mature, we find ourselves influenced to create and develop for their needs. We do not build something in hopes that our partners will buy it. We listen carefully to those around us and develop to their needs after the proper qualification is secured. Recently there has been a flurry of activity in Central and South America and especially in Brazil. One of our larger partners who have fully integrated our software into their platform, requested we develop a Brazilian Portuguese acoustic model. Given our willingness to please and our understanding of the market potential of undertaking such a venture, we recently agreed and added this to our ever growing list of languages. We used a novel approach to developing this model that we have been researching for quite some time and we think should satisfy most speech recognition applications.
The Brazilian economy is in a state that presents a favorable opportunity to increase automation to maximize their efficiencies. To answer this urgent call, LumenVox has expanded its ASR offering by creating a Brazilian Portuguese language model which will bring the number of ASR languages to 8 and the number of TTS languages to 23. Today LumenVox covers all of the America’s from Cape Horn to Cape Columbia and everywhere in between!
We will be doing a lot more with the Asian TTS languages in the future, once we figure out how to deal with some of the double byte issues in our Media Server. We just entered into QA with our new version so we should be able to share some details with you on this in just a few weeks.
LumenVox version 12.2, scheduled for release on Tuesday, Sept. 2, has a large number of exciting new changes. In particular, the Tuner is getting a major series of improvements, and some cool new changes have been added throughout.
From almost top to bottom, we have looked at how we can improve the usefulness of the LumenVox Speech Tuner. One of the first things we realized is that many users have trouble figuring out what they need to tune the most.
Analyzing by Menu
Loading data into the Tuner can be overwhelming, so we added a new concept to the Tuner called a menu. A menu is designed to allow you to filter data so you can tune a specific menu in an IVR or speech application.
The way this works is the Tuner analyzes the grammar files that were in use for each speech interaction. A main menu in a banking application might use the following grammars:
And a “transfer funds” menu might use:
Because the Tuner knows which grammars are active for which speech interactions, it can make logical inferences about which interactions should be grouped together. That grouping is the menu system. A new dropdown allows you to select from the various menus the Tuner recognizes and just pick the one you’d like to focus on.
New to the 12.2 are Tuner Wizards, a series of automated tools that guide you through the process of identifying problems and focusing on the relevant data. You can fire up the new Tuning wizard, pick a menu (or all of the data), and choose from a list of options to focus on. That list includes:
Confidence Threshold Tuning
Decode Speed Tuning
Decode Failure Tuning
The Tuning Wizard will let you know whether your data exhibits any problems related to these issues and then will help you identify which interactions contribute to the particular type of issues you’re facing. It’s a great way to focus your time so that you only pay attention to the items most relevant to you.
Grammar Editor Changes
The Grammar Editor is a long-standing feature in the Tuner, giving developers an easy way to build, edit, and test their grammars. Several new features enhance the capabilities even further:
Multiple grammar parses. Previously, the Grammar Editor could only parse a sentence against a single grammar at a time. A new option allows developers to parse any combination of loaded grammars, making it easier to test how combinations of grammars will affect grammar coverage.
Pronunciation Checker. A new module called the Pronunciation Checker shows where pronunciations for grammar items come from: are they in our built-in dictionary? A user-defined lexicon? Or are they being produced by our statistical pronunciation rules? Words which don’t have good pronunciation definitions often lead to errors in recognition, so this is a useful module for troubleshooting performance.
Random Sentence Generator. This module generates 10 random sentences at a time that are allowed by the grammar. Using it, you can check grammar coverage to make sure that the words and phrases you expect to be in grammar are, while simultaneously ensuring that phrases you don’t expect to be in grammar are not.
Speech tuning is often perceived as an add-on effort to deploying speech applications. Our years of experience has demonstrated that it is a vital part of the process and can contribute significantly to cost savings for any business.
Speech tuning is the process of improving speech applications after they have been deployed. Speech Tuning assesses how users interact with the system and its testing changes. Though the process can be time-consuming, even minute improvements in application performance produce an impactful Return On Investment (ROI) within a short amount of time.
The LumenVox Speech Tuner can be used to accelerate this ROI by decreasing the time spent in tuning cycles, which also decreases the Total Cost of Ownership (TCO) of a speech application.
The numbers are significant, with our clients documenting hundreds of thousands of dollars in savings per year, all as a result of speech tuning.
In an updated whitepaper we describe the ROI of Speech Tuning, showing exactly how to calculate the return on investment (ROI) of speech tuning.
How well do you know speech recognition? Speech recognition is a field of computer science and computational linguistics. It is responsible for developing technology that recognizes and translates spoken language into text—using computers. Other terms associated with speech recognition include: ASR, or automatic speech recognition, computer speech recognition, and speech to text.
With Siri and Alexa now incorporated to our daily lives, many of us use this innovative technology without knowing how it really works.
That’s why we’ve created a new, two-part video series that gives a basic overview of speech recognition technology, its various components, and capabilities.
It’s called Speech Recognition 101.
Speech Recognition 101 Part 1 provides an overview of the components that enable speech recognition and discusses commonly used speech recognition technologies.
Speech Recognition 101 Part 2 takes an in-depth look at one of those parts, the grammar.
LumenVox offers Automated Speech Recognizer (ASR), a software solution that converts spoken audio into text. LumenVox ASR is unique in its ability to recognize naturally spoken language and its tuning flexibility.