
Because tuning is an absolute requirement for every speech recognition solution, we created the LumenVox Speech Tuner, a complete tuning and maintenance tool.
The Speech Tuner is designed to perform tuning and transcription, as well as instant parameter, grammar, and version upgrade testing of any speech recognition application. It reduces the work of your post-deployment application revisions, and allows you to bring tuning in-house, avoiding costly professional service fees.
Tuning uses prompts, grammars, call flow, and caller data to improve the speech application as a whole. The Speech Tuner also provides the following features:
The Call Browser grants users the ability to quickly choose and listen to a specific interaction, and export the audio file -- all from one window. It displays a list of all currently loaded and filtered calls, and displays all the interactions for a selected call.
The Call Browser window is divided into three distinct sections: the Calls List, the Interactions List in the middle, and an Audio Control at the bottom.
Within the Calls List, you can easily move between calls to highlight a specific interaction. The
data fields available provide key information such as call time, the number of interactions within a
call, the number of times the Speech Engine recognized speech, and the confidence for times the Speech
Engine correctly interpreted a phrase.
The Interactions List contains details of every interaction within a call. For each interaction, you
can click the View Details button for a specific list of details such as: the acoustic model, decode
time, NBest rank, and semantic interpretation.
The Audio control panel allows users to choose between hearing the decoded audio or the actual caller
utterance, with easy volume controls. An "Export as WAV" button provides user with the ability to export
the call audio as a WAV file on their hard drive.
Whether you're new to tuning or just don't have the time, LumenVox can help you tune your applications.
One of the
Make changes to grammars or parameters, secure in the knowledge that those changes will make the application better, faster, and more accurate. The Speech Tuner uses historical information to validate your changes, ensuring your success.
Most 'tuning' tools are passive log viewers, requiring that changes be made in the live speech recognition application and retested over a period of time with live callers. With the Speech Tuner, we send the changes to the Speech Engine, simulating the recognition process and evaluating changes instantly. Instead of slow, non-interactive, static tuning, the Speech Tuner enables on-the-fly, highly interactive, dynamic tuning.
Speech can be evaluated against grammar sets, as they are sent to the Speech Engine. The grammar can be adjusted and re-tested and re-scored to see if the changes improved performance. Therefore, you can determine instantly whether adding a new phrase to the grammar will improve your speech recognition accuracy.
The Speech Tuner rates performance against commonly accepted measures like WER (Word Error Rate). This helps to give an accurate picture of details such as average confidence scores, correct versus incorrect responses, and in-grammar versus out-of-grammar performance.
Setting parameters optimizes the Speech Engine performance, further improving the caller's experience. Traditionally, changing Engine parameters is a difficult and time-consuming task, often requiring long delays between changing a parameter, and evaluating its effects on performance. Our Speech Tuner changes this.
The dynamic test capability of the Speech Tuner allows the user to shorten this delay. Now, Speech Engine parameters such as search optimizations, speech end-pointing, and NBest result processing can be easily adjusted, and immediately re-tested and re-scored from within the testing component.
Good transcripts can be an important part of properly tuning a speech recognition application. The Speech Tuner's Transcriber is designed to make this process as quick and seamless as possible.
In fact, in the latest version of the Speech Tuner, we have made the transcription process up to 5 to 10 times faster with improved statistics, a new control panel interface, and shortcuts.
Transcribing speech is an excellent way to become familiar with how callers interact with the system.
Transcriptions are used to calculate automatic performance measurements such as in-grammar or out-of-grammar rates and recognition accuracy. Good transcripts are a key component in using the Speech Tuner to adjust your speech recognition application as needed.
The Transcriber is used to write down every word in a call. The Grammar Tester uses these transcripts in evaluating how well a Speech Engine is interpreting what users are saying.
The Transcription Log allows for a detailed view of every single transcript during a transcription session. Interaction number, name and transcription description are all fields that the Log tracks.
The main window of the Speech Tuner is used to load call data, filter that data, and launch the other components of the Speech Tuner application.