Advanced Tuning Concepts Video



  • In this continuation of the Speech Tuning series, Advanced Tuning Concepts covers identifying and fixing problems in your speech application, such as prompt design issues and misrecognitions.
  • RUNTIME 8:43


Video Transcription

Advanced Tuning Concepts

In this video, we will talk about some advanced tuning concepts. We will cover what kinds of problems you are looking for when tuning, how to identify the problems, and how to solve them.


Tuning is about finding trends. You want to look for a large number of users having similar problems, and not focus on problems that only affect a small number of callers. When using the Speech Tuner, you will be able to listen to every single call, and some of you will be tempted to fix each error. But this is not the best way to build a strong application, as you have limited time and resources. Instead, focus your efforts on the big problems first, which means finding problems that are affecting the majority of callers.

There are a few types of problems you should be looking for while tuning.


One of the main categories of errors are misrecognitions. These are problems where the user has said something that hasn't been recognized or hasn't been recognized correctly.

Unexpected response

Unexpected responses are things you don't expect callers to say, and so the response is not in the grammar or the application. For example, you have a banking application, and the caller mentions ordering a pizza. Not all unexpected responses are problems you would want to fix, and that example is a type of response you wouldn't want to accommodate.

Variations on expected responses

Variations on expected responses are when callers say things similar to what you expect, but different in key ways. For instance, a caller may say "customer care" instead of "technical support."

They both mean the same thing but the words used were not what was expected, or they pronounced words differently than expected — these are alternative pronunciations.

Misrecognized expected responses

A misrecognized response is a response that's contained in the grammar, but the Speech Engine misrecognized it. This is usually the rarest type of misrecognition.

Fixing misrecognitions

Depending on the type of misrecognition, the steps taken to fix it will vary.

Unexpected responses

When dealing with unexpected responses, the first decision is whether the problem needs to be fixed at all. If you modify applications, grammars, or prompts to accommodate a small number of callers, you may end up hurting a larger number of callers.

You should avoid adding in new possible responses to your application or grammar to accommodate a single call — remember, look for trends.

Variations on expected responses

These are often the easiest type of misrecognition to fix. If your grammars are designed well, all you need to do is set up alternate pronunciations, or add extra words. If you are using semantic interpretation, this is easy: all you have to do is add the word or phrase into an existing grammar rule that already has semantic interpretation. Because the output from the grammar (the semantic interpretation) has not changed, no changes are needed to your application.

Misrecognized expected responses

Usually, misrecognitions of expected responses happen because the Speech Engine is not set up correctly. Most common are parameters related to end-pointing, identifying the start and end of speech. Look to adjust the timeout values, as if the end of audio is cut off, it can drastically harm recognition. This happens frequently when callers are reading out strings of digits.

Other common causes include hardware problems, such as echo cancelation on your telephony card failing. This will allow your prompts to bleed back into the system, which is causing false barging in too early and so no valid audio is collected.

Design issues

In addition to misrecognitions, tuning can help identify application design issues. Two common types of design issue you can identify just from listening to callers are caller frustration and caller confusion.

Caller frustration

This is where the caller loses faith in the system and caller becomes frustrated. You can often hear this in the tone of their voice. Callers may hang up or ask to speak to an operator. The main idea is that callers simply do not believe the system is going to help them achieve their goal.

Caller confusion

This is when callers just don't know what to do. A lot of the time caller frustration comes from caller confusion — a caller starts out confused and then becomes frustrated when the system does not help them achieve their goal.

It may also lead to unexpected responses, as callers guess what to say to get the system to work.

Fixing design issues

To alleviate caller frustration, you must often redesign your application's call flow. Strive to make it clear where in the call the caller is. For example, the system could tell the user something like, "Thanks for that response, just three more questions and we'll be done."

Callers want to feel progress towards their goal. You want to make the call flow smooth and fast so that you're not wasting their time.

Caller confusion

To reduce confusion, make prompts more clear. The idea should be to provide a clear picture of where the caller is in the system and what options are available at any time. A good application can succinctly give the user a mental model of the application design for each prompt, a sort of mental flow chart.

You may also want to add in more global commands, such as "main menu" or "help" to assist users who become lost within the call flow.

© 2018 LumenVox, LLC. All rights reserved.