Practical Guide to Tuning

Practical Guide to Tuning

To give and idea of how much time speech tuning can take, the speech recognition industry estimates that 40–50% of total development and deployment time should be spent on the tuning process.

Untuned speech applications do not survive contact with customers. Whether your company has live speech applications in deployment today, plans to implement one within the next three to six months, or is only beginning to consider adding speech applications, you should consider the importance of tuning. Tuning uses prompts, grammars, call flow, and caller data to improve the speech application as a whole, and is critical to the success of your deployment.

There are three ideas to keep in mind when approaching a tuning task:

  1. Make Time for Tuning. Even the best of "best practices" build on assumptions that might not hold true after deployment–once you have callers, you must often readjust or remove these assumptions to provide the quality experience callers expect. To give and idea of how much time tuning can take, the speech industry estimates that 40–50% of total development and deployment time should be spent on the tuning process. Putting emphasis on tuning will help your application run more smoothly, keeping callers happy — and your customer.
  2. Adapt the System to the Caller. In general, you will not be able to make users do anything in any particular way. You can, and should, give as much guidance for callers as possible, but ultimately the caller dictates the conversation. The trick is to provide good cues and guidelines, so callers choose the pathway you designed for the application. Remember that if the system fails to meet the caller's needs, it's not the caller who has failed; it's the speech application.
  3. Start with Small Changes. It's all too easy to get caught up in the moment, expending hours of effort on a seemingly enormous problem — for something that really only affects a few out of several hundred callers. Identify, first, the issues that are the easiest to resolve and provide the biggest benefit. Making small changes to improve the experience for most callers is preferable to costly changes that only benefit a few.

What you shouldn't do when tuning a speech recognition application:

  1. Don't Make Changes Based on One Instance. This should be fairly obvious, but we still see it happen. Making changes based on a single instance usually results in fixing a problem that doesn't really exist. There are numerous 'one–off' errors in speech recognition, many of which are associated with noise, or transient effects that won't be generally reproducible. Real issues will arise multiple times, in multiple places, with plenty of evidence to help you decide how to solve them.
  2. Don't Make Changes on Unanalyzed Reports. Treat the report with respect: analyze the call, compare it with other calls, see what really happened — often, the system worked as designed, but the design was flawed. Research the problem carefully so that you avoid unnecessary (and costly) changes. Instead, try this process when tuning an application:
  3. Familiarize Yourself with the Caller's Experience. Do this by listening to the calls, from start to finish. Compare the speech engine results with respect to the audio prompts and the caller's speech. Transcribe the audio, so you can analyze the accuracy and performance.

    Use your Speech Platform's reporting and analytical tools to maximize your information. Above all, identify the key issues and prioritize them. Solve the easiest dilemmas first, like typical grammar problems. Then, move to prompt and dialogue changes and finally proceed to acoustic model training and adaptations.

  4. Test Changes Rigorously. When you make a change, you must test it. You did the transcripts, and so you have the grammar and audio data: as much as possible, test under 'real' conditions. Give yourself the assurance that any

© 2016 LumenVox, LLC. All rights reserved.