Browse
 
Tools
Rss Categories

Recognition Questions

Reference Number: AA-02073 Views: 3562 0 Rating/ Voters

I cannot get the engine to recognize correctly, or my results have a low confidence.

A good speech recognition application depends on a well designed grammar. A grammar which contains very similar words (like "bit" and "pit") is an inefficient grammar that will hurt accuracy and speed. The engine will take longer as it tests the competing words against the audio. The resulting match will have a lower confidence because of the additional similar words.


What do the confidence scores mean?

The confidence score is a rough measure of how closely the speech matched the phrases in the grammar. The score ranges from 0 - 1000. The higher the score, the higher the estimated probability that the result is correct. A score of 500 indicates the Engine is 50 percent sure the result is correct. Typically, an application designer will use the confidence score to make decisions about the quality of a recognition result.

For instance, results over 600 might always be accepted, results between 599 and 200 might trigger a confirmation, and results below 200 might be rejected outright. The thresholds to use depend largely on the grammar that is being used. In addition to the grammars, an application's confidence thresholds should be one of the first things to tune.


What are some ways to increase the recognition accuracy?

Smaller grammars work better. The practical limit is 10,000 phrases, but the smaller the grammar, the greater the accuracy.

Longer phrases also work better. When you need to recognize a phrase like "How do I" or "transfer me to", put these in as a single phrase, not individual words. Except where recognizing a single word, (like "Yes" or "No") avoid single small words.

Also, attempt to cover all the words you believe a normal user will speak. If a word or phrase is not in the grammar, the Engine will not be able to identify it.

Another key thing to try tweaking are various Engine parameters, such as the voice activity detection settings. See Recommended Engine Settings for more details.