I cannot get the engine to recognize correctly, or my results have a low confidence.
A good speech recognition application depends on a well designed grammar. A grammar which contains very
similar words (like "bit" and "pit") is an inefficient grammar that will hurt accuracy
and speed. The engine will take longer as it tests the competing words against the audio. The resulting
match will have a lower confidence because of the additional similar words.
What do the confidence scores mean?
The confidence score is a rough measure of how closely the speech matched the phrases in the grammar. The
score ranges from 0 - 1000. The higher the score, the higher the estimated probability that the result is
correct. A score of 500 indicates the Engine is 50 percent sure the result is correct. Typically, an
application designer will use the confidence score to make decisions about the quality of a recognition result.
For instance, results over 600 might always be accepted, results between 599 and 200 might trigger a confirmation,
and results below 200 might be rejected outright. The thresholds to use depend largely on the grammar that is being
used. In addition to the grammars, an application's confidence thresholds should be one of the first things to tune.
What are some ways to increase the recognition accuracy?
Smaller grammars work better. The practical limit is 10,000 phrases, but the smaller the grammar, the greater
Longer phrases also work better. When you need to recognize a phrase like "How do I" or "transfer me
to", put these in as a single phrase, not individual words. Except where recognizing a single word, (like
"Yes" or "No") avoid single small words.
Also, attempt to cover all the words you believe a normal user will speak. If a word or phrase is not in the grammar,
the Engine will not be able to identify it.
Another key thing to try tweaking are various Engine parameters, such as the voice activity detection settings.
See Recommended Engine Settings for more details.
Why does the first result in my list of N-Best results have a low confidence score?
This is related to a bug currently in the Engine.
The results are actually in order of what the Engine believes are best; it is just displaying the confidence
scores incorrectly. Basically the confidence score for the first result is correct and all the others are being
shown too high.
The Engine calculates an initial round of scores, which it then uses to sort the N-Best results (highest score is
first). Then it applies weighting on those scores, issuing penalties for certain things in the recognition. That
weighted score is what it is supposed to display. Unfortunately, there is a bug that is causing the Engine to show
the weighted score only for the first result, so all the other results are being shown with unweighted scores.
You are most likely to encounter this issue when speaking out-of-grammar words. In most cases, the penalty will be
slight enough that the first result will still have the highest confidence score.