Tools

Other FAQ's

Reference Number: AA-02074 Views: 7109

0 Rating/ Voters

Will the engine handle proper names?

The internal dictionary has thousands of common names (around half of the words are names).

If a name is not in the dictionary, the decoder will use a combination of statistical pronunciation as well as basic rules to phonetically spell any name.

You may enter the phonetic spelling of the name if the phonetic speller is unable to come up with a good pronunciation. This has been shown to work in the vast majority of cases. The phonetic spelling can be directly entered as the phrase, if necessary, by enclosing the phoneme characters in curly braces "{ }". See Using Phonetic Spelling.

Why do my prompts trigger barge-in?

A problem speech developers often have is poor echo cancellation. When this happens, prompts played to callers are heard by the speech application. If the prompt is played back loud enough, the speech application may confuse the prompt echo for speech and trigger barge-in, cutting off the prompt.

This commonly happens for three reasons:

The prompt playing is too loud. If you listen to the entire audio, you can typically hear the echo cancellation start to fail. To remedy this, reduce the volume at which the prompts are played.
The prompt has leading silence. The echo cancellation hardware uses the first half-second of a prompt to get to get the timing down. Basically, the hardware looks for sound markers in the prompt in order to perform echo cancellation, and if there is only silence it cannot. It is OK to have trailing silence, but not at the prompt's start.
The prompt is text to speech. Text to speech can create problems for echo cancellation because there is actual silence between the words. This can cause the echo cancellation to lose the proper timing.

What is n-best?

Instead of hypothesizing only one sentence, the engine hypothesizes several sentences on what it heard. Usually the top sentence is the highest scoring sentence. The others are the top alternative sentences, which scored lower. N-best results can be used to craft more intelligent confirmations.

Why does the API appear to cause a memory leak?

A common reason that causes the memory usage to grow is keeping loading grammars without unloading them. A good practice is unloading grammars that will not be used for a while.

Also, please exercise caution when using the C API. Most of the handles created by the API, such as H_SI, H_GRAMMAR, and HPORT, need to be explicitly released after you are done using them.