Tools

Asterisk Generic Speech API - Overview

Reference Number: AA-01799 Views: 11964

0 Rating/ Voters

Full documentation for the speech API is available at Asterisk.org. View the reference for all of the Dialplan Applications that start with the word "Speech."

Speech recognition on Asterisk consists of several steps:

Initialize a connection between Asterisk and the Speech Engine using SpeechCreate(). This creates a speech resource with which grammars and recognitions will be associated. For each such resource you initialize, you need one LumenVox license.
Load a grammar, either using theSpeechLoadGrammar() function or by specifying the grammar to be preloaded in lumenvox.conf -- SpeechLoadGrammar takes two parameters: a label and a path to the grammar. For instance, to load our built-in boolean grammar using the label yesno, you would useSpeechLoadGrammar(yesno|/opt/lumenvox/engine/Lang/BuiltinGrammars/ABNFBoolean.gram)
Activate grammars as needed. Active grammars control what words the Engine will recognize. At any time, the Engine can only recognize words specified in the active grammars. To activate a grammar, callSpeechActivateGrammar() using the grammar's label as a parameter, e.g. SpeechActivateGrammar(yesno).
Perform a recognition by calling SpeechBackground() or SpeechStart(). Both functions tell Asterisk to start listening for speech, butSpeechBackground() takes the name of a prompt to play while listening for speech. It allows for barge-in, very similar to theBackground() function.
Examine the results of a recognition using several variables: $(SPEECH_TEXT(n)) contains the text (or semantic interpretation, if applicable) that a caller said. The n represents the number of the result in case there are multiple returns from the Engine. The first result is number 0. $(SPEECH(results)) contains the number of results. $(SPEECH_SCORE(n)) contains the confidence score for result n.
Unload and deactivate grammars using SpeechUnloadGrammar(label) andSpeechDeactivateGrammar(label). You may then repeat the process, or destroy the speech resource using SpeechDestroy(). Destroying the resource frees up the license.

More Resources

Our general Speech Engine help document contains a lot of detailed information on writing grammars that applies to Asterisk. Particularly read through the sections in the Programmer's Guide on SRGS Grammars and Semantic Interpretation for Speech Recognition (SISR).
We have online training videos that cover speech development. In particular, our Asterisk Speech Recognition 101 series is a good starting point.
The Asterisk Speech Application Zone contains several small applications, with source code, that you can use as a starting point for your own development.
If you would like personalized, in-depth training for Asterisk speech development, we offer training classes.
Digium maintains an Asterisk Speech Recognition Mailing List that you can participate in to have more advanced questions answered.