Tools

Developing Speech Applications on Asterisk

Reference Number: AA-01143 Views: 19064

0 Rating/ Voters

Asterisk has native speech recognition support through the module res_speech.so. This module provides a number of dialplan applications that can be used for speech recognition. However, res_speech.so does not handle the communication between Asterisk and an automatic speech recognizer (ASR).

To get that communication working, a second Asterisk module must be loaded. LumenVox' recommended method for this is to use the open-source UniMRCP-Asterisk module, which uses the Media Resource Control Protocol (MRCP) to send requests for speech recognition from Asterisk to the LumenVox Media Server. A module called res_speech_unimrcp.so provides a link between the native Asterisk speech API and LumenVox in this case, meaning that dialplan calls will be translated into MRCP requests.

One shortcoming of this approach is that Asterisk has no native text-to-speech (TTS) support. To help solve this issue, the UniMRCP project includes another Asterisk module called res_unimrcp.so that offers dialplan applications which provide TTS functionality. The res_unimrcp.so module also includes new ASR functions that can replace or complement the native Asterisk speech API.

In summary, there are two broad approaches to using speech software on Asterisk with UniMRCP:

Use res_speech_unimrcp.so with the native Asterisk Generic Speech API (res_speech.so). This will not give you access to TTS.
Use res_unimrcp.so for TTS and/or ASR functionality.

Developers may also choose to combine these approaches. One important benefit in using the Asterisk Generic Speech API is that recognition results are generally returned as simple strings. The res_unimrcp.so interface (currently) returns results in the Natural Language Semantic Markup Language (NLSML) format, which represents a complex object as XML. This means application developers will need to be able to parse an NLSML-formatted object in order to make use of the results of an ASR interaction. We have a detailed guide to how LumenVox supports NLSML that may be of use for those looking to build NLSML parsers.

The following information on building speech applications for Asterisk is available:

res_unimrcp.so

MRCPRecog() - speech recognition over MRCP.
MRCPSynth() - TTS over MRCP.
SynthAndRecog() - allows for combined TTS output while listening for speech (ASR) input.

Generic Speech API

Overview