Rss Categories

CPA Example Applications

Reference Number: AA-02457 Views: 137 0 Rating/ Voters

Introduced with 19.1 were two new sample applications for users, which are designed to explicitly demonstrate the features and capabilities of CPA and AMD

These are the SimpleCPAClient_cpp, which is the C++ version of the example and also the SimpleCPAClient_c, which is the C language version of the example.

Both examples demonstrate the use of CPA and AMD by using the provided grammar and audio files to enable these features and perform streaming requests, using multiple threads, then obtaining and displaying the results to the console window.

Since the functionality of both of these applications is essentially the same, we will describe their functionality together below. Any differences between the applications will be noted.

SimpleCPAClient Overview

The diagram below shows the basic functionality of the SimpleCPAClient applications, which require the input of a grammar file, as well as an audio file to be used.

Note that the settings being used for each request into the sample application are contained in the grammar file. These settings include various things like whether to use the Call Progress Analysis (CPA) or Tone Detection algorithm when processing the audio contained in the specified file to generate the result. You can use different grammar files containing different settings for each request, effectively tailoring each request to be processed as needed. This is how production-scale applications can utilize CPA and also apply different settings across differing applications or different accounts in a multi-tenant installation.

The LumenVox server shown to the right of the diagram is the ASR server, which processes the requests in order to deliver results to the application. The application itself is designed to run on both Windows and Linux operating systems and links directly to the shared Speech Port library provided by LumenVox in the regular installed packages when installing a normal instance of LumenVox ASR software on your system. You should therefore be sure to install the latest version as described in our Downloading LumenVox Products article. You should also have a reasonable understanding of some of the concepts behind LumenVox CPA as detailed in our Introduction to CPA article.

See our Grammars in CPA and AMD article for a complete explanation of the various settings available for use in the grammar files for each of the detection modes when working with CPA and AMD.

The purpose of this sample application is to provide a more in-depth working example of a simple implementation of how to write your own application that uses LumenVox CPA technology. Where possible, we've included comments in the provided source code that help describe important aspects and concepts used within the application.

Within the application, there are two independent threads created, one for loading and activating the grammar, and the other for streaming the audio from the file into the speech port.  The grammar loading thread does not usually take very long to load and activate the grammar, so completes its task relatively quickly. The audio streaming thread sends chunks of audio, simulating a realtime audio stream, and monitors the status of the port callback (assigned within the sample application), which is asynchronously notified of events occurring on the Speech Port, including various notifications such as the presence of speech, the end of speech and various timeouts.

Once both of these threads have completed their tasks, an answer can then be determined, which is then printed out to the console window.


Please note that the sample applications provided here and elsewhere are aimed at highlighting the technology and various system calls that should be used to obtain results. They specifically do not perform extensive error-checking and exception handling that should be used in production code. The assumption is that application developers will include their own debugging, error-checking and other logic that is required, which is beyond the scope of the sample.

Running the application

To run, you should go to the command line (both Windows and Linux) and type the name of the application along with the parameters to use. If you do not provide any parameters, you will see a list of the various options, including the optional parameters, as shown here:



From the details shown here, you can see how to specify the audio file as well as the grammar file when making requests.

IP Address (optional)

There is also an optional IP Address that can be used to send requests to the ASR server at the specified IP Address. If this is not provided, the application will use the SRE_SERVERS setting specified in your client_property.conf file.

Verbose Output

There is also an optional verbose mode, which shows additional status information in the console window as the audio is being streamed and request is being processed. This may be helpful when initially running the application to give you a sense of the sequence of events that are happening which the application is processing, as shown below:

You can see various status and debugging statements appear, which come from the debugging lines included within the code, as well as feedback from the Speech Port callback function, when triggered to indicate changes in the port's stream state.

Without specifying the verbose output option, only the bottom portion of the output will be displayed (the result), as shown here:

SimpleCPAClient Terse Output

Sample Files Included

Several sample files for both grammar and audio are provided with the SimpleCPAClient application to allow you to fully exercise the capabilities of the application and demonstrate how you could use your own audios and grammars when you are ready.

Note: Audio format

Note that the sample application uses the G.711 8 kHz ?-law file format (ULAW_8KHZ), so the samples included here are encoded in this format. Files of this encoding type are generally saved with the .raw or .ulaw suffix. The sample application does not process the file suffix, so any or none could be used, but the audio you provide must be in this specific format to work. Also note that you can configure the speech port (and sample code) to work with other ASR Sound Formats if needed

CPA Sample Files

The files provided for testing CPA are listed in the table below:

Sample File Name Comments
CallProgressAnalysis.grxml Sample grammar not using the CPA_MAX_TIME_FROM_CONNECT option
CallProgressAnalysis_max1000.grxml CPA_MAX_TIME_FROM_CONNECT specified to return CPA result within 1000 ms
CPA_MAX_TIME_FROM_CONNECT specified to return CPA result within 5000 ms
CallProgressAnalysis.raw Audio file containing "Hello" simulating HUMAN RESIDENCE response
CallProgressAnalysis_max1000.raw Audio file containing "Hello" with reduced silence to work within 1000 ms

To use these with the SimpleCPAClient application, simple select the appropriate grammar file and audio file and provide these as parameters to the application, as shown here:

SimpleCPAClient -g CallProgressAnalysis.grxml -a CallProgressAnalysis.raw

When testing with the maximum 1000 ms response option (using CallProgressAnalysis_max1000.grxml) you will get 2 different responses from the two audio files provided. This is because the normal file (CallProgressAnalysis.raw) does not have speech early enough in the audio stream, so the predictor responds with UNKNOWN SILENCE, whereas the second audio file (CallProgressAnalysis_max1000.raw) has the amount of leading silence reduced so that there is sufficient time to detect speech before returning a result (of UNKNOWN SPEECH). This clearly demonstrates the effectiveness of the new CPA_MAX_TIME_FROM_CONNECT setting in use.

AMD Sample Files

The files provided for testing AMD, or Tone Detection are listed in the table below:

Sample File Name Comments
ToneDetection.grxml Sample grammar with AMD, FAX and SIT enabled (no Busy)
ToneDetection_busy.grxml Sample grammar with AMD, FAX, SIT and BUSY enabled
Audio file containing an answering machine beep after a message
ToneDetection_busy.raw Audio file containing a busy tone (US Style)
ToneDetection_fax.raw Audio file containing a fax tone
ToneDetection_SIT.raw Audio file containing a Special Information Tone (SIT INTERCEPT)

To use these with the SimpleCPAClient application, simple select the appropriate grammar file and audio file and provide these as parameters to the application, as shown here:

SimpleCPAClient -g ToneDetection.grxml -a ToneDetection_beep.raw

Note that in order to test the BUSY detection option, you should make sure the BUSY_CUSTOM_ENABLE option is set to "true" in the grammar (as it is in the ToneDetection_busy.grxml file), as shown here:

<meta name="BUSY_CUSTOM_ENABLE"                content="true"/>

MRCP and Telephony Platform Developers

Note that the design of this SimpleCPAClient application uses the Speech Port library as described above, so is designed for users implementing their own application and connecting to either our C or C++ API when working with CPA and AMD. Users working with speech platforms do not need to use our API directly, they should instead configure their speech platform to communicate with the LumenVox Media Server, which can process similar requests using the MRCP protocol. See our Using CPA on a Voice Platform article for more details of how to do this.

The sample audio and grammar files mentioned above for both CPA and AMD can also be used with the SimpleMRCPClient application to exercise the same functionality, which may be helpful to MRCP developers, for example:

SimpleMRCPClient -g ToneDetection.grxml -a ToneDetection_beep.raw -inline_grammar

Be sure to specify and use the -inline_grammar option, which will send the specified grammar's text inline to the Media Server for processing.

-inline_grammar option

Note that the -inline_grammar option shown here was introduced to the SimpleMRCPClient in LumenVox version 19.1, so you should update your client to this version or newer in order to use it. Without this, you will need to host your grammars somewhere the Media Server is able to reach. If your media server is located on the same machine, you may be able to simply use the absolute path if the -inline_grammar is not available in your version.