- Installation
- Administration
- Programmer's Guide
- Grammars
- MRCP Server
- FAQs
The LumenVox Speech Engine is a piece of core technology that will allow you to load audio and grammars (lists of words and phrases to be recognized by the Engine), and get back the decoded utterance. This task for returning decoded text from audio can be broken up into three general phases:
The Engine accepts audio in one of two ways. The simplest way is in a batch (offline) mode, where you have audio files saved to disk and load them into the Engine. In this case, your application will need to follow a process like this:
Most applications, however, will be more complex as they will require that a streaming (online) interface to load audio directly from a live source, such as a telephone caller. In this case, your application will have two separate threads.
The first thread will be a main processing thread, similar to the simple application process spelled out above:
At the same time, you will have a streaming thread that loops:
You will want to keep this overall design in mind as you get started learning about the Engine. This guide will walk you through the basic steps, providing example code along the way. To get started, continue on to Initializing a Speech Port.
If you are brand new to speech recognition, you may wish to browse through this core programming guide to get an idea of our API, and then immediately read our tutorial on writing SRGS grammars. Much of your application's success will depend on the quality of your grammars, so understanding how to write quality grammars is important.