Frequently Asked Questions
LumenVox Distributed Architecture
Though the LumenVox Speech Engine is commonly run on
a single machine, it can also be run in a distributed manner, with multiple clients and servers working together. This allows it
to function well in high–volume speech recognition solutions, as the computing demand can be split across multiple machines
(this is known as load balancing, since the workload is balanced across several computers). It also enables users to set up redundancy:
if one Speech Engine server goes down in a multi–server environment, clients can switch servers seamlessly.
The architecture is broken into four components:
- Speech Servers: The Speech Engine server is a program that performs the actual speech recognition.
It processes the incoming audio, compares a speaker's utterance to the phrases in the active grammars,
and returns the results of the audio decode to the speech client.
- Speech Clients: The LumenVox client is a piece of software that sits between the speech–enabled
application (such as a speech–enabled IVR) and the speech server. It passes the audio from the application to the
engine, and returns the decode information from the server back to the application.
- Server Monitor: The server monitor is a component of the client process that coordinates the
servers and the clients. When a client has audio it needs decoded, it asks the server monitor which speech server to use.
The monitor tells the client which server to use, giving it the server that is the least busy. The monitor continuously
watches speech servers, so it can remove one from its list of valid servers if a server goes down. If that server later
goes back online, the monitor will know this and begin sending clients to that server once more.
- License Server: The License Server manages the pool of Speech Engine licenses. When a speech–enabled
application opens a new speech port, the speech client requests a license from the License Server. If there is an available
license, the License Server assigns it to that client until the speech port is closed and the client releases the license.
By working together in the manner described above, the pieces of the LumenVox distributed architecture allow speech–enabled
applications to handle high volumes of speech recognition requests in a robust and fault–tolerant way.
To illustrate how the process works, here is a step–by–step example of how the pieces fit together.
- A caller calls into a speech–enabled call router and asks to speak with technical support.
- The call router application opens a speech port with the speech client.
- The client asks the License Server for a Speech Engine license.
- The License Server checks its license pool, sees an available license, and assigns it to the speech client.
- The client confirms that the speech port is opened and the call router application passes the call
audio and parameters to the speech client.
- The speech client asks the server monitor which server it should use.
- The server monitor has been monitoring the status of the speech servers,
and gives the speech client the IP address and port of an available speech server.
- The client sends the audio and parameters to the speech server.
- The speech server runs the audio and parameters through the Speech Engine and gets the results of
the decoded audio, which it passes back to the client.
- The client returns the results to the call router application, which is then able to transfer the
caller to the technical support department.