Distributed Architecture



  • This video discusses the reasons why and the "how to's" of setting up a distributed architecture for speech recognition applications. Specifics include the Client-Server model, the Server Monitor, Licensing and more.
  • RUNTIME 8:11


Video Transcription

Distributed Architecture

What & Why

  • Distributed architecture is taking a single computing task and distributing it among multiple machines.
  • The primary reason for distributed architecture is redundancy. Let's say you have a critical system, such as a PBX, which handles your incoming calls. If the power supply in the PBX machine catches fire, you could lose all of your calls. In a distributed environment, when one machine fails, the other would pick up the slack.
  • Load balancing is another good reason for distributed architecture. This allows you to split up tasks among multiple machines. If any one machine becomes overburdened with too much to do the others would help out. The load would be evenly distributed so no single computer becomes bogged down.
  • Distributed architecture is very scalable, which means it's easy to add more machines. So as the need grows, as you do more business, as traffic grows, you simply plug in more machines to the cluster of machines and the increased traffic is handled.

Client-Server Model

The LumenVox Speech Engine works well in distributed environments, because internally we use a client-server model. There are two parts of the application when the LumenVox Speech Engine is installed.

  • The first part is the speech client, this is the part that talks to your application. You may have a platform that runs your IVRs, or a PBX, or any other application that communicates with the Speech Engine using the speech client. This communication could be by way of MRCP or an API integration. Audio is sent, grammars are sent and the LumenVox Speech Engine sends back decodes with the actual text of what a user said.
  • Internally, the Speech Engine client takes the speech and passes this data to a speech server. The server application is what actually performs the decode. The client and the server speak to each other using standard TCP/IP port connections. This is good because unlike an API integration, which has an integration from your platform to the LumenVox engine, the API cannot traverse a network. API is used with local machines only. As opposed to the LumenVox client and server speak to each other using TCP/IP, which allows communication over a network.
  • We can have the speech client and the speech server on the same machine if we choose to, but we can take the speech engine and put it on a separate machine and have the client and server still able to talk to one another. The advantage to doing this is that it allows us to balance the load. Speech recognition can be very resource intensive, it can require a good deal of memory, and use many CPU's, etc. You may not want this all happening on your PBX or IVR server, so you could just add the client to the PBX or IVR server then have the speech server perform decodes on a separate machine which could be dedicated to speech recognition if you choose.

Server Monitor

  • For all for this to work, the speech client contains a small routine called the server monitor. This routine takes a list of speech servers to use and your client application supplies the monitor with the location of all of the speech servers so that it can routinely check all of the speech servers. So if you have three speech servers: A, B, and C, the client is set up on the PBX with 3 separate servers elsewhere. The server monitor will constantly be communicating with all three servers, A, B, and C. If B suddenly becomes disabled, the server monitor will realize this and it will no longer send decodes to server B, it will simply use A and C only. However, if B server comes back online, the monitor will see this and will begin sending decodes once again. This is good for redundancy and failover.
  • The server monitor also keeps track of how busy the servers are. So if server A suddenly becomes very busy, the request for decode would be sent to B or C. New decode request are sent to the server that is least busy. So LumenVox is automatically load performing a type of load balancing if you have multiple servers set up.


  • The final piece that allows LumenVox to work in a distributed environment is our licensing. Licensing is very convenient for this purpose. LumenVox is not concerned with how many installations of the engine and the client you have, at least from a licensing perspective. We don't have per seat or per server types of licenses. The LumenVox licenses are all per decode; when a speech port is opened to perform a decode, this is when a license is used. The speech engine and client on as many machines as you want but it will not take up a license until decodes are done. The client asks for licenses during a decode, it will look at a license server. Licenses can be installed on the same machine as the client or on a separate machine. So you could have ten separate speech clients communicating with twenty separate speech servers all speaking to one license located on yet another machine.

The following is a graphic representation of the above:

In this graphic one of our customers, Ontelnet has two PBX servers that take incoming SIP traffic. The servers are set up in a cluster and performing load balancing. Since the PBX is integrated with LumenVox our client application is on the systems that the PBX's are running. However, the PBX are very busy and the customer does not want to have to do speech decodes on the PBX servers because they are handling so many calls. What they have chosen to do is to move the speech decodes onto different servers. As you can see there are multiple speech servers set up so now when there is a request for speech recognition that request goes off to the most appropriate speech server because, as we discussed, the server monitor on each client has a list of all the speech servers and their status. Also note that we have a license server, all the clients communicate with the license server, and we also can see that there is a back up license server that can be used if the primary license server fails for whatever reason.

© 2018 LumenVox, LLC. All rights reserved.