Speech Recognizer Sizing

Sizing LumenVox Deployments

When building speech applications we're often asked, "How many ports or channels can I run on a given machine?"

In general, the number of simultaneous speech sessions that can be reliably run varies with the capacity of the machine and the type of speech recognition being performed. For instance, recognizing a single digit is computationally less "expensive" than recognizing a full street address from a complex grammar.

At LumenVox, we can provide an estimate of the number of ports or channels you'll need by designing an application that mirrors the most common speech application and running it on a variety of hardware configurations. Our sample application simulates a user calling into an IVR that performs both speech and DTMF interactions. The call progresses through a series of interactions, including digits, currencies, date/time, menu commands and mixed grammars.

With specialized testing tools, we're able to generate large numbers of calls at the same time and measure the performance of LumenVox speech software to determine numbers that we use as guidelines for developers building standard speech IVRs. The tests are conducted in Windows and Linux configurations:

Low resource machine 85 concurrent sessions
High resource machine 360 concurrent sessions

These results are intended as a guide. Your application may allow for significantly higher or lower density.

Test Environment

The test environment configuration:

Low resource machine
Intel Xeon CPU, 2.4 GHz (), total of 4 CPU cores
2.0 GB of RAM
32-bit operating system (Windows XP Professional SP2 32-bit and CentOS 5.4 Linux 32-bit)

High resource machine
2x Intel Xeon CPU, 2.33 GHz (E5410), quad-core, for a total of 8 CPU cores
8.0 GB of RAM
64-bit operating system (Windows Server 2008R2 64-bit and CentOS 5.4 Linux 64-bit)

Test configuration

  • LumenVox Version 10.2
  • LumenVox stack (Media Server, Speech Engine, License Server) installed on the target machine
  • Calls driven by a separate machine on the same local area network as the LumenVox server
  • When call finishes, the system waits 10 seconds before sending the next call

NOTE: To reach the higher levels seen in the test, some default LumenVox configurations were changed. For example, by default the Media Server is only configured to use up to 200 ports. This obviously needs to be increased to get 360+ simultaneous sessions.

Please contact LumenVox for more information if you are attempting to run higher density sessions.

Test criteria

The LumenVox test seeks to measure performance of the system in the same way a person would when calling into an IVR. Thus we did not measure hardware load directly, but instead ensured the system performed exactly as expected. Our criteria for a successful test are:

  • The system must perform at greater than 95% success rate for all events
  • An "event" consists of a speech recognition, DTMF interaction, or new call
  • The average time to initiate a new session must also have been less than 1 second
  • For a speech or DTMF interaction to be successful, LumenVox must return the correct response within 5 seconds of the end of speech/DTMF
  • Interactions had a 70%/30% split of speech/DTMF

© 2016 LumenVox, LLC. All rights reserved.