Sizing LumenVox Deployments
When building speech applications we're often asked, "How many ports or channels can I run on a given machine?"
In general, the number of simultaneous speech sessions that can be reliably run varies with the capacity of the machine and the type of speech recognition being performed. For instance, recognizing a single digit is computationally less "expensive" than recognizing a full street address from a complex grammar.
Also, grammar complexity is often not simply the number of words present within a grammar, but also how the grammar rules themselves are structured. Rules that have variable phrases, or recursion for example are more computationally complex than rules containing simply one or two words.
There are a number of factors that influence the overall performance of speech applications, and generally, no two applications are alike, so coming up with some standardized metric that works for all applications is impossible.
The main contributors to the amount of CPU and memory resources needed by a speech application can be summarized by factoring in things like the complexity of grammars used, the number of concurrent calls, the ratio of speech recognition to prompt playback or DTMF processing within an application, the average call duration and also the average number of ASR operations within a call.
All of these factors affect the overall sizing calculation, so it is virtually impossible to be completely certain of a maximum number of channels a given system can run without actually performing some tests using the actual application and measuring system memory and CPU use under load.
We also understand that from a design perspective, users need to have some "ballpark" idea of what a given machine can provide, or as is more often the case, what type machine specification would be needed in order to handle a certain type of application.
LumenVox Sizing Tools
To provide some sizing estimates for users that allow them to get a reasonable idea of the type of performance from certain machine specifications, we created an ASR Sizing Tool and a TTS sizing tool, which are freely available on our LVDN site and provide some guidance, however we always recommend performing some testing against the production application in order to get actual performance metrics.
Please contact LumenVox for more information if you require specialized assistance in configuring large or unusual server configurations, including high performance or high-availability clusters.
Distributed and High-Availability Architecture
It is also important to remember that the LumenVox product architecture is designed to be distributed, allowing several LumenVox servers to be configured into clusters, allowing modular scalability and redundancy within designs if needed. Also, LumenVox is continuously developing new methods to allow this scaling to be done more efficiently, so please be sure to review the latest versions of LumenVox products when sizing a new application.
In addition, some good things to bear in mind when sizing a server for use in a speech application are that speech applications are typically very CPU and memory intensive, so you should plan accordingly. These days, the cost of large multi-core servers with many Gigabytes of memory are reasonably priced as well as being fairly commonplace. We suggest that whenever possible, allow for extra CPU and memory resources above your projected needs to allow for future application growth.
Also, it does not cause problems if the hardware being used exceeds the calculated demands of your application - indeed, if additional resources are available, they can be used to improve performance - this is a much better scenario than sizing a machine that does not have sufficient resources, which will struggle to keep up with the demands of the speech resources being used.
As well as CPU and memory resources, LumenVox services typically rely heavily on network connectivity, so having a well configured network certainly helps.
System Input/Output Performance in the form of disk operations also has an impact on overall performance, especially when handling several hundred or thousand simultaneous channels, for this reason, we recommend monitoring disk performance at these levels, and if necessary reduce logging verbosity in order to improve throughput. If desired, the use of solid-state drives (SSDs) could also be used to improve performance if this becomes an issue.
Important: Impact of Logging Verbosity
Typically, the default logging verbosity (1) is recommended when processing a large volume of simultaneous channels, to avoid I/O bandwidth issues within the host Operating System. Higher levels of logging verbosity place more of a burden on the Operating System's I/O subsystem and can result in latency if the system cannot handle the number of disk operations being undertaken