Speech Recognizer More Information

Operating System

  • Windows
    • Windows 7 (32 & 64-bit)
    • Windows 8 (32 & 64-bit)
    • 2008 Server (32 & 64-bit)
    • 2012 Server* (64-bit only)

      *Windows Server 2012 requires a 64-bit processor, however is capable of running both 32 & 64-bit applications.
  • Linux
    • Red Hat & CentOS 5 (32 & 64-bit)
    • Red Hat & CentOS 6 (64-bit only)
    • Red Hat & CentOS 7 (64-bit only)

Distributed Client/Server Architecture

The LumenVox suite of solutions has always been designed around a client/server architecture. Most speech applications require increased processor load, and cannot afford service outages due to hardware failures. The versatility of the client/server design allows administrators to grow their speech environments with the businesses they support, provide stability through redundant installations, and achieve higher levels of performance through load balancing.

Server-Side Speech Grammars

LumenVox offers even more efficient support for large speech recognition grammars through advanced server and client-side grammar caching. Server-side speech grammars allow large grammars to be compiled once and saved so that they can be loaded again very quickly and without using up valuable processor or memory resources.

Voice Activity Detection

In our mobile society, we rarely make calls while sitting in a quiet room. Whether the call is coming from a crowded restaurant or inside of a speeding car, separating speech from background noise is a tricky task for speech recognition software.

The LumenVox Speech Recognizer uses a technology called Voice Activity Detection (VAD) to distinguish between actual speech and other sounds. Human speech has qualities that make it distinguishable from other sounds. VAD listens to the incoming audio for these qualities. These include:

  • Energy Level (volume)
  • Frequency (pitch)
  • Changes in frequency
  • Duration

NBest Results

Instead of returning only the top scoring result, you can instruct the Speech Recognizer to return several of the highest scoring, most likely answers, often called NBest results. Returning NBest results is particularly effective when callers need to spell names, street addresses, or email addresses. Without NBest results, if a caller spells a name beginning with "N," but the recognizer returns a low confidence score, the caller would be asked to repeat the letter — and given how similar "N" is to "M," it's likely that the second answer would have a similarly low confidence score. With NBest results, the system can prompt the caller using several of the likely results, such as "Did you mean 'M,' as in 'Mary'?" When the caller responds, "No," the system goes to its next option, "Perhaps you meant 'N,' as in 'Nancy'?"

© 2016 LumenVox, LLC. All rights reserved.