1-877-977-0707
| espaņol | site map | login

Speech Engine

Contact
Contact LumenVox
Call us
today to
learn more.

Phone Demos

Speech Recognition Demo
Call our demo hotline!
Experience speech recognition for yourself.

Resource Center

Speech Recognition Nav Cap

NBest Results

n-best Results Instead of returning only the top scoring result, you can instruct the speech engine to return several of the highest scoring, most likely answers, often called NBest results. Returning NBest results is particularly effective when callers need to spell names, street addresses, or e-mail addresses. Without NBest results, if a caller spells a name beginning with "N," but the engine returns a low confidence score, the caller would be asked to repeat the letter—and given how similar "N" is to "M," it's likely that the second answer would have a similarly low confidence score. With NBest results, the system can prompt the caller using several of the likely results, such as "Did you mean 'M,' as in 'Mary'?" When the caller responds, "No," the system goes to its next option, "Perhaps you meant 'N,' as in 'Nancy'?"

Returning NBest results improves the caller's experience: instead of asking the caller to simply repeat an answer that received a low confidence score, the speech recognition system can confirm the caller's intention using several likely choices.
LumenVox's corporate and product strategy is right in sync with us.

Vern Baker, president of enGenic Corporation

Server-Side Grammar

Server Side Speech Grammars LumenVox offers even more efficient support for large speech recognition grammars, by allowing clients to pre-load grammars onto the server. This allows users to send the grammar prior to the decode requests.

Typically, the grammar itself accompanies each decode request, but in the case of large grammars, sending the grammar to the server prior to decoding is more efficient—reducing network traffic.

Voice Activation Detection

Voice Activation Detection Voice Activity Detection (VAD), also referred to as barge-in and/or End-Of-Speech (EOS) detection, detects when a person begins speaking, finishes speaking, or pauses while speaking.

LumenVox's VAD implementation delivers high performance despite challenging conditions: hisses, pops, abrupt changes in background noise, telephone line echo, and squawks from two-way radio communication.

The Voice Activity Detection module is highly configurable and can be adapted to work equally well within telephone, VoIP, or microphone-based applications.

Noise Reduction Module

Speech Engine Truck Demo The waveforms below demonstrate the power of LumenVox's Noise Reduction Module. In the original audio a truck driver is speaking on a cell phone while driving. In addition to noise from the truck engine and blowing wind, another vehicle engine starts in the middle of the recording.
Noise Reduction Module When noise is present, it will degrade the performance of any speech recognition system. Quality noise reduction improves the
accuracy of Voice Activation Detection and Core Recognition, both essential parts of a speech recognition system.

To improve application robustness in noisy environments, LumenVox implemented a Noise Reduction Module (NRM) into our Speech Recognition Engine. The NRM automatically adapts to the acoustic environment, and dynamically updates its estimate of noise levels. The adaptive algorithm enables and dynamically updates it's estimate of noise levels. The adaptive algorithm enables the NRM to reduce the effects of noise.

Technical Requirements

Windows
  • Windows NT 4.0 with Service Pack 6a
  • Windows 2000
  • Windows XP Pro
  • Windows 2003
  • Intel Pentium 800MHz or greater / 1 GB RAM
  • 200 MB HD (1-n GB for logs)
Windows Icon
Linux
  • Red Hat Enterprise Server, Fedora Core, rPath Linux and Debian.
  • Intel Pentium 800 Mhz or greater
  • 1 GB RAM
  • 200 MB HD (1-n GB for logs)
Linux Icon Please note that at this time we do not support any 64-bit operating systems.