Speech Recognizer

LumenVox Automated Speech Recognizer (ASR)

The LumenVox Automated Speech Recognizer (ASR) is a software solution that converts spoken audio into text, providing users with a more efficient means of input. Less time spent in data input frees up resources that can be used more effectively.

A Speech Recognizer compares spoken input to a list of phrases to be recognized, called a grammar. The grammar is used to constrain the search, enabling the ASR to return the text that represents the best match. This text is then used to drive the next steps of your speech-enabled application.

One of the most common uses for speech recognition is in Interactive Voice Response (IVR) systems, which allow computers and humans to communicate with one another through speech. If you have ever had to "Say the name of the person you want to talk to..." you have used speech recognition on an IVR.

Our ASR is integrated with more than 25 voice platforms. Developers can write applications using our application programming interface (API) or a standards-based solution like media resource control protocol (MRCP). This makes installation, implementation and deployment easy for users of those platforms.

The LumenVox Speech Recognizer also supports Natural Language Understanding (NLU) applications through development of Statistical Language Models (SLM). These advanced speech development techniques provide end users with a more natural speech interface to the ASR.

Available in 32 and 64-bit versions of Linux and Windows, the hardware-independent Speech Recognizer powers speech solutions and platforms deployed in Enterprise and SMB environments worldwide.

Languages Supported

  • American English
  • British English
  • Australian / New Zealand English
  • Indian English
  • Colombian / Latin American Spanish
  • Mexican / North American Spanish
  • Canadian French

It also supports the W3C's Semantic Interpretation for Speech Recognition (SISR) recommendation, the Speech Recognition Grammar Specification (SRGS), and Voice Extensible Markup Language (VXML). Developers that are familiar with industry standards like these will be able to write applications quickly and effectively.

The Speech Recognizer is distinguished from other speech solutions. It does not provide dictation or voice verification. Voice recognition is often used in security solutions where the software determines who you are; not what you're saying. Our technology is a speaker independent solution. Anyone can speak to our Recognizer, and it will match their spoken audio to phrases from the grammar.

How Speech Recognition Works

Speech engines use this process to figure out what a speaker said:

  1. The engine loads a list of words to be recognized. This list of words is called a grammar.
  2. Audio from a speaker is captured by a microphone or telephone. This audio is turned into a waveform, a mathematical representation of sound.
  3. The engine looks at features — distinct characteristics of sound — derived from the waveform and compares them with its own acoustic model. The engine searches its acoustic space, using the grammar to guide this search.
  4. It then determines which words in the grammar the audio most closely matches and returns a result.

 

How Speech Recognition Works

 

We have extensive documentation on our Recognizer and its uses in our Resources section, our Help documentation, or directly from the Sales or Support teams.

The Speech Engine compares audio with the loaded grammars to produce recognized text.