Speech Automation


Multi-Factor Authentication


Request a Demo
Sun – Closed

Contact Us
24/7 online feedback

Automatic Speech Recognition (ASR)

Powering Speech Enabled Solutions

Integrated in 25+ Voice Platforms

Developers can write applications using our application programming interface (API) or a standards-based solution like media resource control protocol (MRCP).


Accurately Detect Speech

Whether the call is coming from a noisy restaurant or a speeding car, the LumenVox Speech Recognizer separates speech from background noise using Voice Activity Detection.

Hardware Independent

Available in 32- and 64-bit versions of Linux and Windows, the LumenVox Speech Recognizer powers speech solutions and platforms deployed in Enterprise and SMB environments worldwide.

Develop Innovative & Dynamic Speech Enabled Solutions

The LumenVox Automatic Speech Recognition (ASR) engine converts spoken audio into text, providing users with a more efficient means of input. Less time spent in data input frees up resources that can be used more effectively.

An ASR engine compares spoken input to a list of phrases to be recognized, called a grammar. The grammar is used to constrain the search, enabling the engine to return the text that represents the best match. This text is then used to drive the next steps of your speech-enabled applications.

Languages Supported

In addition to the languages listed, the LumenVox ASR engine also supports the W3C’s Semantic Interpretation for Speech Recognition (SISR) recommendation, the Speech Recognition Grammar Specification (SRGS), and Voice Extensible Markup Language (VXML). Developers that are familiar with industry standards like these will be able to write applications quickly and effectively.

Standard Support + Short Utterance Transcription

American English 
British English 
Mexican Spanish 
Colombian Spanish 
Brazilian Portuguese 

Languages with Standard Support

Australian / New Zealand English
Indian English
Canadian French

How Speech Recognition Works

Speech Engines Use This Process to Figure out What a Speaker Said:


The engine loads a list of words to be recognized. This list of words is called a grammar.

Audio from a speaker is captured by a microphone or telephone. This audio is turned into a waveform, a mathematical representation of sound.


The engine looks at features — distinct characteristics of sound — derived from the waveform and compares them with its own acoustic model. The engine searches its acoustic space, using the grammar to guide this search.


It then determines which words in the grammar the audio most closely matches and returns a result.

Complementary Products

Whether you’re looking to maximize your return on investment, improve the effectiveness of your speech-enabled solution, or enhance customer experience, our additional offerings will help you hit the mark.

LumenVox Speech Tuner

Catch every word! Our tuning and maintenance tool is an absolute requirement for every speech recognition solution. 

Learn More >


Call Progress Analysis

Get your message heard! Detect answering machine, voice mail, fax machine, and SIT tones when you send outbound messages. 

Learn More >

Learn More

Looking for more specific information about the LumenVox Automatic Speech Recognition engine? Click for: a complete list of features; technical specifications; sizing capabilities; or additional documentation.

Still have questions? We’d be happy to help, just contact us via the form below.

Experience Speech Recognition. Schedule a Demo Today.

Contact Us

Share This