Part 1 Speech Recognition Basics Video



  • Are you new to speech recognition? This video explains what is speech recognition, the difference between speech recognition and voice recognition, and the reasons for implementing a speech recognition solution today.
  • RUNTIME 8:09


Video Transcription

Part 1 Speech Recognition Basics

Speech Recognition Basics

  • Overview
  • Speech Recognition vs. Voice Recognition
  • Reasons for Speech Recognition


  • Speech is basically just another user interface, an input method, like using a mouse or a keyboard.
  • Speech recognition recognizes what you say, not what you mean. Much like when you click the button on your mouse, the computer doesn't really know your intent. When the button is clicked the computer's response is in accordance with its programming. The same thing is true with speech recognition. The user says words, the words are recognized by the engine and the system response as it has been programmed.
  • It is not artificial intelligence. You cannot carry on a viable conversation with a computer just yet. There is a misconception that speech recognition has a science ficition quality which would enable you to carry on an intelligent, natural conversation with the computer.
  • Speech recognition is only as good as the application built around it. For example, in some early Windows point and click applications the icons may have been confusing at first, the behavior of the application was foreign at first. One would not blame the mouse for the operating system's behavior simply because it was the mouse that was clicked. Likewise if you've have had some negative experiences with speech recognition, it's possible that it may not have been the underlying technology as much as the fault of the early application that wasn't really built to take advantage of its input method. Speech recognition is currently a more mature technology and we have a better idea of how to design better applications. Here at you will be able to find various examples of how to design applications well.

Speech Recognition vs. Voice Recognition

Speech recognition and voice recognition are two terms that are frequently used and to most people the terms are interchangeable. The press will use the terms interchangeably and well as people who don't have much experience in the industry. However, within the speech recognition industry and academic circles, linguist, scholars and computer scientists who study speech, there is a very large distinction.

Speech Recognition

This is the ability of a computer to understand the words that are spoken. It is the translation of vocal sounds into predefined words to be recognized.

Voice Recognition

This is the ability to recognize a speaker based upon that speaker's style. We all have specific characteristics about our individual styles, somewhat like a fingerprint. Voice recognition technology allows computers to recognize distinct characteristics of our voice. Used mainly for biometrics (authenticating for security purposes), and dictation. Here at LumenVox we do speech recognition technology. So we recognize what you said, and not who said it.

Reasons for Using Speech

What are some of the reasons to use speech recognition? What makes it better or different then other input methods?

  • More natural interaction
    You don't have to be trained on how to speak since you've been trained in speech from the very moment that you were born. Using a mouse or a keyboard or learning to dial a telephone is not as natural. It's a pleasant and natural ability to simply state what one wants, as opposed to point and click or typing keys.
  • Convenient
    If you're on the phone with your bank while driving, you may not be able to safely reach your phone or key in your account numbers, so speech in this instance is great. You simply say what you need and you don't have to worry about the location of your cell phone or taking your eyes off the road. For many people, speech is the best way to interact with systems for convenience.
  • Open-ended questions
    As an application designer, speech is great because it allows you to really open up your applications and make them easier for people to use, more user friendly. Think about certain prompts and questions you can incorporate with speech recognition that you can't have with DTMF touch tone applications on the telephone:
    • City and State. For example, a directory assistance application would need to know the city and state for the required phone number. You cannot type in a city and state. Perhaps a ZIP code can be keyed in, but there may be many ZIP codes in a single city and most people may not know the ZIP code for a particular area. With speech, the caller can be simply prompted for city and state, and respond "San Diego, California," which is a much, much simpler approach going far beyond DTMF applications.
    • Call Router. With speech applications, the menus are greatly improved. With DTMF you commonly hear "Press one for a particular function, or person, or department, press two for another." You may generally listen to the entire menu of choices just to make sure you don't miss the choice that is most specific to your needs. This can be frustrating and is not convenient to the user at all. With speech recognition you can simply say where you would like to go without worrying what option is best. Also with speech, you can cut down on the number of menus used. You don't have to have the caller press one and go to another menu, then press five and be taken to yet another menu, and so on. With speech you can have as many options as needed in a single menu as opposed to the 10 or so options allotted by the number of keys on a telephone keypad.

All of the above are just some of the reasons you'll want use speech recognition.

© 2016 LumenVox, LLC. All rights reserved.