Speech is the most basic, common and efficient form of communication. Its goal is to transmit information such as ideas, concepts, and values between groups of people.
We are taught many of the rules of speech communication as children, for example, how to recognize when its our turn to speak, the difference between inside and outside voices, and the appropriateness of certain words and language around others. As we enter formal education, we are taught the differences between the types of words and how to structure them as a means to communicate information in acceptable and reproducible forms.
Delving deeper, we learn the divisions between words, and the substructures within words that help us to connect with an audience through stress, pace and intonation. This is all in an attempt to train us to use a common set of understood tools to encapsulate our thoughts, feelings, and perceptions and then deliver them to others in a context they can understand.
In order to get a computer to recognize speech we must delve into the underlying mechanics of sound — noise reduction, signal amplification, echo cancellation, and the movements of acoustic energy. An understanding of the mathematics behind linguistics, context free grammars, and phonemes are also required. We need to create a specialized written language for communicating grammars and semantic information thats easily entered, and limited in context to a domain that a computer can properly handle with existing technology.
To create effective speech recognition software, we also need to qualify the differences between speakers, between high pitch voices normally associated with women and children, and the lower pitch voices of men. Differences in vowel pronunciation between language speakers can create opportunities for confusion between different speakers and the speech engine. Finally, we need to reconstruct a reasonable semblance of the mental model of the speaker to identify the elements of their speech that are actionable, and the rest that are simply ignored.
© 2017 LumenVox, LLC. All rights reserved.