Transcripts entered with the Speech Tuner are helpful to tuning a speech recognition application. They are used to perform tests on a speech application in order to pinpoint problems.
While transcribing, you should be mainly concerned with audio that is appropriate for the speech application and that is recognizable by humans. It is a subjective process and transcribers should strive to maximize their efficiency.
For instance, if the application records a caller talking to another person in the background, that speech can simply be marked as garbage and discarded while doing tests. Likewise, if a caller says something that is unintelligible to a human, there is no way the Engine can be expected to understand it and thus it can be marked as garbage as well.
What should be transcribed are utterances that are appropriate for the application. This includes intelligible out-of-grammar responses when those responses make sense as being valid responses to the prompt.
For perfect transcripts, transcribe what a speaker said verbaitm, without orrecting grammatical errors or mispronunciations. If you have a need for perfect, very detailed transcripts, the following rules are useful.
Guidelines for Detailed Transcripts
Try to get the words, noises, and their placements correct in order for the Tuner to know which sounds correspond to which word or sound in the transcription.
Transcription Rules
Grammatical errors and mispronunciations: For transcription purposes there are no such things as grammatical or mispronunciation errors. Transcribe precisely what the caller said. If the caller says "I seen him," then transcribe "I SEEN HIM."
Standard reductions, alternate pronunciations and contractions: Transcribe as spoken.
Hyphenating: Never hyphenate.
Compound words: Unless there is an obvious pause between two words, all compound words should be transcribed as one word when such a word exists in the dictionary. "Everyday" should not be transcribed as "EVERY DAY" for instance.
Abbreviations: Never abbreviate, except when the speaker says the abbreviation. If the caller says "Doctor" then transcribe "DOCTOR" and not "Dr." However, if the caller says "Ave" instead of "Avenue" then transcribe " AVE."
Punctuation: No punctuation should be used in transcriptions. Do not put in periods, commas, question marks, etc. However, if the word is possessive or a contraction you may use the apostrophe. Never use double quotes, the "+", "<", or " >" symbols. These symbols are used in the underlying code in order to analyze the gathered data.
Common Misspellings: Watch for common spelling confusions. For instance, " they're," "there," and "their" all sound the same.
Numbers: Numbers should be transcribed as words. If the caller says "Four hundred and fifty five" then the transcription should read "FOUR HUNDRED AND FIFTY FIVE " and not "455."
Letter sequences: Spell out letter sequences.
Acronyms: Transcribe acronyms as they are said. "NATO" is transcribed as "NATO" with no spaces or periods.
Initialisms: Transcribe initialisms as they are said. "CIA" is transcribed as "C I A" with spaces to denote that each letter is pronounced individually.
Possessives: Use standard punctuation rules to denote possession. "Susan's book " is transcribed simply as "SUSAN'S BOOK" and "The drivers' cars" is transcribed "THE DRIVERS' CARS."
Filler noise: Depending on the type of filler noise, it should be transcribed as either a noise tag or a word.
Yes/no sounds: For anything resembling sounds of assent or denial, transcribe them as they sound.
Gender: Pick the appropriate gender for what the speaker sounds like.
Learn more about our Speech Engine
Transcription can help improve your application tuning signficantly. Depending on the level of
detail you need from your transcripts, you may wish to perfectly transcribe what callers say.
Our practical guide to tuning has more information
on tuning speech applications.