The interior of a car is a difficult environment for speech recognition, especially for the telephone-based application Gustavo Berdinas was working on. It relied on speakerphone, there is background noise from other passengers, the streets and the engine, and the interior's surfaces are close to terrible for acoustics.
These challenges were ruining the performance of the speech recognition engine being used by General Motors' ChevyStar system, resulting in an unacceptable recognition rate. Unhappy with this, GM asked their system designers, Berdinas and the Redmond Software team, to improve the speech recognition.
By switching to the LumenVox Speech Engine and working closely with the application developers at LumenVox, Berdinas and his team were able to increase the system's accuracy by 42%.
The two companies accomplished this accuracy increase by collecting audio from actual interactions made in a car and training the Speech Engine to better understand speech in that noisy environment.
GM had developed ChevyStar as the onboard speech-driven communication and security system for its Venezuela, Colombia and Ecuador markets. ChevyStar is the South American counterpart of OnStar in the United States and Canada. It's a subcription-based service that provides speech-enabled, hands-free cell phone, navigation, security and remote diagnostics features to drivers.
While already popular in many countries, increasing customer demand illustrates that such systems are coming to be regarded as a necessity. Recent studies have shown that the distraction that the driver experiences while using a cellular phone and driving is comparable to that of legal intoxication. The value of ChevyStar exceeds mere convenience.
Redmond Software is a software vendor that specializes in advanced interactive voice response (IVR) systems. Redmond's pedigree is in building computer systems that speak and can be spoken to. Even with nearly 20 years of experience in the IVR and telecommunications industry, Redmond Software's developers knew they had a challenge ahead of them.
One of the largest contributors to the recognition problems was inherent to being in a car — background noise. In a small, confined space any noise is even more noticeable. This effect is exaggerated when you add the stereo, a talkative child or a rolled-down window. The ability to discern spoken commands amongst the other noise becomes difficult.
Adding to the problem, drivers are not holding a microphone up to their mouths, or even speaking directly into a telephone. The speech recognition and commands are issued to the ChevyStar system via speakerphone. So, in addition to the noisy background of the car, its users are speaking into a microphone that is sometimes more than a foot away from their mouths. At this distance, the microphones are prone to picking a considerable portion of the background noise.
The final problem that Redmond needed to overcome was with their automatic speech recognition (ASR) software provider. The company that they'd been using for several years was proving to be both expensive and unhelpful.
"In order to get any Technical Support with the previous vendor, you have to go through their process, and pay their prices," Berdinas said. "And even so, they were not really giving us any solutions."