Rss Categories

ASR Client Log

Reference Number: AA-01761 Views: 8236 0 Rating/ Voters

The client_asr.txt file contains information about the speech client (the Media Server) and its communication with the ASR server. Failures that are specific to ASR requests are indicated in this log. The log also contains information about voice activity detection. You can generally correlate RECOGNIZE events in MRCP with activity in this log. Here is an example of a typical RECOGNIZE event:

12/17/2012 11:59:22.226,INFO,SpeechPortClien,0AA0C9D0, Initializing streaming object.
12/17/2012 11:59:22.281,INFO,SpeechPortClien,0AA0C9D0, Start listening to stream!
12/17/2012 11:59:22.281,INFO,SpeechPortClien,Stream Started in Voice Activity Detection Mode
12/17/2012 11:59:22.281,INFO,SpeechPortClien,0AA0C9D0, Change stream state to: STREAM_STATUS_READY
12/17/2012 11:59:22.284,INFO,SpeechPortClien,LocalGrammarManager - Sending load grammar request...
12/17/2012 11:59:22.284,INFO,SpeechPortClien,Sending grammar request {1F36EC86-6E35-4434-B873-7BD89589ED37}
12/17/2012 11:59:22.284,INFO,SpeechPortClien,*****Send Grammar request to*****
12/17/2012 11:59:22.284,INFO,SpeechPortClien,LocalGrammarManager - Waiting [200000] ms for load request grammar response code
12/17/2012 11:59:22.285,INFO,SpeechPortClien,Received grammar request {1F36EC86-6E35-4434-B873-7BD89589ED37}
12/17/2012 11:59:22.285,INFO,SpeechPortClien,LocalGrammarManager - Load grammar request returned
12/17/2012 11:59:22.285,INFO,SpeechPortClien,0AA0C9D0, Start streaming!
12/17/2012 11:59:22.285,INFO,SpeechPortClien,Stream Started in Voice Activity Detection Mode
12/17/2012 11:59:22.285,INFO,SpeechPortClien,0AA0C9D0, Change stream state to: STREAM_STATUS_READY
12/17/2012 11:59:23.173,INFO,SpeechPortClien,0AA0C9D0, Change stream state to: STREAM_STATUS_BARGE_IN
12/17/2012 11:59:26.484,INFO,SpeechPortClien,0AA0C9D0, Change stream state to: STREAM_STATUS_END_SPEECH
12/17/2012 11:59:26.484,INFO,SpeechPortClien,clsSoundChannel::Decode() Begin
12/17/2012 11:59:26.485,INFO,SpeechPortClien,Sending decode request 3611840_0_1
12/17/2012 11:59:26.485,INFO,SpeechPortClien,*****Send Decode to*****
12/17/2012 11:59:26.934,INFO,SpeechPortClien,Received decode request 3611840_0_1
12/17/2012 11:59:26.934,INFO,SpeechPortClien,Decode Request Client MessageHandler got message from server
12/17/2012 11:59:26.934,INFO,SpeechPortClien,clsSoundChannel::ReceivedAnswers (3611840_0_1)
12/17/2012 11:59:26.941,INFO,SpeechPortClien,Closing port [11]
12/17/2012 11:59:26.941,INFO,SpeechPortClien,0AA0C9D0, Destroy streaming object.

In this case, you can see the stream initialize and the voice activity detector begin in VAD mode (it can also be in Call Progress Analysis or Answering Machine Detection modes). A grammar is loaded as the client waits for the stream, and the stream state changes to BARGE_IN and then END_SPEECh as start-of-speech and end-of-speech are detected by the client. At that point, the audio is packaged up and sent to the server for a decode. The client receives a response from the server containing the answer, and then performs some cleanup.