Browse
 
Tools
Rss Categories

Detailed Results

Reference Number: AA-02496 Views: 547 0 Rating/ Voters

More detailed results can be obtained for a transcription.  These include per word confidence scores and per word timing information.

This can be turned on per interaction by including the meta tag ASR_RESULT_DETAIL_ENABLE in the grammar used for the recognition.

<?xml version='1.0'?> <grammar xml:lang="es" version="1.0" root="root" mode="voice"          xmlns="http://www.w3.org/2001/06/grammar"          tag-format="semantics/1.0"> <meta name="TRANSCRIPTION_ENGINE" content="V2"/> <meta name="ASR_RESULT_DETAIL_ENABLE" content="1"/> <rule id="root" scope="public"> <ruleref special="NULL"/> </rule> </grammar>



The results interpretation will consist of a structured object including a section called "AsrDetails".  This will have an nbest list of results.  Each result will have utterance level data (confidence score, duration, start time and transcript).  It will include a words section, that lists each word.  For each work the confidence, duration and start time are included as well.

Below is an example transcription with details active for the phrase "THIS IS TEST NUMBER FOUR".


<AsrDetails> <nbest length="1"> <item index="0"> <confidence>988</confidence> <duration>1.72</duration> <start_time>0.9</start_time> <transcript>THIS IS TEST NUMBER FOUR</transcript> <words length="5"> <item index="0"> <confidence>989</confidence> <duration>0.18</duration> <start_time>0.9</start_time> <word>THIS</word> </item> <item index="1"> <confidence>985</confidence> <duration>0.12</duration> <start_time>1.12</start_time> <word>IS</word> </item> <item index="2"> <confidence>16</confidence> <duration>0.32</duration> <start_time>1.3</start_time> <word>TEST</word> </item> <item index="3"> <confidence>988</confidence> <duration>0.34</duration> <start_time>1.66</start_time> <word>NUMBER</word> </item> <item index="4"> <confidence>869</confidence> <duration>0.56</duration> <start_time>2.06</start_time> <word>FOUR</word> </item> </words> </item> </nbest> </AsrDetails>



ASR Detail results can be enabled globally as well via the configuration setting  ASR_RESULT_DETAIL_ENABLE



Optionally, the Meta tag INTERPRETATION_AS_JSON can be used to have the results returned as JSON.

<?xml version='1.0'?> <grammar xml:lang="es" version="1.0" root="root" mode="voice"          xmlns="http://www.w3.org/2001/06/grammar"          tag-format="semantics/1.0"> <meta name="TRANSCRIPTION_ENGINE" content="V2"/> <meta name="ASR_RESULT_DETAIL_ENABLE" content="1"/> <meta name="INTERPRETATION_AS_JSON" content="1"/> <rule id="root" scope="public"> <ruleref special="NULL"/> </rule> </grammar>


Would return the following with the same example as above:  "THIS IS TEST NUMBER FOUR".

{ "AsrDetails": { "nbest": [ { "words": [ { "start_time": 0.9, "duration": 0.18, "word": "THIS", "confidence": 989 }, { "start_time": 1.12, "duration": 0.12, "word": "IS", "confidence": 985 }, { "start_time": 1.3, "duration": 0.32, "word": "TEST", "confidence": 16 }, { "start_time": 1.66, "duration": 0.34, "word": "NUMBER", "confidence": 988 }, { "start_time": 2.06, "duration": 0.56, "word": "FOUR", "confidence": 871 } ], "start_time": 0.9, "duration": 1.72, "confidence": 988, "transcript": "THIS IS TEST NUMBER FOUR" } ] } }