Semantic Interpretation for Speech Recognition

Improving Applications with SISR

(JAN 2007) — An important thing to keep in mind while developing speech applications is that what your caller says isn't that important. What matters is what your caller actually means. Separating exactly what was said from what was meant is called semantic interpretation, and a technology called Semantic Interpretation for Speech Recognition (SISR) is the standard way of performing this interpretation.

For instance, if callers need technical support, they may say "technical support" but they may also ask for "service" or "customer support." All of those requests need to go to the same location, even though the exact words are different. The semantic interpretation, however, is the same for each one.

To deal with this, you can use SISR within a grammar. By adding SISR tags to grammar entries, you can return meanings to your applicaton instead of what the user actually said.

Technical Details

Within an SRGS grammar, semantic interpretation tags are enclosed within curly braces {} in ABNF grammars and within <tag></tag> elements in grXML.

The return value of a matched rule is an ECMAScript object named $. You can manipulate the $ object within SISR tags to specify what information should be returned to your application.

As an example, suppose you were using an ABNF grammar for an application that let a user say a number. If the user says "ten" you would like that returned to your application as the integer 10 and not the text string "ten." You might do this with the following grammar rule:

$ten = ten {$ = 10};

Using our technical support example from above, you can use SISR so that several utterances from callers all returned the same results to your application:

Adjusting this higher will help prevent accidental loud noises in the background from triggering barge–in, meaning that the audio you send to the Engine will be more often recognized correctly.

$support = (service | [technical | customer] support) {$ = "service"};

For more information on SISR, see the Introduction to Semantic Interpretation in our Speech Engine help file, or read the official W3C SISR specification.

New Resources for Developers

LumenVox has completely revamped the Resources section of the Web site. The section has cleaner navigation and new articles aimed at helping speech application developers improve their applications. More »

We have also begun adding training videos to the Web site. Currently have videos covering speech application development, working with SRGS grammars, and Asterisk speech sales. We're adding new videos all the time, so check back often. More »

How to Download the Latest Release

If you would like information on downloading the latest release of the LumenVox Speech Engine, please contact us. It is a free download for users with current software maintenance packages.

Why Use SISR Within a Grammar?

You could perform all of your semantic interpretation within an application by parsing the raw text returned by the Speech Engine and then writing logic in the application to handle the results.

But the problem with this is that you would end up having to maintain extra code, since you have to do some level of interpretation within a grammar anyway.

You have to keep the grammar up–to–date with every word you expect callers to use in order for the Engine to recognize the words. There is no reason to duplicate this list in your application, as maintaining the same information in two places creates the possibility that they will become out"of"synch and cause bugs in your application.

By keeping all interpretation within a grammar, you can return predictable and fixed values to your application without effectively duplicating the word list within your application's codespace.

Other News

LumenVox will be attending the Internet Telephony Conference and Expo East, held in Ft. Lauderdale, FL from January 24–26.

Please feel free to drop by our booth and meet us in person, and come to the show to learn about how to speech–enable your voice over the Internet applications.

© 2018 LumenVox, LLC. All rights reserved.