Continuous vs. Semi-Continuous Models

As of version 9.0, the LumenVox Speech Engine features the ability to use two different types of acoustic models – semi-continuous models and higher quality continuous models. Previous to version 9.0, LumenVox used only semi-continuous acoustic models.

However, starting version 10.1.100, semi-continuous mode of operation has been completely dropped from the Engine. If you are using version 10.1.100 or higher, you may skip ahead directly to Different Acoustic Model Resolutions section below.

The biggest difference between the two model types is the level of recognition accuracy and the amount of processing needed to complete the audio decodes. The semi-continuous model uses data compression to reduce the size of the acoustic model. Because the continuous model does not use compression, it has higher resolution, resulting in increased accuracy.

The continuous models have shown an accuracy increase across various domains, but at the expense of approximately 15-20% more processing time. In some cases, the semi-continuous decoder proved more accurate (however, the continuous decoder uses roughly one-third of the memory used by the semi-continuous decoder).

The table below shows some of the results of our accuracy testing between the two models during their development.

Test

Continuous Model

Semi-Continuous Model

Natural Number

92.71

90.69

Digits

88.38

82.75

YesNo

98.95

98.72

Date

93.90

91.53

Dollar Amount

93.66

89.56

Name at Agency

90.32

89.17

Restaurant

91.81

88.24

CityState

88.20

82.97

Call Router Untuned

85.67

86.25

Call Router Tuned

96.49

96.76

Specifying Model Type

Not all of our languages are supported in the continuous model yet. This is because each acoustic model (language) needs to be retrained for processing in a fully continuous mode.

Currently, LumenVox only has continuous models for American English and Australian English

This means that if you are using another language, such as either Spanish dialect, you must switch the decoder back to semi-continuous mode by modifying the sre_server.conf file. This is in /etc/lumenvox/ for Linux users, and C:\Program Files\LumenVox\Engine\ on Windows.

In that file there is a parameter called HMM_TYPE. Enter SEMI if you want to use semi-continuous models, or CONT for continuous models.

You must then restart the Speech Engine service for the changes to take effect.

In almost all cases, LumenVox recommends using the continuous model when it is available. The only time we recommend using the semi-continuous models and decoder is when fast decode time is more important than higher accuracy.

Different Acoustic Model Resolutions

In addition to picking between semi-continuous and continuous models, in version 9.1 LumenVox supports various resolutions for the American English model. These settings only work when the Engine is in continous mode.

Three models are availble: low, medium, and high resolution versions. Higher resolution models offer better accuracy, but use more memory and CPU time.

To choose which model to use, there are three new settings in sre_settings.conf that allow you to specify which models will be loaded: LOAD_LOW_RES_MODEL, LOAD_MED_RES_MODEL, AND LOAD_HIGH_RES_MODEL. You may set any of those to 1 in order to load the model.

You must set at least one of these to 1 in order to start the Engine. If more than one is set to 1, all the specified resolutions will be loaded. When doing a decode, the Engine will default to the lowest resolution model that it loaded; this can be changed via SetPropertyEx and the PROP_EX_ACOUSTIC_MODEL_RESOLUTION parameter.

© 2012 LumenVox LLC. All rights reserved.