Browse
 
Tools
Rss Categories

ASR Sound Formats

Reference Number: AA-01051 Views: 2132 0 Rating/ Voters

The Speech Engine accepts audio only in certain formats. The sound formats listed below are members of an enum SOUND_FORMAT which is used as an argument in loading the voice channel (LoadVoiceChannel or LV_SRE_LoadVoiceChannel) for decodes . The streamed audio format is set be setting the STREAM_PARM_SOUND_FORMAT to one of the below formats using StreamSetParameter or LV_SRE_StreamSetParameter.

List of Supported ASR Sound Formats


ULAW_8KHZ


Format: mu-law

Sample Rate: 8 kHz

Bytes per Sample: 1 byte per sample

Memory: ~0.5 MB / min.

Notes: This is the standard domestic telephone format.


PCM_8KHZ


Format: PCM

Sample Rate: 8 kHz

Bytes per Sample: 2 bytes per sample

Memory: ~1 MB / min


PCM_16KHZ


Format: PCM

Sample Rate: 16 kHz

Bytes per Sample: 2 bytes per sample

Memory: ~2 MB / min


ALAW_8KHZ


Format: a-law

Sample Rate: 8 kHz

Bytes per Sample: 1 bytes per sample

Memory: ~0.5 MB / min

Deprecated Sound Formats


SPX_8KHZ


Status: Deprecated and non-functional since 9.0


SPX_16KHZ


Status: Deprecated and non-functional since 9.0