Tools

TTS Sound Formats

Reference Number: AA-01933 Views: 8224

0 Rating/ Voters

The TTS Server synthesizes audio only in certain formats and sampling rates. The sound formats listed below are the supported members of the enum called SYNTHESIS_SOUND_FORMAT.

These values are used to set the TTS property PROP_EX_SYNTH_SOUND_FORMAT which controls the format in which the audio is synthesized.

The only sampling rates supported for TTS are 8000 Hz and 22050 Hz which are set with the TTS Property PROP_EX_SYNTHESIS_SAMPLING_RATE.

For most telephony applications, the sound format supported by the telephone company is either SFMT_ULAW (most common) or SFMT_ALAW with a sampling rate of 8000.

SFMT_PCM can be used at a sampling rate of 8000, but with a higher sampling rate of 22050, this should provide better audio quality, however, it is practical only in applications that do not have bandwidth constraints and can handle this higher sampling rate, which are typically not telephony applications, which means mobile or web-applications or similar.

List of Supported TTS Sound Formats

SFMT_ULAW

Format: mu-law

Bytes per Sample: 1 byte per sample

Memory: ~0.5 MB / min.

Notes: This is the standard domestic telephone format.

SFMT_PCM

Format: PCM

Bytes per Sample: 2 bytes per sample

Memory: ~1 MB / min

SFMT_ALAW

Format: a-law

Bytes per Sample: 1 bytes per sample

Memory: ~0.5 MB / min

TTS Client Properties