Browse
 
Tools
Rss Categories

Vendor Specific Parameters

Reference Number: AA-01100 Views: 4645 0 Rating/ Voters

As defined in the MRCP specifications, there are a set of headers allowing the client to adjust vendor specific parameters. These headers may be sent in the SET-PARAMS/GET-PARAMS methods.

The following parameters are LumenVox-specific extensions to the MRCP specification. They can be controlled via the media_server.conf file, located in the config directory of the Windows LumenVox installation folder. By default, this location is C:\Program Files\Lumenvox\config\.

In Linux, edit the media_server.conf file in /etc/lumenvox/

They may also be set with the appropriate header as part of a RECOGNITION or SET-PARAMS method; see Specifying Vendor-Specific Properties via MRCP Headers below.

See  Configuration Parameters for more information about changing various MRCP parameters.

wind-back-time

The length of audio wound back at the beginning of voice.

It helps in the situation of weak speech onset. The resolution of this parameter is 40 ms and it is rounded to the closes multiple of 40ms, which means setting this value to 139 ms is the same as setting it to 120 and setting this value to 141 ms is the same as setting it to 160 ms. It is specified in milliseconds.

Range: >0

See STREAM_PARM_VAD_WIND_BACK in the LumenVox API documentation for more details.

Default: 480

snr-sensitivity-lvl

This setting controls the minimum SNR of streamed audio data for it to be processed to identify whether it is speech. Data below this threshold is automatically assumed to be  silence/noise. The Noise estimate for the calculation is obtained from the initial silence specified by STREAM_PARM_VAD_STREAM_INIT_DELAY. The higher the value the harder it is to barge in. The default value of 50 equals 5 dB SNR. The parameter range is mapped between 3.5dB to  20dB. If the application is expected to be in a very noisy environment and speech is not expected to be much louder than the background, this setting may need to be lowered. If speech is expected to be much louder than the surrounding noise, then raising this value allows the VAD to ignore lower volume background speech or babble noise that may otherwise cause barge-in

Note that this parameter can be set in the range 0-100, with higher values (closer to 100) being more sensitive to barge-in in noisy situations with low SNR (where speech and background noise are similar)

Range: 0-100

See STREAM_PARM_VAD_SNR_SENSITIVITY in the LumenVox Core API reference documentation for more details. Note that the LumenVox setting (0 is most sensitive) is opposite to the snr-sensitivity-lvl setting (100 is most sensitive). Note that this vendor specific setting should not be confused with the similar MRCP Sensitivity-Level header setting, which affects the STREAM_PARM_VAD_VOLUME_SENSITIVITY setting in the API.

Default: 50

vad-stream-init-delay

The length of audio (in milliseconds) that the VAD module uses to estimate the acoustic environment.

Accurate VAD depends on good estimation of acoustic environment. The VAD module uses the first a couple of frames of audio to estimate the acoustic environment, such as noise level. The length of this period is defined by this parameter.

Range: >0

See STREAM_PARM_VAD_STREAM_INIT_DELAY in the LumenVox API documentation for more details.

Default: 100

vad-bargein-threshold

VAD speech sensitivity setting.

A higher value makes the VAD more sensitive to speech which means that the VAD is very sure the data is speech before barge in. Raising the value will reject more false positives/noises However, it may mean that some speech that is on the borderline may be rejected This value should not be changed from the default without significant tuning and verification.

Range: 0 - 100 (MRCP v1 and MRCP v2)

See STREAM_PARM_VAD_BARGEIN_THRESHOLD in the LumenVox API documentation for more details.

Default: 50

compatibility_mode

Enables compatibility encoding of results

This option may need to be enabled to match the output of LumenVox decodes with those of other vendors.

Please contact LumenVox support for more specific details

Default: 0

end-of-speech-timeout

Controls the end of speech timeout setting

This value affects the underlying STREAM_STATUS_END_SPEECH_TIMEOUT of the speech port, which is used in an MRCP ASR recognition session.

After barge-in, the streaming interface will flag STREAM_STATUS_END_SPEECH_TIMEOUT, if it did detect end-of-speech in the time specified by this property. This is different from the end of speech delay; STREAM_PARM_END_OF_SPEECH_TIMEOUT represents the total amount of time a caller has to speak after barge-in is detected.

See STREAM_STATUS_END_SPEECH_TIMEOUT in the LumenVox Stream Properties documentation for more details.

Default: -1 (infinite)

secure_context

Enables suppression of potentially sensitive ASR data. 

When enabled, this option will prevent logging of any potentially sensitive data to either log files or callsre data files, which includes any associated audio segments. Where potentially sensitive data would have appeared, the word _SUPPRESSED will replace the potentially sensitive data to indicate that suppression occurred.

This functionality was introduced with LumenVox version 11.0.300 (November 2012)  as part of our ongoing enhancements to support secure application development.

Possible Values:

  • 0 - Disabled. Normal logging will be performed
  • 1 - Secure Context mode enabled. Sensitive data will be suppressed

Default: 0

tts.secure_context

Enables suppression of potentially sensitive TTS data

When enabled, this option will prevent logging of any potentially sensitive data to either log files or callsre data files, which includes any associated audio segments. Where potentially sensitive data would have appeared, the word _SUPPRESSED will replace the potentially sensitive data to indicate that suppression occurred.

This functionality was introduced with LumenVox version 11.0.300 (November 2012)  as part of our ongoing enhancements to support secure application development.

Possible Values:

  • 0 - Disabled. Normal logging will be performed
  • 1 - Secure Context mode enabled. Sensitive data will be suppressed

Default: 0

Specifying Vendor-Specific Properties via MRCP Headers

As mentioned previously, you may specify the above parameters in an MRCP header. You must use the following format. Note that a semicolon (";") is used as the delimiter:

Vendor-Specific: com.lumenvox.wind-back-time=300;com.lumenvox.vad-stream-init-delay=200

This header field may be specified in RECOGNIZE, recognizer SET-PARAMS or synthesizer SET-PARAMS method during an MRCP session. The following header field names may be used:

com.lumenvox.wind-back-time
com.lumenvox.snr-sensitivity-lvl
com.lumenvox.vad-stream-init-delay
com.lumenvox.vad-bargein-threshold
com.lumenvox.compatibility-mode
com.lumenvox.end-of-speech-timeout
com.lumenvox.secure_context
com.lumenvox.tts.secure_context


See Also