Pre-10.5 Release Notes
Reference Number: AA-02008 Views: 12806 |
0 Rating/ Voters
|
|
These are release notes prior to LumenVox version 10.5
10.4.500 (May 25, 2012):
Improvements and Major Changes:
- Changed ASR behavior to allow well-formed grammars that have a zero vocabulary size. This was previously prevented to avoid using malformed grammars, however there are certain legitimate uses for such grammars when referenced from other base grammars, so these are now permitted.
- Changed SSML behavior to allow say-as with or without the previously required vxml: namespace when interpret-as uses boolean, digits, currency, number or phone. This change means that the vxml: prefix is now optional for those interpret-as values
- Significant modification to synthesis STOP processing. Previously only the active SPEAK request (if any) was stopped. Now only requests specified by the client (in optional Active-Request-Id-List) will be stopped. If the client does not specify any request-id, all active and pending SPEAK requests will be stopped and the reply Active-Request-Id-List will indicate which requests were stopped.This is for both MRCPv1 and MRCPv2
Minor Changes and Fixes:
- Added better support for recognizer STOP handling. Previously only the active recognize task would be stopped. Now, the implementation will be to perform a wildcard STOP of all active and pending RECOGNIZE requests any time a STOP is received - even if that STOP request specifies specific requests only. This limitation should not affect any customers
- Changed Media Server processing of SDP to enable c= connection information at the session level as well as allowing override at media level.
- Fixed a Media Server problem when processing synthesis STOP requests in MRCPv1 if the specified resource URL is incorrect. Technically having mismatched URLs here is invalid,however the LumenVox Media Server applies heuristics to determine the type of resource being requested, and in this one case the detection method was incorrect.
- Fixed a minor issue in Media Server to help stability when operating under extreme load
- Fixed a minor issue in MultiThreadedStreamingCExample to help support accented characters if detected by applying locale setting
- Fixed a minor TTS synthesis issue where WaitForSynthesis would fail to indicate completion if thread timing of this call overlapped the synthesis call itself
- Fixed a TTS synthesis issue where the Media Server would fail to return SPEAK-COMPLETE if the synthesis returned a zero length audio result
- Fixed a TTS synthesis issue where attempting to synthesize an SSML containing only a period would result in zero length audio. Now 500ms of silence is played if this is detected. This fix is for TTS1 engine only, TTS2 already had this feature
10.4.300 (March 19, 2012):
Minor Changes and Fixes:
- TTS Server now logs SSML event when verbosity set to 2 (verbosity 3 was previously required)
- Fixed a bug when processing TTS1 lexicon references when using SSML 1.0
- Fixed a rare bug specific to CentOS6, 64-bit Operating System, which resulted in unexpected decode results due to an incompatible memory move operation.
- Logic relating to suppression of more than 10 critical errors per hour was buggy. Now fixed.
10.4.200 (February 1, 2012):
Minor Changes and Fixes:
- Changed MultiThreadedStreamingCExample to allow running of AMD decodes.
- Re-enabled switching of voices in SSML which was a regression in 10.4.100
10.4.100 (January 20, 2012):
Improvements and New Features:
- Added new American English acoustic models with improved accuracy and performance
- Logging Verbosity settings have been implemented across all products. These settings can be set at startup within the corresponding product configuration files. At runtime, this setting can be changed using the LumenVox Dashboard product, which also has the option of making the changes temporary or persistent beyond service restart. The default value is set as LOW, which means only errors and warnings will be logged. Users are encouraged to set their desired level of logging verbosity. Note that maximum logging can adversely affect system performance when large numbers of operations are being processed on the system. In other words, reducing logging verbosity can improve throughput performance
- Added new PROP_EX_LOGGING_VERBOSITY option to SetPropertyEx and GetProperyEx allowing users to adjust client logging verbosity at run time
- Changed LumenVox Dashboard to allow configuration of logging verbosity and viewing of service configuration settings.This can be accessed by right-clicking on the service in Summary view, and selecting the configure menu option. Selecting the 'make changes permanent' checkbox will apply configuration changes to the startup configuration files
- Added new Vendor-Specific-Parameters: com.lumenvox.compatibility-mode option, allowing compatibility mode to be selected or deselected by using SET-PARAMS (etc.) without the need to restart the Media Server.
- Added new Australian English TTS voice
- Added statistical performance tracking option to TTS Server. This information will be periodically recorded in tts_server_status.txt located in the logs folder
- Added new waveform_url_prefix configuration option to Media Server, allowing users to specify the prefix for returned Waveform-URI references.This may be useful when configurations expose these files from a local http server
- Added Media Server support for wildcard option to GET-PARAMS requests for recognizer and synthesizer resources for both MRCP V1 and V2 for better specification compliance
- Added LV_SRE_SetClientPropertyExPermanent and LV_SRE_CommitClientPropertySettings to client API, along with C++ equivalents. These can be used to permanently commit config changes to configuration files on disk, so that they persist beyond service restart. The only setting that is supported by this mechanism at present is LOGGING_VERBOSITY, however more support will be added in future versions
- Windows installation now checks for required administrator privileges in Windows 7, 2008 where UAC is required and if not present, generates a suitable message in place of the cryptic message in previous versions
- Changed the User Agent string returned by Media Server to show version number information. In MRCPv1, this string is now returned in the session name header (s=)
- Changed French pronunciation rules to prevent empty word phonetic pronunciations
- When processing GrXML grammars, the xmlns attribute is now considered optional and the value of "http://www.w3.org/2001/06/grammar" will be implied if not supplied. Previously we required this value to be non-optional with all GrXML grammars
- Overall inter-component communication was improved to increase performance and reduce connectivity issues
- SimpleTTSClient now has default values for language, voice and gender, making it easier to use.The SimpleTTSClient code was also modified to help clarify the process flow so that it may be easier for users to follow and implement their own functionality
- SimpleTTSClient and SimpleSREClient now provide more verbose feedback, allowing users to clearly see what types of issues they have, rather than returning a numeric error code
- Improved performance of Media Server to increase throughput when large numbers of simultaneous sessions were being processed. This improves overall SRE and TTS performance and increases the speed at which RTSP and SIP sessions are created and destroyed
- Modified TTS Server to support all voices by default, without the need to configure into one of two modes as before.The corresponding TTS_ENGINE setting in configuration no longer needs to be used and is now ignored because of this change
- Performance of TTS Server was improved by removing a single-threaded code bottleneck which would only become apparent under significant load. This portion of code was redone to avoid such a bottleneck, thus improving performance when under significant load
- Commas that were present in GrXML grammars were previously escaped with double-quotes which was incorrect. Commas are now discarded as part of the grammar compilation process.This should have no adverse effect on customers since commas do not have any acoustic reference
- Changed Media Server to add more verbose logging for 'Busy Here' events
- Changed the Media Server behavior for recognition-timeout. Now works more according to spec, using its own timer instead of just setting the EndOfSpeechTimeout in the streaming interface. EndOfSpeechTimeout can now be set separately, defaulting to the private value in the media_server.config file.
- Note that due to this modified behavior, users may experience "no-match-maxtime" earlier than in previous LumenVox versions. In these cases, please configure maxspeechtimeout to the desired value, which will be dialog dependent.
Minor Changes and Fixes:
- The default value for the trim_silence setting in client_property.conf has been changed from 990 to 970 to correspond with revised acoustic model performance
- In Linux, the permissions applied to log and cache folders have been changed to allow better compatibility when non-root users attempt to access these. The folders are now created with 755 for the parent folders and 777 for child folders. Previously all folders were created with 755 permission, which prevented non-root users access, since execute permission is required to list and traverse folders
- Changed Media Server behavior to delay decodes until all associated grammars have been fully loaded and activated.This change avoids the recognizer returning early from a RECOGNIZE when the client had not correctly waited for completion of DEFINE-GRAMMAR requests before issuing a RECOGNIZE request, which could result in no-match
- Removed enable_logging configuration option from Media Server configuration since we now have logging verbosity which offers better control
- Removed the no longer used optional media_server_log configuration item
- Removed validation of dtmf-term-char for less strict (more flexible) performance
- Fixed a bug where the Media Server incorrectly responds to OPTIONS request with 0=(using zero) instead of o= (using oh) for the reply headers, which was not compliant
- Project settings for the sample MultiThreadedStreamingCExample code now avoids unsupported Unicode and debug library references
- SSML processing of ampersand characters has been improved again to allow better handling of escaped and unescaped instances when they are encountered
- Fixed a problem where certain port-scanner software would cause an exception within the Media Server
- Fixed a problem in Media Server where invalid negative timestamps were being reported for some packets
- Fixed the critical log emailer application where it was not parsing the log entries correctly which resulted in the emails not being sent out.
10.3.200 (November 7, 2011):
Minor Changes and Fixes:
- Fixed SSML preparser problem when prosody rate is specified with a decimal value.This was previously was not being handled correctly.
- Fixed a problem affecting Windows installations only, which prevented correct upgrade from a previous version of the Speech Engine due to incorrect InstallShield product code setting.
- Minor Fix to SimpleTTSClient application to prevent possible buffer overrun.
10.3.100 (November 1, 2011):
Improvements and New Features:
- Added new Answering Machine Detection algorithms, replacing previous versions and with significant improvements in accuracy with reduced CPU overhead.
- Added support for 8 new TTS voices across 4 new languages:
- German Male
- German Female
- European French Male
- European French Female
- Castilian Spanish Male
- Castilian Spanish Female
- North American Spanish Male
- North American Spanish Female
Licensing and use of these new voices is similar to previous languages and voices
- Added new routines that were in the C API, but not in the C++ API
- GetAvailableLanguageCount
- GetAvailableLanguageIndex
- GetCallGuid
- GetDecodeAcousticModel
- GetLogFileName
- GetPhoneticPronunciation
- GetPhoneticPronunciationCount
- SetCustomCallGuid
These are now members of LVSpeechPort class, and documentation can be found here: Core API Introduction
- Consistent with en-US, there are now definitions for the pronunciation of numerals in en-IN, en-AU and en-GB
- Added support for 64-bit RHEL6 / CentOS6 (no 32-bit support of this Operating System is planned).Removed support for RHEL4 / CentOS4 and Debian
- Changes to Media Server to accommodate clients that do not wait for grammar loads or response from RECOGNIZE before streaming audio and DTMF. Any decoder or DTMF activity needed while waiting for grammar loads to complete will be postponed to prevent unpredictable results due to incorrect grammar(s) being active
- SimpleSREclient and SimpleTTSclient now have symbolic links in /usr/bin and /etc/lumenvox on Linux systems, the data files ABNFDigits.gram and 8587070707.pcm are now copied into/etc/lumenvox, making them easier to access for customers.The symbolic links are SimpleSREClient and SimpleTTSClient. The original files can still be found in/usr/share/doc/lumenvox/client/examples/
- The Speech Engine now allows mixing of grammar tag-formats.Previously any such mixing would result in undesirable semantic interpretation and was not supported
- Minor speed improvements when processing grammars
- Changed internal VAD settings to be less sensitive to low volume speech using the default settings.This change improves false barge-in accuracy slightly
- Media Server now performs better parsing of 'Content-Type' when grammars are specified to avoid problems when optional (valid or invalid) parameters are specified
- New options were added to the Dashboard File menu to be more intuitive for new users when adding, changing or removing machines
- Dashboard now keeps track of selected log type when switching machines in log view to avoid the need to keep re-selecting it
- Dashboard now automatically refreshes the status view values every few seconds
- Centralized logging now trims or pads the third field to a fixed length for better readability
- Updated the internal architecture of the TTS server for better performance. This change should not be readily noticeable by users, but allows better flexibility for more languages and voices as they become available
- Various improvements to the speech engine, including optimizations to the memory management algorithms
- Header file comments have been improved for the API functions, correcting out of date information and including newly added functionality and options. This should be consistent with the recent website documentation updates
- The Call Indexer port is now configurable, with 7595 being the new default instead of 50800. This change coincides with similar changes to the Speech Tuner, which can now specify the port number for Call Indexers as needed
- Various changes to allow client applications a faster shutdown
- Better support when converting GrXML to ABNF grammar formats when dealing with DTMF and punctuation characters
Improvements and Major Changes:
- Changed severity of startup checks to critical from error, since these prevent ASR engine startup from taking place
- Previous TTS language packs defining es-LA (Latin America) have been changed to a more correct es-MX (Mexico) language definition to add better clarification for the various Spanish dialects now supported
- Minor changes to Dashboard menu and tooltips to have more consistent wording
- Following some internal changes to built-in grammar management, Mexican Spanish built-in grammars will now be distributed with ASR packages in addition to American English
Minor Changes and Fixes:
- Fixed a problem where the Dashboard would fail to restart the ASR engine on Linux machines due to an internal permissions issue
- Fixed a problem in the Speech Tuner when opening a grammar file from Explorer. Previously a confusing and incorrect message would ask whether you wanted to delete all current data and would not load the grammar. This fix also applies to ssml files.
- Fixed TTS processing to correctly escape ampersands in the SSML. These were previously causing synthesis problems
- Fixed Media Server to allow for optional quotes around boundary specifier for multipart/mixed grammar specifications in accordance with RFC1521
10.2.900 (October 13, 2011):
Improvements and Major Changes:
- This is a significant maintenance release which fixes a problem relating to a timer wraparound, which presents itself when the tick counter wraps around when the system uptime counter reaches 49 days 17 hours and 2 minutes.Client applications using the Speech Port could exception at this time.
- The TTS server shipping in 10.2.900 does not correctly perform an upgrade.You must do an uninstall of any previous version before installing the newest.This is because there was a change to the Service Name in the installation package and the previous versions prevented the service from being removed from the command line.If you encounter installation errors 1923 or 1920, you should perform an uninstall before installing the newer version. Future versions should not encounter this problem, once older versions have been uninstalled
10.1.700 (October 7, 2011):
Improvements and Major Changes:
- This is a significant maintenance release which fixes a problem relating to a timer wraparound, which presents itself when the tick counter wraps around when the system uptime counter reaches 49 days 17 hours and 2 minutes.Client applications using the Speech Port could exception at this time.
10.2.800 (September 22, 2011):
Minor Changes and Fixes:
- Minor fix to Media Server when reporting IN-PROGRESS for INTREPRET request. This was previously sent as an event instead of a response type packet
10.2.700 (September 13, 2011):
Minor Changes and Fixes:
- Minor fix to Media Server processing where INTERPRET requests were being made quickly before a previous DEFINE-GRAMMAR request was in progress, thus causing a sequencing problem.
- Minor fix to Media Server when performing INTERPRET with DTMF grammars, which previously returned an empty success result instead of no-match. Also changed the input mode to correctly return dtmf in these cases too
10.2.600 (August 25, 2011):
Minor Changes and Fixes:
- Minor fix to number of days to/from license expiration being displayed in Dashboard (math error)
- Minor fix to configuration file generation following installation in connection with Manager product
- Fix to license type switching when used with AMD mode. Problems were seen when repeatedly entering and leaving AMD mode within the same session
- Minor coding changes to avoid valgrind warnings
10.2.500 (August 5, 2011):
Improvements and New Features:
- Minor improvement to AMD detection accuracy
10.2.400 (August 3, 2011):
Minor Changes and Fixes:
- Fixed a problem with RTP stream timestamp from TTS synthesis streams to reduce calculated jitter.
- Minor fix to correctly release cached license from a non-primary server. It is unlikely customers would encounter this issue in the field, but if they did its effect would be minimal
- Changed recent message when locales are not present from an error to a warning
10.2.200 (July 27, 2011):
Improvements and New Features:
- Added better checks for locale information during ASR startup
- Added better auto-detection of UTF-8 strings in TTS input
10.2.100 (July 22, 2011):
Improvements and New Features:
- Added a new LumenVox Dashboard application to monitor various LumenVox services, statistics and event logs (on the local machine, remote machines, or both). All license maintenance functions can be performed using the Dashboard application. This change also includes the addition of a new LumenVox Manager service, which communicates with the existing LumenVox services and acts as an interface to the Dashboard. The Dashboard is a Windows-only GUI tool but the Manager service runs in both Windows and Linux so the Dashboard can monitor and manage both Windows and Linux machines.
- As part of our ongoing effort to improve logging, we have implemented may changes to the logging mechanism, allowing logs to be remotely accessed (via Dashboard). This process also included renaming logs; see table below for more details. Log entries now contain severity information for each event, and filtering of these events is possible when viewing them in the Dashboard. For example, just reviewing critical errors and warnings is now easily possible.
- As part of our ongoing effort to improve statistics reporting, we have implemented new statistics tracking mechanisms within the ASR Server and TTS Server. These are in addition to the statistics tracking mechanisms previously added to Media Server and License Servers. These statistics can be viewed using the new Dashboard application, or by reviewing the individual application statistics logs.
- Revised memory management improves overall performance, making the most of available system resources. This change has the effect of reducing the overall memory consumption of all LumenVox applications. In test conditions, dramatic reduction in memory use was observed, at times using less than 40% memory than previous versions for the same load tests.
- Added SSML pre-parser to the TTS server, which now attempts to determine whether any unexpected elements are present within the SSML passed in from user applications. If any are detected, an attempt will be made to determine the users intent. If one can be made the SSML will be updated accordingly. If no sense can be made of that portion of the request, it will be ignored. This is in response to user requests for a more forgiving SSML parsing mechanism. Now the server will make an effort to interpret any SSML sent and will only respond with an error as a last resort. Note that any substitutions or deletions will be recorded in the TTS server log.
- Added the option of displaying pre-parsed and modified SSML in the Speech Tuner via the SSML properties dialog.
- Answering machine detection code has been reworked to provide significantly faster response when detecting beeps. Typical improvements seen were from 622ms in version 10.1 to a response time of 96ms in 10.2. In addition, new methods of beep detection were added to improve efficiency.
- Fax tone detection was added to the Answering Machine Detection module.
- Added support for viewing answering machine/fax detection events through the Speech Tuner.
- Acoustic models for UK English Digits, Mexican Spanish Digits and Indian English Digits were retrained and updated to provide better overall performance.
Improvements and Major Changes:
- Renamed logs in accordance with the new centralized logging mechanism. All log files should now share a common naming convention of component_name_log_type.txt. New and old log filenames are listed below:
Old Log Name |
New Log Name |
LVStatus_TTSServer.log |
tts_server_status.txt |
LVStatus_SREServer.log |
asr_server_status.txt |
LVStatus_LVLicenseServer.log |
license_server_status.txt |
LVStatus_MediaServer.log |
media_server_status.txt |
LVApp_SRE.log |
asr_server_app.txt |
SpeechServerLog.txt |
asr_server_app.txt |
LVApp_SpeechPort.log |
client_asr.txt |
--- |
client_tts.txt |
--- |
client_license.txt |
LVApp_LicenseServer.log |
license_server_app.txt |
LvMediaServer.txt |
media_server_app.txt |
LVApp_Tuner.log |
tuner_app.txt |
LVApp_Tuner.log |
call_indexer_app.txt |
LVApp_TTSServer.log |
tts_server_app.txt |
--- |
dashboard_app.txt |
--- |
manager_app.txt |
GrammarLoadLog.txt |
asr_server_grammar.txt |
LVApp_Critical.log |
lumenvox_critical.txt |
LVApp_MediaServer_Restart.log |
media_server_restart.txt |
LVApp_SREServer_Restart.log |
asr_server_restart.txt |
LVApp_LicenseServer_Restart.log |
license_server_restart.txt |
LVApp_TTSServer_Restart.log |
tts_server_restart.txt |
--- |
manager_restart.txt |
- The default wind-back-time for decodes has been changed from the previous value of 325ms to 480ms. This change is due to resolution improvements in Voice Activity Detection code.
- Both LvMediaServerMonitor and LVLicenseAdministrator applications will no longer be supported after version 10.2.All of the functionality these applications provided are now handled by the LumenVox Dashboard. Migration across to this new application is therefore encouraged.
- Media Server configuration settings are now initialized to a setting of 'default' which will use the default value. This saves users from remembering the default value, and may also be useful in future to identify non-default settings.
- Removed definitions for unused DeactivateGrammar(int index) and GetUtteranceScore from LVSpeechPort.h header file, since these functions no longer exist.
- Minor change to decoder processing when presented with an invalid language as part of a decode request. Logging will now be much clearer what exactly the problem is.
- The phonetic speller tool within Speech Tuner has been improved to maintain a history of phrases, which can now be copied and pasted into external applications as needed.
- Result strings have generally been changed from 'No Error' to 'Success' so that searching for the word 'error' in log files is more productive.
- Minor change to the response timing of internal messaging mechanism to better respond to dropped connections.
- AMD functionality is now exposed via the regular API mechanism. Previously this was not a primary method of accessing AMD features.
- The SimpleTTSClient application was modified to be more flexible and also provide more feedback to users when errors occur, such as when TTS server is not available.
- Minor change to add more post-allocation memory checks to help prevent problems when running into low memory situations.
Minor Changes and Fixes:
- A minor change to the Media Server code was made to accommodate dtmf-char=none, which was previously expecting a single DTMF character value.
- A minor bug in Speech Tuner was fixed when users rapidly switched between calls, then deselected all calls in the browser view, some interactions were incorrectly shown as unprocessed.
- Fixed a bug where the Speech Tuner audio display was incorrectly scaling very long audio clips, which caused the tick marks to appear too close together. There was also a minor rounding issue in the same section of code, making the ticks appear slightly offset occasionally. This too was fixed.
- Fixed a bug in Speech Tuner where flagging a number of interactions, then filtering them out and subsequently unfiltering them caused the flag marker for these to be cleared, which was incorrect behavior
- Fixed the Linux installer to auto-create the configuration file for the call indexer.Previously this would not have been created until the first run of the call indexer which is not expected behavior.
- Fixed problem with command line installation/removal of the TTS server. Previously installing and removing of the server using the InstallShield utility was the only way to perform the operations, which is not expected behavior.
- Fixed a problem with the Call Indexer, which was recording log events to the Speech Tuners log file instead of its own.
- Fixed a problem in Media Server when parsing a 'simple' RTSP SETUP request with transport line ending with the client port number. This was previously being rejected.
- Modified Media Server configuration to use default 480ms for wind back time from 325ms. Note that this change should have no affect on customers, and is a result of internal resolution refinement.
10.1.600 (July 13, 2011):
Minor Changes and Fixes:
- Fixes a problem loading and activating en-GB grammars that was introduced with 10.1.100 - all users running en-GB should consider upgrading to this version from other 10.1 releases
10.1.500 (June 20, 2011):
Improvements and Minor Changes:
- This is a maintenance release to resolve a minor packaging issue in10.1.400 and should be used in place of 10.1.400.The packaging problem related to a conflict with the xulrunner package in CentOS5 64-bit only
10.1.400 (June 1, 2011):
Improvements and Minor Changes:
- Speech Tuner now better handles malformed GrXML when files are being loaded, and reports such errors.
- Removed unused 'engine' logs folder in Linux.All Speech Engine log files are now saved in the 'sre' folder.
- Improved NLSML formatting when the Media Server is in compatibility mode in order to better structure results with multiple parses and multiple n-best results.
Minor Changes and Fixes:
- Fixed rare problem on Windows where Speech Engine failed to restart correctly following a power outage which caused corruption to one of the files accessed during startup
- Fixed a Media Server problem where TTS audio streams were stopped prematurely on occasion.
- Fixed a Media Server problem where timestamps in TTS audio streams were not consistent with wall clock if different streams were started and stopped within the same session. Now the timestamps are recalculated at the beginning of each stream segment.
10.1.300 (May 24, 2011):
Do not use this version in production
10.1.200 (April 22, 2011):
Improvements and Minor Changes:
- Speech Tuner confidence histogram display was modified to display confidence at threshold when the mouse is clicked onto the histogram
- Exposed an internal Speech Engine setting, allowing large and complex grammars to be processed in specific ways. This is needed for very specific grammars that fail to compile due to their complexity. Contact LumenVox Technical Support for assistance when working with large/complex grammars for more information on this feature.
- Minor change to timing mechanism used for restarting services when installing language packs.
- Minor change to Speech Tuner TTS view which better handles audio being displayed.
- Improved Speech Engine shutting down with non-critical acoustic model loading failure.
- Improved server-side SLM caching.
- Fix for Speech Tuner when dragging and dropping many callsre files.
- Fix for Speech Tuner that caused the number of SRE interactions to not be displayed correctly in the Platforms list on the summary view when using certain operating systems, including Windows 7.
10.1.100 (April 11, 2011):
Major Improvements and New Features:
-
New Voice Activity Detection (VAD 3.0) module implemented to greatly improve barge-in performance, allowing users to more accurately control the noise and speech levels when triggering barge-in. By default the new VAD 3.0 performs better at distinguishing speech from non-speech, offering better overall performance.
VAD 3.0 now uses 100ms of stream to determine noise/silence level instead of the previous value of 1000ms. This time period is also now configurable, allowing it to be tailored to individual application needs.
- The Speech Tuner has an SSML editor view, allowing text-to-speech audio to be tuned and developed within the Speech Tuner environment.
- Virtual Machine (VM) environments are now explicitly supported by the LumenVox License Server.Previous behavior when operating in a virtualized environment was unpredictable, as they were not officially supported. Now, with the use of a special VM license, the License Server can run in a virtualized environment provided it has outbound Internet access on port 80 to authenticate with a LumenVox license server. Please contact LumenVox Support for more details. Note that if a virtual machine is detected, the License Server will not function without this license and outbound port 80 access.
- A new acoustic model has been added, offering support for Indian English (en-IN).
Improvements and Major Changes:
- The Media Server has a revamped communications subsystem, using significantly less threads and offering noticeably improved throughput performance.
- The Speech Tuner has new Streaming options page, allowing various stream parameters to be configured and tried within the Speech Tuner environment.
- Added a new log called LVApp_Critical.log for tracking serious application issues that require user attention. This log should now be the first place users look when troubleshooting LumenVox issues.
- The Speech Tuner now has auto-completion of tags when editing GrXML grammars.
- Added support for Multi-Part grammars in the Media Server.
- Added option to specify Round-Robin or First-Available port allocation mode in Media Server. The new Round-Robin mode was added to offer better performance when cycling ports quickly when the system is under extreme load, avoiding CLOSE_WAIT socket issues.
- Improved performance in critical low memory situations.
- Improved Speech Engine decode throughput performance.
- Modified the default port number ranges used by Media Server to avoid problems with ephemeral ports. Here are the new recommended (default) values:
mrcp_server_port_base = 30000 (was previously 49922)
rtp_server_port_base = 35000(was previously 50922)
monitoring_port = 29900(was previously 39911)
- Improved License Server performance when processing large numbers of simultaneous requests.
- Modified the client licensing mechanism to offer significantly improved performance by temporarily caching released licenses, thus removing the need for round trip to the license server if another license is requested within a short time period.
- Speech tuner now allows filtering and sorting in the Call Browser view, similar to that in other views.
- N-best results are now sorted by confidence score, in descending order. Previously there had been confusion surrounding the designation of the confidence scores of lower n-best results which was caused by different methods of calculating these scores for independent results. A more unified approach to confidence scoring has been implemented forn-best results lower than the top value to eliminate this confusion.
Minor Changes and Fixes:
10.0.1020 (Feb. 21, 2011):
Minor Changes:
- Minor release releated to packaging for TTS voices. Will not affect most users.
10.0.1019 (Feb. 14, 2011):
Please note that there are a very large number of changes in the 10.0 release of LumenVox, as it represents one of our biggest releases ever. In addition to a number of fixes and improvements, we have added native 64-bit binaries to our releases,written a text-to-speech (TTS) server that is integrated with our Media Server, and made quite a few changes to our C and C++ interfaces.Customers upgrading from older versions are advised to read through these release notes carefully.
Major Improvements and New Features:
- Windows users should see our new instructions for Downloading the LumenVox products as the download process has changed.
-
64-Bit Versions of Server and Client products in both Windows and Linux are officially released. This supports full, native 64-Bit performance on supported 64-Bit operating systems (Linux Red Hat 5 64-bit/CentOS5 64-bit, Windows Server 2008 R2, Windows 7 64-bit)Note that only Intel x64 (AMD64 mode) processors are supported (not Itanium).
When running LumenVox in 64-bit native mode, you need one of the following supported Operating Systems:
- Linux RHEL 5 / CentOS 5 (64-bit)
- Windows Server 2008 R2 (64-bit)
- Windows 7 (64-bit)
Minimum supported machine configuration:
- 8 GB Memory
- 200 GB Hard Drive
- 8 processor cores
Note that machine configurations are highly application specific, depending on things like grammar size and number of simultaneous calls. Please contact LumenVox Technical Support for assistance in determining your specific hardware needs.
- Added new LumenVox Text To Speech functionality. This has been added to the Speech Port API and also is embedded within the Media Server (both MRCPv1 and MRCPv2). The new TTS functionality is exposed in non-MRCP implementations via two new LVSpeechPort header files LV_TTS.h (C interface) and LVTTSClient.h (C++ interface). Note that the TTS server is initially being released in 32-bit mode only. This can be run on 64-bit versions of supported versions of both Windows and Linux.The native 64-bit version will follow in the next product version. Most users should not encounter any performance limitations of this 32-bit component
- Added TTS events to callsre response log files (when enabled). These can now be viewed in the Speech Tuner for diagnostic purposes.When using the Media Server, these log files now use the name of the active session rather than a GUID so that they can be more easily matched up later. These logs files contain a combination of SRE and TTS events that happen within a single session, so the overall call flow can be visualized within the Speech Tuner and optionally, audio from both types of events can be replayed.Correct logging of DTMF events to callsre files was also added to LumenVox Media Server.
- TTS functionality has been integrated into the LumenVox Media Server, which connects to the LumenVox TTS Server (via licensing). SPEAK, PAUSE, RESUME, SET-PARAMS and GET-PARAMS requests are serviced via MRCPv1 and MRCPv2 connectivity. MRCP CONTROL requests are not handled at this time.Of note, when plain text TTS requests are made,the TTS Server will utilize any SET-PARAMS settings that have been specified for the session (or non-standard settings from the configuration) and internally produce and handle this as a full SSML request.By contrast, SSML requests sent from external client applications (platforms) must embed any SSML settings within their SSML request string.Also note that syntax errors that are detected in the specified SSML string passed into the TTS parser are reported back to the MRCP client in the failed completion packet with details contained in a "Completion-Reason" header where possible.
- Added new GetPropertyEx API functionality to expose the values that are set using the companion SetPropertyEx API functions.This interface is available from both the C++SpeechPort and also the C-style API interface
- Added a new command line utility allowing grammars to be pre-loaded to the specified SREservers via batch or script files as was requested by some customers.
This utility is called lv_grammar_loader in Linux and GrammarLoader.exe in Windows
Usage: GrammarLoader.exe <grammar-file> <server-ip> [timeout]
The timeout (specified in seconds) indicates the timeout for the grammar load call.Defaults to 3600 (1 hour) if none specified
- Added Statistical Language Model (SLM) support to the Engine. Use of SLMs requires a new "SLM" licensetype that includes all functionality of the Full license plus support for SLMs. Please contact LumenVox for more information about using SLM functionality.
Improvements and Major Changes:
- Optimized the packaging of installation packages to remove uncommonly used medium and high resolution models. Also split the installation packages into language packs,which reduces the installation download size for most customers. Please be sure to read the new installation instructions for more information on these language packages.
- The client_property.conf file has had the default value of STRICT_SISR_COMPLIANCE set to 1. This should not affect users who upgrade as your existing values should be copied into the new file, but any new installations of LumenVox will have this set. Any user who is using the older SISR tag-format making use of $instead of out should verify the setting is set to 0. LumenVox encourages all developers to switch to the current SISR tag-format as soon as it is convenient.
- Many functions and definitions that were previously deprecated have been removed from the API.Any applications using these should be modified to avoid their use. Please consult LumenVox technical support if you require assistance with this. The following items have been removed:
Defines and typedefs-
-----------------------
H_SPT
H_SPT_PRE_ORDER_ITR
H_SPT_CHILDREN_ITR<
LV_NOT_A_VALID_PROPERTY_VALUE<
LV_BAD_HPORT<
LV_GRAMMAR_WARNING
LV_GRAMMAR_ERROR<
LV_DECODE_USE_ABNF_GRAMMAR<
Functions and methods-
-----------------------
LVParseTree_GetIteratorBegin()
LVParseTree_GetIteratorEnd()
LVParseTree_GetConceptIteratorBegin()
LVParseTree_GetConceptIteratorEnd()
LVParseTree_GetTerminalIteratorBegin()
LVParseTree_GetTerminalIteratorEnd()
LVParseTree_GetTagIteratorBegin()
LVParseTree_GetTagIteratorEnd()
LVParseTree_Node_GetParent()
LVParseTree_Node_GetIteratorBegin() LVParseTree_Node_GetIteratorEnd() LVParseTree_Node_GetChildrenIteratorBegin() LVParseTree_Node_GetChildrenIteratorEnd() LVParseTree_Node_GetTerminalIteratorBegin() LVParseTree_Node_GetTerminalIteratorEnd() LVParseTree_Node_GetTagIteratorBegin() LVParseTree_Node_GetTagIteratorEnd() SI_DATA_Clone() SI_DATA_Release() SI_DATA_Is_Equal() SI_DATA_Type() SI_DATA_Print() SI_DATA_GetBool() SI_DATA_GetInt() SI_DATA_GetDouble() SI_DATA_GetString() SI_DATA_Object_Size() SI_DATA_Object_Property_Id() SI_DATA_Object_Property_Value() SI_DATA_Object_Property_Exists() SI_DATA_Array_Size() SI_DATA_Array_Element() LVGrammar_SaveCompiledGrammar() LVGrammar_LoadCompiledGrammar() LV_SRE_GetPhonemes() LV_SRE_SetProperty() LV_SRE_GetUtteranceScore() LV_SRE_SetBuiltinGrammarURI() LV_SRE_IsGlobalGrammarLoaded() LV_SRE_UnloadGlobalGrammars() LV_SRE_GetParseTreeHandle() LV_SRE_GetNumberOfConceptsReturned() LV_SRE_GetConcept() LV_SRE_GetPhraseDecoded() LV_SRE_GetRawTextDecoded() LV_SRE_GetPhonemesDecoded() LV_SRE_GetConceptScore() LV_SRE_AddPhrase() LV_SRE_LoadStandardGrammar() LV_SRE_RemoveConcept() LV_SRE_ResetGrammar() LV_SRE_SetConceptRepetition() LV_SRE_LoadGrammarIdx() LV_SRE_LoadGrammarFromBufferIdx() LV_SRE_LoadGrammarFromObjectIdx() LV_SRE_UnloadGrammarIdx() LV_SRE_IsGrammarLoadedIdx() LV_SRE_ActivateGrammarIdx() LV_SRE_DeactivateGrammarIdx() LV_SRE_SwitchFromHotMode() LVSpeechPort::GetNumberOfConceptsReturned(int VoiceChannel) LVSpeechPort::GetConcept(int VoiceChannel, int Index) LVSpeechPort::GetPhraseDecoded(int VoiceChannel, int Index) LVSpeechPort::GetRawTextDecoded(int VoiceChannel, int Index) LVSpeechPort::GetPhonemes(int VoiceChannel, int Index) LVSpeechPort::GetPhonemesDecoded(int VoiceChannel, int Index) LVSpeechPort::GetConceptScore(int VoiceChannel, int Index) LVSpeechPort::AddPhrase(int GrammarSet, const char* Concept, const char* Phrase) LVSpeechPort::LoadStandardGrammar(int GrammarSet, int DefaultGrammar) LVSpeechPort::RemoveConcept(int GrammarSet, const char* Concept) LVSpeechPort::ResetGrammar(int GrammarSet) LVSpeechPort::SetProperty(int property,int value) LVSpeechPort::LoadGrammarFromBuffer(int index, const char* buffer_string) LVSpeechPort::LoadGrammarFromObject(int index, LVGrammar& Grammar) LVSpeechPort::IsGrammarLoaded(int index) LVSpeechPort::UnloadGrammar(int index) LVSpeechPort::IsGlobalGrammarLoaded(const char* label) LVSpeechPort::UnloadGlobalGrammars() LVSpeechPort::SetBuiltinGrammarURI(const char* Name, lv_bool DTMF, const char* URI) LVSpeechPort::ActivateGrammar(int index) LVSpeechPort::SwitchFromHotMode() LVSpeechPort::LoadGrammar(int index, const char* uri) LVGrammar::SaveCompiledGrammar(const char* filename) LVGramamr::LoadCompiledGrammar(const char* filename) LVParseTree_Node::Parent()
- Some API functions were deprecated. These will continue to be supported for some time and can be used by including the LV_SRE_Deprecated.h header file:
LV_SRE_OpenPort2()/LV_SRE_OpenPort() replaced by LV_SRE_CreateClient()
LV_SRE_ClosePort()replaced by LV_SRE_DestroyClient()
LVSpeechPort::OpenPort()replaced by LVSpeechPort::CreateClient()
LVSpeechPort::ClosePort()replaced by LVSpeechPort::DestroyClient()
- As noted above, CreateClient should now be called in place of OpenPort or OpenPort2. This new single function is a consolidation of the previous two. In addition, the ClosePort function call has now been renamed DestroyClient to be more consistent with other API names.The previous functions can continue to be called by including the LV_SRE_Deprecated.h header file, but this older functionality will be removed in some future product version so we recommend moving over to using the new function names as soon as is practical.
- Several unused and no longer supported settings have been removed:
PROP_EX_DECODE_OPTIMIZATION
PROP_EX_SEARCH_BEAM_WIDTH
PROP_EX_LANGUAGE
- Several settings have been deprecated, and will be removed in future versions:
PROP_EX_LIC_SERVER_HOSTNAME - replaced by new PROP_EX_LICENSE_SERVERS
PROP_EX_LIC_SERVER_PORTNUM- replaced by new PROP_EX_LICENSE_SERVERS
- All references to the long data type in the exposed API functions have been replaced with the int data type instead. This is for better 32/64-bit compatibility, since int data types are the same size in both 32- and 64-bit implementations of Windows and Linux.
- Configuration files on Windows are now located in a single common folder. This is the LVCONFIG folder (which by default is set to $LVBIN/config and defaults to Program Files\LumenVox\Engine\).On Linux, previously there were application-specific variables in lumenvox_settings.conf that would identify the full path to the respective configuration file. Now there is a global variable called LVCONFIG (under a section named "GLOBAL") in lumenvox_properties.conf that identifies the directory where all LumenVox products will look for their respective application-specific configuration file. By default, LVCONFIG points to/etc/lumenvox. When performing an upgrade from an earlier version, the installation process will scan the old locations,making backup copies of old configuration files as needed and moving the old settings into the new files created in this folder. User settings should therefore be carried to the new locations where possible. Note that some unused settings have been removed.When this is encountered during an upgrade scan, the settings will be copied to the new location, but will be commented out as removed. This change is in combination with anew mechanism for creating configuration files following installation, which was previously performed by the installation packaging process, but is now performed by a command line utility application (ConfigurationUpdater.exe on Windows,lv_configuration_updater on Linux).This utility should not need to be run by users and should only be used in conjunction with LumenVox technical support. One noticeable effect of these various configuration changes is that now, if users wish to revert to the original, default settings at any time, they can rename or delete a configuration file and it will be created by the host application the next time it is started.
- Renamed Media Server configuration from file mediaserver.conf to media_server.conf in both Windows and Linux.
- Added support for custom XML lexicon files to be specified via the <lexicon> element inside of a grammar.
- Added a new command line utility allowing users to see their current client settings and configuration file locations more easily. This utility can optionally perform a simple request to determine availability of licenses with the current configuration settings(particularly useful when running in authentication mode).
This utility is called lv_show_config in Linux and LVShowConfig.exe in Windows. It can be run with no options in order to see a usage message.
-
Added new LV_SRE_GetAvailableLanguageCount andLV_SRE_GetAvailableLanguageIndex APIfunctions to expose loaded acoustic model language information to the client. Notethat these are advanced functions and therefore are declared in LV_SRE_Advanced.h
- Added new LV_SRE_SetCustomCallGuid andLV_SRE_GetCallGuid APIfunctions that can be used to specify the callsre filename for the session (this is done automatically when running the Media Server).If these are not used, the previous functionality of using a random GUID each time will continue to work as before. Note that these are advanced functions and therefore are declared in LV_SRE_Advanced.h
- Exposed some C++ functions that were previously only accessibly using the C-Style API.
Added to LVSpeechPort.h:
LVSpeechPort::ReturnGrammarErrorString
LVSpeechPort::GetGrammarVocabSize
LVSpeechPort::GetAvailableLicensesCount
LVSpeechPort::IsServerAvailable
Added to LV_SRE_Grammar.h:
LVGrammar::GetLanguage
LVGrammar::GetMode
LVGrammar::GetTagFormat
- The Speech Tuner now requires a separate Speech Tuner license in order to run. Please contact our sales or support departments for details.
- The Media Server now uses a default confidence-threshold of 0.05 in place of 0.45 due to recent confidence score changes returning a broader spectrum of results. This can still be changed as always by specifying a new value in the configuration file or as part of an MRCP/VXML setting.
- The Engine and the client have had significant improvements made to better handle very large and/or recursive grammars.
- The Speech Tuner's transcriber view has been changed to allow users to pause and resume audio playback (using ~ key), and new options to jog forward and backward 5% through the audio using Page Up / Page Down keys.
- Added new upgrade analysis tool, which allows users to identify potential problems before upgrading to newer versions of LumenVox products. In particular is there would be any licensing issues encountered when moving to a newer version, these would be identified so that they can be resolved before upgrading. This is an optional tool.
- Added Media Server handling of the SIP Record-Route headers if present in SIP INVITE requests. If Record-Route headers are not present, there is no effect. If one or more are present, they will be preserved in their correct order and relayed back to the client whenever appropriate.
- Changed Media Server to allow more control over Save-Waveform functionality. This includes persistent state of Save-Waveform flag as well as configuration file override for the default Save-Waveform flag. The Save-Waveform flag can now be set from SET-PARAMS (or other) requests in addition to just RECOGNIZE tasks. Includes changes to naming convention of generated audio files that include the RECOGNITION task ID so that each audio within the session can easily be identified.
- Improved robustness across all products to better handle out-of-memory situations and attempt to fail gracefully, continuing operation where possible. Also improved robustness of Media Server when running under significant stress situations either under heavy load, or low resource situations to be tolerant of socket and memory failures at the operating system level.
- Improved Media Server throughput performance.
- Improved built-in time grammars to return results more compatible with other vendors.
Minor Changes and Fixes:
- In addition to the configuration file location, all of the configuration files have had sections added to help users better identify the area or module affected by each setting.
- All of the older, now unused semi-continuous decoder functionality has now been removed to allow better future performance and remove unnecessary overhead. This change included removing some older acoustic models from the installation packages since they are no longer necessary.
- Invalid Tag-Formats that are used in grammars are now reported during grammar load,rather than after use as before.
- Added grammar display/selection option within the Grammar View and generally modified how grammars are handled within Speech Tuner.This includes a change allowing users to toggle between .callsre grammars and loaded grammars.Also, currently loaded grammars are unloaded whenever a new tuner database is opened.
- Media Server now supports "Logging-Tag" header, when specified. This tag is stored in the corresponding callsre file and can be filtered using the Speech Tuner.
- New recognizer_resource_url and synthesizer_resource_url settings were added to the Media Server configuration file to indicate the ASR/TTS resources users wish to target.
- Added new SimpleTTSClient example application to exercise the new TTS functionality. This accompanies the newly renamed SimpleSREClient example application to exercise the SRE functionality (this was previously called SimpleClient).
- Added support for builtin:grammar of non-US English languages. See the SetPropertyEx option PROP_EX_BUILTIN_GRAMMAR_LANGUAGE for more details. Note that this change is automatically implemented within the LumenVox Media Server when the grammar language is specified (via 'Speech-Language') in the MRCP RECOGNIZE or DEFINE-GRAMMAR request.
- Added new error codes in the range -51 to -68 which are used by the TTS handling code and License handling code as well as additional exception handling.
- Fixed a minor bug in Speech Tuner where zero length transcriptions were causing problems when deactivating the view (switching to another, or shutting down).
- Identified and fixed a problem in Media Server where the possibility existed in which MRCP packets could have been mishandled if multiple messages were compacted into a single transport packet by the sending machine in a certain way. This would not have been a very common problem.
- Identified and fixed a problem relating to registering and unregistering the LumenVox Media Server service from the command line (Windows only). This was tracked down to an incorrect name being used by the InstallShield packaging code. This has now been corrected.This would only have become apparent to Windows users attempting to manually remove and reinstall the Media Server service.
- Added new options to Speech Tuner, allowing users to specify the VAD streaming parameters prior to running a test. These include SNR and Volume Sensitivity settings as well as the VAD Init Mode (silence trimmed).
- Added versioning to Speech Tuner Interactions files.
- Modified Media Server SIP error detection algorithm to be more tolerant of errors before resetting the UDP socket. This may have the effect of improving overall performance when UDP receive errors persist.
- Modified Media Server port allocation algorithm to favor round robin approach instead of lowest available. This may have the effect of improving performance when cycling through large numbers of sessions in quick succession due to TCP CLOSE-WAIT timing.
- Removed previous memory check that would fail decodes indicating insufficient memory if available physical memory was below 80 MB. This helps with low resource system configurations where physical memory is not available, and is using virtual memory instead, which is monitored using other methods.
- Internally, Windows binaries are now built using Visual Studio 2008 in place of the previous Visual Studio 2005 (this change should not affect most users).
- Major product components now log out their operating system environment and location of configuration settings files they are using at startup.
- Improved logging in the License Server. Now installed licenses are reported to the log file at both application startup and following a merge operation when new licenses are added. Also, as each license is acquired, the number of used/remaining licenses of that type are reported.
- Media Server now utilizes the decode timeout specified in the configuration file when performing the final steps of a decode (internally). This was previously using a fixed 20 second timeout. This change should not be noticeable to users.
- Improved API logging of SetPropertyEx and GetPropertyEx in the speech port to give clearer indication of the meanings of values specified.
- During Windows installation, a check for MDAC (required for emailer.exe) is done, which will be installed if needed.
- Significant performance improvement in the Media Server Monitor, which now performs caching and periodic (500ms) updates to prevent backlogs of messages that couldn't be written quickly enough. Also, the auto-scroll feature has been improved to stabilize selection and viewing area when new events are being appended to the display window.
- Speech Tuner now offers better handling of .csv files and better reporting of files when they cannot be found or loaded correctly.
- Removed some diagnostic (and possibly confusing) messages from the console output when installing/removing some LumenVox services from the command line (Windows only).
- Logging was changed to remove unnecessary events being recorded that were cluttering the otherwise useful logs. Also recategorized some events that were being reported with the wrong level of severity.
- Modified example grammars to use the correct {$=""} tags with instead of the :""shorthand.
- Added the option of having the license server create Info.bts files when needed using the new optional /SYSINFO command line parameter. This can be useful as an alternative to lv_license_manager, which can also produce these files when needed.
9.5.100 (May 10, 2010):
Improvements and New Features:
- The confidence scoring mechanism used by the decoder has been almost entirely rewritten. A completely new set of algorithms is used in calculating scores, which greatly improves the reliability and meaning of the reported confidence scores. In particular, it provides a much clearer separation between correct and incorrect results.
- Due to the significantly revised confidence score calculation methods added to version 9.5, there is no backward compatibility between the 9.5 server and older clients (or vice versa). You should recompile your applications and be sure that you have updated all components prior to deploying 9.5.
- All acoustic models for the various languages have been rebuilt using the new continuous methodology introduced for American English in 9.0. This should lead to improved accuracy with each language. It also means that we are dropping support for semi-continuous mode as it is no longer needed. If you were previously using semi-continuous mode (e.g. for Spanish support) you should switch to continuous mode.
- The LumenVox software can now be configured to automatically send e-mail alerts when critical errors occur. This is disabled by default.
- On Windows installations, a new config file called lumenvox_settings.conf will be installed in%CommonProgramFiles%/LumenVox/.This will be used to hold global settings for all LumenVox products, including the new e-mail settings for reporting critical errors. (As this file already existed on Linux, it has simply had the new [GLOBAL] section added to it.)
- The jitter buffer mechanism in the Media Server was rewritten. This should now perform better when sequencing problems between audio and dtmf occur and also when dropped packets are encountered.
- New LV_SRE_GetAvailableLicensesCountand LVSpeechPort::GetAvailableLicensesCountAPI functions allow client applications to query the number of available licenses from a License Server.
- The speech server now has several options for controlling memory use and determining what happens when system memory becomes low. Please see the sre_server.conf documentation for the following settings: FRAME_TRACK_MODE, CRITICAL_MEMORY_THRESHOLD, LOW_MEMORY_THRESHOLD, and LIMITED_MEMORY_THRESHOLD.
- The Engine now has an optional answering machine detection mode that can be enabled with a special license type. This is useful for outbound calling, as it can very reliably detect answering machine or voicemail beeps. Please contact LumenVox for information about obtaining answering machine detection licenses.
- Linux packages now ship with a script called lvservices_restarter.sh that allows you to configure the system to automatically restart the LumenVox processes.
- Please Note: In LumenVox 10.0, we will be completely removing a number of deprecated functions, including the concept/phrase interface. If you are still using functions that are in the deprecated header, please consider changing them now.
Fixes and Minor Enhancements:
- Acoustic model file versions are now reported at Engine startup.
- Various Media Server statistics are now periodically logged to a new file calledLVStatus_LVMediaServer.log. This file is located in the same directory as the Media Server log (by default C:\Program Files\Lumenvox\Engine\Logs\ on Windows or/var/log/lumenvox/mediaserver/ on Linux).
- All products now log a version number, operating system, and LumenVox environment variables at startup to aid in troubleshooting of common problems.
- Fixed a memory leak in the License Server.
- When the License Server is running in authentication mode, it will now log usernames in plain text (previously user names were logged as obfuscated hash codes).
- Linux client packages now include a compiled binary of the SimpleClient sample along with the source code (helpful when there is no compiler on the target machine).
- Changed the media server configuration file to better describe the various logging options in the responses (callsre) files.
- The LV_SRE_IsServerAvailable API function signature was changed to return an integer instead of bool.The bool was inappropriate here since this is a C type interface. Now 1 means that an SRE server is available, 0 means there is not.
- Configuration files now have a GLOBAL section of where previously there were no section markers. Applications that looked for settings in unmarked locations will now look for them in the GLOBAL section.
- Added dropped packet monitoring to Media Server statistics.
- Added new error codes to LV_Error_Codes.h
- Fixed a problem with the Call Indexer which previously had the possibility of waiting for an infinite amount of time between scans. Now it will default to 1 day between scans.
- Fixed a bug in the Linux version of the Call Indexer which caused the application to look in the wrong location for configuration file. The application will now use the correct/etc/lumenvox/ location for the configuration file.
- On Windows, the Call Indexer now expects its configuration file to be located in the new %LVCONFIG%folder, which defaults to %LVBIN%\config.
- Added the missing LVCallIndexer.ini to Linux RPM packages.
- Fixed a problem with the grammar parser that allowed for infinite recursion in grammars.
- Modified the mechanism that established a connection between speech clients and servers at clientstartup. The client will now try for up to 2 seconds to connect to a server before giving up.Previously, on very fast machines, connections could timeout before this connection had been fully established
- Modified the way in which Speech Tuner stores transcription text to allow for commas to be included without affecting the comma separated file format, which previously caused corruption of the data when reloading the files. Now commas detected in transcript text will be encoded as , when saved.This method is now used for Transcript, Decode, Transcript SI, Comments, Error String and ModelName strings.
- Speech Tuner now correctly calculates Word Accuracy scoring. Previously the number of word was being derived incorrectly. Word deletions are now no longer used in counting the number of total words in the statistics.
- Speech Tuner was modified to update word counts displayed in the word list to include mismatches (insertions and substitutions only).
- Added correct C-style API function attributes to some deprecated functions. This change should not adversely affect users.
- Added new grammar list option to Speech Tuner, allowing grammars to be viewed and selected from within the grammar editor window.
- Added options to specify more than one active grammar for use when testing with the Speech Tuner. This makes working with grammars much easier.
- Improved grammar loading and error processing added to Speech Tuner allowing users to more clearly see what problems are being reported for problematic grammars.
- Improved grammar loading in Speech Tuner to resolve external references wherever possible. These references are cached to local temporary files, allowing complex referenced grammars contained in callsre files to be edited and tested on the fly.
- Improved Speech Tuner prompting to save grammars when changing views or exiting the application.
- Fixed a bug in the Call Indexer which could cause a segmentation fault on Linux under certain conditions.
9.2.400 (February 26, 2010):
- Fixed problems with custom pronunciations which caused custom pronunciations to be silently ignored in a number of cases.
- Fixed problem with new grammar caching mechanism where the original GrXML grammar text was not being correctly stored in the resulting callsre files when using the streaming interface or Media Server.
- Fixed problem with grammar parser resolving uris that specified relative paths using the ../syntax.
- Fixed problems with Speech Tuner loading malformed grammars into grammar editor.
- Fixed cosmetic problem in Speech Tuner when displaying decode progress with greater than 65535 interactions due to minor overflow issue.
- Fixed bug that caused grammar results to sometimes be returned in incorrect case.
- Added missing acoustic models to Speech Tuner installation that are needed for phonetic speller tool.
9.2.300 (February 16, 2010):
- Fixed bug in Media Server where an empty result is returned with 0 confidence score when a bad VAD silence is detected due to insufficient leading silence in the decoded audio.This was changed to now return no-match if recognized string length is less than 1 and confidence is set to 0.
- Fixed bug in the client where a grammar label of greater than 260 characters could create a buffer overrun situation and cause an exception.
9.2.200 (February 9, 2010):
- A brand new Speech Tuner has been released! Full details are available in the Tuner documentation
- Significant internal changes to allow better handling of grammars. This specifically addresses large grammar issues but also allows faster processing of all grammars. If you previously had trouble loading large grammars, please try upgrading to 9.2.200 or later.
- Changed default location of the Media Server configuration file in Windows (it will now default to a folder called config in the Engine installation directory). In the event that your %LVBIN% variable is set to somewhere other than the Engine installation directory, the Media Server will look in %LVBIN%\config.
- Streamlined decode processing code allowing decodes to run faster generally.
- Added more verbose logging to grammar loading and parsing.
- Better compatibility with alaw audio format in Media Server. Is now faithful to SDP specified format, not RTP packet markings. Previously, incorrectly marked RTP packets were not processed correctly if there was an incorrect mismatch between SDP specifier and RTP specifier. Also improved possible ambiguity when client requests both alaw and ulaw audio formats at the same time (now first specified will be used).
- Fixed a bug in the internal statistical pronunciation model relating to pronunciation of words (typically nouns) that would not generally be in the language dictionary. This fix improves accuracy in these cases.
- Fixed a problem with possible non-unique Session-ID strings in MRCPv1 sessions.
- The Media Server's RTSP interface will now correctly respond with 486 Busy Here when speech port is unavailable at SETUP. Previously the SETUP would succeed, but subsequent DEFINE-GRAMMAR or other calls would fail.
- Improved the load balancing between client and multiple SRE servers using a new algorithm designed to better utilize and balance SRE resoruces.
- Client side grammar caching has been added. This reduces the amount of time needed for subsequent grammar load requests from each client. This can significantly speed up loading GrXML grammars, which need to be converted to ABNF prior to sending to server. Several new settings have been added to client_property.conf including: CLIENT_CACHE_ENABLE, CLIENT_CACHE_EXPIRATION, CLIENT_CACHE_MAX_NUMBER and CLIENT_CACHE_MAX_MEMORY.
- The Media Server was getting into a deadlock state under extreme specific conditions on Linux platforms. This problem was identified and fixed.
- Changed configuration file parser to better handle situations where config file is missing.
- Fixed a problem in the way threads are stopped in Linux. This allows more robust thread control.
- Fixed problem in Media Server in Linux, where locking would occur sometimes due to blocking socket call.
- Changes to Linux shutdown code improve shutdown performance and reduce potential problems.
- More robust handling of configuration values in Media Server to prevent minor rounding problems on some Linux distributions.
- Media Server changed to better handle multiple duplicate SETUP requests in the same session.
- The Engine installation package now includes a new Call Indexer service. This works in conjunction with New Speech Tuner to index and serve callsre files as needed on Linux and Windows machines connected to Speech Tuner.
- Logging of License Authentication requests was improved to indicate licenses on a per-user basis.
- Fixed a grammar reference bug where a non-root-rule was being referred to in a referenced grammar.
- Fixed bug in server side grammar cache, when expiring grammars.
- Added fix for NAT problems with certain routers and firewalls that would automatically close license connections after a period of inactivity. Now the socket will be pinged to remain open more frequently.
- Fixed small memory leak when socket message connectivity is lost in certain conditions.
- Fixed a bug where callsre responses were being stored in the incorrect folder under certain conditions.
- License Administrator changes were made to prevent freezing of GUI when connecting to License Server that is under stress.
- Fixed bug with multiple sequential $NULL rules being parsed in a grammar.
- Fixed problem with GrXML to ABNF conversion that would not correctly assign grammar weighting to $NULL rules in the generated ABNF output. Removed unnecessary double $NULL rules.
- Added option to automatically clear cache folders one time after installing or upgrading software to avoid possible version compatibility problems. Upon installation or upgrading, a file called cached_grammars.key will be generated in the server-side grammar cache folder, and a file called cached_client_grammars.key will be generated in the client-side grammar cache folder.If this file is detected by the Engine or client, it will clear the appropriate cache (including the file). This means that after upgrading, large grammars may take a while to load the first time as the cache is clear.
- Removed medium and high resolution acoustic models from packaging to reduce shipping size(these models are still available by custom request).
- Removed some unwanted/unnecessary logging.
- Windows 2000 is no longer a supported operating system; however we have tested on Vista and Windows 7 and now support those operating systems.
- Support for Fedora Core 9 has been dropped, but support added for FC12.
- Fixed bugs when handling custom pronunciations in (default) continuous decode mode.
9.1 (October 2009):
- The LumenVox products now support licensing authentication, allowing for the License Server to only provide licenses to clients who authenticate with a user name and password. You can use this if you wish to provide licenses to customers over the public Internet,or to take advantage of our new subscription licenses.
- American English (en-US) now has three different resolution acoustic models you can use. These models offer better accuracy, but use more memory and require more time for decodes. By default, only the lowest resolution model is loaded.
- We have removed the "-di" suffix in a language declaration to indicate that digits-only acoustic models should be used. The Engine will now make this switch automatically, if it detects that that all loaded grammars use only digit words. This should offer much greater compatibility when using the built-in digits grammars, especially with VXML applications. If you continue to specify -di in a language,it will be ignored.
- There have been a number of bug fixes and enhancements that should improve compatibility between the LumenVox Media Server and various voice platforms that use MRCP.
- We have streamlined the Speech Engine such that its memory footprint has been reduced substantially. Under light loads, you should find that the Engine uses only about 150 MB of memory.
- Be sure to note that if you upgrade to 9.1, you should download and install new license files following our new upgrade procedures that were put into place in 9.0.
- Fixed a bug that caused loads to be improperly distributed when using multiple speech servers.
- Added instructions for using VoiceGenie 7, Genesys Voice Platform 7.6, and Syntellect Communications Platform with the LumenVox Speech Engine.
9.0 (July 2009):
- This is the first release of the LumenVox Continuous HMM Decoder for use with the Speech Engine.The Continuous model does not use compression, so it has higher resolution, resulting in increased accuracy.The continuous models have shown an accuracy increase across various domains, but at the expense of approximately 15-20% more processing time.
- There are new noise reduction options available in 9.0, allowing better recognition in noisy environments.
- The LumenVox Speech Engine version 9.0 offers improved support of the$GARBAGE rule, which allows grammars to be defined where utterances before and after the desired phrase can be ignored. This is programmatically challenging to do correctly, and this new version sets out to improve performance over previous implementations of this rule.
- The new LumenVox Media Server provides an interface to our Speech Engine via MRCPv1 (RTSP) and MRCPv2 (SIP) connectivity. These are two commonly used networking protocols used among a variety of leading speech engines.
- The LumenVox License Server will no longer allow old licenses to work with the newer versions of the Speech Engine. This means that to use the latest versions of the software, you must ensure your software maintenance is up to date and download and install licenses with the newer maintenance date. For more information,see Upgrading LumenVox Software.
8.6 (January 2009):
- New configuration files have been added to the Engine. This should allow greater control over Engine settings without having to use the API. A few older configuration files, such as the old license_client.conf, have been merged into these new files. If you still have the old configuration files in place,the Engine will prefer the values from those, so it should not break backwards compatibility for any users.
- The Engine has a new startup procedure that should help catch problems earlier. There is a new startup procedure for the speech client (the SpeechPort interface) and a new startup procedure for the speech server.
- We have revamped all of our example application code. It should be better commented and more cleanly written. We have also added a few new sample applications.
8.5.100 (May 2008):
- Licenses can now be uninstalled, using the latest License Server. If you have a machine with licenses that were set up before the release of 8.5, you will need to upgrade those licenses.Users who do not need to uninstall a license should be unaffected by this change.
- The location of files on Linux has been restructured. This represents a major revamp of the LumenVox software on Linux. Existing Linux users will need to take this into account when they upgrade. Note that we have also completely dropped the environment variables ($LVBIN, $LVLIB, etc.) on our Linux installations, and we have split the Engine into three separate Linux packages: an Engine client package, an Engine server package, and a "core" package containing files shared across all products. Please see Linux Directory Structure for information about where our files are now installed.
- Short words (e.g. "back" or "stop") should now have higher confidence scores when correctly recognized.
8.0.300 (December 2007):
|