Introduction to Speech Application Design
This section will be covering LumenVox and Speech Recognition, along with Asterisk and how you, as a Digium reseller, can improve your margin in selling speech recognition technology along with the Asterisk system.
We are a core speech technology company, focusing on building powerful speech engine technology since 2001. Our focus in terms of marketing has been the small and medium size business community. We have been working diligently in the past years to provide a more powerful engine, and a license model that is more affordable for the SMB space.
Speech recognition has been around for many years now and it is by now a more mature technology. There are three separate areas of speech which you will receive many questions regarding.
A grammar is a file of words and phrases that represent what you expect the caller to say at a given prompt. For example: in a call router type of environment where you might ask a caller, "Who would you like to speak to?" A normal response would consist of a name of an employee or even perhaps a department (ex. Mark Spencer, or Sales). You'll want to put the first name, the last name and even a nickname, along with the names of the various departments within your call router. All of these options for selection would make up your grammar.
Let's compare a typical TouchTone application with a speech application. In this example, let's say that you wish to rent a car to travel to San Jose California. When you make the call to a car rental agency you find yourself interacting with a speech enabled car rental application. The IVR asks; "Where would you like to pick up the car?" With TouchTone you would be able to respond with a number or perhaps a zip code. However, most of us are unaware of the zip code of our destination when traveling. Using a speech enabled IVR we can simply respond to the prompt verbally with our destination, in this case, "San Jose," which is simpler then retrieving and then punching in a ZIP code for a target destination.
Let's say we have an order entry speech enabled IVR application. This application allows the caller to buy jackets. The caller is prompted for the color of jacket. In a TouchTone environment a list of options may be as follows:
1 for Red
2 for Blue
3 for Green"
In this case the caller would have to memorize their option prior to making a decision, this can become more challenging the longer these types of menus grow. In a speech application, the caller would simply state the color "green" without having to memorize a menu at all. Much easier, much more efficient to speech enable IVR applications.
As you know there are many DTMF applications running currently utilizing millions of touch tone ports. Most of these DTMF IVR's have not moved to speech as of yet. One of the reasons being that in the past speech was much more expensive and also it was very technically difficult operation to move from TouchTone to the speech environment. With the LumenVox license model and the Asterisk infrastructure and platform you have a very power instrument to take to company who have legacy IVR applications deployed and help them migrate from TouchTone to speech. This presents you, the reseller an incredible business opportunity.
For more information on how to migrate TouchTone applications to speech application please see the LumenVox.com web site for white papers and videos on the topic.
With the advent of VOIP technology, it's no longer a situation where we are presented with the single tool of dial tone and long distance calling. In today's environment it's more about applications, and what has become a power driving force for application development is speech recognition technology. Now, as a reseller, you can go to your partners and clients and present the Asterisk solution as something that can address certain challenges and make business processes easier. The following is a list of the types of applications that could provide solutions to various types of business challenges:
The General section contains the most common use of speech. We foresee that within the next few years most PBX systems will be speech enabled. In its most simple form a speech enabled Auto-Attendant makes it easy to simply say a person's name, as opposed to knowing the spelling of a persons name and also having to key it in.
In the health care or pharmacy environment, we see that the prescription refill applications are very popular, there are tens of thousands of TouchTone ports deployed currently and in the near future they will all be speech enabled.
In financial services all of us are currently interfacing with a banking IVR, these are great opportunities to be converted to speech.
In the retail environment, customer satisfaction surveys are a very easy way to gauge how happy customers are with the service they are receiving. Speech enables the customer to communicate their satisfaction level using a tool which provides ease of use and efficiency.
The question can be asked: how one can get to these applications? For you, as a reseller of Digium solutions, your main source of margin is professional services. With professional services, there are many opportunities, such as installing the system, and placing the infrastructure to make it all work. You also now have the additional opportunity to build the applications we spoke about.
So one opportunity, in terms of margin, is to build speech applications, and make them available to your partners and end users.
Another opportunity is to take existing solutions and resell them as a turn-key package. This can be done in the same fashion as buying a copy of Word for Windows from the local computer software store. Many solutions providers have pre-written applications for things such as setting doctor appointments, surveys and many other pre-existing application you can resell. So know you two opportunities here; number one you can use pre-existing turn-key packaging, or number two, for more custom requirements you can utilize your own professional services team to build these applications yourself. Lastly, to empower the technology you can sell LumenVox speech ports, which also carries a margin opportunity.
The LumenVox Speech Engine is a speaker independent technology, no training of the Engine is necessary, it works right out of the box.
We also work on certifying additional distributions. So if you have a need outside of our current supported distributions please contact us.
Ports: The Engine is licensed on a per port basses. Each port allows a concurrent or simultaneous interaction with the speech engine. It is determined on the application level, when a port is to be open and then closed. This is in contrast to other companies which tend to have a per channel license model. LumenVox does not tie up ports to the channel. Meaning, in the example of a call router situation a caller calls and via the speech IVR after being prompted says he would like to speak with Mark Spencer. A port is opened on the engine level, the audio is decoded and the text is created and passed back to the application. The call is transferred to Mark Spencer, at this point the port can be closed. This is an example of resourced based licensing, versus channel licensing which means that the port would be open the entire time that the channel is being used. This ensures a more economical use of ports, especially in larger environments where your port population can be better optimized.
Looking back at the graphic of the Resource Requirements, if you have a single port you'll hardly have any resource requirements on the server. As the concurrent interactions go up as does the resource requirements, as high as 96 ports which could potentially put in a single server. Beyond that you would want to consider having a machine specifically dedicated to the engine alone. We would suggest that between 20 to 30 ports the engine be put on the Asterisk server along with the Asterisk PBX. Beyond that we suggest using our distributed architecture which is built into the core technology. This allows one or more speech servers to run with ports installed. This would be done in addition to the Asterisk server.
The ability to use VXML is very good news because VXML is a very important standard in the speech industry. There are a large number of existing speech application available that are running in VXML. There are also a large number of skilled programmers that are available that know how to write applications in VXML.
The information provided has hopefully given you information and ideas on how to increase sales through building applications and acquiring existing turnkey solutions.
In the first part of this training video for Digium resellers, we discuss how you can differentiate your Asterisk offerings by adding speech recognition capabilities. We cover the basics of speech recognition, how the LumenVox Speech Engine works with Asterisk, and how adding speech to an Asterisk system provides real value for your end users.