WO2001050453A2 - Systeme de reponse vocale interactif - Google Patents
Systeme de reponse vocale interactif Download PDFInfo
- Publication number
- WO2001050453A2 WO2001050453A2 PCT/US2001/000376 US0100376W WO0150453A2 WO 2001050453 A2 WO2001050453 A2 WO 2001050453A2 US 0100376 W US0100376 W US 0100376W WO 0150453 A2 WO0150453 A2 WO 0150453A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- user
- voice
- anita
- information
- web
- Prior art date
Links
- 230000004044 response Effects 0.000 title claims abstract description 15
- 230000002452 interceptive effect Effects 0.000 title claims description 14
- 238000011161 development Methods 0.000 claims description 2
- 238000000034 method Methods 0.000 abstract description 6
- 230000008859 change Effects 0.000 abstract description 5
- 241000857945 Anita Species 0.000 description 79
- 230000000007 visual effect Effects 0.000 description 11
- 230000006870 function Effects 0.000 description 10
- 230000010076 replication Effects 0.000 description 8
- 238000012546 transfer Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 239000012634 fragment Substances 0.000 description 5
- 238000012544 monitoring process Methods 0.000 description 5
- 230000001413 cellular effect Effects 0.000 description 4
- 238000005352 clarification Methods 0.000 description 4
- 230000035755 proliferation Effects 0.000 description 4
- 208000011380 COVID-19–associated multisystem inflammatory syndrome in children Diseases 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 238000007726 management method Methods 0.000 description 3
- 230000035515 penetration Effects 0.000 description 3
- 238000002319 photoionisation mass spectrometry Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 101150026173 ARG2 gene Proteins 0.000 description 2
- 101100005166 Hypocrea virens cpa1 gene Proteins 0.000 description 2
- 101100323865 Xenopus laevis arg1 gene Proteins 0.000 description 2
- 101100379634 Xenopus laevis arg2-b gene Proteins 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000010006 flight Effects 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000012552 review Methods 0.000 description 2
- 230000011664 signaling Effects 0.000 description 2
- 239000000344 soap Substances 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- WYWHKKSPHMUBEB-UHFFFAOYSA-N 6-Mercaptoguanine Natural products N1C(N)=NC(=S)C2=C1N=CN2 WYWHKKSPHMUBEB-UHFFFAOYSA-N 0.000 description 1
- 241000150058 Amblypodia anita Species 0.000 description 1
- SPNQRCTZKIBOAX-UHFFFAOYSA-N Butralin Chemical compound CCC(C)NC1=C([N+]([O-])=O)C=C(C(C)(C)C)C=C1[N+]([O-])=O SPNQRCTZKIBOAX-UHFFFAOYSA-N 0.000 description 1
- 101001094649 Homo sapiens Popeye domain-containing protein 3 Proteins 0.000 description 1
- 101000608234 Homo sapiens Pyrin domain-containing protein 5 Proteins 0.000 description 1
- 101000578693 Homo sapiens Target of rapamycin complex subunit LST8 Proteins 0.000 description 1
- 102100027802 Target of rapamycin complex subunit LST8 Human genes 0.000 description 1
- 208000027418 Wounds and injury Diseases 0.000 description 1
- 210000005252 bulbus oculi Anatomy 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 230000006378 damage Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 239000003292 glue Substances 0.000 description 1
- ZEKANFGSDXODPD-UHFFFAOYSA-N glyphosate-isopropylammonium Chemical compound CC(C)N.OC(=O)CNCP(O)(O)=O ZEKANFGSDXODPD-UHFFFAOYSA-N 0.000 description 1
- 208000014674 injury Diseases 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 235000012054 meals Nutrition 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 229940095374 tabloid Drugs 0.000 description 1
- 230000029305 taxis Effects 0.000 description 1
- 210000003454 tympanic membrane Anatomy 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000003442 weekly effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/487—Arrangements for providing information services, e.g. recorded voice services or time announcements
- H04M3/4872—Non-interactive information services
- H04M3/4878—Advertisement messages
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/487—Arrangements for providing information services, e.g. recorded voice services or time announcements
- H04M3/493—Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/487—Arrangements for providing information services, e.g. recorded voice services or time announcements
- H04M3/493—Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
- H04M3/4938—Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals comprising a voice browser which renders and interprets, e.g. VoiceXML
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2201/00—Electronic components, circuits, software, systems or apparatus used in telephone systems
- H04M2201/40—Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition
Definitions
- the present invention relates to voice-based interactive user interfaces, particularly to interactive voice response systems, and more particularly to interactive voice response systems for accessing information from a computer network via remote telephony devices.
- IVR interactive voice response
- a user to access audio information stored in a computer memory such as a hard disk.
- the audio information is stored in audio files created either by the user or for the user.
- Conventional IVR systems use dual-tone multi-frequency (DTMF) signalling to allow the user to interact with the server through a standard telephone keypad.
- DTMF dual-tone multi-frequency
- Pre-recorded audio information is available on IVR systems in the form of instructional phrases such as "Please type in your account number followed by the pound sign.”
- Pre-recorded audio is also used for introductory phrases such as "Your account balance is . . . "
- the IVR computer may access a connected database that stores the requested account balance in numerical format, convert the numerical format to an audio format using a numerical text-to-speech engine, and state the account balance.
- This conversion from numerical format to audio format is extremely rigid and completely predefined.
- IVR systems are "closed” in that each IVR system is uniquely designed, not connected to a computer network, and IVR systems cannot be used interchangeably. Also, these IVR systems are designed specifically for audio interaction.
- audio/visual information on an audio/visual server in a computer network may be accessed using a personal computer.
- a World Wide Web (Web) page on the Internet may be accessed using a computer linked through an Internet access provider, such as America On LineTM, or ProdigyTM, to a Web server.
- an Internet access provider such as America On LineTM, or ProdigyTM
- IDC International Data Corporation
- an audio interface may be useful for obtaining information from the Internet or another computer network.
- Other situations where an audio interface to a computer network may be useful include accessing an electronic calendar on a local area network (LAN) to receive or modify an itinerary, accessing E-mail on the Internet or a wide-area network (WAN) while away from a computer, and requesting a telephone number from an electronic yellow pages or white pages while at a pay phone.
- An audio interface to the Web could also be used to traverse the Internet and obtain information residing on various Web servers.
- each area code enables nearly 8 million separate telephone numbers and the total number of area codes in service has nearly doubled since 1991, growing from 119 to 215, according to the FCC.
- California Public Utilities Commission expects the number of area codes in service to increase from 13 in January 1997, to 40 by 2002. A significant portion of this growth is due to the rapid proliferation of cellular and PCS telephone service.
- the number of U.S. wireless subscribers is expected to grow to 149 million in 2003, representing a wireless market penetration of 53%.
- the global wireless penetration is expected to increase from 425 million in 1999 to 953 million in 2003.
- U.S. Patent No. 5,884,262 discloses a computer document audio access and conversion system that allows a user to access information originally formatted for audio/visual interfacing on a computer network via a simple telephone.
- files formatted specifically for audio interfacing can also be accessed by the system.
- a user can call a designated telephone number and request a file via dual-tone multi- frequency (DTMF) signaling or tlirough voice commands.
- DTMF dual-tone multi- frequency
- the system analyzes the request and accesses a predetermined document.
- the document may be in a standard document file format, such as hyper-text mark-up language (HTML) which is used on the World Wide Web.
- HTML hyper-text mark-up language
- the document is analyzed by the system, and depending on the different types of formats used in the document, information is translated from an audio/visual format to an audio format and played to the user via the telephone interface.
- the document may contain links to other documents that can be invoked to access such other documents.
- the system can have a native command capability that allows the system to act independently of the accessed document contents to replay a document or carry out functions similar to those available in conventional web browsers.
- U.S. Patent No. 5,884,262 is limited to handling information originally formatted for audio/visual interfacing to a computer network via a telephone. There is a need for flexible interactive access to mformation that is not originally formatted for audio interfacing to a computer network via telephony devices. There is a need for interactive telephony access to a computer network, such as the Internet, to expand and enrich usage with unique and compelling content and products.
- the present invention is directed to an interactive voice response system that permits users to access information that is not originally formatted for audio interfacing to an information exchange network, such as a computer network.
- an information exchange network such as a computer network.
- Users spoken utterance is analyzed and matched with an index of destinations.
- a list of valid destinations is produced and the user is the guided along the path with pre-recorded voice prompts.
- the user accessing the system can control the navigation via more speech and/or telephone keypad entry.
- the intent of the system is to be able to come up with a single choice destination amongst the many offered within the system.
- the destination that is derived earlier is then accessed via spoken utterance and/or telephone keypad entry.
- User specific information about the destination is derived from the user profile and the current call context and is used to offer access to the facilities offered by the destination.
- the facilities offered are specific to the application provided by the destination node.
- the inventive voice response system includes a number of novel functional and logical components, including without limitations query engine, ad generator, web parser, profiler and replication engine, managed by a manager. These components may physical reside in the same or different servers.
- FIG. 1 is a schematic representation of the Anita Server Architecture.
- FIG. 2 is a schematic representation of the logical internal structure of Anita Server.
- FIG. 4 illustrates one embodiment of a "tree" structure that exemplifies how clarification questions would be asked while narrowing down a search.
- FIG. 5 is a schematic representation of the HeyAnita Operating System.
- the present mvention will be described below in reference to the Internet as an example of an information exchange network.
- the present invention is applicable to other types of information network without departing from the scope and spirit of the present invention.
- HeyAnita enables individuals to surf the Internet from any phone, anywhere, anytime simply by using their voice.
- HeyAnita OS By utilizing its revolutionary HeyAnita operating system (“HeyAnita OS”) technology and easy to use interface, HeyAnita establishes a comprehensive Voice Internet Portal (“VIP”), providing a voice interface to the Internet and allowing Internet and telephone users to access volumes of information, headline news, stock quotes, horoscopes, auctions, food delivery services, weather forecasts, sports scores, travel, shipping status, free integrated voice mail, and much more.
- VIP Voice Internet Portal
- HeyAnita enables e-commerce providers to add voice application (v-application) services to their existing platform and enables traditional corporations to efficiently compete in the digital arena.
- HeyAnita' s unique solution increases traffic and commerce by providing access to individuals who do not use traditional Web-based browsers and also allows traditional Internet users access from locations lacking connectivity.
- HeyAnita uses its proprietary technology and easy to use interface to create an informative and entertaining environment to attract and retain a large and loyal user base. In addition to its easily brandable name and concept, ' Hey Anita offers the most comprehensive array of voice enabled services and allows phone users to access the Internet in multiple languages. Appendix B sets forth some of the application features possible with the inventive HeyAnita system.
- HeyAnita Voice Platform is a set of components based on Microsoft Windows DNA architecture that allows developers and power-users to rapidly develop and deploy speech applications.
- the platform is an open environment that encapsulates a speech recognition engine, audio input sources (speaker, telephone) and audio output sources (speaker, telephone). It provides a vendor independent interface to the voice application by providing a consistent interface to the various audio devices and the speech recognition engine.
- Any application written to these interfaces can be ported from one device to another or from one speech recognition vendor to another merely by creating the appropriate object. For example, developers can develop and test their voice applications using a PC speaker and a microphone and then move the application to the telephone just by creating objects that support the telephone device.
- HeyAnita Voice Platform is not tied to any specific speech recognition engine. It provides plug-and-play flexibility to switch the underlying speech recognition engine without having to modify the actual application. Developers will be able to develop applications on any shareware speech recognition engine and later deploy them on any of the popular commercial speech recognition engines such as Speechworks or Nuance.
- HeyAnita Voice Platform does not force developers to learn a new language such as VXML.
- HeyAnita Voice Platform allows developers to write applications in a language of their choice. For instance, any COM-compliant language such as Visual Basic, Visual C++ or Java can be used to develop applications on the HeyAnita Voice Platform.
- Rich VUI HeyAnita Voice Platform's open architecture allows developers to plug in third-party components to make their Voice User Interfaces richer. Developers do not have to settle for mediocre Voice Interfaces because of the limitations in the platform or language.
- HeyAnita Voice Platform allows developers to host their applications on any server on the Internet. All the pieces of HeyAnita Voice Platform are developed with location transparency in mind.
- HeyAnita Voice Platform has been designed to support international languages. Any application written on HeyAnita Voice Platform can be localized in any international language without any code changes.
- HeyAnita OS is a multi-threaded surrogate process that hosts all the HeyAnita components and application objects. It takes care of all the thread management and monitoring, administration so that applications writers do not have to worry about issues such as thread synchronizations.
- Fig. 5 shows the components of the HeyAnita OS (100).
- Speech Recognition Manager This object encapsulates the speech recognition engine and the text to speech engines and provides a consistent interface to these engines in a vendor independent fashion.
- Audio Source (Al) - This object encapsulates the audio input device and provides a consistent interface in a device independent fashion.
- Audio Destination This object encapsulates the audio output device and provides a consistent interface in a device independent fashion.
- Grammar Object (GO) This object provides a consistent interface to provide grammar files for speech recognition.
- the grammar files can reside anywhere on the Internet.
- the grammar object refers to the grammars files by URL Prompt Object (PO) - This object provides a consistent interface to provide prompts in speech applications.
- the prompts can reside anywhere on the Internet.
- the prompt object refers to the prompt files by URL
- a typical voice application will create a SR object for speech recognition, an Al object as an audio input object, an AO object as an audio output, a GO object for recognizing speech and several PO objects for the various prompts it may require.
- the application can then play the prompts using the audio out object, accept input using the audio in object and recognize the input using the speech recognition- object while the grammar object gives context to the speech recognition object.
- HeyAnita Agent is a set of COM+ objects that allow speech applications to access data in a consistent manner. This makes speech applications transparent to the underlying data format. Applications access data in any OLE DB-compliant database, XML page, HTML page or WAP page using the same programming model.
- Speech applications are written as a set of COM+ components or VXML files. These applications can be written in any COM-compliant language such as Visual Basic, Visual C++ or Java. It is also possible to write an application using multiple languages, e.g., it is possible to make use of a VXML file inside a Visual Basic speech application. This flexibility allows developers to write voice applications faster and in the language they are most comfortable with.
- HeyAnita tools are a set of design time controls (DTCs) that allows the developers to quickly generate Speech Applications in a drag-and-drop fashion. Developers do not have to learn a new language such as VXML. All the code is generated by these design time controls. These tools are provided for all components included in the HeyAnita framework. In addition to the DTCs, add-ins are provided for Office to facilitate easy authoring of content.
- DTCs design time controls
- HeyAnita framework Many components from the HeyAnita framework have associated metadata and data elements. Tools are provided for easy management of this content. Application wizards are provided for popular functions, such as a "shopping cart", “get a stock quote” etc. In addition, since the HeyAnita wizard model is a Visual Studio DTC, developers can create their own wizards or extend existing ones.
- HeyAnita framework provides a number of plug-and-play COM+ components to facilitate rapid development and deployment of voice applications. Using these components as building blocks and writing just the code to glue them together, programmers can create voice applications in a matter of hours. All the necessary voice user interface, grammars and functionality are implemented by these components. All the components contain the necessary audio prompts and grammars. Developers, however, have the ability to override these by customing their prompts or grammars.
- Basic Components These are basic building blocks for constructing a voice application. When developers use these components, they automatically get consistent and easy-to-use voice interfaces across all their applications.
- Data-bound components These components implement standardized voice interface on top of commonly used data elements.
- Value-added components provide all the bells and whistles for making voice user interface entertaining and fun-to-use.
- the HeyAnita framework may include the following basic components:
- Sentence Plays back a set of sentences.
- VXML Parser Parses and executes a W3C compatible VXML stream.
- the HeyAnita framework may include the following data-bound components:
- Store/Service Locator Locates a store or a service 7.
- Status Inquiry Checks status of an order, shipment
- Yellow page inquires Developers will be able to bind these to any OLE DB provider or XML repository to retrieve the necessary data.
- the HeyAnita framework may include the following value-added components:
- AdMixer Selects, advertisements based on the user's preferences and history.
- Randomize Randomizes selection of audio prompts (from a pre-defined set). 3. Joke-of-the-day: Selects a joke of the day.
- Debug Adds debugging trace to the voice application.
- Notifications/Alerts Sends outbound notifications/alerts.
- Anita Server 120 (Fig. 1) that implements the HeyAnita Voice Platform, which consists of several components to implement the following functionality and features:
- Fig. 1 is a schematic representation of the Anita Server Architecture.
- the Anita Server 120 is a fault tolerant, scaleable, remotely manageable, multi-threaded NT Service. This comprises the following components:
- call management features such as ring and hangup detection, call switch-over, call transfer, call waiting and tromboning.
- This also. implements functionality to transform computer audio files (.wav files) to audio streams that can be played on a telephone 15 and to detect user utterances on the phone line to pass them on to the Anita Speech Recognition Engine.
- This may be implemented using Dialogic system software version DNA 3.2 and Nuance Speech recognition system version 6.2.
- Anita Natural Language Engine in conjunction with Anita Query Engine identify destination nodes and the applications that are available to the user. This engines serves as input to the
- An example of an application would be to obtain weather information using Yahoo! Web site. This would provide a user of the system the capability of listening to weather infonnation for a set of cities or zip codes.
- the Anita Query Engine does the following:
- Anita State Machine and Web Parser executes state machines written using a proprietary function library. This retrieves information web sites and other applications that are enabled for this operation. In addition, its web-parsing function also allows Anita Query Engine to retrieve web pages from any conventional web site on the Internet and convert unstructured HTML data into meaningful structured data. It is not mandatory to make changes to existing web sites to make them work with Anita State Machine and Web Parser. An example of this would be the operations performed to pass in a zip code to the Yahoo web site, execute the form to retrieve the results, select and format the results, play relevant information in the form of concatenated speech fragments. In this scenario the Yahoo! web site was not modified to support the operations nor was it aware that a voice-enabled application was using its HTML based services.
- Anita Query Engine transfers relevant information to Anita Profiler.
- Anita Profiler captures and filters this information to build a repository of user preferences, navigational history and usage patterns.
- Anita Profiler recognizes the phone number of the incoming caller and can work without any user registration.
- Anita Prompt Generator Converts text phrases to audio prompts. Unlike most other text-to-speech engine, Anita Prompt Generator implements algorithms to generate prompts in natural human voice using concatenated speech fragments rather than digitally created voice. However, in cases of completely unstructured text, Anita Prompt Generator uses Text-To-Speech software. This software may be based on Fonix Corporation TTS engine.
- Anita Telephone Interface 1 receives the call and hands it over to Anita Speech Recognition Engine 2.
- Anita Speech Recognition engine 2 converts spoken utterance into text and sends it to Anita Natural Language Engine 3 for further processing.
- Anita Natural Language Engine 3 interprets Natural Language text and sends structured commands to Anita Query Engine 4.
- Anita Query Engine 4 takes into consideration all of the governing factors such as user preferences, user context, usage patterns and history to determine an end destination node for the user's request.
- Anita Query Engine 4 generates web queries needed to fulfill user's request and sends them to the Anita State Machine and Web Parser 8. 6.
- Anita State Machine and Web Parser 8 browses the Internet/web 11 to retrieve information requested by the user. It parses each received page to convert unstructured text into structured datasets.
- Anita State Machine and Web Parser 8 While Anita State Machine and Web Parser 8 is busy retrieving the requested information, Anita Query Engine 4 asks Anita Prompt
- Generator 6 to generate context-sensitive voice prompts. It also sends a request to Anita Profiler to add generated queries to the user's profile.
- Anita Prompt Generator 6 asks Anita Ad Generator 9 to create a set of entertaining commercials based on user's preferences and context.
- Anita Prompt Generator 6 creates an audio stream based on commercials and web information returned by Anita State Machine and Web Parser 8 and sends it to Anita Telephone Interface 12.
- Fig. 2 is a schematic representation of the logical internal structure of Anita Server 120:
- Anita Server 120 consists of three logical servers. These servers could be implemented on one physical box or multiple physical boxes based on the size and load at each Anita site. If they are implemented on multiple boxes, all the boxes are connected on a single high-bandwidth LAN segment.
- Anita Phone Server 20 implements computer telephony interface using CTI hardware 21, Anita Telephone Interface 1, Anita Speech Recognition Engine 2, and Anita Prompt Generator6. It connects to one or more digital lines to accept telephone calls.
- Anita Application Server 30 implements Anita applications using Anita Natural Language Engine 3, Anita Query Engine 4, Anita State Machine and Web Parser 8, Anita Profiler 10 and Anita Ad Generator/Mixer 9. This server is connected to Internet.using ' high-bandwidth lines. It also implements smart replication using Anita Replication Engine 13.
- Anita Database Server 40 implements Anita Repository 7 database.
- the Anita Toolbox provides a comprehensive set of tools to facilitate business partners and developers to:
- Voice-enable existing web-sites and/or applications 2) Build voice-enabled v-applications. This uses the function library to build state machines that can be executed by the Anita State Machine and Web Parser 3) Remotely monitor and manage multiple Anita Servers
- Fig. 3 is a schematic representation of the overall HeyAnita global infrastructure that comprises Anita Servers 120 in various countries, cities, and other locales.
- the Anita Servers 120 communicate with each other via a network such as the Internet 11.
- the Anita Replication Engine 12 in the Anita Servers 120 distributes Anita Repository 7 information to other Anita Servers 120.
- Anita Monitoring Stations 122 are provided to monitor and manage the interaction between the Anita Servers 120.
- the Anita Monitoring Stations 122 may be Anita Servers 120 which are configured for monitoring as their primary function. They may be similar to the Anita Managers 13.
- Post Office Assistant Say stamps to buy stamps, say directions to get the directions to the post office or say shipping to get shipping status for parcels at the post office
- Fig. 4 demonstrates the organized tree of information which helps to show how the clarification questions would be asked while narrowing down the search.
- Each parent node describes the set of features in the child node.
- Email PIMS modular component
- Read Hear
- reply to email from any phone
- Navigation keywords Next, Back, First, Reply, Remove, Repeat o
- Email is read using AT&T/SpeechWorks SpeechifyTM Text-to-Speech engine o Support for the following attachments: MP3, Real Audio, Text, Wave,'
- Flight Tracker o Check the status of all domestic and international flights that depart from and/or arrive at any U.S. airport o Search for a specific flight based on a variety of fields: airline, flight number, departure time, arrival time, arrival city, departure city o Application supports multiple airports per city o Can check for all flights with similar characteristics (eg. approximate arrival or departure time, different airport in same city, etc.) o Data provided includes flight status (departed, in flight, arrived), current altitude and speed (in flight only), estimated time of arrival (in flight only), and actual time of arrival (arrived only)
- Horoscopes o Horoscopes for yesterday, today, and tomorrow offered o User can select horoscope by specifying zodiac sign or date of birth o Information is updated daily
- Login/Registration o Login and registration functionality for customers wishing to offer personalized and/or member-based voice services to their end users o
- User verification utilizes a combination of a 4-15 digit mailbox number (login) and a 4-15 digit PIN (password) o
- login can be automatic or semi-automatic
- Registration process can include optional personal information and/or voice surveys, depending on customer requirements o Can be integrated with all other applications
- Measurement Conversion o Convert between metric and U.S. measurement units o Weight: kilogram, pound, ounce o Liquids: liters, pints, quarts, gallons o Distances: kilometers, meters, centimeters, yards, feet, inches o Speeds: km/h, mph o Temperature: Celsius, Fahrenheit
- Outbound Alerts (PIMS modular component. Note: Outbound Alerts incur additional phone charges.)
- o Alerts can be set either via voice or web-based interface o Provision for various U.S. time zones; user can also pick one time zone as the default setting o
- the following variables can be specified per alert: Time of alert, time zone, phone number to dial, custom voice message o Phone number confirmation (to ensure that the number to dial belongs to the user setting the alert) o Recurring alerts (daily, weekly, etc.) can also be specified via the web-based interface o Must be used in conjunction with Login/Registration module o Can be integrated with Calendar module for notification of appointments o Can also be integrated with Sprint Express module for personalized content delivery (e.g. wake up call with a daily news and stock portfolio update)
- Anita Express o Personalized content delivery for registered users ("my page" functionality) o Users can predefine which content categories to play upon entering Sprint Express (eg. Stocks, Weather, and Sports) o Within each content category, users can predetermine the specific content they want to hear in Sprint Express (e.g. Weather for Dallas) o Content can be personalized via voice or web-based interface o Can be integrated with Outbound Alerts to deliver content to a user-specified phone number on a timely basis o Must be used in conjunction with Login/Registration module
- Stock Portfolio o Create and modify custom portfolios with Stock Portfolio o All standard information fields for stock quotes are available and customizable o Additional customizable information fields: purchase price (user-defined), date of purchase (user-defined), shares owned (user-defined), total portfolio value, total cost, daily dollar and percentage change in value, total dollar and percentage change in value o
- Stocks in portfolio are read in alphabetical order User can skip to next stock, go back to previous stock, obtain detailed information on a particular stock, or remove a stock from the portfolio at any time while in the Stock Portfolio module
- Tipping Guide o Suggests and. calculates an appropriate tip based on cost of meal, level of service, and taxes paid o Divides up bill based on number of diners in the party
- ANI Automatic Number ID
- Calendar (PIMS modular component) o Manage appointments and set up personal reminders with HeyAnita' s Calendar module o Appointments can be created, updated, or deleted via voice and/or web-based interfaces o User can record a detailed audio message associated with each calendar event o Checks for conflicting and adjacent appointments and prompts the user accordingly o Can be integrated with Outbound Alerts module for enhanced functionality o Must be used in conjunction with Login/Registration module
- Driving Directions o Provides quickest route to caller specified destination within the US o Allows user to designate starting point using multiple variables o User can determine speed of route playback by utilizing pause command o Full functionality to be detennined. Locator o Locate business in your vicinity o Provides ability to connect user to business o Provides driving directions
- Movies o Obtain show times o Purchase tickets o Obtain review information and movie sound clips o Locate theatres o Provides driving direction
- Traffic o Obtain traffic information
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Business, Economics & Management (AREA)
- Marketing (AREA)
- Telephonic Communication Services (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU29288/01A AU2928801A (en) | 2000-01-04 | 2001-01-04 | Interactive voice response system |
US10/188,585 US20030078779A1 (en) | 2000-01-04 | 2002-07-03 | Interactive voice response system |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17437100P | 2000-01-04 | 2000-01-04 | |
US60/174,371 | 2000-01-04 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/188,585 Continuation US20030078779A1 (en) | 2000-01-04 | 2002-07-03 | Interactive voice response system |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2001050453A2 true WO2001050453A2 (fr) | 2001-07-12 |
WO2001050453A3 WO2001050453A3 (fr) | 2002-01-31 |
Family
ID=22635919
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2001/000376 WO2001050453A2 (fr) | 2000-01-04 | 2001-01-04 | Systeme de reponse vocale interactif |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU2928801A (fr) |
WO (1) | WO2001050453A2 (fr) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002089112A1 (fr) * | 2001-05-02 | 2002-11-07 | Vox Generation Limited | Apprentissage adaptatif de modeles de langage pour la reconnaissance vocale |
EP1283634A1 (fr) * | 2001-08-09 | 2003-02-12 | France Telecom | Service d'informations de proximite |
EP1293963A1 (fr) * | 2001-09-07 | 2003-03-19 | Sony International (Europe) GmbH | Architecture d'un serveur de gestion de dialogues pour des systèmes de dialogue |
FR2834167A1 (fr) * | 2001-12-26 | 2003-06-27 | France Telecom | Service et serveur de conversion de devises |
FR2835999A1 (fr) * | 2002-02-13 | 2003-08-15 | France Telecom | Edition et consultation de services vocaux telephoniques interactifs |
EP1351477A1 (fr) * | 2002-04-03 | 2003-10-08 | BRITISH TELECOMMUNICATIONS public limited company | Système et méthode pour construire une représentation de données structurées pour une interface vocale |
EP1473916A1 (fr) * | 2003-04-29 | 2004-11-03 | Intervoice Limited Partnership | Composants de parole dans un flux d'appel de services pour internet |
DE10317497A1 (de) * | 2003-04-16 | 2004-11-25 | Abb Patent Gmbh | System zur Kommunikation zwischen einem Feldgerät und einem Bediengerät |
US6868153B2 (en) | 2002-03-12 | 2005-03-15 | Rockwell Electronic Commerce Technologies, Llc | Customer touch-point scoring system |
EP1619663A1 (fr) * | 2004-07-13 | 2006-01-25 | Hewlett-Packard Development Company, L.P. | Système d'application d'ordinateur utilisant la parole |
EP1750253A1 (fr) * | 2005-08-04 | 2007-02-07 | Harman Becker Automotive Systems GmbH | Système de dialogue vocal integré |
US7260537B2 (en) | 2003-03-25 | 2007-08-21 | International Business Machines Corporation | Disambiguating results within a speech based IVR session |
US7809578B2 (en) | 2002-07-17 | 2010-10-05 | Nokia Corporation | Mobile device having voice user interface, and a method for testing the compatibility of an application with the mobile device |
GB2473894A (en) * | 2009-09-24 | 2011-03-30 | Avaya Inc | Improving the temporal order and the hierarchical order of menu items in an Interactive Voice Response system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2307619A (en) * | 1995-11-21 | 1997-05-28 | Alexander James Pollitt | Internet information access system |
EP0848373A2 (fr) * | 1996-12-13 | 1998-06-17 | Siemens Corporate Research, Inc. | Système de communication interactive |
US5884262A (en) * | 1996-03-28 | 1999-03-16 | Bell Atlantic Network Services, Inc. | Computer network audio access and conversion system |
WO1999046920A1 (fr) * | 1998-03-10 | 1999-09-16 | Siemens Corporate Research, Inc. | Systeme d'exploration du web utilisant un telephone classique |
-
2001
- 2001-01-04 AU AU29288/01A patent/AU2928801A/en not_active Abandoned
- 2001-01-04 WO PCT/US2001/000376 patent/WO2001050453A2/fr active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2307619A (en) * | 1995-11-21 | 1997-05-28 | Alexander James Pollitt | Internet information access system |
US5884262A (en) * | 1996-03-28 | 1999-03-16 | Bell Atlantic Network Services, Inc. | Computer network audio access and conversion system |
EP0848373A2 (fr) * | 1996-12-13 | 1998-06-17 | Siemens Corporate Research, Inc. | Système de communication interactive |
WO1999046920A1 (fr) * | 1998-03-10 | 1999-09-16 | Siemens Corporate Research, Inc. | Systeme d'exploration du web utilisant un telephone classique |
Non-Patent Citations (2)
Title |
---|
ATKINS D L ET AL: "INTEGRATED WEB AND TELEPHONE SERVICE CREATION" BELL LABS TECHNICAL JOURNAL,US,BELL LABORATORIES, vol. 2, no. 1, 21 December 1997 (1997-12-21), pages 19-35, XP000659566 ISSN: 1089-7089 * |
T V RAMAN: "ASTER - Towards Modality-Independent Electronic Documents" DAGS. CONFERENCE ON ELECTRONIC PUBLISHING AND THE INFORMATION HIGHWAY, 30 May 1995 (1995-05-30), XP002087401 * |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2391680B (en) * | 2001-05-02 | 2005-07-20 | Vox Generation Ltd | Adaptive learning of language models for speech recognition |
GB2391680A (en) * | 2001-05-02 | 2004-02-11 | Vox Generation Ltd | Adaptive learning of language models for speech recognition |
WO2002089112A1 (fr) * | 2001-05-02 | 2002-11-07 | Vox Generation Limited | Apprentissage adaptatif de modeles de langage pour la reconnaissance vocale |
EP1283634A1 (fr) * | 2001-08-09 | 2003-02-12 | France Telecom | Service d'informations de proximite |
FR2828612A1 (fr) * | 2001-08-09 | 2003-02-14 | France Telecom | Service d'informations de proximite |
EP1293963A1 (fr) * | 2001-09-07 | 2003-03-19 | Sony International (Europe) GmbH | Architecture d'un serveur de gestion de dialogues pour des systèmes de dialogue |
FR2834167A1 (fr) * | 2001-12-26 | 2003-06-27 | France Telecom | Service et serveur de conversion de devises |
FR2835999A1 (fr) * | 2002-02-13 | 2003-08-15 | France Telecom | Edition et consultation de services vocaux telephoniques interactifs |
WO2003069921A1 (fr) * | 2002-02-13 | 2003-08-21 | France Telecom | Services vocaux telephoniques interactifs |
US6868153B2 (en) | 2002-03-12 | 2005-03-15 | Rockwell Electronic Commerce Technologies, Llc | Customer touch-point scoring system |
EP1351477A1 (fr) * | 2002-04-03 | 2003-10-08 | BRITISH TELECOMMUNICATIONS public limited company | Système et méthode pour construire une représentation de données structurées pour une interface vocale |
US7809578B2 (en) | 2002-07-17 | 2010-10-05 | Nokia Corporation | Mobile device having voice user interface, and a method for testing the compatibility of an application with the mobile device |
US7260537B2 (en) | 2003-03-25 | 2007-08-21 | International Business Machines Corporation | Disambiguating results within a speech based IVR session |
DE10317497A1 (de) * | 2003-04-16 | 2004-11-25 | Abb Patent Gmbh | System zur Kommunikation zwischen einem Feldgerät und einem Bediengerät |
DE10317497B4 (de) * | 2003-04-16 | 2013-10-17 | Abb Ag | System zur Kommunikation zwischen einem Feldgerät und einem Bediengerät |
US7269562B2 (en) | 2003-04-29 | 2007-09-11 | Intervoice Limited Partnership | Web service call flow speech components |
EP1473916A1 (fr) * | 2003-04-29 | 2004-11-03 | Intervoice Limited Partnership | Composants de parole dans un flux d'appel de services pour internet |
EP1619663A1 (fr) * | 2004-07-13 | 2006-01-25 | Hewlett-Packard Development Company, L.P. | Système d'application d'ordinateur utilisant la parole |
EP1750253A1 (fr) * | 2005-08-04 | 2007-02-07 | Harman Becker Automotive Systems GmbH | Système de dialogue vocal integré |
GB2473894A (en) * | 2009-09-24 | 2011-03-30 | Avaya Inc | Improving the temporal order and the hierarchical order of menu items in an Interactive Voice Response system |
US8494148B2 (en) | 2009-09-24 | 2013-07-23 | Avaya, Inc. | Dynamic IVR dialog based on analytics data |
Also Published As
Publication number | Publication date |
---|---|
AU2928801A (en) | 2001-07-16 |
WO2001050453A3 (fr) | 2002-01-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20030078779A1 (en) | Interactive voice response system | |
CN101297355B (zh) | 响应自然语言语音口头表达的系统和方法 | |
US7103563B1 (en) | System and method for advertising with an internet voice portal | |
US6895084B1 (en) | System and method for generating voice pages with included audio files for use in a voice page delivery system | |
US8122057B2 (en) | System and method for the transformation and canonicalization of semantically structured data | |
US8521585B2 (en) | System and method for using voice over a telephone to access, process, and carry out transactions over the internet | |
US7457397B1 (en) | Voice page directory system in a voice page creation and delivery system | |
CA2400073C (fr) | Systeme et procede d'acces vocal a une information basee sur internet | |
US8849659B2 (en) | Spoken mobile engine for analyzing a multimedia data stream | |
US6687734B1 (en) | System and method for determining if one web site has the same information as another web site | |
US8874446B2 (en) | System and method for funneling user responses in an internet voice portal system to determine a desired item or servicebackground of the invention | |
US20100204994A1 (en) | Systems and methods for responding to natural language speech utterance | |
CN101292282A (zh) | 支持自然语言人机交互的移动系统和方法 | |
US20070208564A1 (en) | Telephone based search system | |
WO2001050453A2 (fr) | Systeme de reponse vocale interactif | |
Pargellis et al. | An automatic dialogue generation platform for personalized dialogue applications | |
WO2001071538A2 (fr) | Systeme et procede de developpement de regles hors programmation utilisees dans la conversion d'informations web |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
AK | Designated states |
Kind code of ref document: A3 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A3 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
WWE | Wipo information: entry into national phase |
Ref document number: 10188585 Country of ref document: US |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase in: |
Ref country code: JP |