US8601096B2 - Method and system for multi-modal communication - Google Patents
Method and system for multi-modal communication Download PDFInfo
- Publication number
- US8601096B2 US8601096B2 US10/145,304 US14530402A US8601096B2 US 8601096 B2 US8601096 B2 US 8601096B2 US 14530402 A US14530402 A US 14530402A US 8601096 B2 US8601096 B2 US 8601096B2
- Authority
- US
- United States
- Prior art keywords
- control command
- input
- component
- output component
- terminal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/03—Arrangements for converting the position or the displacement of a member into a coded form
- G06F3/033—Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
- G06F3/038—Control and interface arrangements therefor, e.g. drivers or device-embedded control circuitry
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/53—Network services using third party service providers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/75—Indicating network or usage conditions on the user display
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/40—Network security protocols
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/724—User interfaces specially adapted for cordless or mobile telephones
- H04M1/72403—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
- H04M1/72445—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality for supporting Internet browser applications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/30—Definitions, standards or architectural aspects of layered protocol stacks
- H04L69/32—Architecture of open systems interconnection [OSI] 7-layer type protocol stacks, e.g. the interfaces between the data link level and the physical level
- H04L69/322—Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions
- H04L69/329—Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions in the application layer [OSI layer 7]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/26—Devices for calling a subscriber
- H04M1/27—Devices whereby a plurality of signals may be stored simultaneously
- H04M1/271—Devices whereby a plurality of signals may be stored simultaneously controlled by voice recognition
Definitions
- the invention relates generally to communication devices and methods and more particularly to communication devices and methods utilizing multi-modal communication.
- An emerging area of technology with terminal devices is the application of multi-modal information transfer.
- terminal devices such as hand-held devices, mobile phones, laptops, PDAs, internet appliances, desktop computers, or other suitable devices
- the browser is a program which allows the user to enter information fetch requests, receive requested information, and navigate through content servers via internal, e.g. intranet, or external, e.g. internet, connections.
- the browser may be a graphical browser, voice browser, JAVA®-based application, software program application, or any other suitable browser as recognized by one of ordinary skill in the art.
- Multi-modal technology allows a user to access information, such as voice information, data encryption, video information, audio information or other information, through at least one browser. More specifically, the user may submit an information fetch request in one mode, such as speaking a fetch request into a microphone, receive the requested information in any of a plurality of modes, such as the first mode, i.e. audible output, or a second mode, i.e. graphical display.
- information such as voice information, data encryption, video information, audio information or other information
- the user may submit an information fetch request in one mode, such as speaking a fetch request into a microphone, receive the requested information in any of a plurality of modes, such as the first mode, i.e. audible output, or a second mode, i.e. graphical display.
- the browser may work in a manner similar to a standard web browser, such as NETSCAPE NAVIGATOR® resident on a computer connected to a network.
- the browser receives an information fetch request from an end user, commonly in the form of a universal resource indicator (URI), a bookmark, touch entry, key-entry, voice command, etc.
- URI universal resource indicator
- the browser interprets the fetch request and then sends the information fetch request to the appropriate content server, such as a commercially available content server, e.g. a weather database via the internet, an intranet server etc.
- the requested information is then provided back to the browser.
- the information is encoded as mark-up language for the browser to decode, such as hypertext mark-up language (HTML), wireless mark-up language (WML), extensive mark-up language (XML), Voice eXtensible Mark-up Language (VoiceXML), Extensible HyperText Markup Language (XHTML), or other such mark-up languages.
- HTML hypertext mark-up language
- WML wireless mark-up language
- XML extensive mark-up language
- XML Voice eXtensible Mark-up Language
- XHTML Extensible HyperText Markup Language
- GUI graphical user interface
- FIG. 1 illustrates a general block diagram of a multi-modal communication system receiving a control command in accordance with one embodiment of the present invention
- FIG. 2 illustrates a thin client multi-modal communication system for receiving a control command, in accordance with one embodiment of the present invention
- FIG. 3 illustrates a thick client multi-modal communication system for receiving a control command, in accordance with one embodiment of the present invention
- FIG. 4 illustrates a flow chart illustrating a method for multi-modal communications receiving a control command, in accordance with one embodiment of the present invention.
- FIG. 5 illustrates a flow chart illustrating a method for multi-modal communication receiving a control command, in accordance with one embodiment of the present invention.
- the method and system for multi-modal communication receives encoded control command from a content server, such as a commercially available content server, i.e. a banking server, an internal server disposed on an intranet, or any another server accessible via the network.
- the method and system further decodes the control command to generate a decoded control command and provides the decoded control command to a control unit, wherein the control unit modifies, such as enabling or disabling, various input and output components.
- the method and system In response to the decoded command, the method and system disables at least an input component and/or an output component, where the input component may be a microphone disposed within an audio subsystem, a speech detector/controller, a speech recognition engine, a keypad, an input encoder, touch screen, a handwriting engine, etc., and wherein the output component may be a speaker disposed within the audio subsystem, a display, etc.
- the method and system further consists of enabling at least one input component and/or at least one output component in response to the decoded control command.
- the method and system may also enable at least one input component or at least one output component without disabling any of the input or output components, or disable at least one input component or at least one output component without enabling any of the input or output components.
- the method and system for multi-modal communication prior to receiving an encoded control command, receives an information fetch request, such as a URI, which may be provided by a user or another suitable means.
- the information request is for requested information, such as personal information stored on a commercially available content server, weather information from a weather database, etc.
- the information request is provided to a dialog manager, such as a multi-modal browser, a graphical browser, a voice browser, or any other dialog manager.
- the method and system accesses the content server to retrieve the requested information, wherein the control command is encoded within the requested information.
- FIG. 1 illustrates a multi-modal communication system for receiving an encoded control command.
- the system 100 includes an input component 102 , a control unit 104 and an output component 106 coupled to the control unit 104 via connection 112 .
- the input component 102 represents at least one of an input component for receiving or modifying an input signal.
- the output component 106 represents at least one component used to provide or modify an output signal.
- the input component 102 is coupled to the control unit 104 via connection 110 and the output component is coupled to the control unit via connection 112 .
- a dialog manager 114 is coupled to the control unit 104 via connection 116 for receiving an input command, such as 118 , from an input component and coupled to the control unit 104 via connection 120 for providing an output command to the output component 106 , through the control unit 104 .
- a content server 122 is coupled to the dialog manager, via connection 124 , for receiving information requests and providing the requested information from the content server 122 back to the dialog manager 114 .
- the system 100 further contains a control server 126 coupled to the dialog manager 114 via connection 128 , wherein the control server 126 contains a plurality of control commands which may be used by the control unit 104 to modify the input component 102 and/or the output component 106 .
- FIG. 1 illustrates the content server 122 and the control server 126 as two separate servers, but as recognized by one skilled in the art, these servers may be disposed within a single content/control server.
- An end user provides an information request, such as 118 , for requested information to the input component 102 .
- the input component 102 provides this information request to the dialog manager 114 through the control unit 104 .
- the dialog manager 114 decodes the information request and then provides the information request to the content server 122 , via connection 124 .
- the content server 122 provides the requested information, with a control command encoded therein, to the dialog manager 114 .
- the dialog manager 114 decodes the control command by parsing out the control command from the requested information. This control command is then provided to the control unit 104 via connection 120 , whereupon the control 104 modifies the input component 102 and/or the output component 106 .
- the dialog manager 114 decodes the requested information and parses out a control command reference indicator, such as a URI.
- the dialog manager 114 then accesses the control server 126 , via connection 128 , to retrieve the control command as indicated by the reference indicator.
- the retrieved control command is then provided to the control unit 104 via connection 120 for the appropriate modification of the input component 102 and/or the output component 106 .
- the control unit 104 in response to the control command, performs at least one of a plurality of functions. Specifically, the control unit 104 modifies, either through enabling or disabling, at least one input component 102 or at least one output component 106 . Moreover, the control unit 104 may modify multiple components, such as enabling the input component 102 while simultaneously disabling the output component 106 .
- the present invention provides for the reception of a control command, which is received by a dialog manager 114 where it is decoded, and provided to the control unit 104 .
- input component 102 and output component 106 may be enabled or disabled.
- a control command is provided with the encoded information.
- the dialog manager 114 decodes the control command and provides the control command to the control unit 104 for disabling the input component 102 , requiring the user to listen to the full disclaimer without being able to interrupt or terminate the audio transmission.
- FIG. 2 illustrates a thin client embodiment of the multi-modal communication system 100 of FIG. 1 , further illustrating input components and output components.
- the system 100 contains the terminal 102 having a terminal session control unit 138 operably coupled to a gateway control unit 140 disposed within a gateway 130 via connection 132 .
- the terminal control unit 138 and the gateway control unit 140 were designated as control unit 104 of FIG. 1 .
- the terminal 102 further contains a speaker 142 , a microphone 143 , a speech recognition engine 144 , a speech detector/controller 146 , a handwriting engine 148 , a keypad 150 , a touchscreen 152 , a display 154 , an input encoder 156 and a terminal dialog manager 157 .
- the touchscreen 152 and display 154 may be the same component, but provide for different forms of interaction, either as an input component or an output component and have therefore been separately illustrated.
- terminal input and output components of FIG. 2 are illustrative, and not herein a conclusive list of suitable input and output components for use in the multi-modal communication system.
- the input components and output components are operably coupled to the terminal control unit 138 and each other via a bus, designated generally at 158 .
- the terminal dialog manager 157 includes contains a graphical browser and the dialog manager 116 disposed on the gateway includes a voice browser.
- the system 100 operates as discussed above with reference to FIG. 1 .
- the microphone 143 receives an audio information request for requested information, such as request 118 in FIG. 1 .
- the microphone 143 provides this request to the speech recognition engine 144 where it is recognized and provided as an input to the dialog manager 116 across bus 132 .
- the input may be provided through any of the input components and further provided to the dialog manager 116 , such as entered within the keypad 150 or entered on a touchscreen 152 .
- the dialog manager 116 fetches the information from the designated content server 122 via connection 164 . Contained within the requested information is the control command, which is decoded by the dialog manager 116 .
- the dialog manager 116 decodes the control command and provides the control command to the gateway control unit 140 via connection 158 .
- the dialog manager 116 parses out a reference indicator, such as URI, and accesses the control server 106 , via connection 166 , to retrieve the control command.
- the decoded control command is provided to the terminal control unit 138 from the gateway control unit 140 , via connection 132 .
- the terminal control unit 138 modifies at least one input component and/or one output component in response to the decoded control command.
- at least one input component may be enabled or disabled and/or at least one output component may be enabled or disabled.
- a control command may be provided to restrict a user from speaking the PIN into a microphone 143 .
- the dialog manager 114 decodes the control command, which is then provided to the terminal control unit 138 , via the gateway control unit 140 , and thereupon the audio subsystem is disabled so as to disable audio input.
- the keypad 150 may be enabled and the input encoder 156 may be enabled to encode the information as it is provided from the user on the keypad 150 .
- the input encoder 156 may be further enabled to provide a non-recognition character, such as an asterisk or other non-alphanumeric digit, upon the display 154 as each entry of the PIN is entered into the keypad 150 . Thereupon, the system provides for the reception of control commands that can enable or disable specific input and/or output components on a terminal device.
- a non-recognition character such as an asterisk or other non-alphanumeric digit
- FIG. 3 illustrates a thick client embodiment of a communication system, in accordance with one embodiment of the present invention.
- the system 200 has a terminal 202 operably coupled to a content server 122 via connection 204 .
- the terminal 202 is further coupled to a control server 126 , via connection 206 .
- these connections 204 and 206 are dynamically made.
- the terminal 202 contains a plurality of input components and a plurality of output components and a control unit 104 .
- the terminal 202 has a dialog manager 114 operably coupled to control unit 104 , via connection 216 and further coupled to content server 122 and the control server 126 .
- the dialog manager 114 is coupled to the plurality of input and output devices, through the control unit 104 , via bus 214 .
- FIG. 3 illustrates the terminal 202 having a speech detector/controller 146 , an audio subsystem 216 , a speech recognition engine 144 , a handwriting engine 148 and a media module 218 . These elements represent several of the various input and output devices which can be utilized within the terminal 202 and in conjunction with the dialog manager 114 .
- the input and output components is an illustrative list only and not a comprehensive list as any other suitable input and output devices may be coupled to the dialog manager 114 via the bus 214 and the through the control 104 .
- the thick client allows a user to interact in a multi-modal system and receive control commands from a content server 122 or control command requests which are retrieved from the content server 122 .
- the terminal 202 of FIG. 3 is recognized as a thick client because most input and output components are disposed on the terminal 202 .
- an information fetch request for a requested information is provided from an input device, such as the handwriting engine 148 , to the dialog manager 114 , via bus 214
- the dialog manager 114 decodes the information requests and further provides the request to the content server 116 , via connection 204 to retrieve the requested information.
- the dialog manager 114 decodes the requested information and parses out the control command, providing the control command to the control unit 104 .
- the control unit 104 modifies at least an input component and/or an output component based on the control command.
- the dialog manager 114 parses out a control command request from the retrieved information from the content server and retrieves the proper control command from the control server 126 via connection 114 . Regardless of whether the control command is embedded in the retrieved information or is a link to the control server 126 , once the dialog manager 114 receives the control command, it is forward to the control unit 104 where at least one of the input components or at least one of the output components is modified.
- FIG. 4 a flowchart illustrates a method for multi-modal communication receiving a control command.
- the method begins, 300 , upon receiving an information fetch request for requested information 302 .
- a typical example of an information request is a URI request, such as accessing a specific server on a network.
- the next step, 304 is accessing a content server to retrieve the requested information, wherein a control command is encoded within the requested information.
- the control command is received from the content server, in combination with the requested information.
- the requested information may contain a reference indicator wherein the dialog manager retrieves the control command from a control server based upon the reference indicator, as discussed below with reference to FIG. 5 .
- the control command is then decoded, designated at step 308 .
- decoding occurs when the control unit parses the control command from the requested information.
- the control command is then provided to a control unit, designated at step 310 .
- at least one input component and/or at least one output component is modified, designated at step 312 .
- the step of modifying further comprises enabling or disabling at least one of the following: at least one input component and at least one output component, designated at 314 .
- FIG. 5 illustrates another embodiment of the method for multi-modal communication of the present invention.
- FIG. 5 is similar to the method of FIG. 4 , wherein FIG. 5 illustrates three specific examples of input and/or output component modifications.
- the method begins, 320 , when an information fetch request for requested information is received, step 322 .
- the system then accesses a content server to retrieve the requested information, designated at step 324 .
- the control command is disposed within the requested information.
- a reference indicator is disposed in the requested information, wherein the reference indicator, such as a URI, references a control command stored in the control server.
- the system receives the control command.
- the control command is received from the content server, in conjunction with the requested information, wherein the control command is encoded therein.
- the control command is received from the control server wherein the reference indicator is encoded within the requested information and the dialog manager accesses the control server using the reference indicator to retrieve the control command. Thereupon, the control command is then provided to a control unit, step 330 .
- the system in response to the control command, enables an input encoder such that an input command is encoded in a first encoding scheme, step 332 .
- the first encoding scheme may be a security encryption encoding scheme, such as a 64 bit encoding scheme, to encrypt the input.
- an output on a display of at least one keypad entry is disabled and an output of non-recognition characters on the display is enabled, wherein the non-recognition character corresponds to each of the keypad entries.
- a typical non-recognition character is an asterisk, ampersand, number symbol, are any other character wherein the visible non-recognition character does not disclose the actual entered character.
- a common example of this embodiment is the manual entry of a PIN, wherein for security reasons, the actual PIN is not displayed on the screen.
- a speech detector/controller in response to the control command, is disabled, thereby limiting user input via speech, while an audio output is being provided to a speaker within an audio subsystem, designated at 336 .
- a common example of this embodiment may occur when a user access a server which requires a disclaimer to be provided to the user.
- the system requires the user to listen to the disclaimer by not recognizing a barge-in voice command while the audio is being provided to a speaker in the audio subsystem, wherein a barge-in voice command is any user initiated audio noise which would typically activate the speech detector/controller, such as spoken command.
- the present invention provides for an improved communication system, wherein a control command enables and/or disables specific input and/or output components.
- the communication system further provides for improved security and efficiency whereby a user does not have to be concerned about manually modifying the input and/or output components.
- a content server, or other services may be provided with an added level of security for the transference of sensitive information to or from an end user by activated specific encoding techniques and disabling various output devices, thereby prevent inadvertent disclosure to a third party.
- dialog manager may be a graphical browser having voice browser capabilities or the content server and control server may be disposed within the same server. It is therefore contemplated to cover by the present invention, any and all modifications, variations, or equivalent to fall within the spirit and scope of the basic underlying principles disclosed and claimed herein.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- General Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Theoretical Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Description
Claims (25)
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/145,304 US8601096B2 (en) | 2002-05-14 | 2002-05-14 | Method and system for multi-modal communication |
PCT/US2003/011823 WO2003098456A1 (en) | 2002-05-14 | 2003-04-15 | Method and system for multi-modal communication |
AU2003226415A AU2003226415A1 (en) | 2002-05-14 | 2003-04-15 | Method and system for multi-modal communication |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/145,304 US8601096B2 (en) | 2002-05-14 | 2002-05-14 | Method and system for multi-modal communication |
Publications (2)
Publication Number | Publication Date |
---|---|
US20030217161A1 US20030217161A1 (en) | 2003-11-20 |
US8601096B2 true US8601096B2 (en) | 2013-12-03 |
Family
ID=29418608
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/145,304 Expired - Fee Related US8601096B2 (en) | 2002-05-14 | 2002-05-14 | Method and system for multi-modal communication |
Country Status (3)
Country | Link |
---|---|
US (1) | US8601096B2 (en) |
AU (1) | AU2003226415A1 (en) |
WO (1) | WO2003098456A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130325480A1 (en) * | 2012-05-30 | 2013-12-05 | Au Optronics Corp. | Remote controller and control method thereof |
US10600421B2 (en) | 2014-05-23 | 2020-03-24 | Samsung Electronics Co., Ltd. | Mobile terminal and control method thereof |
Families Citing this family (50)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040061717A1 (en) * | 2002-09-30 | 2004-04-01 | Menon Rama R. | Mechanism for voice-enabling legacy internet content for use with multi-modal browsers |
US9083798B2 (en) | 2004-12-22 | 2015-07-14 | Nuance Communications, Inc. | Enabling voice selection of user preferences |
US7917365B2 (en) | 2005-06-16 | 2011-03-29 | Nuance Communications, Inc. | Synchronizing visual and speech events in a multimodal application |
US8032825B2 (en) * | 2005-06-16 | 2011-10-04 | International Business Machines Corporation | Dynamically creating multimodal markup documents |
US20060287858A1 (en) * | 2005-06-16 | 2006-12-21 | Cross Charles W Jr | Modifying a grammar of a hierarchical multimodal menu with keywords sold to customers |
US20060287865A1 (en) * | 2005-06-16 | 2006-12-21 | Cross Charles W Jr | Establishing a multimodal application voice |
US8090584B2 (en) | 2005-06-16 | 2012-01-03 | Nuance Communications, Inc. | Modifying a grammar of a hierarchical multimodal menu in dependence upon speech command frequency |
US8073700B2 (en) | 2005-09-12 | 2011-12-06 | Nuance Communications, Inc. | Retrieval and presentation of network service results for mobile device using a multimodal browser |
US7848314B2 (en) | 2006-05-10 | 2010-12-07 | Nuance Communications, Inc. | VOIP barge-in support for half-duplex DSR client on a full-duplex network |
US20070274297A1 (en) * | 2006-05-10 | 2007-11-29 | Cross Charles W Jr | Streaming audio from a full-duplex network through a half-duplex device |
US9208785B2 (en) * | 2006-05-10 | 2015-12-08 | Nuance Communications, Inc. | Synchronizing distributed speech recognition |
US7676371B2 (en) | 2006-06-13 | 2010-03-09 | Nuance Communications, Inc. | Oral modification of an ASR lexicon of an ASR engine |
US8332218B2 (en) | 2006-06-13 | 2012-12-11 | Nuance Communications, Inc. | Context-based grammars for automated speech recognition |
US8145493B2 (en) | 2006-09-11 | 2012-03-27 | Nuance Communications, Inc. | Establishing a preferred mode of interaction between a user and a multimodal application |
US8374874B2 (en) | 2006-09-11 | 2013-02-12 | Nuance Communications, Inc. | Establishing a multimodal personality for a multimodal application in dependence upon attributes of user interaction |
US8073697B2 (en) * | 2006-09-12 | 2011-12-06 | International Business Machines Corporation | Establishing a multimodal personality for a multimodal application |
US7957976B2 (en) | 2006-09-12 | 2011-06-07 | Nuance Communications, Inc. | Establishing a multimodal advertising personality for a sponsor of a multimodal application |
US8086463B2 (en) | 2006-09-12 | 2011-12-27 | Nuance Communications, Inc. | Dynamically generating a vocal help prompt in a multimodal application |
KR100832534B1 (en) * | 2006-09-28 | 2008-05-27 | 한국전자통신연구원 | Apparatus and Method for providing contents information service using voice interaction |
US7827033B2 (en) | 2006-12-06 | 2010-11-02 | Nuance Communications, Inc. | Enabling grammars in web page frames |
US7861921B1 (en) * | 2007-01-11 | 2011-01-04 | Diebold Self-Service Systems Division Of Diebold, Incorporated | Cash dispensing automated banking machine system and method |
US8069047B2 (en) | 2007-02-12 | 2011-11-29 | Nuance Communications, Inc. | Dynamically defining a VoiceXML grammar in an X+V page of a multimodal application |
US8150698B2 (en) | 2007-02-26 | 2012-04-03 | Nuance Communications, Inc. | Invoking tapered prompts in a multimodal application |
US7801728B2 (en) | 2007-02-26 | 2010-09-21 | Nuance Communications, Inc. | Document session replay for multimodal applications |
US7809575B2 (en) | 2007-02-27 | 2010-10-05 | Nuance Communications, Inc. | Enabling global grammars for a particular multimodal application |
US7822608B2 (en) | 2007-02-27 | 2010-10-26 | Nuance Communications, Inc. | Disambiguating a speech recognition grammar in a multimodal application |
US9208783B2 (en) | 2007-02-27 | 2015-12-08 | Nuance Communications, Inc. | Altering behavior of a multimodal application based on location |
US20080208586A1 (en) * | 2007-02-27 | 2008-08-28 | Soonthorn Ativanichayaphong | Enabling Natural Language Understanding In An X+V Page Of A Multimodal Application |
US7840409B2 (en) | 2007-02-27 | 2010-11-23 | Nuance Communications, Inc. | Ordering recognition results produced by an automatic speech recognition engine for a multimodal application |
US8713542B2 (en) | 2007-02-27 | 2014-04-29 | Nuance Communications, Inc. | Pausing a VoiceXML dialog of a multimodal application |
US8938392B2 (en) | 2007-02-27 | 2015-01-20 | Nuance Communications, Inc. | Configuring a speech engine for a multimodal application based on location |
US8843376B2 (en) | 2007-03-13 | 2014-09-23 | Nuance Communications, Inc. | Speech-enabled web content searching using a multimodal browser |
US7945851B2 (en) * | 2007-03-14 | 2011-05-17 | Nuance Communications, Inc. | Enabling dynamic voiceXML in an X+V page of a multimodal application |
US8670987B2 (en) | 2007-03-20 | 2014-03-11 | Nuance Communications, Inc. | Automatic speech recognition with dynamic grammar rules |
US8515757B2 (en) | 2007-03-20 | 2013-08-20 | Nuance Communications, Inc. | Indexing digitized speech with words represented in the digitized speech |
US20080235029A1 (en) * | 2007-03-23 | 2008-09-25 | Cross Charles W | Speech-Enabled Predictive Text Selection For A Multimodal Application |
US8909532B2 (en) | 2007-03-23 | 2014-12-09 | Nuance Communications, Inc. | Supporting multi-lingual user interaction with a multimodal application |
US8788620B2 (en) * | 2007-04-04 | 2014-07-22 | International Business Machines Corporation | Web service support for a multimodal client processing a multimodal application |
US8725513B2 (en) | 2007-04-12 | 2014-05-13 | Nuance Communications, Inc. | Providing expressive user interaction with a multimodal application |
US8862475B2 (en) | 2007-04-12 | 2014-10-14 | Nuance Communications, Inc. | Speech-enabled content navigation and control of a distributed multimodal browser |
US8214242B2 (en) | 2008-04-24 | 2012-07-03 | International Business Machines Corporation | Signaling correspondence between a meeting agenda and a meeting discussion |
US8229081B2 (en) | 2008-04-24 | 2012-07-24 | International Business Machines Corporation | Dynamically publishing directory information for a plurality of interactive voice response systems |
US8121837B2 (en) | 2008-04-24 | 2012-02-21 | Nuance Communications, Inc. | Adjusting a speech engine for a mobile computing device based on background noise |
US8082148B2 (en) | 2008-04-24 | 2011-12-20 | Nuance Communications, Inc. | Testing a grammar used in speech recognition for reliability in a plurality of operating environments having different background noise |
US9349367B2 (en) | 2008-04-24 | 2016-05-24 | Nuance Communications, Inc. | Records disambiguation in a multimodal application operating on a multimodal device |
US8380513B2 (en) * | 2009-05-19 | 2013-02-19 | International Business Machines Corporation | Improving speech capabilities of a multimodal application |
US8290780B2 (en) | 2009-06-24 | 2012-10-16 | International Business Machines Corporation | Dynamically extending the speech prompts of a multimodal application |
US8510117B2 (en) * | 2009-07-09 | 2013-08-13 | Nuance Communications, Inc. | Speech enabled media sharing in a multimodal application |
US8416714B2 (en) * | 2009-08-05 | 2013-04-09 | International Business Machines Corporation | Multimodal teleconferencing |
US10079015B1 (en) * | 2016-12-06 | 2018-09-18 | Amazon Technologies, Inc. | Multi-layer keyword detection |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20010049603A1 (en) * | 2000-03-10 | 2001-12-06 | Sravanapudi Ajay P. | Multimodal information services |
US20020032751A1 (en) * | 2000-05-23 | 2002-03-14 | Srinivas Bharadwaj | Remote displays in mobile communication networks |
US20020035545A1 (en) * | 2000-09-08 | 2002-03-21 | Michihiro Ota | Digital contents sales method and system |
US20020194388A1 (en) * | 2000-12-04 | 2002-12-19 | David Boloker | Systems and methods for implementing modular DOM (Document Object Model)-based multi-modal browsers |
US6523061B1 (en) | 1999-01-05 | 2003-02-18 | Sri International, Inc. | System, method, and article of manufacture for agent-based navigation in a speech-based data navigation system |
US6557032B1 (en) * | 1997-06-07 | 2003-04-29 | International Business Machines Corporation | Data processing system using active tokens and method for controlling such a system |
US6567787B1 (en) * | 1998-08-17 | 2003-05-20 | Walker Digital, Llc | Method and apparatus for determining whether a verbal message was spoken during a transaction at a point-of-sale terminal |
US6574595B1 (en) * | 2000-07-11 | 2003-06-03 | Lucent Technologies Inc. | Method and apparatus for recognition-based barge-in detection in the context of subword-based automatic speech recognition |
US20030140113A1 (en) | 2001-12-28 | 2003-07-24 | Senaka Balasuriya | Multi-modal communication using a session specific proxy server |
US6631418B1 (en) * | 2000-04-05 | 2003-10-07 | Lsi Logic Corporation | Server for operation with a low-cost multimedia terminal |
US6678715B1 (en) * | 1998-08-28 | 2004-01-13 | Kabushiki Kaisha Toshiba | Systems and apparatus for switching execution of a process in a distributed system |
US6868383B1 (en) * | 2001-07-12 | 2005-03-15 | At&T Corp. | Systems and methods for extracting meaning from multimodal inputs using finite-state devices |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6253061B1 (en) * | 1997-09-19 | 2001-06-26 | Richard J. Helferich | Systems and methods for delivering information to a transmitting and receiving device |
-
2002
- 2002-05-14 US US10/145,304 patent/US8601096B2/en not_active Expired - Fee Related
-
2003
- 2003-04-15 WO PCT/US2003/011823 patent/WO2003098456A1/en not_active Application Discontinuation
- 2003-04-15 AU AU2003226415A patent/AU2003226415A1/en not_active Abandoned
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6557032B1 (en) * | 1997-06-07 | 2003-04-29 | International Business Machines Corporation | Data processing system using active tokens and method for controlling such a system |
US6567787B1 (en) * | 1998-08-17 | 2003-05-20 | Walker Digital, Llc | Method and apparatus for determining whether a verbal message was spoken during a transaction at a point-of-sale terminal |
US6678715B1 (en) * | 1998-08-28 | 2004-01-13 | Kabushiki Kaisha Toshiba | Systems and apparatus for switching execution of a process in a distributed system |
US6523061B1 (en) | 1999-01-05 | 2003-02-18 | Sri International, Inc. | System, method, and article of manufacture for agent-based navigation in a speech-based data navigation system |
US20010049603A1 (en) * | 2000-03-10 | 2001-12-06 | Sravanapudi Ajay P. | Multimodal information services |
US6631418B1 (en) * | 2000-04-05 | 2003-10-07 | Lsi Logic Corporation | Server for operation with a low-cost multimedia terminal |
US20020032751A1 (en) * | 2000-05-23 | 2002-03-14 | Srinivas Bharadwaj | Remote displays in mobile communication networks |
US6574595B1 (en) * | 2000-07-11 | 2003-06-03 | Lucent Technologies Inc. | Method and apparatus for recognition-based barge-in detection in the context of subword-based automatic speech recognition |
US20020035545A1 (en) * | 2000-09-08 | 2002-03-21 | Michihiro Ota | Digital contents sales method and system |
US20020194388A1 (en) * | 2000-12-04 | 2002-12-19 | David Boloker | Systems and methods for implementing modular DOM (Document Object Model)-based multi-modal browsers |
US6868383B1 (en) * | 2001-07-12 | 2005-03-15 | At&T Corp. | Systems and methods for extracting meaning from multimodal inputs using finite-state devices |
US20030140113A1 (en) | 2001-12-28 | 2003-07-24 | Senaka Balasuriya | Multi-modal communication using a session specific proxy server |
Non-Patent Citations (1)
Title |
---|
Maes, Stephane H., "Multi-modal Web IBM Position," W3C/WAP Workshop, IBM Human Language Technologies, pp. 1-9. |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130325480A1 (en) * | 2012-05-30 | 2013-12-05 | Au Optronics Corp. | Remote controller and control method thereof |
US10600421B2 (en) | 2014-05-23 | 2020-03-24 | Samsung Electronics Co., Ltd. | Mobile terminal and control method thereof |
Also Published As
Publication number | Publication date |
---|---|
US20030217161A1 (en) | 2003-11-20 |
WO2003098456A1 (en) | 2003-11-27 |
AU2003226415A1 (en) | 2003-12-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8601096B2 (en) | Method and system for multi-modal communication | |
US9819744B1 (en) | Multi-modal communication | |
KR100561228B1 (en) | Method for VoiceXML to XHTML+Voice Conversion and Multimodal Service System using the same | |
US9152375B2 (en) | Speech recognition interface for voice actuation of legacy systems | |
US6898567B2 (en) | Method and apparatus for multi-level distributed speech recognition | |
US7729919B2 (en) | Combining use of a stepwise markup language and an object oriented development tool | |
US7146323B2 (en) | Method and system for gathering information by voice input | |
US6834265B2 (en) | Method and apparatus for selective speech recognition | |
US6185535B1 (en) | Voice control of a user interface to service applications | |
US20100094635A1 (en) | System for Voice-Based Interaction on Web Pages | |
US20040054539A1 (en) | Method and system for voice control of software applications | |
CN1617559B (en) | Sequential multimodal input | |
US7171361B2 (en) | Idiom handling in voice service systems | |
US20040162731A1 (en) | Speech recognition conversation selection device, speech recognition conversation system, speech recognition conversation selection method, and program | |
US20060100881A1 (en) | Multi-modal web interaction over wireless network | |
US20030195751A1 (en) | Distributed automatic speech recognition with persistent user parameters | |
US9202467B2 (en) | System and method for voice activating web pages | |
KR20010025243A (en) | Method for Voice Web Browser Service in Internet | |
JP2003271376A (en) | Information providing system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: MOTOROLA, INC., ILLINOIS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BALASURIYA, SENAKA;REEL/FRAME:012923/0027 Effective date: 20020507 |
|
AS | Assignment |
Owner name: MOTOROLA MOBILITY, INC, ILLINOIS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MOTOROLA, INC;REEL/FRAME:025673/0558 Effective date: 20100731 |
|
AS | Assignment |
Owner name: MOTOROLA MOBILITY LLC, ILLINOIS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MOTOROLA MOBILITY, INC.;REEL/FRAME:028829/0856 Effective date: 20120622 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: GOOGLE TECHNOLOGY HOLDINGS LLC, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MOTOROLA MOBILITY LLC;REEL/FRAME:034625/0001 Effective date: 20141028 |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20211203 |