US20140163996A1 - Controlling a set-top box via remote speech recognition - Google Patents
Controlling a set-top box via remote speech recognition Download PDFInfo
- Publication number
- US20140163996A1 US20140163996A1 US14/180,897 US201414180897A US2014163996A1 US 20140163996 A1 US20140163996 A1 US 20140163996A1 US 201414180897 A US201414180897 A US 201414180897A US 2014163996 A1 US2014163996 A1 US 2014163996A1
- Authority
- US
- United States
- Prior art keywords
- stb
- information
- remote control
- speech
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/422—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
- H04N21/42203—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G10L15/193—Formal grammars, e.g. finite state automata, context free grammars or word networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/16—Transforming into a non-visible representation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/4104—Peripherals receiving signals from specially adapted client devices
- H04N21/4122—Peripherals receiving signals from specially adapted client devices additional display device, e.g. video projector
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/422—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
- H04N21/42204—User interfaces specially adapted for controlling a client device through a remote control device; Remote control devices therefor
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/422—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
- H04N21/42204—User interfaces specially adapted for controlling a client device through a remote control device; Remote control devices therefor
- H04N21/42206—User interfaces specially adapted for controlling a client device through a remote control device; Remote control devices therefor characterized by hardware details
- H04N21/42222—Additional components integrated in the remote control device, e.g. timer, speaker, sensors for detecting position, direction or movement of the remote control, microphone or battery charging device
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
- H04N21/440236—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by media transcoding, e.g. video is transformed into a slideshow of still pictures, audio is converted into text
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/443—OS processes, e.g. booting an STB, implementing a Java virtual machine in an STB or power management in an STB
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/226—Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
- G10L2015/228—Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of application context
Definitions
- FIG. 4 is a functional block diagram of the exemplary server device of FIG. 1 ;
- Other applications 410 may include hardware and/or software for supporting various functionalities of device 300 , such as browser functions, MPEG encoding/decoding, a menu system for controlling a STB, application server functions, STB notification, text messaging, email, multimedia messaging, wireless communications, web access, file uploading and downloading, image transfer, etc.
- blocks 512 and 514 may be omitted.
Abstract
Description
- The present application is a continuation of U.S. patent application Ser. No. 13/447,487, filed Apr. 16, 2012, which is a divisional of U.S. patent application Ser. No. 11/781,628, filed Jul. 23, 2007, which is now U.S. Pat. No. 8,175,885, the disclosures of which are each hereby incorporated by reference herein.
- Set-top boxes (STBs) can be controlled through a remote control. The remote control may allow a user to navigate a program guide, select channels or programs for viewing, adjust display characteristics, and/or perform other interactive functions related to viewing multimedia-type content provided over a network. Typically, a user interacts with the STB using a keypad that is part of the remote control, and signals representing key depressions are transmitted to the STB via an infrared transmission.
-
FIG. 1 shows an exemplary system in which concepts described herein may be implemented; -
FIG. 2 is a block diagram of an exemplary remote control ofFIG. 1 ; -
FIG. 3 is a block diagram of an exemplary server device ofFIG. 1 ; -
FIG. 4 is a functional block diagram of the exemplary server device ofFIG. 1 ; -
FIG. 5 is a flow chart of an exemplary process for controlling a set-top box via remote speech recognition; -
FIG. 6 shows another exemplary system in which the concepts described herein may be implemented; and -
FIG. 7 is a diagram showing some of the components ofFIG. 1 andFIG. 4 . - The following detailed description refers to the accompanying drawings. The same reference numbers in different drawings may identify the same or similar elements.
- A preferred remote control may allow a user to issue voice commands to control a set-top box (STB). A user may speak commands for a STB into a microphone of the remote control. The remote control may send the spoken audio signal to a voice application facility, which may apply speech recognition to the audio signal to identify text information in the audio signal. The text information may be used to obtain command information that may be sent to the STB. The STB may execute the command information and cause a television connected to the STB to display certain media or other information.
- As used herein, the term “audio dialog” may refer to an exchange of audio information between two entities. The term “audio dialog document,” as used herein, may refer to a document which describes how an entity may respond with audio, text, or visual information upon receiving audio information. Depending on context, “audio dialog” may refer to an “audio dialog document.”
- The term “form,” as used herein, may refer to an audio dialog document portion that specifies what information may be presented to a client and how audio information may be collected from the client.
- The term “command information,” as used herein, may refer to information that is derived by using the result of applying speech recognition to an audio signal (e.g., voice command). For example, if the word “Rome” is identified via speech recognition and “Rome” is used as a key to search a television program database, the search result may be considered “command information.”
- As used herein, “set top box” or “STB” may refers to any media processing system that may receive multimedia content over a network and provide such multimedia content to an attached television.
- As used herein, “television” may refer to any device that can receive and display multimedia content for perception by users, and includes technologies such as CRT displays, LCDs, LED displays, plasma displays and any attendant audio generation facilities.
- As used herein, “television programs” may refer to any multimedia content that may be provided to an STB.
-
FIG. 1 shows an exemplary environment in which concepts described herein may be implemented. As shown,system 100 may include aremote control 102, a STB 104, atelevision 106, acustomer router 108, anetwork 110, aserver device 112, avoice application facility 114, andcontent provider device 116. In other implementations,system 100 may include more, fewer, or different components. For example,system 100 may include many set-top boxes, remote controls, televisions, customer routers, and/or voice application facilities and excludeserver device 112. Moreover, one or more components ofsystem 100 may perform one or more functions of another component ofsystem 100. For example, STB 104 may incorporate the functionalities ofcustomer router 108. -
Remote control 102 may include a device for issuing wireless commands to and for controlling electronic devices (e.g., a television, a set-top box, a stereo system, a digital video disc (DVD) player, etc.). Some commands may take the form of digitized speech that is sent toserver device 112 through STB 104 orcustomer router 108. - STB 104 may include a device for receiving commands from
remote control 102 and for selecting and/or obtaining content that may be shown or played ontelevision 106 in accordance with the commands. The content may be obtained fromcontent provider device 116 vianetwork 110 and/orcustomer router 108. In some implementations, STB 104 may receive digitized speech fromremote control 102 over a wireless communication channel (e.g., Wireless Fidelity (Wi-Fi) channel) and relay the digitized speech toserver device 112 overnetwork 110.Television 106 may include a device for playing broadcast television signals and/or signals from STB 104. -
Customer router 108 may include a device for buffering and forwarding data packets toward destinations. InFIG. 1 ,customer router 108 may receive data packets fromremote control 102 over a wireless communication channel (e.g., a Wi-Fi network) and/or from STB 104 and route them toserver device 112 orvoice application facility 114, one or more devices incontent provider device 116, and/or other destinations in network 110 (e.g., a web server). In addition,customer router 108 may route data packets that are received fromserver device 112 orvoice application facility 114,content provider device 116, and/or other device devices innetwork 110 to STB 104. In some implementations,customer router 108 may be replaced with a different device, such as a Network Interface Module (NIM), a Broadband Home Router (BHR), an Optical Network Terminal (ONT), etc. -
Network 110 may include one or more nodes interconnected by communication paths or links. For example,network 110 may include any network characterized by the type of link (e.g., a wireless link), by access (e.g., a private network, a public network, etc.), by spatial distance (e.g., a wide-area network (WAN)), by protocol (e.g. a Transmission Control Protocol (TCP)/Internet Protocol (IP) network), by connection (e.g., a switched network), etc. Ifserver device 112,voice application facility 114, and/orcontent provider device 116 are part of a corporate network or an intranet,network 110 may include portions of the intranet (e.g., demilitarized zone (DMZ)). -
Server device 112 may include one or more computer systems for hosting server programs, applications, and/or data related to speech recognition. In one implementation,server device 112 may supply audio dialog documents that specify speech grammar (e.g., information that identifies different words or phrases that a user might say) tovoice application facility 114. -
Voice application facility 114 may include one or more computer systems for applying speech recognition, using the result of the speech recognition to obtain command information, and sending the command information to STB 104. In performing the speech recognition,voice application facility 114 may obtain a speech grammar fromserver device 112, apply the speech grammar to an audio signal received fromremote control 102 to identify text information, use the text information to obtain command information, and send the command information to STB 104 viacontent provider device 116. In some embodiments,voice application facility 114 andserver device 112 may be combined in a single device. -
Content provider device 116 may include one or more devices for providing content/information to STB 104 and/ortelevision 106 in accordance with commands that are issued from STB 104. Examples ofcontent provider device 116 may include a headend device that provides broadcast television programs, a video-on-demand device that provides television programs upon request, and a program guide information server that provides information related to television programs available to STB 104. -
FIG. 2 is a block diagram of an exemplaryremote control 102. As shown,remote control 102 may include aprocessing unit 202,memory 204,communication interface 206, amicrophone 208, other input/output (I/O)devices 210, and/or abus 212. Depending on implementation,remote control 102 may include additional, fewer, or different components than the ones illustrated inFIG. 2 . -
Processing unit 202 may include one or more processors, microprocessors, and/or processing logic capable of controllingremote control 102. In some implementations,processing unit 202 may include a unit for applying digital signal processing (DSP) to speech signals that are received frommicrophone 208. In other implementations, processingunit 202 may apply DSP to speech signals via execution of a DSP software application.Memory 204 may include static memory, such as read only memory (ROM), and/or dynamic memory, such as random access memory (RAM), or onboard cache, for storing data and machine-readable instructions. In some implementations,memory 204 may also include storage devices, such as a floppy disk, CD ROM, CD read/write (R/W) disk, and/or flash memory, as well as other types of storage devices. -
Communication interface 206 may include any transceiver-like mechanism that enablesremote control 102 to communicate with other devices and/or systems. For example,communication interface 206 may include mechanisms for communicating withSTB 104 and/or television via a wireless signal (e.g., an infrared signal). In another example,communication interface 206 may include mechanisms for communicating with devices in a network (e.g., a Wi-Fi network, a Bluetooth-based network, etc.). -
Microphone 208 may receive audible information from a user and relay the audible information in the form of an audio signal to other components ofremote control 102. The other components may process the audio signal (e.g., digitize the signal, filter the signal, etc.). Other input/output devices 210 may include a keypad, a speaker, and/or other types of devices for converting physical events or phenomena to and/or from digital signals that pertain toremote control 102.Bus 212 may provide an interface through which components ofremote control 102 can communicate with one another. -
FIG. 3 is a block diagram of adevice 300.Device 300 may representSTB 104,server device 112,voice application facility 114, and/orcontent provider device 116. As shown,device 300 may include aprocessing unit 302,memory 304,network interface 306, input/output devices 308, andbus 310. Depending on implementation,device 300 may include additional, fewer, or different components than the ones illustrated inFIG. 3 . For example, ifdevice 300 is implemented asSTB 104,device 300 may include a digital signal processor for Motion Picture Experts Group (MPEG) decoding and voice processing. In another example, ifdevice 300 is implemented asserver device 112,device 300 may include specialized hardware for speech processing. In yet another example, ifdevice 300 is implemented ascontent provider device 116,device 300 may include storage devices that can quickly process large quantities of data. -
Processing unit 302 may include one or more processors, microprocessors, and/or processing logic capable of controllingdevice 300. In some implementations, processingunit 302 may include a specialized processor for applying digital speech recognition. In other implementations, processingunit 302 may synthesize speech signals based on XML data. In still other implementations, processingunit 302 may include an MPEG encoder/decoder.Memory 304 may include static memory, such as read only memory (ROM), and/or dynamic memory, such as random access memory (RAM), or onboard cache, for storing data and machine-readable instructions. In some implementations,memory 304 may also include storage devices, such as a floppy disk, CD ROM, CD read/write (R/W) disk, and/or flash memory, as well as other types of storage devices. -
Network interface 306 may include any transceiver-like mechanism that enablesdevice 300 to communicate with other devices and/or systems. For example,network interface 300 may include mechanisms for communicating with devices in a network (e.g., an optical network, a hybrid fiber-coaxial network, a terrestrial wireless network, a satellite-based network, a wireless local area network (WLAN), a Bluetooth-based network, a metropolitan area network, a local area network (LAN), etc.). In another example,network interface 306 may include radio frequency modulators/demodulators for receiving television signals. - Input/
output devices 308 may include a keyboard, a speaker, a microphone, and/or other types of devices for converting physical events or phenomena to and/or from digital signals that pertain todevice 300. For example, ifdevice 300 isSTB 104, input/output device 308 may include a video interface for selecting video source to be decoded or encoded, an audio interface for digitizing audio information, or a user interface.Bus 310 may provide an interface through which components ofdevice 300 can communicate with one another. -
FIG. 4 is a functional block diagram ofdevice 300. As shown,device 300 may include aweb server 402, avoice browser 404, adatabase 406, aXML generator 408, andother applications 410. Depending on implementation,device 300 may include fewer, additional, or different types of components than those illustrated inFIG. 4 . For example, ifdevice 300 representsSTB 104,device 300 may possibly excludeweb server 402. In another example, ifdevice 300 represents avoice application facility 114,device 300 may possibly excludeweb server 402. In another example, ifdevice 300 representscontent provider device 116,device 300 may possibly include logic for communicating withSTB 104. -
Web server 402 may include hardware and/or software for receiving information from client applications such as a browser and for sending web resources to client applications. For example,web server 402 may send XML data to a voice browser that is hosted onvoice application facility 114. In exchanging information with client devices and/or applications,web server 402 may operate in conjunction with other components, such asdatabase 406 and/orother applications 410. -
Voice browser 404 may include hardware and/or software for performing speech recognition on audio signals and/or speech synthesis.Voice browser 404 may use audio dialog documents that are generated from a set of words or phrases. In one implementation,voice browser 404 may apply speech recognition based on audio dialog documents that are generated from program guide information and may produce a voice response via a speech synthesis in accordance with the audio dialog documents. The audio dialog documents may include a speech grammar (e.g., information that identifies different words or phrases that a user might say and specifies how to interpret a valid expression) and information that specifies how to produce speech. - In some implementations,
voice browser 404 may be replaced with a speech recognition system. Examples of a speech recognition system includes a dynamic type warping (DTW)-based speech recognition system, a neural network-based speech recognition system, Hidden Markov model (HMM) based speech recognition system, etc. -
Database 406 may act as an information repository for other components ofdevice 300. For example,web server 402 may retrieve and/or store web pages and information to/fromdatabase 406. In one implementation,database 406 may include program guide information and/or audio dialog documents.XML generator 408 may include hardware and/or software for generating the audio dialog documents in XML, based on the program guide information that is downloaded fromcontent provider device 116 and stored indatabase 406. The audio dialog documents that are generated byXML generator 408 may be stored indatabase 406. -
Other applications 410 may include hardware and/or software for supporting various functionalities ofdevice 300, such as browser functions, MPEG encoding/decoding, a menu system for controlling a STB, application server functions, STB notification, text messaging, email, multimedia messaging, wireless communications, web access, file uploading and downloading, image transfer, etc. - The above paragraphs describe system elements that are related to devices and/or components for controlling a set-top box via remote speech recognition.
FIG. 5 depicts anexemplary process 500 that is capable of being performed on one or more of these devices and/or components. - As shown in
FIG. 5 ,process 500 may start atblock 502, where speech may be received atremote control 102. The speech may include a request to play a television program, a video (e.g., a movie), or to perform a particular action ontelevision 106. For example, a user may press a button onremote control 102 and utter “Go to channel 100.” In another example, a user may request, “Increase volume.” - The received speech may be processed at remote control 102 (block 504). If the speech is received through
microphone 208, the speech may be digitized, processed (e.g., digitally filtered), and incorporated into network data (e.g., a packet). The processed speech may be sent to voiceapplication facility 114 throughSTB 104 orcustomer router 108, using, for example, the Session Initiation Protocol (SIP). - The processed speech may be received at voice application facility 114 (block 506). In some implementations,
voice application facility 114 may hostvoice browser 404 that supports SIP and/or voiceXML. In such cases,voice application facility 114 may accept messages fromremote control 102 viavoice browser 404 over SIP. For example,remote control 102 may use a network address and a port number ofvoice browser 404, which may be stored inmemory 204 ofremote control 102, to establish a session withvoice browser 404. During the session, the processed speech and any additional information related toSTB 104 may be transferred fromremote control 102 to voiceapplication facility 114. The session may be terminated after processed speech and the information are received atvoice application facility 114. - At
block 508, speech recognition may be performed to obtain text information. Performing the speech recognition may involve identifying the text information (e.g., words or phrases) in the processed speech usingvoice browser 404, which may apply a speech grammar specified in the audio dialog documents to the processed speech. - The audio dialog documents may be kept up-to-date by
server device 112. For example,server device 112 may periodically download information (e.g., program guide information) fromcontent provider device 116, generate audio dialog documents that specify speech grammar, forms, or menus based on the downloaded information viaXML generator 408, and store the audio dialog documents indatabase 406. In many implementations, the audio dialog documents may be stored in XML (e.g., voiceXML). - The text information may be used to obtain a command intended by the user (block 510). For example, in one implementation, if the text information includes the name of a television show,
voice browser 404 may interpret the text information as a command to retrieve the show's viewing data/time. In such instances, the text information may be used as a key to retrieve additional information that includes the show's schedule fromdatabase 406. In another example, if the text information includes a word related to changing a viewing channel,voice browser 404 may interpret the text information as a command that STB 104 may follow to change the viewing channel ontelevision 106. In different implementations, a component other thanvoice browser 404 may be used to obtain the command. - The obtained command may be sent to
content provider device 116, along with any additional information that is retrieved based on the command (e.g., a show's viewing schedule) (block 512). As used herein, the term “command information” may encompass both the command and the additional information. - The command information may be sent to
STB 104 from content provider device 116 (block 514). In sending the command information fromvoice application facility 114 tocontent provider device 116 and fromcontent provider 116 toSTB 104, any suitable communication protocol may be used (e.g., hypertext transfer protocol (HTTP), file transfer protocol (FTP), etc.). Command information may be formatted such thatSTB 104 will recognize the communication as command information (e.g., using predetermined coding and/or communication channels/ports/formats). - Actions in accordance with the command information may be performed at STB 104 (block 516). For example, if the command information indicates a selection of a television program,
STB 104 may momentarily display on television 106 a message that indicates a television program is being selected andSTB 104 may causetelevision 106 to show the selected program. In yet another example, if the command information indicates that a volume oftelevision 106 is to be changed,STB 104 may momentarily display the magnitude of the increased volume ontelevision 106. - Many changes to the components and the process for controlling a set-top box via remote speech recognition as described above may be implemented. For example, in different implementations,
server device 112 andvoice application facility 114 may be replaced by a single device. In such implementation, all components that are shown inFIG. 4 may be included in the single device. -
FIG. 6 shows a diagram of another implementation of a speech recognition system. In the system shown inFIG. 6 ,database 406 andXML generator 408 may be hosted bySTB 104. In the implementation,STB 104 may perform speech recognition and apply a speech grammar to a digitized audio signal that is received fromremote control 102. The speech grammar may be produced byXML generator 408, based on program guide information that is periodically downloaded fromcontent provider device 116. - In the system of
FIG. 6 ,STB 104 may obtain text information from the digitized voice signal, obtain a command from the text information, and perform an action in accordance with the command. In such implementations,server device 112 may be excluded. - The following example illustrates processes that may be involved in controlling a set-top box via remote speech recognition, with reference to
FIG. 7 . The example is consistent with the exemplary process described above with reference toFIG. 5 . - As illustrated in
FIG. 7 ,system 100 includes components that have been described with respect toFIG. 1 andFIG. 4 . Assume for the sake of the example thatserver device 112 includesweb server 402,database 406 andXML generator 408, thatvoice application facility 114 includesvoice browser 404, and thatremote control 102 communicates withcustomer router 108 via a Wi-Fi network. In addition, assume thatvoice application facility 114 has retrieved the program guide information for the next seven days fromcontent provider device 116. - In the example, assume that a user speaks a command to
remote control 102, intending to have a search performed to identify all instances of television shows which have the word “Sopranos” in the title. The user might indicate this desire by depressing a key on the remote control associated with voice commands and speaking a phrase such as “Find all Sopranos shows.” The remote control may also have a key dedicated to search functionality, in which case the user might depress such a key and simply say “Sopranos.” In any case, theremote control 102 digitizes the received voice audio to produce a digital audio signal.Remote control 102 establishes a SIP session withvoice application facility 114 and transmits the digital audio signal over the Wi-Fi network viacustomer router 108.Voice browser 404 that is hosted onvoice application facility 114 performs speech recognition on the received audio signal, by using a speech grammar that is part of audio dialog documents stored indatabase 406. To perform the speech processing,voice browser 404 may request thatweb server 402 provide the audio dialog documents. The speech grammar may be based on the program guide information provided by thecontent provider device 116, which may allow for better speech recognition results due to the limitation of the grammars to just text that appears in the program guide information. In the current example, the voice browser may find that the received speech most closely matches the grammar associated with the text “Sopranos.” - Upon identifying the word “Sopranos” as the most likely matching text,
voice application facility 114 may send this resulting text tocontent provider device 116 over HTTP. In turn,content provider device 116 may send the resulting text toSTB 104 over HTTP, and may further include indications for the STB to perform a search of the program guide information to identify a matching set of program guide entries. Upon receiving this command information,STB 104 may interpret the command information to determine that a search has been requested using the resulting text, and cause the search to be performed using the resulting text. The STB may further provide a display indicating that a search has been requested, the resulting text that is being searched, and possibly an indication that the command was received through the voice interface. The search results may then be displayed bySTB 104 ontelevision 106 according to the manner in which STB displays such search results. - In some embodiments, the program guide search described above may instead be performed by
voice application facility 114 and provided tocontent provider 116, or be performed bycontent provider device 116 in response to receiving the resulting text. In some cases this may be preferable, as it reduces the processing obligations ofSTB 104. In such cases, the results of the program guide search would be communicated toSTB 104—for example, as indications of program guide entries that are within the results (possibly with an indication of order of display)—which would then cause a display of the search results ontelevision 106 similar to that described above. - The above example illustrates how a set-top box may be controlled via remote speech recognition. A remote control that can receive and forward speech to speech processing facilities may facilitate issuing voice commands to control a STB. In the above implementation, to control the STB, the user may speak commands for the STB into a microphone of the remote control. The remote control may convert the commands into audio signals and may send the audio signals to a voice application facility. Upon receiving the audio signals, the voice application facility may apply speech recognition. By using the result of applying the speech recognition, the voice application facility may obtain command information for the STB. The command information may be routed to the STB, which may display and/or execute the command information.
- The foregoing description of implementations provides an illustration, but is not intended to be exhaustive or to limit the implementations to the precise form disclosed. Modifications and variations are possible in light of the above teachings or may be acquired from practice of the teachings.
- For example, while a series of blocks have been described with regard to the process illustrated in
FIG. 5 , the order of the blocks may be modified in other implementations. In addition, non-dependent blocks may represent blocks that can be performed in parallel. Further, certain blocks may be omitted. For example, in the implementation in whichSTB 104 performs speech recognition, blocks 512 and 514 may be omitted. - It will be apparent that aspects described herein may be implemented in many different forms of software, firmware, and hardware in the implementations illustrated in the figures. The actual software code or specialized control hardware used to implement aspects does not limit the invention. Thus, the operation and behavior of the aspects were described without reference to the specific software code—it being understood that software and control hardware can be designed to implement the aspects based on the description herein.
- Further, certain portions of the implementations have been described as “logic” that performs one or more functions. This logic may include hardware, such as a processor, an application specific integrated circuit, or a field programmable gate array, software, or a combination of hardware and software.
- No element, block, or instruction used in the present application should be construed as critical or essential to the implementations described herein unless explicitly described as such. Also, as used herein, the article “a” is intended to include one or more items. Where only one item is intended, the term “one” or similar language is used. Further, the phrase “based on” is intended to mean “based, at least in part, on” unless explicitly stated otherwise.
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/180,897 US20140163996A1 (en) | 2007-07-23 | 2014-02-14 | Controlling a set-top box via remote speech recognition |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/781,628 US8175885B2 (en) | 2007-07-23 | 2007-07-23 | Controlling a set-top box via remote speech recognition |
US13/447,487 US8655666B2 (en) | 2007-07-23 | 2012-04-16 | Controlling a set-top box for program guide information using remote speech recognition grammars via session initiation protocol (SIP) over a Wi-Fi channel |
US14/180,897 US20140163996A1 (en) | 2007-07-23 | 2014-02-14 | Controlling a set-top box via remote speech recognition |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/447,487 Continuation US8655666B2 (en) | 2007-07-23 | 2012-04-16 | Controlling a set-top box for program guide information using remote speech recognition grammars via session initiation protocol (SIP) over a Wi-Fi channel |
Publications (1)
Publication Number | Publication Date |
---|---|
US20140163996A1 true US20140163996A1 (en) | 2014-06-12 |
Family
ID=40296140
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/781,628 Active 2031-03-07 US8175885B2 (en) | 2007-07-23 | 2007-07-23 | Controlling a set-top box via remote speech recognition |
US13/447,487 Active US8655666B2 (en) | 2007-07-23 | 2012-04-16 | Controlling a set-top box for program guide information using remote speech recognition grammars via session initiation protocol (SIP) over a Wi-Fi channel |
US14/180,897 Abandoned US20140163996A1 (en) | 2007-07-23 | 2014-02-14 | Controlling a set-top box via remote speech recognition |
Family Applications Before (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/781,628 Active 2031-03-07 US8175885B2 (en) | 2007-07-23 | 2007-07-23 | Controlling a set-top box via remote speech recognition |
US13/447,487 Active US8655666B2 (en) | 2007-07-23 | 2012-04-16 | Controlling a set-top box for program guide information using remote speech recognition grammars via session initiation protocol (SIP) over a Wi-Fi channel |
Country Status (1)
Country | Link |
---|---|
US (3) | US8175885B2 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150127353A1 (en) * | 2012-05-08 | 2015-05-07 | Samsung Electronics Co., Ltd. | Electronic apparatus and method for controlling electronic apparatus thereof |
WO2017040644A1 (en) * | 2015-08-31 | 2017-03-09 | Roku, Inc. | Audio command interface for a multimedia device |
WO2021061304A1 (en) * | 2019-09-26 | 2021-04-01 | Dish Network L.L.C. | Method and system for implementing an elastic cloud-based voice search utilized by set-top box (stb) clients |
US11392217B2 (en) | 2020-07-16 | 2022-07-19 | Mobius Connective Technologies, Ltd. | Method and apparatus for remotely processing speech-to-text for entry onto a destination computing system |
Families Citing this family (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8175885B2 (en) * | 2007-07-23 | 2012-05-08 | Verizon Patent And Licensing Inc. | Controlling a set-top box via remote speech recognition |
US9135809B2 (en) | 2008-06-20 | 2015-09-15 | At&T Intellectual Property I, Lp | Voice enabled remote control for a set-top box |
JP2011033680A (en) * | 2009-07-30 | 2011-02-17 | Sony Corp | Voice processing device and method, and program |
CN102056021A (en) * | 2009-11-04 | 2011-05-11 | 李峰 | Chinese and English command-based man-machine interactive system and method |
CN102314218A (en) * | 2010-07-01 | 2012-01-11 | 李峰 | Man-machine interaction method on intelligentized mobile phone and system |
US8453176B2 (en) * | 2010-08-20 | 2013-05-28 | Avaya Inc. | OCAP/STB ACAP/satellite-receiver audience response/consumer application |
US8914287B2 (en) * | 2010-12-31 | 2014-12-16 | Echostar Technologies L.L.C. | Remote control audio link |
JP5694102B2 (en) * | 2011-09-22 | 2015-04-01 | 株式会社東芝 | Speech recognition apparatus, speech recognition method and program |
US8515766B1 (en) * | 2011-09-30 | 2013-08-20 | Google Inc. | Voice application finding and user invoking applications related to a single entity |
KR101467519B1 (en) * | 2011-11-21 | 2014-12-02 | 주식회사 케이티 | Server and method for searching contents using voice information |
US8793136B2 (en) | 2012-02-17 | 2014-07-29 | Lg Electronics Inc. | Method and apparatus for smart voice recognition |
KR102056461B1 (en) * | 2012-06-15 | 2019-12-16 | 삼성전자주식회사 | Display apparatus and method for controlling the display apparatus |
KR101605862B1 (en) * | 2012-06-29 | 2016-03-24 | 삼성전자주식회사 | Display apparatus, electronic device, interactive system and controlling method thereof |
KR20140004515A (en) | 2012-07-03 | 2014-01-13 | 삼성전자주식회사 | Display apparatus, interactive server and method for providing response information |
US9288421B2 (en) | 2012-07-12 | 2016-03-15 | Samsung Electronics Co., Ltd. | Method for controlling external input and broadcast receiving apparatus |
FR2996399B3 (en) | 2012-09-28 | 2015-05-15 | Samsung Electronics Co Ltd | IMAGE PROCESSING APPARATUS AND CONTROL METHOD THEREFOR, AND IMAGE PROCESSING SYSTEM |
KR20140055502A (en) * | 2012-10-31 | 2014-05-09 | 삼성전자주식회사 | Broadcast receiving apparatus, server and control method thereof |
JP2014109889A (en) * | 2012-11-30 | 2014-06-12 | Toshiba Corp | Content retrieval device, content retrieval method and control program |
JP2014126600A (en) * | 2012-12-25 | 2014-07-07 | Panasonic Corp | Voice recognition device, voice recognition method and television |
KR102009316B1 (en) * | 2013-01-07 | 2019-08-09 | 삼성전자주식회사 | Interactive server, display apparatus and controlling method thereof |
US10585568B1 (en) | 2013-02-22 | 2020-03-10 | The Directv Group, Inc. | Method and system of bookmarking content in a mobile device |
US8970792B2 (en) * | 2013-07-16 | 2015-03-03 | Browan Communications Inc. | Remote controller and remote controller set applied to display device |
KR102210933B1 (en) * | 2014-01-02 | 2021-02-02 | 삼성전자주식회사 | Display device, server device, voice input system comprising them and methods thereof |
US10089985B2 (en) | 2014-05-01 | 2018-10-02 | At&T Intellectual Property I, L.P. | Smart interactive media content guide |
KR102277259B1 (en) * | 2014-11-26 | 2021-07-14 | 엘지전자 주식회사 | Device control system, digital device and method of controlling the same |
JP6627775B2 (en) * | 2014-12-02 | 2020-01-08 | ソニー株式会社 | Information processing apparatus, information processing method and program |
CN105959761A (en) * | 2016-04-28 | 2016-09-21 | 京东方科技集团股份有限公司 | Display for supporting speech control OSD menu |
KR102614697B1 (en) * | 2016-12-08 | 2023-12-18 | 삼성전자주식회사 | Display apparatus and method for acquiring channel information of a display apparatus |
US10762903B1 (en) * | 2017-11-07 | 2020-09-01 | Amazon Technologies, Inc. | Conversational recovery for voice user interface |
CN108573703B (en) * | 2018-03-07 | 2020-08-18 | 珠海格力电器股份有限公司 | Control method of electric appliance system |
US10430125B1 (en) | 2018-06-04 | 2019-10-01 | gabi Solutions, Inc. | System, network architecture and method for accessing and controlling an electronic device |
US11211063B2 (en) * | 2018-11-27 | 2021-12-28 | Lg Electronics Inc. | Multimedia device for processing voice command |
CN109600646B (en) * | 2018-12-11 | 2021-03-23 | 未来电视有限公司 | Voice positioning method and device, smart television and storage medium |
CN111968636B (en) * | 2020-08-10 | 2021-11-12 | 湖北亿咖通科技有限公司 | Method for processing voice request text and computer storage medium |
CN111918110A (en) * | 2020-08-31 | 2020-11-10 | 中移(杭州)信息技术有限公司 | Set top box control method, server, system, electronic device and storage medium |
CN116055817B (en) * | 2023-03-23 | 2023-06-20 | 无锡威达智能电子股份有限公司 | Voice remote control method, system and storage medium |
Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6192340B1 (en) * | 1999-10-19 | 2001-02-20 | Max Abecassis | Integration of music from a personal library with real-time information |
US6324512B1 (en) * | 1999-08-26 | 2001-11-27 | Matsushita Electric Industrial Co., Ltd. | System and method for allowing family members to access TV contents and program media recorder over telephone or internet |
US6330537B1 (en) * | 1999-08-26 | 2001-12-11 | Matsushita Electric Industrial Co., Ltd. | Automatic filtering of TV contents using speech recognition and natural language |
US6442523B1 (en) * | 1994-07-22 | 2002-08-27 | Steven H. Siegel | Method for the auditory navigation of text |
US6615177B1 (en) * | 1999-04-13 | 2003-09-02 | Sony International (Europe) Gmbh | Merging of speech interfaces from concurrent use of devices and applications |
US20030167174A1 (en) * | 2002-03-01 | 2003-09-04 | Koninlijke Philips Electronics N.V. | Automatic audio recorder-player and operating method therefor |
US20040226042A1 (en) * | 1998-05-19 | 2004-11-11 | United Video Properties, Inc. | Program guide system with video-on-demand browsing |
US20050132420A1 (en) * | 2003-12-11 | 2005-06-16 | Quadrock Communications, Inc | System and method for interaction with television content |
US20060039367A1 (en) * | 2004-08-18 | 2006-02-23 | Bellsouth Intellectual Property Corporation | SIP-based session control |
US20070139513A1 (en) * | 2005-12-16 | 2007-06-21 | Zheng Fang | Video telephone soft client with a mobile phone interface |
US20070140150A1 (en) * | 2005-12-15 | 2007-06-21 | Andre Beck | Method and network for providing service blending to a subscriber |
US20080139222A1 (en) * | 2006-12-08 | 2008-06-12 | General Instrument Corporation | Presence Detection and Location Update in Premise Gateways |
US20080209497A1 (en) * | 2007-02-27 | 2008-08-28 | At&T Knowledge Ventures, L.P. | Method for reestablishing presentation of a paused media program |
US20080231684A1 (en) * | 2007-03-23 | 2008-09-25 | Verizon Services Corp. | Video streaming system |
US20080313310A1 (en) * | 2007-06-15 | 2008-12-18 | Sony Ericsson Mobile Communications Ab | Method for Distributing Programs over a Communication Network |
US7779028B1 (en) * | 2006-05-02 | 2010-08-17 | Amdocs Software Systems Limited | System, method and computer program product for communicating information among devices |
US8655666B2 (en) * | 2007-07-23 | 2014-02-18 | Verizon Patent And Licensing Inc. | Controlling a set-top box for program guide information using remote speech recognition grammars via session initiation protocol (SIP) over a Wi-Fi channel |
Family Cites Families (44)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5774859A (en) * | 1995-01-03 | 1998-06-30 | Scientific-Atlanta, Inc. | Information system having a speech interface |
US5721827A (en) * | 1996-10-02 | 1998-02-24 | James Logan | System for electrically distributing personalized information |
US6523061B1 (en) * | 1999-01-05 | 2003-02-18 | Sri International, Inc. | System, method, and article of manufacture for agent-based navigation in a speech-based data navigation system |
US6314398B1 (en) * | 1999-03-01 | 2001-11-06 | Matsushita Electric Industrial Co., Ltd. | Apparatus and method using speech understanding for automatic channel selection in interactive television |
US6643620B1 (en) * | 1999-03-15 | 2003-11-04 | Matsushita Electric Industrial Co., Ltd. | Voice activated controller for recording and retrieving audio/video programs |
US6408272B1 (en) * | 1999-04-12 | 2002-06-18 | General Magic, Inc. | Distributed voice user interface |
US6901366B1 (en) * | 1999-08-26 | 2005-05-31 | Matsushita Electric Industrial Co., Ltd. | System and method for assessing TV-related information over the internet |
US7024461B1 (en) * | 2000-04-28 | 2006-04-04 | Nortel Networks Limited | Session initiation protocol enabled set-top device |
US20020019732A1 (en) * | 2000-07-12 | 2002-02-14 | Dan Kikinis | Interactivity using voice commands |
US8467502B2 (en) * | 2001-02-27 | 2013-06-18 | Verizon Data Services Llc | Interactive assistant for managing telephone communications |
US7836147B2 (en) * | 2001-02-27 | 2010-11-16 | Verizon Data Services Llc | Method and apparatus for address book contact sharing |
US20020194327A1 (en) * | 2001-06-14 | 2002-12-19 | International Business Machines Corporation | Method for sensing the status of a client from a server |
US6801604B2 (en) * | 2001-06-25 | 2004-10-05 | International Business Machines Corporation | Universal IP-based and scalable architectures across conversational applications using web services for speech and audio processing resources |
US20030061039A1 (en) * | 2001-09-24 | 2003-03-27 | Alexander Levin | Interactive voice-operated system for providing program-related sevices |
US7324947B2 (en) * | 2001-10-03 | 2008-01-29 | Promptu Systems Corporation | Global speech user interface |
US20050043948A1 (en) * | 2001-12-17 | 2005-02-24 | Seiichi Kashihara | Speech recognition method remote controller, information terminal, telephone communication terminal and speech recognizer |
US7260538B2 (en) * | 2002-01-08 | 2007-08-21 | Promptu Systems Corporation | Method and apparatus for voice control of a television control device |
US7143023B2 (en) * | 2002-03-01 | 2006-11-28 | Signal Integrity Software, Inc. | System and method of describing signal transfers and using same to automate the simulation and analysis of a circuit or system design |
US20040064839A1 (en) * | 2002-09-30 | 2004-04-01 | Watkins Daniel R. | System and method for using speech recognition control unit |
US7519534B2 (en) * | 2002-10-31 | 2009-04-14 | Agiletv Corporation | Speech controlled access to content on a presentation medium |
DE10340580A1 (en) | 2003-09-01 | 2005-03-24 | Klimek, Winfried M. | Set-top box, video recorder or television with speech recognition capability, whereby spoken commands or input are recorded locally and transmitted over a network to a central speech recognition processing unit |
WO2005024780A2 (en) * | 2003-09-05 | 2005-03-17 | Grody Stephen D | Methods and apparatus for providing services using speech recognition |
US20060028337A1 (en) * | 2004-08-09 | 2006-02-09 | Li Qi P | Voice-operated remote control for TV and electronic systems |
US9037494B2 (en) * | 2004-09-13 | 2015-05-19 | Comcast Cable Holdings, Llc | Method and system of managing subscriber access to services associated with services provider |
US7634564B2 (en) * | 2004-11-18 | 2009-12-15 | Nokia Corporation | Systems and methods for invoking a service from a plurality of event servers in a network |
JP4667085B2 (en) | 2005-03-11 | 2011-04-06 | 富士通株式会社 | Spoken dialogue system, computer program, dialogue control apparatus, and spoken dialogue method |
US7460996B2 (en) * | 2005-06-23 | 2008-12-02 | Microsoft Corporation | Using strong data types to express speech recognition grammars in software programs |
US7302273B2 (en) * | 2005-07-08 | 2007-11-27 | Soleo Communications, Inc. | System and method for providing interactive wireless data and voice based services |
US8171493B2 (en) * | 2005-09-06 | 2012-05-01 | Nvoq Incorporated | VXML browser control channel |
US8635073B2 (en) * | 2005-09-14 | 2014-01-21 | At&T Intellectual Property I, L.P. | Wireless multimodal voice browser for wireline-based IPTV services |
US7499704B1 (en) * | 2005-10-21 | 2009-03-03 | Cingular Wireless Ii, Llc | Display caller ID on IPTV screen |
US20070112571A1 (en) * | 2005-11-11 | 2007-05-17 | Murugappan Thirugnana | Speech recognition at a mobile terminal |
US7624417B2 (en) * | 2006-01-27 | 2009-11-24 | Robin Dua | Method and system for accessing media content via the internet |
US20070286360A1 (en) * | 2006-03-27 | 2007-12-13 | Frank Chu | System and Method for Providing Screen-Context Assisted Information Retrieval |
US8755335B2 (en) * | 2006-04-13 | 2014-06-17 | At&T Intellectual Property I, L.P. | System and methods for control of a set top box |
US7792675B2 (en) * | 2006-04-20 | 2010-09-07 | Vianix Delaware, Llc | System and method for automatic merging of multiple time-stamped transcriptions |
US9602512B2 (en) * | 2006-05-08 | 2017-03-21 | At&T Intellectual Property I, Lp | Methods and apparatus to distribute media delivery to mobile devices |
US20070281680A1 (en) * | 2006-06-05 | 2007-12-06 | Vish Raju | Method and system for extending services to cellular devices |
US8478310B2 (en) * | 2006-10-05 | 2013-07-02 | Verizon Patent And Licensing Inc. | Short message service (SMS) data transfer |
US8316408B2 (en) * | 2006-11-22 | 2012-11-20 | Verizon Patent And Licensing Inc. | Audio processing for media content access systems and methods |
US8219636B2 (en) * | 2006-12-18 | 2012-07-10 | Verizon Patent And Licensing Inc. | Networked media recording |
US8130917B2 (en) * | 2006-12-21 | 2012-03-06 | Verizon Data Services Llc | Method and apparatus for group messaging |
US20080208589A1 (en) * | 2007-02-27 | 2008-08-28 | Cross Charles W | Presenting Supplemental Content For Digital Media Using A Multimodal Application |
US7912963B2 (en) * | 2007-06-28 | 2011-03-22 | At&T Intellectual Property I, L.P. | Methods and apparatus to control a voice extensible markup language (VXML) session |
-
2007
- 2007-07-23 US US11/781,628 patent/US8175885B2/en active Active
-
2012
- 2012-04-16 US US13/447,487 patent/US8655666B2/en active Active
-
2014
- 2014-02-14 US US14/180,897 patent/US20140163996A1/en not_active Abandoned
Patent Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6442523B1 (en) * | 1994-07-22 | 2002-08-27 | Steven H. Siegel | Method for the auditory navigation of text |
US20040226042A1 (en) * | 1998-05-19 | 2004-11-11 | United Video Properties, Inc. | Program guide system with video-on-demand browsing |
US6615177B1 (en) * | 1999-04-13 | 2003-09-02 | Sony International (Europe) Gmbh | Merging of speech interfaces from concurrent use of devices and applications |
US6324512B1 (en) * | 1999-08-26 | 2001-11-27 | Matsushita Electric Industrial Co., Ltd. | System and method for allowing family members to access TV contents and program media recorder over telephone or internet |
US6330537B1 (en) * | 1999-08-26 | 2001-12-11 | Matsushita Electric Industrial Co., Ltd. | Automatic filtering of TV contents using speech recognition and natural language |
US6192340B1 (en) * | 1999-10-19 | 2001-02-20 | Max Abecassis | Integration of music from a personal library with real-time information |
US20030167174A1 (en) * | 2002-03-01 | 2003-09-04 | Koninlijke Philips Electronics N.V. | Automatic audio recorder-player and operating method therefor |
US20050132420A1 (en) * | 2003-12-11 | 2005-06-16 | Quadrock Communications, Inc | System and method for interaction with television content |
US20060039367A1 (en) * | 2004-08-18 | 2006-02-23 | Bellsouth Intellectual Property Corporation | SIP-based session control |
US20070140150A1 (en) * | 2005-12-15 | 2007-06-21 | Andre Beck | Method and network for providing service blending to a subscriber |
US20070139513A1 (en) * | 2005-12-16 | 2007-06-21 | Zheng Fang | Video telephone soft client with a mobile phone interface |
US7779028B1 (en) * | 2006-05-02 | 2010-08-17 | Amdocs Software Systems Limited | System, method and computer program product for communicating information among devices |
US20080139222A1 (en) * | 2006-12-08 | 2008-06-12 | General Instrument Corporation | Presence Detection and Location Update in Premise Gateways |
US20080209497A1 (en) * | 2007-02-27 | 2008-08-28 | At&T Knowledge Ventures, L.P. | Method for reestablishing presentation of a paused media program |
US20080231684A1 (en) * | 2007-03-23 | 2008-09-25 | Verizon Services Corp. | Video streaming system |
US20080313310A1 (en) * | 2007-06-15 | 2008-12-18 | Sony Ericsson Mobile Communications Ab | Method for Distributing Programs over a Communication Network |
US8655666B2 (en) * | 2007-07-23 | 2014-02-18 | Verizon Patent And Licensing Inc. | Controlling a set-top box for program guide information using remote speech recognition grammars via session initiation protocol (SIP) over a Wi-Fi channel |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150127353A1 (en) * | 2012-05-08 | 2015-05-07 | Samsung Electronics Co., Ltd. | Electronic apparatus and method for controlling electronic apparatus thereof |
WO2017040644A1 (en) * | 2015-08-31 | 2017-03-09 | Roku, Inc. | Audio command interface for a multimedia device |
US10048936B2 (en) | 2015-08-31 | 2018-08-14 | Roku, Inc. | Audio command interface for a multimedia device |
US10871942B2 (en) | 2015-08-31 | 2020-12-22 | Roku, Inc. | Audio command interface for a multimedia device |
WO2021061304A1 (en) * | 2019-09-26 | 2021-04-01 | Dish Network L.L.C. | Method and system for implementing an elastic cloud-based voice search utilized by set-top box (stb) clients |
US11303969B2 (en) | 2019-09-26 | 2022-04-12 | Dish Network L.L.C. | Methods and systems for implementing an elastic cloud based voice search using a third-party search provider |
US11317162B2 (en) | 2019-09-26 | 2022-04-26 | Dish Network L.L.C. | Method and system for navigating at a client device selected features on a non-dynamic image page from an elastic voice cloud server in communication with a third-party search service |
US11477536B2 (en) | 2019-09-26 | 2022-10-18 | Dish Network L.L.C | Method and system for implementing an elastic cloud-based voice search utilized by set-top box (STB) clients |
US11849192B2 (en) | 2019-09-26 | 2023-12-19 | Dish Network L.L.C. | Methods and systems for implementing an elastic cloud based voice search using a third-party search provider |
US11392217B2 (en) | 2020-07-16 | 2022-07-19 | Mobius Connective Technologies, Ltd. | Method and apparatus for remotely processing speech-to-text for entry onto a destination computing system |
Also Published As
Publication number | Publication date |
---|---|
US8655666B2 (en) | 2014-02-18 |
US8175885B2 (en) | 2012-05-08 |
US20090030681A1 (en) | 2009-01-29 |
US20120203552A1 (en) | 2012-08-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8175885B2 (en) | Controlling a set-top box via remote speech recognition | |
CN108063969B (en) | Display apparatus, method of controlling display apparatus, server, and method of controlling server | |
US9495969B2 (en) | Simplified decoding of voice commands using control planes | |
US9736552B2 (en) | Authoring system for IPTV network | |
JP7026449B2 (en) | Information processing device, receiving device, and information processing method | |
JP6227459B2 (en) | Remote operation method and system, and user terminal and viewing terminal thereof | |
US8745683B1 (en) | Methods, devices, and mediums associated with supplementary audio information | |
JP7020799B2 (en) | Information processing equipment and information processing method | |
US20050114141A1 (en) | Methods and apparatus for providing services using speech recognition | |
US20140006022A1 (en) | Display apparatus, method for controlling display apparatus, and interactive system | |
JP2014003610A (en) | Display device, interactive server and response information provision method | |
JP2014093778A (en) | Broadcast receiver, server, and control method thereof | |
US20210350807A1 (en) | Word correction using automatic speech recognition (asr) incremental response | |
US20090144312A1 (en) | System and method for providing interactive multimedia services | |
KR102145370B1 (en) | Media play device and method for controlling screen and server for analyzing screen | |
JP6266330B2 (en) | Remote operation system and user terminal and viewing device thereof | |
WO2019188393A1 (en) | Information processing device, information processing method, transmission device and transmission method | |
US11551722B2 (en) | Method and apparatus for interactive reassignment of character names in a video device | |
KR102160756B1 (en) | Display apparatus and method for controlling the display apparatus | |
KR101763594B1 (en) | Method for providing service for recognizing voice in broadcast and network tv/server for controlling the method | |
WO2022156246A1 (en) | Voice command processing circuit, receiving device, server, and voice command accumulation system and method | |
CN114667566A (en) | Voice instruction processing circuit, receiving apparatus, server, voice instruction accumulation system, and voice instruction accumulation method | |
WO2019188269A1 (en) | Information processing device, information processing method, transmission device and transmission method | |
KR20110067479A (en) | Digital broadcast receiver and method for providing search list |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: VERIZON DATA SERVICES INDIA PVT LTD., INDIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SUREKA, ASHUTOSH K.;SUBRAMANIAN, SATHISH K.;BASU, SIDHARTHA;AND OTHERS;SIGNING DATES FROM 20070711 TO 20070719;REEL/FRAME:032221/0457 Owner name: VERIZON PATENT AND LICENSING INC., NEW JERSEY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:VERIZON DATA SERVICES INDIA PVT LTD.;REEL/FRAME:032221/0466 Effective date: 20090301 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |