WO2020089932A1 - Voice control method and device for intelligent linux set-top-box system - Google Patents

Voice control method and device for intelligent linux set-top-box system Download PDF

Info

Publication number
WO2020089932A1
WO2020089932A1 PCT/IN2019/050793 IN2019050793W WO2020089932A1 WO 2020089932 A1 WO2020089932 A1 WO 2020089932A1 IN 2019050793 W IN2019050793 W IN 2019050793W WO 2020089932 A1 WO2020089932 A1 WO 2020089932A1
Authority
WO
WIPO (PCT)
Prior art keywords
client device
receiver
wireless communication
communication link
over
Prior art date
Application number
PCT/IN2019/050793
Other languages
French (fr)
Inventor
Priyabrata Das
Sam Prabakar WILLIAM
Sunny Gupta
Balaji. P
Amit Kharabanda
Ankur ROHATGI
Original Assignee
Mybox Technologies Pvt. Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mybox Technologies Pvt. Ltd. filed Critical Mybox Technologies Pvt. Ltd.
Publication of WO2020089932A1 publication Critical patent/WO2020089932A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42203Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/436Interfacing a local distribution network, e.g. communicating with another STB or one or more peripheral devices inside the home
    • H04N21/43615Interfacing a Home Network, e.g. for connecting the client to a plurality of peripherals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content

Definitions

  • the present invention relates to a voice control device and voice control method and display device, and more particularly, to a voice control device and voice control method and display device capable of enhancing utilization convenience in a Linux Set-Top-Box.
  • UMTS Universal Mobile Telecommunication System
  • LTE Long term Evolution
  • IEEE Institute of Electrical and Electronics Engineers
  • a television is adaptable to viewing programmes through cable or radio channel or other set top boxes and is also adaptable to images from Digital Video Disc (DVD), Blu-Ray players, game consoles, computer systems and other devices.
  • DVD Digital Video Disc
  • TV set has obvious progress in functionality, such as video recording, picture in picture, memory card reading, automatic brightness adjustment, and so on.
  • An object of the invention is to provide a system and method to enable wireless communication for enabling two-way communication and controlling a RF receiver only Linux Integrated Receiver and Decoder (IRD) through voice interface.
  • IRD Integrated Receiver and Decoder
  • the present invention relates to a system and method to enable wireless communication for controlling a Linux Integrated Receiver and Decoder (IRD).
  • the present invention in particular pertains to the field of Voice Interface and cloud based middleware implementation for a linear one-way Linux Set-Top-Box by retro-fitting a Bluetooth Low Energy (BLE) and WiFi dongle through Universal Serial Bus (USB).
  • BLE Bluetooth Low Energy
  • USB Universal Serial Bus
  • the invention relates to a method and a system for implementing voice control implementation in dvb set top box.
  • Voice command is captured through digital microphone (MIC) triggered by a PTT (Push-To-Talk) button on the remote controller and transmitted over BLE to receiver which is connected to set top box over USB.
  • Set Top Box running ARM Linux detects the BLE receiver as a Human interface device (HID) and receives audio and data for processing voice commands through Voice control Cloud and MYCONNECT Cloud through a middleware implementation.
  • HID Human interface device
  • An aspect of the present invention provides a system for controlling one or more client devices using voice data, comprising: a microphone contained in a remote control device for capturing voice data; a client device for receiving voice data from the remote control device via a receiver connected with the client device, the client device includes a structural communication interface for connecting the receiver with the client device for the receiver to communicate with the client device via the structural communication interface; the receiver includes: a first short range wireless communication unit for establishing a short range wireless communication link, between the receiver and the remote control device; and a second wireless communication unit for establishing a local wireless communication link for connecting the client device to a wireless local area network via the receiver; wherein, the remote control device captures the voice data from a user and sends the voice data to the receiver over the established short range wireless communication link; the receiver communicates and sends the voice data to the client device via the structural communication interface; the client device accesses a content server, using the receiver, via the established local wireless communication link and over the wireless local area network; the client device processes the voice data and transmits the
  • the client device is a linear one-way Linux Set-Top- Box.
  • the system of the present invention wherein the client device processes the voice data using asynchronous voice capture protocol and FIFO (First In First Out) sampling management.
  • FIFO First In First Out
  • Another aspect of the invention relates to a system wherein the short range wireless communication link is a Bluetooth link, and the remote control device and the receiver are Bluetooth enabled devices, and wherein the voice data is captured over Bluetooth HID over Generic Attribute Profile (GATT) profile using compressed sub band coding and voice sample serialization using a dynamic FIFO (First In First Out) using a write and read pointer mechanism.
  • GATT Generic Attribute Profile
  • the system of the present invention in which the client device queues up the voice data and manages the voice data transmission to the content server.
  • the system of the present invention wherein the client device implements noise filter and error correction to increase accuracy of PCM (pulse code modulation) sampling.
  • PCM pulse code modulation
  • One more aspect of the present invention relates to a system of wherein the content server transmits the output data to the receiver over the established local wireless communication link and the receiver transmits the output data to the client device over the structural communication interface.
  • the system of the present invention wherein the client device is further connected to an output device via a wired or a wireless communication link, for outputting the output data at the output device, and the client device transmits the output data to the output device via the wired or a wireless communication link.
  • the output data is audio output data or video output data, individually or in combination
  • the output device includes an audio output device, and audio/video output device, a television, speakers, a projector, a monitor, or a display device
  • the wired communication link includes Ethernet connection, or DVB cable connection
  • the wireless communication link includes Bluetooth or Internet connection link.
  • the system of the present invention wherein the remote control device discovers, over a short range wireless communication, the receiver, and connects or pairs with the receiver over the short range wireless communication link.
  • receiver is either an integral part of the client device, or connects to the client device via the structural communication interface.
  • Yet another aspect of the present invention provides a method for controlling one or more client devices using voice data, comprising: capturing voice data from a user, using a remote control device; transmitting the voice data from the remote control device to a receiver, over a short range wireless communication link established between the remote control device and the receiver; transmitting the voice data from the receiver to a client device, where the receiver is connected to the client device via a structural communication interface included in the client device, the structural communication interface is for connecting the receiver with the client device for enabling the receiver to communicate with the client device, and the transmission of the voice data from the receiver to the client device is over the structural communication interface; accessing a content server by the client device using the receiver connected to the client device, where the receiver establishes a local wireless communication link with a wireless local area network, and the receiver enables the client device to connect to the wireless local area network and access the content server over the established local wireless communication link; processing, at the client device, the voice data; transmitting the processed voice data from the client device to the content server via the receiver and over the
  • the method of the invention involves controlling one or more client devices using voice data of the present invention, wherein the receiver includes: a first short range wireless communication unit for establishing the short range wireless communication link, between the receiver and the remote control device; and a second wireless communication unit for establishing the local wireless communication link for connecting the client device to the wireless local area network to access the content server via the receiver.
  • the receiver includes: a first short range wireless communication unit for establishing the short range wireless communication link, between the receiver and the remote control device; and a second wireless communication unit for establishing the local wireless communication link for connecting the client device to the wireless local area network to access the content server via the receiver.
  • the method of the present invention involves a client device which is a linear one-way Linux Set-Top-Box.
  • the method of present invention wherein the transmission of the output data to the client device from the content server includes transmitting the output data from the content server to the receiver over the established local wireless communication link and transmitting the output data from the receiver the client device over the structural communication interface.
  • the method of present invention further includes outputting the output data at an output device, where the output device is connected to the client device via a wired or a wireless communication link, for outputting the output data at the output device, and the client device transmits the output data to the output device via the wired or a wireless communication link.
  • the output data is audio output data or video output data, individually or in combination
  • the output device includes an audio output device, and audio/video output device, a television, speakers, a projector, a monitor, or a display device
  • the wired communication link includes Ethernet connection, or DVB cable connection
  • the wireless communication link includes Bluetooth or Internet connection link.
  • the method of present invention involves processing of the voice data which includes the client device using asynchronous voice capture protocol and FIFO sampling management.
  • the method of present invention includes a short range wireless communication link which is a Bluetooth link, and the remote control device and the receiver are Bluetooth enabled devices, and wherein the voice data is captured over Bluetooth HID over Generic Attribute Profile (GATT) profile using compressed sub band coding and voice sample serialization using a dynamic FIFO (First In First Out) using a write and read pointer mechanism.
  • GATT Generic Attribute Profile
  • the method of present invention wherein the receiver is either an integral part of the client device, or connects to the client device via the structural interface.
  • a wireless device enabling one or more client devices to be voice controlled, comprising: a communication interface to connect with a client device, and enabling communication between the wireless device and the client device via the communication interface; a first short range wireless communication unit for establishing a short range wireless communication link, between the wireless device and a user remote control device, the establishing of the short range wireless communication link between the wireless device and the user remote control device is for controlling the client device, via the wireless device, using voice data captured by the user remote control device; and a second wireless communication unit for establishing a local wireless communication link with a wireless local area network for connecting the client device to the wireless local area network, wherein the wireless device when connected to the client device via the communication interface enables the client device to be voice controlled, over the short range wireless communication link, using the user remote control device and enables the client device to access a content server and receive output data from the content server, over the wireless local area network and via the wireless device, the output data is related, at least in part, to the voice data.
  • the wireless device of present invention wherein the wireless device enables the client device to receive the voice data from and captured by the user remote control device and be voice controlled, over the short range wireless communication link, using the user remote control device.
  • the wireless device of present invention wherein the wireless device receives the output data from the content server over the wireless local area network, the output data is related, at least in part, to the voice data received from the user remote control device, and the wireless device transmits the output data to the client device via the communication interface.
  • the wireless device of present invention wherein the wireless device is discovered, using the first short range wireless communication unit, by the user remote control device, over a short range wireless communication, for the wireless device to connect or pair with the user remote control device over the short range wireless communication link, and wherein after being paired with the user remote control device, the wireless device enables the client device to receive voice data from the user remote control device and be voice controlled, over the short range wireless communication link, using the user remote control device.
  • the wireless device of present invention wherein the wireless device is either an integral part of the client device, or connects to the client device via a structural communication interface included in the client device.
  • Yet another aspect of the present invention provides that the content server transmits the output data to the receiver over the established local wireless communication link and the receiver transmits the output data to the client device over the structural communication interface.
  • Another aspect of the present invention provides the client device is further connected to an output device via a wired or a wireless communication link, for outputting the output data at the output device, and the client device transmits the output data to the output device via the wired or a wireless communication link.
  • the output data is audio output data or video output data, individually or in combination
  • the output device includes an audio output device, and audio/video output device, a television, speakers, a projector, a monitor, or a display device
  • the wired communication link includes Ethernet connection, or DVB cable connection
  • the wireless communication link includes Bluetooth or Internet connection link.
  • the remote control device discovers, over a short range wireless communication, the receiver, and connects or pairs with the receiver over the short range wireless communication link.
  • Another aspect of the present invention provides the receiver which is either an integral part of the client device, or connects to the client device via the structural communication interface.
  • One more aspect of the present invention relates to a wireless system for Alexa implementation in DVB set top box (STB), said system comprising a AVS (Alexa voice service) Functional flow requires the client device to implement input interface for audio signal processor; microphone implementation is done by a remote and a Bluetooth/Wifi device connected to the STB via USB, the request received and sent from a microphone to an audio signal processor where the voice command is digitized, modulated and compressed and transmitted to a shared audio data center as data, said data is sent to an internet wakeword engine and also to an audio input processor wherein the data sent to the internet wakeword engine which in turn sends recognized triggers to the audio input processor, thereafter, the processed data is sent to an Alexa interaction manager wherein it goes through an Alexa comms library which may correspond to an Alexa cloud service via AVS protocol, simultaneously the processed data is sent to an Alexa Orchestrator library and thereafter to capability agents and then to an audio output which is processed by the audio signal processor, and finally sent to an output audio device.
  • AVS Alexa voice service
  • Figure 1 illustrates a system of controlling one or more client devices, such as a set top box, using voice commands
  • Figure 1.1 depicts DVB-STB Voice control Solution Architecture of the present invention
  • Figure 2 describes the Set Top Box (STB) Hardware Interfaces
  • Figure 3 describes Transmit-Receive protocol and the Voice control Interaction model diagram
  • FIG. 5 is a schematic representation of L2CAP audio frame data
  • Figure 6a depicts the Voice Data Format and Figure 6b is an illustrative representation of the audio buffer and audio frame pool.
  • Figure 7 is a flow chart describing the BLE voice protocol with SBC encoding.
  • Figure 8 is a flow chart describing the functional flow requires the client device to implement input interface for audio signal processor.
  • the invention pertains to the field of Voice Interface and cloud based middleware implementation for a linear one-way Linux Set-Top-Box by retro-fitting a BLE and WiFi dongle through USB. More particularly, the invention pertains to the method of capturing voice input through Push-To-Talk Bluetooth 4.0 remote and development of custom media skills through client and cloud middleware using Voice control SDK on a low foot-print low cost IRD (Integrated Receiver and Decoder).
  • IRD Integrated Receiver and Decoder
  • Figure 1 illustrates a system of controlling one or more client devices, such as a set top box, using voice commands.
  • a user uses a remote control device 102 for inputting voice commands, or voice data.
  • the remote control device 102 being a voice enabled, for example the TV remote, captures the voice data from the user.
  • the remote control device 102 includes a microphone to capture the voice data from the user.
  • the client device 104 includes a structural communication interface for connecting a receiver 106 or a wireless receiving device 106. The client device 104 communicates with the receiver 106 via the structural communication interface.
  • the receiver 106 further includes a communication interface which interfaces with the structural communication interface of the client device 104 to connect with the client device 104.
  • the receiver 106 includes a first short range wireless communication unit for establishing a short range wireless communication link, between the receiver 106 and the remote control device 102.
  • the short range wireless communication link between the receiver 106 and the remote control device 102 is a Bluetooth communication link.
  • the receiver 106 also includes a second wireless communication unit for establishing a local wireless communication link for connecting the client device 104 to a wireless local area network 108 via the receiver 106.
  • the local wireless communication link includes a WiFi link to communicate with Internet over the wireless local area network.
  • the receiver 106 is attached to the client device 104 through the structural communication interface of the client device 104. Retrofitting the receiver 106 with the client device 104 provides the client device 104 with the capability to receive voice data from a user operating the remote control device 102, such as a TV remote, over the short range communication link via the receiver 106. This retrofitting also enables the client device to communicate with one or more content servers 108 over the network.
  • the client device 104 transmits the voice data of the user to the content server 108 over the network via the wireless communication link, and in turn receives output data from the content server 108, over the network via the wireless communication link, where the output data is related to the voice data.
  • the output data is audio digital content or video digital content, individually or in combination.
  • the content server 108 is a digital content server, an audio digital content server, or video digital content server, individually or in combination.
  • the content server 108 may store audio files, video files, and the content server 108 extracts the audio or video files which map with the voice data, and sends the mapped audio and/or video files to the client device 104.
  • the client device 104 is connected to an output device for outputting the output data at the output device.
  • the output device may include and is not limited to an audio device, a display device, a monitor, a television, wireless or wired speakers, or a projector or the like devices.
  • the client device is connected to the output device either via a wired connection or a wireless connection.
  • the wired connection is an Ethernet connection or cable connection.
  • the wireless connection includes a Bluetooth connection or a wireless network connection, such as Internet.
  • the receiver 106 when connected with the client device 104, via the interfaces, enables the client device 104 to receive voice data which is captured by the remote control unit 102 operated by a user.
  • the voice commands in turn control the client device 104.
  • the client device 104 is controlled through user voice commands via the receiver 106.
  • the receiver 106 when connected with the client device 104 enables the client device 104 to access cloud servers 108 which are residing over the Internet, by providing the client device 104 ability to connect to a wireless network, via the receiver 106.
  • Figure 1.1 describes the Architecture of DVB STB Voice control Solution.
  • the reference numeral (1) indicates the voice command of the user to the TV remote, from the TV remote to the VOICE CONTROL enabled STB, from the VOICE CONTROL enabled STB to the Amazon Voice control Cloud Service, to the online Service/Data Sources, and back to the TV.
  • Figure 1 describes a method of voice capture and processing using asynchronous voice capture protocol and FIFO sampling management.
  • Voice data is captured over BLE HID over Generic Attribute Profile (GATT) profile using compressed sub band coding and voice sample serialization using a dynamic FIFO (First In First Out) using a write and read pointer mechanism.
  • STB middleware queues up the voice samples and manages the voice command transmission to Voice control cloud.
  • GATT Generic Attribute Profile
  • the GATT client protocol is customized to restore parameter between client and server.
  • Asynchronous method is also used for voice capture start and end event. This method is superior compared to the synchronous method of polling as it increases accuracies and eliminates noise in sampling.
  • It is a state oriented protocol where state machine monitors the state of voice sampling, processing and response. It interfaces with native media player and graphics engine to play Uniform Resource Locator (URL) and render Voice control response data.
  • the native player integration is optimized for usage of raw audio capture buffer in Advanced Linux Sound Architecture (ALSA) which is more efficient than the classical “port-audio” wrapper implementation.
  • ALSA Advanced Linux Sound Architecture
  • Hidraw (Raw Access to USB and Bluetooth Human Interface Devices - the Hidraw driver provides a raw interface to USB and Bluetooth Human Interface Devices (HIDs)) device handle is used in the capture and for device management functionalities.
  • the reference implementation uses Realtek RTL 8723 BU/DU integrated WiFi and Bluetooth receiver and Realtek RTL 8762A RCU (Remote Control Unit). It has customized Bluez 5.43 stack and linux 3.10 net module. Reference implementation is achieved using 165MB RAM and 14.4MB Flash. Footprint includes complete Digital Video Broadcasting (DVB) middleware stack, Voice control SDK, Bluetooth stack and embedded Linux Kernel.
  • Voice command is captured by BLE STB remote and is transmitted to BLE receiver which is connected to STB over USB.
  • Request Voice command is then sent to Voice control cloud for intent resolution and response voice stream is captured by STB client to playback voice response and display JavaScript Object Notation (JSON) data in the required format.
  • JSON JavaScript Object Notation
  • STB has RF frontend for receiving signals from DVB cable or satellite or terrestrial delivery medium. It is connected to BLE and WiFi dongle over USB, which works as a BLE receiver after getting paired with BLE RCU. It is the proprietary protocol for command and voice between RCU and STB.
  • the IF module of the voice control device sends start listening command to Voice control SDK.
  • Voice control SDK starts listening for voice data from BLE voice port.
  • Voice control IF module sends stop listening command.
  • the microphone of the voice control device may be disabled after providing the command.
  • Voice control SDK goes to thinking state.
  • Voice control SDK receives JSON data and voice response.
  • Voice control IF module parses JSON and invokes middleware modules for rendering display cards and playing back music and audio file.
  • the System Interaction may be described by Figure 4.
  • there are 3 major system components e.g. Voice control Software Development Kit (SDK), MYBOX player (the set top box and system manufactured by the Applicant) and Audio & Video Playe (AVPlayer).
  • SDK Voice control Software Development Kit
  • MYBOX player the set top box and system manufactured by the Applicant
  • AVPlayer Audio & Video Playe
  • Component interaction is through a shared buffer and IPC (Inter process communication) mechanism.
  • Voice control SDK provides a media player interface for“Play”, “Pause”, “Resume” and“Stop” operations. These operations are realized through corresponding AVPlayer driver functions through the intermediary MYBOX player.
  • AVPlayer maintains two threads of execution, one (Transport Stream)“TS” thread for HLS (also known as HTTP Live Streaming HTTP Live Streaming (HLS) is an HTTP- based adaptive bitrate streaming communications protocol implemented by Apple Inc. as part of its QuickTime, Safari, OS X, and iOS software. Client implementations are also available in Microsoft Edge, Firefox and some versions of Google Chrome) stream decode and playback and another export files (ES) thread for MP3 file decode and playback. This mechanism is scalable and additional threads can be created for simultaneous stream decode and playback.
  • AVPlayer also implements buffer management through a buffer pool data structure.
  • the present invention supports variable voice data length for efficiency. This is demonstrated by Figure 6a and Figure 6b depict the voice data format, including the SBC (Sub-band Coding) audio.
  • Figure 7 of the present invention is drawn to Voice and Data Protocol implementation. For instance, as an illustrative embodiment of the present invention shown in Figure 7, which discloses the use of Realtek 8723BU/DU receiver which is a WiFi+BLE integrated module interfaced with Linux Host (STB) over USB. It may be mapped as USB HID device and managed by kernel as Hidraw device. Voice and data may be accessed through Hidraw handle. Application implements the proprietary audio packet reception and data management.
  • Figure 8 of the present invention show A VS (Alexa voice service) Functional flow requires the client device to implement input interface for audio signal processor.
  • the MIC implementation is done by the BLE RCU and STB which is a unique method.
  • the processed data is sent to the Alexa interaction manager wherein it goes through the Alexa comms library which may correspond to the Alexa cloud service via AVS protocol, simultaneously the processed data is sent to the Alexa Orchestrator library and thereafter to the capability agents and then to the audio output which is processed by the audio signal processor and finally sent to an output audio device.

Abstract

The present invention provides a system and method to enable wireless communication for controlling a Linux Integrated Receiver and Decoder (IRD). The present invention discloses a Voice Interface and cloud based middleware implementation for a linear one-way Linux Set-Top-Box by retro-fitting a BLE and WiFi dongle through USB.

Description

VOICE CONTROL METHOD AND DEVICE FOR INTELLIGENT LINUX SETTOP-BOX SYSTEM
FIELD OF THE INVENTION
The present invention relates to a voice control device and voice control method and display device, and more particularly, to a voice control device and voice control method and display device capable of enhancing utilization convenience in a Linux Set-Top-Box.
BACKGROUND OF THE INVENTION
In the era of disruptive and revolutionary innovations, handheld devices such cellular telephones have increasing functionality. For instance, voice technology today is unrecognizable compared with what it was only a few years ago. No longer just a smart gimmick, or a handy way to make a phone call without taking your hands off the steering wheel, voice technology is now becoming a serious contender to other, more traditional forms. In turn, voice technology is leading to an increase in the number and types of software applications that are available for use on these electronic devices. The modern voice technology landscape is governed chiefly by the platforms and the hardware provided by the major tech players. This means mass adoption of platforms such as Apple's Siri, Google’s Assistant and Microsoft’s Cortana, as well as the commercial usage of devices like Amazon Echo.
In addition, many modern electronic devices include a networking subsystem that is used to wirelessly communicate with other electronic devices. For example, these electronic devices can include a networking subsystem with a cellular network interface (Universal Mobile Telecommunication System (UMTS), Long term Evolution (LTE), etc.), a wireless local area network interface (e.g., a wireless network such as described in the Institute of Electrical and Electronics Engineers (IEEE) standards 802.11 or Bluetooth™ from the Bluetooth Special Interests Group of Kirkland, Wash.), and/or another type of wireless interface.
However, it is difficult to seamlessly integrate the wireless communication capabilities of these electronic devices with the software applications that execute on the electronic devices. As a consequence, it can be difficult for one electronic device to provide instructions or commands to another electronic device, which are then correctly interpreted and executed by a software application executing on the other electronic device. This constraint can limit the interactions between electronic devices and software applications, which can frustrate users and degrade the user experience.
With the advancement of entertainment and multimedia industry and complexity in travel, television and cable has become one of the major leisure appliances for most people. A television is adaptable to viewing programmes through cable or radio channel or other set top boxes and is also adaptable to images from Digital Video Disc (DVD), Blu-Ray players, game consoles, computer systems and other devices. Also, in addition to display quality, TV set has obvious progress in functionality, such as video recording, picture in picture, memory card reading, automatic brightness adjustment, and so on.
Traditionally television was controlled, either by directly by interacting with the television set or through remote. With the advent cable TV and set top boxes, more often than not, the user is required to handle two remote controls, one for the cable or set top box and the other for television, making it a cumbersome process. This is more difficult for senior citizens with failing eye sight and who need to be educated regarding the operation of the remote control.
Hence, there is a need to apply advance technologies such as voice control to enable ease of viewing of the cable or set box vide television by users.
OBJECT OF THE INVETION
An object of the invention is to provide a system and method to enable wireless communication for enabling two-way communication and controlling a RF receiver only Linux Integrated Receiver and Decoder (IRD) through voice interface.
SUMMARY OF THE INVENTION
The present invention relates to a system and method to enable wireless communication for controlling a Linux Integrated Receiver and Decoder (IRD). The present invention in particular pertains to the field of Voice Interface and cloud based middleware implementation for a linear one-way Linux Set-Top-Box by retro-fitting a Bluetooth Low Energy (BLE) and WiFi dongle through Universal Serial Bus (USB). The invention relates to a method and a system for implementing voice control implementation in dvb set top box. Voice command is captured through digital microphone (MIC) triggered by a PTT (Push-To-Talk) button on the remote controller and transmitted over BLE to receiver which is connected to set top box over USB. Set Top Box running ARM Linux detects the BLE receiver as a Human interface device (HID) and receives audio and data for processing voice commands through Voice control Cloud and MYCONNECT Cloud through a middleware implementation.
An aspect of the present invention provides a system for controlling one or more client devices using voice data, comprising: a microphone contained in a remote control device for capturing voice data; a client device for receiving voice data from the remote control device via a receiver connected with the client device, the client device includes a structural communication interface for connecting the receiver with the client device for the receiver to communicate with the client device via the structural communication interface; the receiver includes: a first short range wireless communication unit for establishing a short range wireless communication link, between the receiver and the remote control device; and a second wireless communication unit for establishing a local wireless communication link for connecting the client device to a wireless local area network via the receiver; wherein, the remote control device captures the voice data from a user and sends the voice data to the receiver over the established short range wireless communication link; the receiver communicates and sends the voice data to the client device via the structural communication interface; the client device accesses a content server, using the receiver, via the established local wireless communication link and over the wireless local area network; the client device processes the voice data and transmits the processed voice data to the content server using the receiver and via the established local wireless communication link; and the content server transmits output data to the client device via the receiver and over the established local wireless communication link, the output data is related, at least in part, to the voice data, and wherein the receiver when connected to the client device via the structural communication interface enables the client device to be voice controlled, over the short range wireless communication link, using the remote control device and enables the client device to access the content server over the wireless local area network.
In system of the present invention, the client device is a linear one-way Linux Set-Top- Box.
The system of the present invention wherein the client device processes the voice data using asynchronous voice capture protocol and FIFO (First In First Out) sampling management.
Another aspect of the invention relates to a system wherein the short range wireless communication link is a Bluetooth link, and the remote control device and the receiver are Bluetooth enabled devices, and wherein the voice data is captured over Bluetooth HID over Generic Attribute Profile (GATT) profile using compressed sub band coding and voice sample serialization using a dynamic FIFO (First In First Out) using a write and read pointer mechanism.
The system of the present invention in which the client device queues up the voice data and manages the voice data transmission to the content server.
The system of the present invention wherein the client device implements noise filter and error correction to increase accuracy of PCM (pulse code modulation) sampling.
One more aspect of the present invention relates to a system of wherein the content server transmits the output data to the receiver over the established local wireless communication link and the receiver transmits the output data to the client device over the structural communication interface.
The system of the present invention wherein the client device is further connected to an output device via a wired or a wireless communication link, for outputting the output data at the output device, and the client device transmits the output data to the output device via the wired or a wireless communication link.
Further object of the invention provides a system wherein the output data is audio output data or video output data, individually or in combination, and wherein the output device includes an audio output device, and audio/video output device, a television, speakers, a projector, a monitor, or a display device, and wherein the wired communication link includes Ethernet connection, or DVB cable connection, and wherein the wireless communication link includes Bluetooth or Internet connection link.
The system of the present invention wherein the remote control device discovers, over a short range wireless communication, the receiver, and connects or pairs with the receiver over the short range wireless communication link.
The system of the present invention wherein the receiver is either an integral part of the client device, or connects to the client device via the structural communication interface.
Yet another aspect of the present invention provides a method for controlling one or more client devices using voice data, comprising: capturing voice data from a user, using a remote control device; transmitting the voice data from the remote control device to a receiver, over a short range wireless communication link established between the remote control device and the receiver; transmitting the voice data from the receiver to a client device, where the receiver is connected to the client device via a structural communication interface included in the client device, the structural communication interface is for connecting the receiver with the client device for enabling the receiver to communicate with the client device, and the transmission of the voice data from the receiver to the client device is over the structural communication interface; accessing a content server by the client device using the receiver connected to the client device, where the receiver establishes a local wireless communication link with a wireless local area network, and the receiver enables the client device to connect to the wireless local area network and access the content server over the established local wireless communication link; processing, at the client device, the voice data; transmitting the processed voice data from the client device to the content server via the receiver and over the established local wireless communication link; and transmitting output data from the content server to the client device via the receiver and over the established local wireless communication link, where the output data is related, at least in part, to the voice data, and wherein the receiver when connected to the client device via the structural communication interface enables the client device to be voice controlled, over the short range wireless communication link, using the remote control device and enables the client device to access the content server over the wireless local area network.
The method of the invention involves controlling one or more client devices using voice data of the present invention, wherein the receiver includes: a first short range wireless communication unit for establishing the short range wireless communication link, between the receiver and the remote control device; and a second wireless communication unit for establishing the local wireless communication link for connecting the client device to the wireless local area network to access the content server via the receiver.
The method of the present invention involves a client device which is a linear one-way Linux Set-Top-Box.
The method of present invention wherein the transmission of the output data to the client device from the content server includes transmitting the output data from the content server to the receiver over the established local wireless communication link and transmitting the output data from the receiver the client device over the structural communication interface.
The method of present invention further includes outputting the output data at an output device, where the output device is connected to the client device via a wired or a wireless communication link, for outputting the output data at the output device, and the client device transmits the output data to the output device via the wired or a wireless communication link.
The method of present invention wherein the output data is audio output data or video output data, individually or in combination, and wherein the output device includes an audio output device, and audio/video output device, a television, speakers, a projector, a monitor, or a display device, and wherein the wired communication link includes Ethernet connection, or DVB cable connection, and wherein the wireless communication link includes Bluetooth or Internet connection link.
The method of present invention involves processing of the voice data which includes the client device using asynchronous voice capture protocol and FIFO sampling management. The method of present invention includes a short range wireless communication link which is a Bluetooth link, and the remote control device and the receiver are Bluetooth enabled devices, and wherein the voice data is captured over Bluetooth HID over Generic Attribute Profile (GATT) profile using compressed sub band coding and voice sample serialization using a dynamic FIFO (First In First Out) using a write and read pointer mechanism.
The method of present invention wherein the receiver is either an integral part of the client device, or connects to the client device via the structural interface.
Another aspect of the present invention provides a wireless device enabling one or more client devices to be voice controlled, comprising: a communication interface to connect with a client device, and enabling communication between the wireless device and the client device via the communication interface; a first short range wireless communication unit for establishing a short range wireless communication link, between the wireless device and a user remote control device, the establishing of the short range wireless communication link between the wireless device and the user remote control device is for controlling the client device, via the wireless device, using voice data captured by the user remote control device; and a second wireless communication unit for establishing a local wireless communication link with a wireless local area network for connecting the client device to the wireless local area network, wherein the wireless device when connected to the client device via the communication interface enables the client device to be voice controlled, over the short range wireless communication link, using the user remote control device and enables the client device to access a content server and receive output data from the content server, over the wireless local area network and via the wireless device, the output data is related, at least in part, to the voice data.
The wireless device of present invention wherein the wireless device enables the client device to receive the voice data from and captured by the user remote control device and be voice controlled, over the short range wireless communication link, using the user remote control device.
The wireless device of present invention wherein the wireless device receives the output data from the content server over the wireless local area network, the output data is related, at least in part, to the voice data received from the user remote control device, and the wireless device transmits the output data to the client device via the communication interface.
The wireless device of present invention wherein the wireless device is discovered, using the first short range wireless communication unit, by the user remote control device, over a short range wireless communication, for the wireless device to connect or pair with the user remote control device over the short range wireless communication link, and wherein after being paired with the user remote control device, the wireless device enables the client device to receive voice data from the user remote control device and be voice controlled, over the short range wireless communication link, using the user remote control device.
The wireless device of present invention wherein the wireless device is either an integral part of the client device, or connects to the client device via a structural communication interface included in the client device.
Yet another aspect of the present invention provides that the content server transmits the output data to the receiver over the established local wireless communication link and the receiver transmits the output data to the client device over the structural communication interface.
Another aspect of the present invention provides the client device is further connected to an output device via a wired or a wireless communication link, for outputting the output data at the output device, and the client device transmits the output data to the output device via the wired or a wireless communication link.
Further aspect of the present invention provides that the output data is audio output data or video output data, individually or in combination, and the output device includes an audio output device, and audio/video output device, a television, speakers, a projector, a monitor, or a display device, and the wired communication link includes Ethernet connection, or DVB cable connection, and wherein the wireless communication link includes Bluetooth or Internet connection link. Further aspect of the present invention provides that the remote control device discovers, over a short range wireless communication, the receiver, and connects or pairs with the receiver over the short range wireless communication link.
Another aspect of the present invention provides the receiver which is either an integral part of the client device, or connects to the client device via the structural communication interface.
One more aspect of the present invention relates to a wireless system for Alexa implementation in DVB set top box (STB), said system comprising a AVS (Alexa voice service) Functional flow requires the client device to implement input interface for audio signal processor; microphone implementation is done by a remote and a Bluetooth/Wifi device connected to the STB via USB, the request received and sent from a microphone to an audio signal processor where the voice command is digitized, modulated and compressed and transmitted to a shared audio data center as data, said data is sent to an internet wakeword engine and also to an audio input processor wherein the data sent to the internet wakeword engine which in turn sends recognized triggers to the audio input processor, thereafter, the processed data is sent to an Alexa interaction manager wherein it goes through an Alexa comms library which may correspond to an Alexa cloud service via AVS protocol, simultaneously the processed data is sent to an Alexa Orchestrator library and thereafter to capability agents and then to an audio output which is processed by the audio signal processor, and finally sent to an output audio device.
BRIEF DESCRIPTION OF THE DRAWING
The present invention, by way of example, is described with reference to the following drawings. These drawings and the following description are added as example and merely to illustrate and understand the invention. However, the drawings and the following description should not be construed to limit the scope of the invention.
Figure 1 illustrates a system of controlling one or more client devices, such as a set top box, using voice commands
Figure 1.1 depicts DVB-STB Voice control Solution Architecture of the present invention Figure 2 describes the Set Top Box (STB) Hardware Interfaces Figure 3 describes Transmit-Receive protocol and the Voice control Interaction model diagram
Figure 4 describes the System Interaction
Figure 5 is a schematic representation of L2CAP audio frame data
Figure 6a depicts the Voice Data Format and Figure 6b is an illustrative representation of the audio buffer and audio frame pool.
Figure 7 is a flow chart describing the BLE voice protocol with SBC encoding.
Figure 8 is a flow chart describing the functional flow requires the client device to implement input interface for audio signal processor.
DETAILED DESCRIPTION OF THE INVENTION
Although specific terms are used in the following description for the sake of clarity, these terms are intended to refer only to the particular structure of the invention selected for illustration in the drawings, and are not intended to define or limit the scope of the invention.
References in the specification to“one embodiment” or“an embodiment” member that a particular feature, structure, characteristics, or function described in connection with the embodiment is included in at least one embodiment of the invention. The appearances of the phrase“in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment.
The invention pertains to the field of Voice Interface and cloud based middleware implementation for a linear one-way Linux Set-Top-Box by retro-fitting a BLE and WiFi dongle through USB. More particularly, the invention pertains to the method of capturing voice input through Push-To-Talk Bluetooth 4.0 remote and development of custom media skills through client and cloud middleware using Voice control SDK on a low foot-print low cost IRD (Integrated Receiver and Decoder).
References are made to embodiments of the invention, an example of which is given as an illustration in the accompanying figures. These figures are intended to be illustrative, not limiting. Although the invention is generally described in the context of these embodiments, it should be understood that it is not intended to limit the scope of the invention to these particular embodiments.
Figure 1 illustrates a system of controlling one or more client devices, such as a set top box, using voice commands. In the Figure 1, a user uses a remote control device 102 for inputting voice commands, or voice data. The remote control device 102 being a voice enabled, for example the TV remote, captures the voice data from the user. The remote control device 102 includes a microphone to capture the voice data from the user. Further, the client device 104 includes a structural communication interface for connecting a receiver 106 or a wireless receiving device 106. The client device 104 communicates with the receiver 106 via the structural communication interface.
The receiver 106 further includes a communication interface which interfaces with the structural communication interface of the client device 104 to connect with the client device 104. The receiver 106 includes a first short range wireless communication unit for establishing a short range wireless communication link, between the receiver 106 and the remote control device 102. In an embodiment, the short range wireless communication link between the receiver 106 and the remote control device 102 is a Bluetooth communication link. The receiver 106 also includes a second wireless communication unit for establishing a local wireless communication link for connecting the client device 104 to a wireless local area network 108 via the receiver 106. In an embodiment, the local wireless communication link includes a WiFi link to communicate with Internet over the wireless local area network.
The receiver 106 is attached to the client device 104 through the structural communication interface of the client device 104. Retrofitting the receiver 106 with the client device 104 provides the client device 104 with the capability to receive voice data from a user operating the remote control device 102, such as a TV remote, over the short range communication link via the receiver 106. This retrofitting also enables the client device to communicate with one or more content servers 108 over the network. The client device 104 transmits the voice data of the user to the content server 108 over the network via the wireless communication link, and in turn receives output data from the content server 108, over the network via the wireless communication link, where the output data is related to the voice data. In an embodiment, the output data is audio digital content or video digital content, individually or in combination. In an embodiment, the content server 108 is a digital content server, an audio digital content server, or video digital content server, individually or in combination. For example, the content server 108 may store audio files, video files, and the content server 108 extracts the audio or video files which map with the voice data, and sends the mapped audio and/or video files to the client device 104.
Further, the client device 104 is connected to an output device for outputting the output data at the output device. The output device may include and is not limited to an audio device, a display device, a monitor, a television, wireless or wired speakers, or a projector or the like devices. Furthermore, the client device is connected to the output device either via a wired connection or a wireless connection. In an embodiment, the wired connection is an Ethernet connection or cable connection. In an embodiment, the wireless connection includes a Bluetooth connection or a wireless network connection, such as Internet.
Therefore, the receiver 106, when connected with the client device 104, via the interfaces, enables the client device 104 to receive voice data which is captured by the remote control unit 102 operated by a user. The voice commands in turn control the client device 104. Thus, the client device 104 is controlled through user voice commands via the receiver 106. Also, the receiver 106 when connected with the client device 104 enables the client device 104 to access cloud servers 108 which are residing over the Internet, by providing the client device 104 ability to connect to a wireless network, via the receiver 106.
Figure 1.1 describes the Architecture of DVB STB Voice control Solution. In figure 1, the reference numeral (1) indicates the voice command of the user to the TV remote, from the TV remote to the VOICE CONTROL enabled STB, from the VOICE CONTROL enabled STB to the Amazon Voice control Cloud Service, to the online Service/Data Sources, and back to the TV. Figure 1 describes a method of voice capture and processing using asynchronous voice capture protocol and FIFO sampling management. Voice data is captured over BLE HID over Generic Attribute Profile (GATT) profile using compressed sub band coding and voice sample serialization using a dynamic FIFO (First In First Out) using a write and read pointer mechanism. STB middleware queues up the voice samples and manages the voice command transmission to Voice control cloud. It implements noise filter and error correction to increase accuracy of PCM (pulse code modulation) sampling. It also implements an asynchronous event notification for key press and release events. The GATT client protocol is customized to restore parameter between client and server. Asynchronous method is also used for voice capture start and end event. This method is superior compared to the synchronous method of polling as it increases accuracies and eliminates noise in sampling. It is a state oriented protocol where state machine monitors the state of voice sampling, processing and response. It interfaces with native media player and graphics engine to play Uniform Resource Locator (URL) and render Voice control response data. The native player integration is optimized for usage of raw audio capture buffer in Advanced Linux Sound Architecture (ALSA) which is more efficient than the classical “port-audio” wrapper implementation. “Hidraw” (Raw Access to USB and Bluetooth Human Interface Devices - the Hidraw driver provides a raw interface to USB and Bluetooth Human Interface Devices (HIDs)) device handle is used in the capture and for device management functionalities. The reference implementation uses Realtek RTL 8723 BU/DU integrated WiFi and Bluetooth receiver and Realtek RTL 8762A RCU (Remote Control Unit). It has customized Bluez 5.43 stack and linux 3.10 net module. Reference implementation is achieved using 165MB RAM and 14.4MB Flash. Footprint includes complete Digital Video Broadcasting (DVB) middleware stack, Voice control SDK, Bluetooth stack and embedded Linux Kernel. Voice command is captured by BLE STB remote and is transmitted to BLE receiver which is connected to STB over USB. Request Voice command is then sent to Voice control cloud for intent resolution and response voice stream is captured by STB client to playback voice response and display JavaScript Object Notation (JSON) data in the required format.
As shown in Figure 2, STB has RF frontend for receiving signals from DVB cable or satellite or terrestrial delivery medium. It is connected to BLE and WiFi dongle over USB, which works as a BLE receiver after getting paired with BLE RCU. It is the proprietary protocol for command and voice between RCU and STB.
As shown in figure 1 and 2, accordingly, it is to be understood that the embodiments of the invention herein described are merely illustrative of the application of the principles of the invention. Reference herein to details of the illustrated embodiments is not intended to limit the scope of the claims, which themselves recite those features regarded as essential to the invention. The method of interaction may be described by Figure 3. On enabling the microphone of the voice control device, the IF module of the voice control device sends start listening command to Voice control SDK. Voice control SDK starts listening for voice data from BLE voice port. Voice control IF module sends stop listening command. The microphone of the voice control device may be disabled after providing the command. Voice control SDK goes to thinking state. Voice control SDK receives JSON data and voice response. Voice control IF module parses JSON and invokes middleware modules for rendering display cards and playing back music and audio file.
The System Interaction may be described by Figure 4. As shown in figure 4, there are 3 major system components e.g. Voice control Software Development Kit (SDK), MYBOX player (the set top box and system manufactured by the Applicant) and Audio & Video Playe (AVPlayer). Component interaction is through a shared buffer and IPC (Inter process communication) mechanism. Voice control SDK provides a media player interface for“Play”, “Pause”, “Resume” and“Stop” operations. These operations are realized through corresponding AVPlayer driver functions through the intermediary MYBOX player. AVPlayer maintains two threads of execution, one (Transport Stream)“TS” thread for HLS (also known as HTTP Live Streaming HTTP Live Streaming (HLS) is an HTTP- based adaptive bitrate streaming communications protocol implemented by Apple Inc. as part of its QuickTime, Safari, OS X, and iOS software. Client implementations are also available in Microsoft Edge, Firefox and some versions of Google Chrome) stream decode and playback and another export files (ES) thread for MP3 file decode and playback. This mechanism is scalable and additional threads can be created for simultaneous stream decode and playback. AVPlayer also implements buffer management through a buffer pool data structure.
An illustrative depiction of the L2CAP audio frame data of the present invention is provided at Figure 5.
The present invention supports variable voice data length for efficiency. This is demonstrated by Figure 6a and Figure 6b depict the voice data format, including the SBC (Sub-band Coding) audio. Figure 7 of the present invention is drawn to Voice and Data Protocol implementation. For instance, as an illustrative embodiment of the present invention shown in Figure 7, which discloses the use of Realtek 8723BU/DU receiver which is a WiFi+BLE integrated module interfaced with Linux Host (STB) over USB. It may be mapped as USB HID device and managed by kernel as Hidraw device. Voice and data may be accessed through Hidraw handle. Application implements the proprietary audio packet reception and data management. Figure 8 of the present invention show A VS (Alexa voice service) Functional flow requires the client device to implement input interface for audio signal processor. The MIC implementation is done by the BLE RCU and STB which is a unique method. The request received and sent from the Microphone (XN) to audio signal processor where the voice command is digitized, modulated and compressed and transmitted to shared audio data center as data, said data is sent to the wakeword engine and also to the audio input processor wherein the data sent to the internet wakeword engine which in turn sends recognized triggers to the audio input processor. Thereafter, the processed data is sent to the Alexa interaction manager wherein it goes through the Alexa comms library which may correspond to the Alexa cloud service via AVS protocol, simultaneously the processed data is sent to the Alexa Orchestrator library and thereafter to the capability agents and then to the audio output which is processed by the audio signal processor and finally sent to an output audio device.

Claims

We Claim:
1. A system for controlling one or more client devices using voice data, comprising:
A microphone contained in a remote control device for capturing voice data; a client device for receiving voice data from the remote control device via a receiver connected with the client device, the client device includes a structural communication interface for connecting the receiver with the client device for the receiver to communicate with the client device via the structural communication interface; the receiver includes:
a first short range wireless communication unit for establishing a short range wireless communication link, between the receiver and the remote control device; and
a second wireless communication unit for establishing a local wireless communication link for connecting the client device to a wireless local area network via the receiver; wherein, the remote control device captures the voice data from a user and sends the voice data to the receiver over the established short range wireless communication link;
the receiver communicates and sends the voice data to the client device via the structural communication interface;
the client device accesses a content server, using the receiver, via the established local wireless communication link and over the wireless local area network; the client device processes the voice data and transmits the processed voice data to the content server using the receiver and via the established local wireless communication link; and
the content server transmits output data to the client device via the receiver and over the established local wireless communication link, the output data is related, at least in part, to the voice data, and wherein the receiver when connected to the client device via the structural communication interface enables the client device to be voice controlled, over the short range wireless communication link, using the remote control device and enables the client device to access the content server over the wireless local area network.
2. The system of claim 1, wherein the client device is a linear one-way Linux Set-Top- Box.
3. The system of claim 1, wherein the client device processes the voice data using asynchronous voice capture protocol and FIFO (First In First Out) sampling management.
4. The system of claim 3, wherein the short range wireless communication link is a Bluetooth link, and the remote control device and the receiver are Bluetooth enabled devices, and wherein the voice data is captured over Bluetooth HID over Generic Attribute Profile (GATT) profile using compressed sub band coding and voice sample serialization using a dynamic FIFO (First In First Out) using a write and read pointer mechanism.
5. The system of claim 2, wherein the client device queues up the voice data and manages the voice data transmission to the content server.
6. The system of claim 5, wherein the client device implements noise filter and error correction to increase accuracy of PCM (pulse code modulation) sampling.
7. The system of claim 1, wherein the content server transmits the output data to the receiver over the established local wireless communication link and the receiver transmits the output data to the client device over the structural communication interface.
8. The system of claim 7, wherein the client device is further connected to an output device via a wired or a wireless communication link, for outputting the output data at the output device, and the client device transmits the output data to the output device via the wired or a wireless communication link.
9. The system of claim 8, wherein the output data is audio output data or video output data, individually or in combination, and wherein the output device includes an audio output device, and audio/video output device, a television, speakers, a projector, a monitor, or a display device, and wherein the wired communication link includes Ethernet connection, or DVB cable connection, and wherein the wireless communication link includes Bluetooth or Internet connection link.
10. The system of claim 1, wherein the remote control device discovers, over a short range wireless communication, the receiver, and connects or pairs with the receiver over the short range wireless communication link.
11. The system of claim 1, wherein the receiver is either an integral part of the client device, or connects to the client device via the structural communication interface.
12. A method for controlling one or more client devices using voice data, comprising: capturing voice data from a user, using a remote control device;
transmitting the voice data from the remote control device to a receiver, over a short range wireless communication link established between the remote control device and the receiver;
transmitting the voice data from the receiver to a client device, where the receiver is connected to the client device via a structural communication interface included in the client device, the structural communication interface is for connecting the receiver with the client device for enabling the receiver to communicate with the client device, and the transmission of the voice data from the receiver to the client device is over the structural communication interface; accessing a content server by the client device using the receiver connected to the client device, where the receiver establishes a local wireless communication link with a wireless local area network, and the receiver enables the client device to connect to the wireless local area network and access the content server over the established local wireless communication link; processing, at the client device, the voice data; transmitting the processed voice data from the client device to the content server via the receiver and over the established local wireless communication link; and transmitting output data from the content server to the client device via the receiver and over the established local wireless communication link, where the output data is related, at least in part, to the voice data, and wherein the receiver when connected to the client device via the structural communication interface enables the client device to be voice controlled, over the short range wireless communication link, using the remote control device and enables the client device to access the content server over the wireless local area network.
13. The method of claim 12, wherein the receiver includes:
a first short range wireless communication unit for establishing the short range wireless communication link, between the receiver and the remote control device; and
a second wireless communication unit for establishing the local wireless communication link for connecting the client device to the wireless local area network to access the content server via the receiver.
14. The method of claim 13, wherein the client device is a linear one-way Linux Set-Top- Box.
15. The method of claim 14, wherein the transmission of the output data to the client device from the content server includes transmitting the output data from the content server to the receiver over the established local wireless communication link and transmitting the output data from the receiver the client device over the structural communication interface.
16. The method of claim 12, further includes outputting the output data at an output device, where the output device is connected to the client device via a wired or a wireless communication link, for outputting the output data at the output device, and the client device transmits the output data to the output device via the wired or a wireless communication link.
17. The method of claim 16, wherein the output data is audio output data or video output data, individually or in combination, and wherein the output device includes an audio output device, and audio/video output device, a television, speakers, a projector, a monitor, or a display device, and wherein the wired communication link includes Ethernet connection, or DVB cable connection, and wherein the wireless communication link includes Bluetooth or Internet connection link.
18. The method of claim 12, wherein the processing of the voice data includes the client device using asynchronous voice capture protocol and FIFO (First In First Out) sampling management.
19. The method of claim 12, wherein the short range wireless communication link is a Bluetooth link, and the remote control device and the receiver are Bluetooth enabled devices, and wherein the voice data is captured over Bluetooth HID over Generic Attribute Profile (GATT) profile using compressed sub band coding and voice sample serialization using a dynamic FIFO (First In First Out) using a write and read pointer mechanism.
20. The method of claim 12, wherein the receiver is either an integral part of the client device, or connects to the client device via the structural interface.
21. A wireless device enabling one or more client devices to be voice controlled, comprising:
a communication interface to connect with a client device, and enabling communication between the wireless device and the client device via the communication interface; a first short range wireless communication unit for establishing a short range wireless communication link, between the wireless device and a user remote control device, the establishing of the short range wireless communication link between the wireless device and the user remote control device is for controlling the client device, via the wireless device, using voice data captured by the user remote control device; and
a second wireless communication unit for establishing a local wireless communication link with a wireless local area network for connecting the client device to the wireless local area network,
wherein the wireless device when connected to the client device via the communication interface enables the client device to be voice controlled, over the short range wireless communication link, using the user remote control device and enables the client device to access a content server and receive output data from the content server, over the wireless local area network and via the wireless device, the output data is related, at least in part, to the voice data.
22. The wireless device of claim 21, wherein the wireless device enables the client device to receive the voice data from and captured by the user remote control device and be voice controlled, over the short range wireless communication link, using the user remote control device.
23. The wireless device of claim 22, wherein the wireless device receives the output data from the content server over the wireless local area network, the output data is related, at least in part, to the voice data received from the user remote control device, and the wireless device transmits the output data to the client device via the communication interface.
24. The wireless device of claim 21, wherein the wireless device is discovered, using the first short range wireless communication unit, by the user remote control device, over a short range wireless communication, for the wireless device to connect or pair with the user remote control device over the short range wireless communication link, and wherein after being paired with the user remote control device, the wireless device enables the client device to receive voice data from the user remote control device and be voice controlled, over the short range wireless communication link, using the user remote control device.
25. The wireless device of claim 21, wherein the wireless device is either an integral part of the client device, or connects to the client device via a structural communication interface included in the client device.
26. A wireless system for Alexa implementation in DVB set top box (STB), said system comprising a A VS (Alexa voice service) Functional flow requires the client device to implement input interface for audio signal processor; microphone implementation is done by a remote and a Bluetooth/Wifi device connected to the STB via USB, the request received and sent from a microphone to an audio signal processor where the voice command is digitized, modulated and compressed and transmitted to a shared audio data center as data, said data is sent to an internet wakeword engine and also to an audio input processor wherein the data sent to the internet wakeword engine which in turn sends recognized triggers to the audio input processor, thereafter, the processed data is sent to an Alexa interaction manager wherein it goes through an Alexa comms library which may correspond to an Alexa cloud service via A VS protocol, simultaneously the processed data is sent to an Alexa Orchestrator library and thereafter to capability agents and then to an audio output which is processed by the audio signal processor, and finally sent to an output audio device.
PCT/IN2019/050793 2018-10-29 2019-10-29 Voice control method and device for intelligent linux set-top-box system WO2020089932A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN201811040742 2018-10-29
IN201811040742 2018-10-29

Publications (1)

Publication Number Publication Date
WO2020089932A1 true WO2020089932A1 (en) 2020-05-07

Family

ID=70462003

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2019/050793 WO2020089932A1 (en) 2018-10-29 2019-10-29 Voice control method and device for intelligent linux set-top-box system

Country Status (1)

Country Link
WO (1) WO2020089932A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112423053A (en) * 2020-11-06 2021-02-26 歌尔科技有限公司 Audio sharing method, system, remote controller and computer readable storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150382047A1 (en) * 2014-06-30 2015-12-31 Apple Inc. Intelligent automated assistant for tv user interactions
US20160189711A1 (en) * 2006-10-31 2016-06-30 Sony Corporation Speech recognition for internet video search and navigation

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160189711A1 (en) * 2006-10-31 2016-06-30 Sony Corporation Speech recognition for internet video search and navigation
US20150382047A1 (en) * 2014-06-30 2015-12-31 Apple Inc. Intelligent automated assistant for tv user interactions

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112423053A (en) * 2020-11-06 2021-02-26 歌尔科技有限公司 Audio sharing method, system, remote controller and computer readable storage medium
CN112423053B (en) * 2020-11-06 2024-04-09 歌尔科技有限公司 Audio sharing method, system, remote controller and computer readable storage medium

Similar Documents

Publication Publication Date Title
TWI474712B (en) Communication method, apparatus and system between digital television receiving terminal and mobile terminal
US9786278B2 (en) Image display apparatus and method of controlling the same
KR101954012B1 (en) Method, terminal, and system for processing data in a video stream
CN107371044B (en) Electronic equipment interaction method, electronic equipment, user terminal and server
CN106464933B (en) Apparatus and method for remotely controlling rendering of multimedia content
CN113225592B (en) Screen projection method and device based on Wi-Fi P2P
US10638195B2 (en) Electronic apparatus and control method thereof
WO2021249318A1 (en) Screen projection method and terminal
WO2015035742A1 (en) Method, terminal and system for audio and video sharing of digital television
WO2015176648A1 (en) Method and device for transmitting data in intelligent terminal to television terminal
CN113395606A (en) URL screen projection method and device
WO2015070796A1 (en) Method and device for pushing resources to mobile communication terminal by smart television
CN111092898A (en) Message transmission method and related equipment
CN113170222B (en) Television receiver application for television and electronic devices
WO2020089932A1 (en) Voice control method and device for intelligent linux set-top-box system
TWI431994B (en) Method and system for playing multimedia file and computer program product using the method
KR20220116162A (en) Techniques for Alternative Content Signaling in ATSC 3.0 TV
EP2914011A1 (en) Media playing method and device
US20170272487A1 (en) User input based adaptive streaming
US10104422B2 (en) Multimedia playing control method, apparatus for the same and system
CN107277592B (en) Multimedia data playing method and device based on embedded platform and storage medium
US8973082B2 (en) Interactive program system
US11190835B2 (en) Intelligent unload of broadcaster application on channel change
EP2161917A2 (en) Television system and method for providing computer network-based video
US11553230B2 (en) Platform-independent USB driver communicating I2C commands to USB dongle through JAVA application

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19878308

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19878308

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19878308

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 03-03-2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19878308

Country of ref document: EP

Kind code of ref document: A1