CN109637534A - Voice remote control method, system, controlled device and computer readable storage medium - Google Patents

Voice remote control method, system, controlled device and computer readable storage medium Download PDF

Info

Publication number
CN109637534A
CN109637534A CN201811599357.0A CN201811599357A CN109637534A CN 109637534 A CN109637534 A CN 109637534A CN 201811599357 A CN201811599357 A CN 201811599357A CN 109637534 A CN109637534 A CN 109637534A
Authority
CN
China
Prior art keywords
audio data
control command
cloud server
voice
controlled device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811599357.0A
Other languages
Chinese (zh)
Inventor
伍以文
许辉福
袁建强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Skyworth RGB Electronics Co Ltd
Original Assignee
Shenzhen Skyworth RGB Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Skyworth RGB Electronics Co Ltd filed Critical Shenzhen Skyworth RGB Electronics Co Ltd
Priority to CN201811599357.0A priority Critical patent/CN109637534A/en
Priority to PCT/CN2019/079991 priority patent/WO2020133764A1/en
Publication of CN109637534A publication Critical patent/CN109637534A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/34Adaptation of a single recogniser for parallel processing, e.g. by use of multiple processors or cloud computing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42203Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Selective Calling Equipment (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a kind of voice remote control method, system, controlled device and computer readable storage mediums, this method comprises: receiving the first audio data that remote terminal is sent, first audio data is handled to obtain by the remote terminal according to the user speech telecommand got;First audio data is handled according to preset rules, obtains second audio data;The second audio data is sent to Cloud Server;The control command text that the Cloud Server issues is received, the control command text is parsed and obtains control command, execute the control command;The control command text is handled to obtain by the Cloud Server according to the second audio data.The present invention is remotely controlled controlled device by phonetic control command, solves the problems, such as that tradition machinery remote control operation is complicated, response speed is slow.

Description

Voice remote control method, system, controlled device and computer readable storage medium
Technical field
The present invention relates to intelligent distant control technical field more particularly to a kind of voice remote control method, system, controlled device and meters Calculation machine readable storage medium storing program for executing.
Background technique
Currently, most of consumer-elcetronics devices are when in use, it is to be controlled by user's operation mechanical remote control device, such as uses Family while watching tv, needs to be manually operated remote controler and carries out tuning, volume adjustment, program switching, opens/closes application, electricity Visible image/audio parameter adjusting etc.;However, user generally requires repeatedly to click remote controler to open multilevel menu, in program column The program to be watched is searched in table one by one or lookup needs the button of adjustment parameter, search operation is very complicated, gives user Use bring larger inconvenience, the requirement of real-time response can not be met by adjusting the parameters such as volume by mechanical remote control device.
Summary of the invention
The main purpose of the present invention is to provide a kind of voice remote control method, system, controlled device and computer-readable deposit Storage media, it is intended to controlled device is remotely controlled by phonetic control command, it is slow to solve tradition machinery remote control operation complexity, response speed The problem of.
To achieve the above object, the present invention provides a kind of voice remote control method, is applied to controlled device, the voice remote control Method the following steps are included:
The first audio data that remote terminal is sent is received, first audio data is by the remote terminal according to acquisition To user speech telecommand handle to obtain;
First audio data is handled according to preset rules, obtains second audio data;
The second audio data is sent to Cloud Server;
The control command text that the Cloud Server issues is received, the control command text is parsed and obtains control command, Execute the control command;The control command text is handled to obtain by the Cloud Server according to the second audio data.
Optionally, before described the step of sending the second audio data to Cloud Server further include:
It creates Socket connection and sends connection request to Cloud Server;
The response that the Cloud Server is directed to the connection request is received, Socket is established with the Cloud Server and connect.
Optionally, described that first audio data is handled according to preset rules, obtain second audio data Step includes:
Obtain preset audio optimization standard;
First audio data is optimized based on the audio optimization standard got, by the first audio after optimization Data are as second audio data.
Optionally, before described the step of receiving the first audio data that remote terminal is sent further include:
It detects whether to receive preset write instruction;
If so, entering step: receiving the first audio data that remote terminal is sent.
Optionally, before described the step of sending the second audio data to Cloud Server further include:
It detects whether to receive preset reading instruction;
If so, entering step: sending the second audio data to Cloud Server.
Optionally, described the step of receiving the first audio data that remote terminal is sent, includes:
In response to the Bluetooth pairing request that remote terminal is sent, bluetooth connection is established with the remote terminal;
Based on the bluetooth connection, the first audio data that the remote terminal is sent is received.
In addition, the voice telecontrol system includes remote terminal, controlled dress the present invention also provides a kind of voice telecontrol system It sets and Cloud Server;
The remote terminal obtains user speech telecommand, is also used to distant to the voice for being based on preset condition Control instruction carries out analog-to-digital conversion process, obtains the first audio data, and send first audio data to the controlled device;
The controlled device, for after receiving first audio data that the remote terminal is sent, according to pre- If rule handles first audio data, second audio data is obtained, and send the second audio data to institute State Cloud Server;
The Cloud Server, for according to preset recognition rule, identifying institute after receiving the second audio data It states second audio data and generates control command text, send the control command text to the controlled device;
The controlled device is also used to receive the control command text that the Cloud Server issues, parses the control Command text processed obtains control command, executes the control command.
Optionally, the remote terminal includes:
Receiving unit, for receiving beginning record command or the recording stop instruction of user's input, and the institute that will be received It states beginning record command or the recording stop instruction is sent to:
Recoding unit, for after receiving the beginning record command, detecting user speech telecommand and to detection To voice remote control instruction recorded, the recoding unit is also used to after receiving the recording stop instruction, stop institute Recording movement is stated, and saves the user speech telecommand of recording;
Processing unit carries out analog-to-digital conversion process for instructing to the voice remote control, obtains the first audio data;
Transmission unit, for sending first audio data to the controlled device.
In addition, to achieve the above object, the present invention also provides a kind of controlled device, the controlled device includes:
Receiving module, for receiving the first audio data of remote terminal transmission, first audio data is by described distant Control terminal handles to obtain according to the user speech telecommand got;
Processing module obtains second audio data for handling according to preset rules first audio data;
Uploading module, for sending the second audio data to Cloud Server;
Execution module, the control command text issued for receiving the Cloud Server, parses the control command text Control command is obtained, the control command is executed;The control command text is by the Cloud Server according to second audio Data processing obtains.
In addition, to achieve the above object, it is described computer-readable the present invention also provides a kind of computer readable storage medium Voice remote control program is stored on storage medium, the voice remote control program realizes voice as described above when being executed by processor The step of remote control method.
The present invention receives the first audio data that remote terminal is sent, and first audio data is by the remote terminal root It handles to obtain according to the user speech telecommand got;First audio data is handled according to preset rules, is obtained To second audio data;The second audio data is sent to Cloud Server;Receive the control command that the Cloud Server issues Text parses the control command text and obtains control command, executes the control command;The control command text is by described Cloud Server handles to obtain according to the second audio data;When efficiently solving user as a result, using tradition machinery remote controler, It needs repeatedly to click remote controler to open multilevel menu, searching the program to be watched one by one in the rendition list or searching needs The button of adjustment parameter is wanted, the problem of search operation is cumbersome, and governing response can not meet the requirement of real-time response, using this hair Bright voice remote control method, controlled device directly executes operation based on the voice remote control instruction of user, to greatly improve user Operation convenience, meet adjust operation real-time response demand.
Detailed description of the invention
Fig. 1 is the structural schematic diagram for the hardware running environment that the embodiment of the present invention is related to;
Fig. 2 is the flow diagram of voice remote control method first embodiment of the present invention;
Fig. 3 is the flow diagram of voice remote control method second embodiment of the present invention;
Fig. 4 is the flow diagram of voice remote control method 3rd embodiment of the present invention;
Fig. 5 is the flow diagram of voice remote control method fourth embodiment of the present invention;
Fig. 6 is the flow diagram of the 5th embodiment of voice remote control method of the present invention.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
As shown in Figure 1, Fig. 1 is the structural schematic diagram for the hardware running environment that the embodiment of the present invention is related to.
It should be noted that Fig. 1 can be the structural schematic diagram of the hardware running environment of voice remote control equipment.The present invention is real Applying illustrative phrase voice remote control equipment can be PC, the terminal devices such as portable computer.
As shown in Figure 1, the voice remote control equipment may include: processor 1001, such as CPU, network interface 1004, user Interface 1003, memory 1005, communication bus 1002.Wherein, communication bus 1002 is for realizing the connection between these components Communication.User interface 1003 may include display screen (Display), input unit such as keyboard (Keyboard), optional user Interface 1003 can also include standard wireline interface and wireless interface.Network interface 1004 optionally may include having for standard Line interface, wireless interface (such as WI-FI interface).Memory 1005 can be high speed RAM memory, be also possible to stable storage Device (non-volatile memory), such as magnetic disk storage.Memory 1005 optionally can also be independently of aforementioned processing The storage device of device 1001.
It will be understood by those skilled in the art that voice remote control device structure shown in Fig. 1 is not constituted to voice remote control The restriction of equipment may include perhaps combining certain components or different component cloth than illustrating more or fewer components It sets.
As shown in Figure 1, as may include that operating system, network are logical in a kind of memory 1005 of computer storage medium Believe module, Subscriber Interface Module SIM and voice remote control program.Wherein, operating system is to manage and control voice remote control device hardware With the program of software resource, the operation of voice remote control program and other softwares or program is supported.
In voice remote control equipment shown in Fig. 1, user interface 1003 is mainly used for carrying out data communication with each terminal; Network interface 1004 is mainly used for connecting background server, carries out data communication with background server;And processor 1001 can be with For calling the voice remote control program stored in memory 1005, and execute following operation:
The first audio data that remote terminal is sent is received, first audio data is by the remote terminal according to acquisition To user speech telecommand handle to obtain;
First audio data is handled according to preset rules, obtains second audio data;
The second audio data is sent to Cloud Server;
The control command text that the Cloud Server issues is received, the control command text is parsed and obtains control command, Execute the control command;The control command text is handled to obtain by the Cloud Server according to the second audio data.
Further, before described the step of sending the second audio data to Cloud Server, processor 1001 may be used also With for calling the voice remote control program stored in memory 1005, and execute following steps:
It creates Socket connection and sends connection request to Cloud Server;
The response that the Cloud Server is directed to the connection request is received, Socket is established with the Cloud Server and connect.
Further, processor 1001 can be also used for calling the voice remote control program stored in memory 1005, and hold Row following steps:
Obtain preset audio optimization standard;
First audio data is optimized based on the audio optimization standard got, by the first audio after optimization Data are as second audio data.
Further, before described the step of receiving the first audio data that remote terminal is sent, processor 1001 may be used also With for calling the voice remote control program stored in memory 1005, and execute following steps:
It detects whether to receive preset write instruction;
If so, entering step: receiving the first audio data that remote terminal is sent.
Further, before described the step of sending the second audio data to Cloud Server, processor 1001 may be used also With for calling the voice remote control program stored in memory 1005, and execute following steps:
It detects whether to receive preset reading instruction;
If so, entering step: sending the second audio data to Cloud Server.
Further, processor 1001 can be also used for calling the voice remote control program stored in memory 1005, and hold Row following steps:
In response to the Bluetooth pairing request that remote terminal is sent, bluetooth connection is established with the remote terminal;
Based on the bluetooth connection, the first audio data that the remote terminal is sent is received.
Based on above-mentioned structure, each embodiment of voice remote control method of the present invention is proposed.
It is the flow diagram of voice remote control method first embodiment of the present invention referring to Fig. 2, Fig. 2.
The embodiment of voice remote control method provided in an embodiment of the present invention, it should be noted that although showing in flow charts Go out logical order, but in some cases, it can be with the steps shown or described are performed in an order that is different from the one herein.
Voice remote control method of the embodiment of the present invention is applied to controlled device, and controlled device of the embodiment of the present invention can be intelligence The terminal devices such as set-top box of TV, DTV, are not particularly limited herein.
The present embodiment voice remote control method includes:
Step S100 receives the first audio data that remote terminal is sent;Wherein, first audio data is by described distant Control terminal handles to obtain according to the user speech telecommand got;
Currently, most of consumer-elcetronics devices are when in use, it is to be controlled by user by mechanical remote control device, such as uses Family while watching tv, need to be manually operated remote controler carry out tuning, volume adjustment, program switching, signal source switching, open/ Close application, switching on and shutting down, television image/audio parameter adjusting etc.;But user generally requires repeatedly to click remote controler to open Multilevel menu, searches the program to be watched one by one in the rendition list or lookup needs the button of adjustment parameter, searches behaviour Make very complicated, brings larger inconvenience to the use of user, governing response can not meet the requirement of real-time response.
In the present embodiment, as an implementation, remote terminal built-in microphone input module, microphone input module After getting user speech telecommand, pass through MCU (Microcontroller Uni, the microcontroller list built in remote terminal Member) the user speech telecommand got is handled, voice remote control instruction is analog signal, and processing operation can be Identidication key is extracted phonetic order trunk, is sampled to the phonetic order trunk extracted, PDM (Pulse Density Modulation;Pulse density modulated) modulation, MCU coding etc., simulated voice remote control instruction is thus converted into digital letter Number DMA (Direct Memory Access, direct memory access) data are formed to get to the first audio data, and will obtain The first audio data be sent to controlled device.
The present embodiment controlled device receives the first audio data that remote terminal is sent, as an implementation, remote control Terminal and controlled device, which are established, to be wirelessly connected, such as bluetooth connection, based on the wireless connection that both sides establish, remote terminal transmission first Audio data is to controlled device.
Step S200 handles first audio data according to preset rules, obtains second audio data;
Specifically, after controlled device receives the first audio data that remote terminal is sent, the first audio data is carried out Processing can be as an implementation and pass through Alsa (Advanced Linux Sound to the first audio data Architecture, advanced Linux sound framework) noise reduction generates PCM file, then by the PCM file of generation, that is, second sound Frequency is according to being uploaded to Cloud Server, the second audio data is that treated recording file stream.
Step S300 sends the second audio data to Cloud Server;
In the present embodiment, as an implementation, the upload of second audio data is realized using websocket mechanism, Controlled device issues transmission file request to Cloud Server by creation Socket connection socket, and Cloud Server receives controlled After device treated recording file stream, that is, second audio data, second sound is identified by text identification engine server Frequency evidence generates text identification stream, that is, control command text according to recognition result, and the recognition command text of generation is sent to Controlled device.
It should be noted that the websocket mechanism that the present embodiment uses, can pass through HTTP request to avoid controlled device To cloud send data when, due to HTTP client need it is synchronous with server end be waiting, and caused by network overhead it is larger, The data transmission of controlled device can face many problems, such as in the case where unstable networks, if guaranteeing the transmission of data There is no problem, how to guarantee the problems such as data are not repeated transmission, and how connection carries out reconnection after disconnecting, the controlled dress of the present embodiment Set and Cloud Server connected based on websocket Mechanism establishing, avoid well under above-mentioned HTTP transmission there are the problem of.
Step S400 receives the control command text that the Cloud Server issues, and parses the control command text and obtains Control command executes the control command;Wherein, the control command text is by the Cloud Server according to second audio Data processing obtains.
Further, after controlled device receives the control command text of cloud server end, parsing control command text is obtained To control instruction, and control instruction is executed, by taking controlled device is smart television as an example, the control instruction such as obtained is adjusting volume Command parsing result, controlled device, which receives, calls TV system API to execute the behaviour for adjusting volume after the command text Make, similar there are also switching on and shutting down Power On/Off, mute Mute, cut platform Change channel, open and apply Open YouTube etc..
In the present embodiment, since control command text includes character string and numerical value, the present embodiment control command text is used JSON (JavaScript Object Notation, JS object numbered musical notation) format expression, it is to be understood that in other embodiments In, control command text can also be other expression-forms, be not particularly limited herein.
The first audio data that the present invention is sent by receiving remote terminal, first audio data are whole by the remote control End handles to obtain according to the user speech telecommand got;According to preset rules to first audio data at Reason, obtains second audio data;The second audio data is sent to Cloud Server;Receive the control that the Cloud Server issues Command text parses the control command text and obtains control command, executes the control command;The control command text by The Cloud Server handles to obtain according to the second audio data;As a result, by remote terminal to user speech telecommand Sample code is carried out, or carries out digitized processing with other data processing methods and forms DMA data i.e. the first audio data, then First audio data is transferred to controlled device by the wireless connection established, controlled device carries out again the first audio data Processing, such as generates pcm file, that is, second audio data by Alsa noise reduction process, and controlled device again passes through second audio data Websocket mechanism uploads to cloud server, carries out text identification to second audio data by cloud server, will identify To control command text be sent to control device, control command text can be JSON format, and last controlled device is to reception To JSON text parsed and executed corresponding movement, when efficiently solving user using tradition machinery remote controler, need It clicks remote controler repeatedly to open multilevel menu, searches the program to be watched one by one in the rendition list or lookup needs to adjust The problem of button of section parameter, search operation is very complicated, and governing response can not meet the requirement of real-time response, using this hair Bright voice remote control method, controlled device directly executes operation based on the voice remote control instruction of user, to greatly improve user Operation convenience, the operation convenience of especially old user and handicapped user also meets the real-time sound for adjusting operation Answer demand.
Further, voice remote control method second embodiment of the present invention is proposed.
It is the flow diagram of voice remote control method second embodiment of the present invention referring to Fig. 3, Fig. 3, it is distant based on above-mentioned voice Prosecutor method first embodiment, in the present embodiment, step S300, before the step of sending the second audio data to Cloud Server Further include:
Step S201 creates Socket connection and sends connection request to Cloud Server;
Step S202 receives the response that the Cloud Server is directed to the connection request, establishes with the Cloud Server Socket connection.
Data based on http protocol are sent, and defect is that HTTP client needs i.e. waiting synchronous with server end, this The network overhead needed for equipment is larger, and the data transmission of smart machine can face many problems, such as in network shakiness In the case where fixed, if guarantee data is transmitted without problem, how to guarantee that data are not repeated transmissions, after connection disconnection how Such issues that progress reconnection, HTTP can not be solved.
In the present embodiment, the upload of recording file, that is, second audio data, controlled device are realized using websocket mechanism Transmission file request is issued to Cloud Server by creation Socket connection socket, Cloud Server receives recording file and identifies At text, detailed process, which is that controlled device creation Socket connects to send to Cloud Server, is requested, and Cloud Server establishes service Socket is held to monitor request, controlled device connect foundation with Cloud Server;Controlled device sends recording file stream i.e. the second audio Data are to Cloud Server, after Cloud Server receives recording file stream, known recording file stream by text identification engine server Not at text, text identification stream, that is, control command text is formed, the identification of Cloud Server sending information flow to controlled device, controlled dress It sets and receives the control command text that the Cloud Server issues, parse the control command text and obtain control command, execute institute Control command is stated, Socket connection closed discharges resource.
The present embodiment uses websocket mechanism, can send data to cloud by HTTP request to avoid controlled device When, due to HTTP client need it is synchronous with server end be waiting, and caused by network overhead it is larger, the data of controlled device Transmission can face many problems, such as in the case where unstable networks, if guarantee data are transmitted without problem, how protect The problems such as card data are not repeated transmission, and how connection carries out reconnection after disconnecting, the present embodiment controlled device and Cloud Server base Connected in websocket Mechanism establishing, avoid well under above-mentioned HTTP transmission there are the problem of.
Further, voice remote control method 3rd embodiment of the present invention is proposed.
It is the flow diagram of voice remote control method 3rd embodiment of the present invention referring to Fig. 4, Fig. 4, it is distant based on above-mentioned voice Prosecutor method second embodiment, in the present embodiment, step S200 is handled first audio data according to preset rules, The step of obtaining second audio data include:
Step S210 obtains preset audio optimization standard;
Step S220 optimizes first audio data based on the audio optimization standard got, after optimization The first audio data as second audio data.
In the present embodiment, after controlled device receives DMA data i.e. the first audio data of remote terminal transmission, controlled dress The main core bit end set handles the first audio data, as an implementation, the microphone input built in remote terminal While module acquires user speech telecommand, the ambient noise parameters of current scene, i.e. the first audio data are acquired together In include digitized user speech telecommand and current scene ambient noise parameters, controlled device receives the first audio After data, transferred according to the ambient noise parameters for the current environment for including in the first audio data preset with the environmental noise The matched reverse phase noise signal of parameter, offsets the ambient noise parameters of current environment, realizes the drop of the first audio data It makes an uproar processing, the first audio data after noise reduction is uploaded to Cloud Server as second audio data;It is understood that at it In its embodiment, audio optimization standard can have other embodiments, be not limited to implementation described in the present embodiment.
The first audio data that the present embodiment is sent by receiving remote terminal, obtains preset audio optimization standard, base First audio data is optimized in the audio optimization standard got, using the first audio data after optimization as Two audio datas create Socket connection and send connection request to Cloud Server, receive the Cloud Server for the company The response for connecing request is established Socket with the Cloud Server and is connect, and sends the second audio data to Cloud Server, receives The control command text that the Cloud Server issues parses the control command text and obtains control command, executes the control Order;As a result, while promoting user's operation convenience, meeting the real-time response demand for adjusting operation, voice control is improved The accuracy of command recognition processed, it is ensured that the validity of voice control.
Further, voice remote control method fourth embodiment of the present invention is proposed.
It is the flow diagram of voice remote control method fourth embodiment of the present invention referring to Fig. 5, Fig. 5, it is distant based on above-mentioned voice Prosecutor method 3rd embodiment, in the present embodiment, step S100, before the step of receiving the first audio data that remote terminal is sent Further include:
Step S101 detects whether to receive preset write instruction;
If so, entering step S100, the first audio data that remote terminal is sent is received.
Further, in this embodiment step S300, before the step of sending the second audio data to Cloud Server Further include:
Step S301 detects whether to receive preset reading instruction;
If so, entering step S300, the second audio data is sent to Cloud Server.
In the present embodiment, controlled device using Alsa (Advanced Linux Sound Architecture, it is advanced Linux sound framework) audio driven, Alsa support bluetooth sound device, Alsa's reads and writees operation by user setting letter Number call write-in and reading instruct and trigger, the present embodiment controlled device detects receive preset write instruction after, receive it is distant The first audio data that control terminal is sent;It detects after receiving preset reading instruction, sends the second audio data to cloud Server.
Further, the 5th embodiment of voice remote control method of the present invention is proposed.
It is the flow diagram of the 5th embodiment of voice remote control method of the present invention referring to Fig. 6, Fig. 6, it is distant based on above-mentioned voice Prosecutor method first embodiment, in the present embodiment, step S100, the step of receiving the first audio data that remote terminal is sent, is wrapped It includes:
Step S110 requests in response to the Bluetooth pairing that remote terminal is sent, establishes bluetooth connection with the remote terminal;
Step S120 is based on the bluetooth connection, receives the first audio data that the remote terminal is sent.
In the present embodiment, as an implementation, the first bluetooth module built in remote terminal, second built in controlled device Bluetooth module, the first bluetooth module is established with the second bluetooth module and is wirelessly connected by search, scanning, pairing, based on foundation Original audio data queue Bluetooth transmission to controlled device, i.e. remote terminal are sent the first audio by bluetooth connection, remote terminal Data are to the controlled device.
It should be noted that in other embodiments, the wireless connection between remote terminal and controlled device is not limited to indigo plant Tooth connection, can also be that other radio connections, the present embodiment are not particularly limited.
In addition, the embodiment of the present invention also proposes a kind of voice telecontrol system, the voice telecontrol system include remote terminal, Controlled device and Cloud Server;
The remote terminal obtains user speech telecommand, is also used to distant to the voice for being based on preset condition Control instruction carries out analog-to-digital conversion process, obtains the first audio data, and send first audio data to the controlled device;
The controlled device, for after receiving first audio data that the remote terminal is sent, according to pre- If rule handles first audio data, second audio data is obtained, and send the second audio data to institute State Cloud Server;
The Cloud Server, for according to preset recognition rule, identifying institute after receiving the second audio data It states second audio data and generates control command text, send the control command text to the controlled device;
The controlled device is also used to receive the control command text that the Cloud Server issues, parses the control Command text processed obtains control command, executes the control command.
Preferably, the remote terminal includes:
Receiving unit, for receiving beginning record command or the recording stop instruction of user's input, and the institute that will be received It states beginning record command or the recording stop instruction is sent to:
Recoding unit, for after receiving the beginning record command, detecting user speech telecommand and to detection To voice remote control instruction recorded, the recoding unit is also used to after receiving the recording stop instruction, stop institute Recording movement is stated, and saves the user speech telecommand of recording;
Processing unit carries out analog-to-digital conversion process for instructing to the voice remote control, obtains the first audio data;
Transmission unit, for sending first audio data to the controlled device.
In the present embodiment, as an implementation, there is remote terminal an entity voice key or touch-control voice key to come The capture for triggering user speech telecommand, when user needs recorded speech telecommand, presses voice key and starts to record, and discharges The voice key stops recording, to only acquire related data, avoids the lasting monitoring condition voice command of remote terminal and brings Unnecessary identification pressure and transmission bandwidth pressure, improve voice remote control instruction control accuracy.
Voice remote control method as described above is realized when the voice telecontrol system various components operation that the present embodiment proposes Step, details are not described herein.
In addition, the embodiment of the present invention also proposes that a kind of controlled device, the controlled device include:
Receiving module, for receiving the first audio data of remote terminal transmission, first audio data is by described distant Control terminal handles to obtain according to the user speech telecommand got;
Processing module obtains second audio data for handling according to preset rules first audio data;
Uploading module, for sending the second audio data to Cloud Server;
Execution module, the control command text issued for receiving the Cloud Server, parses the control command text Control command is obtained, the control command is executed;The control command text is by the Cloud Server according to second audio Data processing obtains.
Voice remote control method as described above is realized when the speech remote controller modules operation that the present embodiment proposes Step, details are not described herein.
In addition, the embodiment of the present invention also proposes a kind of computer readable storage medium, language is stored on the storage medium The step of voice remote control program, the voice remote control program realizes voice remote control method as described above when being executed by processor.
Wherein, the voice remote control program run on the processor, which is performed realized method, can refer to the present invention The each embodiment of voice remote control method, details are not described herein again.
It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row His property includes, so that the process, method, article or the device that include a series of elements not only include those elements, and And further include other elements that are not explicitly listed, or further include for this process, method, article or device institute it is intrinsic Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do There is also other identical elements in the process, method of element, article or device.
The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art The part contributed out can be embodied in the form of software products, which is stored in a storage medium In (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that a terminal device (can be mobile phone, computer, clothes Business device, air conditioner or the network equipment etc.) execute method described in each embodiment of the present invention.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills Art field, is included within the scope of the present invention.

Claims (10)

1. a kind of voice remote control method, which is characterized in that be applied to controlled device, the voice remote control method includes following step It is rapid:
Receive the first audio data that remote terminal is sent, first audio data is by the remote terminal according to getting User speech telecommand handles to obtain;
First audio data is handled according to preset rules, obtains second audio data;
The second audio data is sent to Cloud Server;
The control command text that the Cloud Server issues is received, the control command text is parsed and obtains control command, is executed The control command;The control command text is handled to obtain by the Cloud Server according to the second audio data.
2. voice remote control method as described in claim 1, which is characterized in that the second audio data to the cloud that sends takes Before the step of business device further include:
It creates Socket connection and sends connection request to Cloud Server;
The response that the Cloud Server is directed to the connection request is received, Socket is established with the Cloud Server and connect.
3. voice remote control method as claimed in claim 2, which is characterized in that it is described according to preset rules to first audio The step of data are handled, and second audio data is obtained include:
Obtain preset audio optimization standard;
First audio data is optimized based on the audio optimization standard got, by the first audio data after optimization As second audio data.
4. voice remote control method as claimed in any one of claims 1-3, which is characterized in that the reception remote terminal is sent The first audio data the step of before further include:
It detects whether to receive preset write instruction;
If so, entering step: receiving the first audio data that remote terminal is sent.
5. voice remote control method as claimed in claim 4, which is characterized in that the second audio data to the cloud that sends takes Before the step of business device further include:
It detects whether to receive preset reading instruction;
If so, entering step: sending the second audio data to Cloud Server.
6. voice remote control method as claimed in any one of claims 1-3, which is characterized in that the reception remote terminal is sent The first audio data the step of include:
In response to the Bluetooth pairing request that remote terminal is sent, bluetooth connection is established with the remote terminal;
Based on the bluetooth connection, the first audio data that the remote terminal is sent is received.
7. a kind of voice telecontrol system, which is characterized in that the voice telecontrol system includes remote terminal, controlled device and cloud clothes Business device;
The remote terminal obtains user speech telecommand, is also used to refer to the voice remote control for being based on preset condition It enables and carries out analog-to-digital conversion process, obtain the first audio data, and send first audio data to the controlled device;
The controlled device, for after receiving first audio data that the remote terminal is sent, according to default rule Then first audio data is handled, obtains second audio data, and sends the second audio data to the cloud Server;
The Cloud Server according to preset recognition rule, identifies described for after receiving the second audio data Two audio datas simultaneously generate control command text, send the control command text to the controlled device;
The controlled device is also used to receive the control command text that the Cloud Server issues, and parses the control life It enables text obtain control command, executes the control command.
8. voice telecontrol system as claimed in claim 7, which is characterized in that the remote terminal includes:
Receiving unit for receiving beginning record command or the recording stop instruction of user's input, and is opened described in receiving Beginning record command or the recording stop instruction are sent to:
Recoding unit detects user speech telecommand and to detecting for after receiving the beginning record command Voice remote control instruction is recorded, and the recoding unit is also used to after receiving the recording stop instruction, stops the record Braking is made, and saves the user speech telecommand of recording;
Processing unit carries out analog-to-digital conversion process for instructing to the voice remote control, obtains the first audio data;
Transmission unit, for sending first audio data to the controlled device.
9. a kind of controlled device, which is characterized in that the controlled device includes:
Receiving module, for receiving the first audio data of remote terminal transmission, first audio data is whole by the remote control End handles to obtain according to the user speech telecommand got;
Processing module obtains second audio data for handling according to preset rules first audio data;
Uploading module, for sending the second audio data to Cloud Server;
Execution module, the control command text issued for receiving the Cloud Server parse the control command text and obtain Control command executes the control command;The control command text is by the Cloud Server according to the second audio data Processing obtains.
10. a kind of computer readable storage medium, which is characterized in that it is distant to be stored with voice on the computer readable storage medium Program is controlled, such as voice remote control described in any one of claims 1 to 6 is realized when the voice remote control program is executed by processor The step of method.
CN201811599357.0A 2018-12-25 2018-12-25 Voice remote control method, system, controlled device and computer readable storage medium Pending CN109637534A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201811599357.0A CN109637534A (en) 2018-12-25 2018-12-25 Voice remote control method, system, controlled device and computer readable storage medium
PCT/CN2019/079991 WO2020133764A1 (en) 2018-12-25 2019-03-28 Speech remote control method and system, and controlled apparatus and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811599357.0A CN109637534A (en) 2018-12-25 2018-12-25 Voice remote control method, system, controlled device and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN109637534A true CN109637534A (en) 2019-04-16

Family

ID=66077687

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811599357.0A Pending CN109637534A (en) 2018-12-25 2018-12-25 Voice remote control method, system, controlled device and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN109637534A (en)
WO (1) WO2020133764A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110366018A (en) * 2019-07-12 2019-10-22 杭州任你说智能科技有限公司 A kind of two-way interactive remote control system and operating method for TV
CN111263100A (en) * 2020-01-19 2020-06-09 中移(杭州)信息技术有限公司 Video call method, device, equipment and storage medium
CN111863041A (en) * 2020-07-17 2020-10-30 东软集团股份有限公司 Sound signal processing method, device and equipment
CN112802464A (en) * 2019-11-14 2021-05-14 阿里巴巴集团控股有限公司 Voice remote control method, remote control terminal and server
CN112792439A (en) * 2020-12-30 2021-05-14 唐山松下产业机器有限公司 Speech recognition welding system
CN113205810A (en) * 2021-05-06 2021-08-03 北京汇钧科技有限公司 Voice signal processing method, device, medium, remote controller and server

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101938391A (en) * 2010-08-31 2011-01-05 中山大学 Voice processing method, system, remote controller, set-top box and cloud server
CN107635214A (en) * 2017-08-21 2018-01-26 深圳创维-Rgb电子有限公司 Response method, device, system and readable storage medium storing program for executing based on blue Tooth remote controller
CN108121528A (en) * 2017-12-06 2018-06-05 深圳市欧瑞博科技有限公司 Sound control method, device, server and computer readable storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120030712A1 (en) * 2010-08-02 2012-02-02 At&T Intellectual Property I, L.P. Network-integrated remote control with voice activation
CN106911949A (en) * 2015-12-23 2017-06-30 北京奇虎科技有限公司 A kind of method and mobile terminal based on mobile terminal control apparatus equipment
CN107566226B (en) * 2017-07-31 2020-11-17 歌尔科技有限公司 Method, device and system for controlling smart home
CN107948695A (en) * 2017-11-17 2018-04-20 浙江大学 Speech-sound intelligent remote controler and television channel selection method
CN108198549A (en) * 2017-11-22 2018-06-22 珠海格力电器股份有限公司 A kind of apparatus control method, device, storage medium, server and user terminal

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101938391A (en) * 2010-08-31 2011-01-05 中山大学 Voice processing method, system, remote controller, set-top box and cloud server
CN107635214A (en) * 2017-08-21 2018-01-26 深圳创维-Rgb电子有限公司 Response method, device, system and readable storage medium storing program for executing based on blue Tooth remote controller
CN108121528A (en) * 2017-12-06 2018-06-05 深圳市欧瑞博科技有限公司 Sound control method, device, server and computer readable storage medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110366018A (en) * 2019-07-12 2019-10-22 杭州任你说智能科技有限公司 A kind of two-way interactive remote control system and operating method for TV
CN112802464A (en) * 2019-11-14 2021-05-14 阿里巴巴集团控股有限公司 Voice remote control method, remote control terminal and server
CN111263100A (en) * 2020-01-19 2020-06-09 中移(杭州)信息技术有限公司 Video call method, device, equipment and storage medium
CN111863041A (en) * 2020-07-17 2020-10-30 东软集团股份有限公司 Sound signal processing method, device and equipment
CN112792439A (en) * 2020-12-30 2021-05-14 唐山松下产业机器有限公司 Speech recognition welding system
CN113205810A (en) * 2021-05-06 2021-08-03 北京汇钧科技有限公司 Voice signal processing method, device, medium, remote controller and server

Also Published As

Publication number Publication date
WO2020133764A1 (en) 2020-07-02

Similar Documents

Publication Publication Date Title
CN109637534A (en) Voice remote control method, system, controlled device and computer readable storage medium
US10650816B2 (en) Performing tasks and returning audio and visual feedbacks based on voice command
CN104618780B (en) Electrical equipment control method and system
KR101506510B1 (en) Speech Recognition Home Network System.
CN102111674B (en) System and method for playing on-line video by mobile terminal and mobile terminal
CN103369385A (en) Method for displaying set-top box program information and controlling set-top box based on intelligent terminal
US8069222B2 (en) System and method to provide services based on network
CN105206272A (en) Voice transmission control method and system
CN105933738B (en) Net cast methods, devices and systems
CN103024528A (en) Mobile terminal and method for transmitting streaming media data on mobile terminal
CN108538289A (en) The method, apparatus and terminal device of voice remote control are realized based on bluetooth
CN110992955A (en) Voice operation method, device, equipment and storage medium of intelligent equipment
CN104122979A (en) Method and device for control over large screen through voice
CN105897686A (en) Smart television user account speech management method and smart television
CN108419108A (en) Sound control method, device, remote controler and computer storage media
US11488603B2 (en) Method and apparatus for processing speech
CN105120207A (en) Sweeping robot video monitoring method and server
CN104506901B (en) Voice householder method and system based on tv scene state and voice assistant
CN110971685B (en) Content processing method, content processing device, computer equipment and storage medium
CN113765903A (en) Screen projection method and related device, electronic equipment and storage medium
KR101351264B1 (en) System and method for message translation based on voice recognition
CN104717536A (en) Voice control method and system
CN114501068B (en) Video live broadcast method, architecture, system and computer readable storage medium
CN114979386A (en) Applet voice communication method, device, electronic equipment and storage medium
CN108399918A (en) Smart machine connection method, smart machine and terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190416