WO2020042993A1 - 语音控制方法、装置及系统 - Google Patents

语音控制方法、装置及系统 Download PDF

Info

Publication number
WO2020042993A1
WO2020042993A1 PCT/CN2019/101913 CN2019101913W WO2020042993A1 WO 2020042993 A1 WO2020042993 A1 WO 2020042993A1 CN 2019101913 W CN2019101913 W CN 2019101913W WO 2020042993 A1 WO2020042993 A1 WO 2020042993A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
control device
wake
collection terminal
control
Prior art date
Application number
PCT/CN2019/101913
Other languages
English (en)
French (fr)
Inventor
孙大鹏
贾伟
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2020042993A1 publication Critical patent/WO2020042993A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • H04L12/2803Home automation networks
    • H04L12/2816Controlling appliance services of a home automation network by calling their functionalities
    • H04L12/282Controlling appliance services of a home automation network by calling their functionalities based on user interaction within the home
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/14Session management
    • H04L67/141Setup of application sessions
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Definitions

  • the present application relates to the field of computer technology, and in particular, to a voice control method, device, and system.
  • the voice signal sent by the user may be accepted and responded by multiple AI home equipment, causing the user to be unable to accurately and reliably control the AI device through the voice signal.
  • the embodiments of the present application provide a voice control method performed by a voice collection terminal and a control device, respectively, in order to accurately and reliably respond to a user's voice control signal to perform voice control on the Internet of Things terminal.
  • the embodiment of the present application further provides a voice control method executed by a server, which aims to assist a control device to accurately and reliably respond to a user's voice control signal to perform voice control on an IoT terminal.
  • an embodiment of the present application provides a voice control method, which is applied to a voice collection terminal, and includes:
  • Upon receiving the wake-up instruction sending a voice channel establishment request to the control device, where the voice channel establishment request includes identification information of the voice collection terminal and a wake-up parameter corresponding to the wake-up instruction;
  • a voice signal is collected, and the voice signal is sent to the control device for the control device to perform voice control on the Internet of Things terminal based on the voice signal.
  • the method before sending a voice channel establishment request to the control device, the method further includes:
  • the wake-up parameter is determined according to a volume of the wake-up instruction.
  • the wakeup parameter is positively related to the volume of the wakeup instruction.
  • sending the voice signal to the control device includes:
  • the voice signal is included in the voice control information, and the voice control information is sent to the control device.
  • the voice control information further includes identification information of the voice collection terminal, and after the voice signal is sent to the control device, the method Also includes:
  • the result of the voice control is displayed.
  • displaying the voice control result specifically includes:
  • the voice control result is displayed through at least one of a sound signal, a light signal, and a vibration signal.
  • an embodiment of the present application further provides a voice control method, which is applied to a control device, and includes:
  • the voice channel establishment request including identification information of the voice collection terminal and a wake-up parameter corresponding to a wake-up instruction received by the voice collection terminal;
  • performing voice control on the IoT terminal based on the voice signal includes:
  • the voice signal is identified, and the IoT terminal performs voice control on the IoT terminal based on the recognition result.
  • identifying the voice signal includes:
  • performing voice control on an IoT terminal based on the voice signal includes:
  • the method further includes:
  • the voice control result is returned to the voice collection terminal for display by the voice collection terminal.
  • performing voice control on an IoT terminal based on the voice signal includes at least one of the following:
  • a target IoT terminal is controlled to perform a target operation, and the target IoT terminal and the target operation correspond to a recognition result of the voice signal.
  • the wakeup parameter is positively related to the volume of the wakeup instruction, and then determining whether to establish a voice channel with the voice collection terminal according to the wakeup parameter includes the following: At least one:
  • the wake-up parameter is less than or equal to a second preset threshold, it is determined that a voice channel is not established with the voice collection terminal.
  • the wakeup parameter is positively related to the volume of the wakeup instruction
  • Receive a voice channel establishment request from a voice collection terminal specifically:
  • Determining whether to establish a voice channel with the voice collection terminal according to the wake-up parameter includes:
  • control device is any one of a plurality of IoT terminals.
  • the method further includes:
  • any one of the plurality of Internet of Things terminals other than the control device is determined as a new control device.
  • an embodiment of the present application further provides a voice control method, which is applied to a server and includes:
  • Identify the voice signal and return the recognition result to the control device for the control device to perform voice control on the IoT terminal based on the recognition result.
  • identifying the voice signal includes:
  • an embodiment of the present application further provides a voice control system, where the voice control system includes:
  • a control device configured to receive a voice channel establishment request from a voice collection terminal, where the voice channel establishment request includes identification information of the voice collection terminal and a wake-up parameter corresponding to a wake-up instruction received by the voice collection terminal; And is further configured to determine whether to establish a voice channel with the voice collection terminal according to the wake-up parameter; and is further configured to receive a voice signal when the voice channel is established with the voice collection terminal, and the voice signal is determined by the voice Acquisition terminal acquisition; also used for voice control of the Internet of Things terminal based on the voice signal;
  • the voice collection terminal is configured to monitor a wake-up instruction; and is further configured to send a voice channel establishment request to a control device when the wake-up instruction is received, where the voice channel establishment request includes identification information of the voice collection terminal and A wake-up parameter corresponding to the wake-up instruction; and further configured to collect a voice signal when the voice channel is established with the control device, and send the voice signal to the control device for the control device to The voice signal performs voice control on the Internet of Things terminal.
  • the voice control system further includes:
  • the server is used to receive a voice signal from a control device, and the voice signal is collected by a voice acquisition terminal that has established a voice channel with the control device; it is also used to identify the voice signal and return the recognition result to the
  • the control device is used for the control device to perform voice control on the Internet of Things terminal based on the recognition result.
  • an embodiment of the present application further provides a voice collection terminal, where the voice collection terminal is configured to execute the method provided in the first aspect of the embodiment of the present application.
  • an embodiment of the present application further provides a control device, where the control device is configured to execute the method provided in the second aspect of the present embodiment.
  • the embodiment of the present application further provides a server, where the server is configured to execute the method provided in the third aspect of the embodiment of the present application.
  • an embodiment of the present application further provides an electronic device, which is applied to a voice collection terminal, and includes:
  • a memory arranged to store computer-executable instructions that, when executed, cause the processor to perform the following operations:
  • the voice channel establishment request including identification information of the voice collection terminal and a wake-up parameter corresponding to the wake-up instruction
  • a voice signal is collected, and the voice signal is sent to the control device for the control device to perform voice control on the Internet of Things terminal based on the voice signal.
  • an embodiment of the present application further provides a computer-readable storage medium, wherein the computer-readable storage medium stores one or more programs, and the one or more programs are regarded as electronic devices including multiple application programs.
  • the electronic device executes, the electronic device is caused to perform the following operations:
  • the voice channel establishment request including identification information of the voice collection terminal and a wake-up parameter corresponding to the wake-up instruction
  • a voice signal is collected, and the voice signal is sent to the control device for the control device to perform voice control on the Internet of Things terminal based on the voice signal.
  • an embodiment of the present application further provides an electronic device, which is applied to a control device, and includes:
  • a memory arranged to store computer-executable instructions that, when executed, cause the processor to perform the following operations:
  • the voice channel establishment request including identification information of the voice collection terminal and a wake-up parameter corresponding to a wake-up instruction received by the voice collection terminal;
  • an embodiment of the present application further provides a computer-readable storage medium, wherein the computer-readable storage medium stores one or more programs, and the one or more programs should be included in a plurality of application programs.
  • the electronic device executes, the electronic device is caused to perform the following operations:
  • the voice channel establishment request including identification information of the voice collection terminal and a wake-up parameter corresponding to a wake-up instruction received by the voice collection terminal;
  • an embodiment of the present application further provides an electronic device, which is applied to a server, and includes:
  • a memory arranged to store computer-executable instructions that, when executed, cause the processor to perform the following operations:
  • Identify the voice signal and return the recognition result to the control device for the control device to perform voice control on the IoT terminal based on the recognition result.
  • an embodiment of the present application further provides a computer-readable storage medium, wherein the computer-readable storage medium stores one or more programs, and the one or more programs should be included in a plurality of application programs.
  • the electronic device executes, the electronic device is caused to perform the following operations:
  • Identify the voice signal and return the recognition result to the control device for the control device to perform voice control on the IoT terminal based on the recognition result.
  • the voice collection terminal can send a voice channel establishment request to the control device when the wake-up instruction is received, and request to establish a voice channel with the control device. Furthermore, when a voice channel is established with the control device, the voice signal is collected and sent to the control device for the control device to perform voice control on the IoT terminal based on the voice signal. Correspondingly, the control device will only receive the voice signal collected by the voice collection terminal if the voice channel is established with a certain voice collection terminal. Therefore, by adopting the solution provided in the embodiment of the present application, it is possible to avoid the false triggering and feedback caused by the user's voice signal being collected by multiple voice collection terminals and sent to the control device, so that the voice control system can respond accurately and reliably.
  • the user's voice control signal performs voice control on the IoT terminal such as AI equipment.
  • FIG. 1 is a schematic structural diagram of a voice control system according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a voice control method performed by a voice collection terminal according to an embodiment of the present application
  • FIG. 3 is a schematic flowchart of a multi-sided interaction between a voice collection terminal and a control device according to an embodiment of the present application
  • FIG. 4 is a schematic flowchart of a voice control method performed by a control device according to an embodiment of the present application
  • FIG. 5 is a schematic flowchart of a voice control method performed by a server according to an embodiment of the present application
  • FIG. 6 is a schematic structural diagram of a voice collection terminal according to an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a control device according to an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a cloud service terminal according to an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • the voice signal sent by the user may be accepted by multiple AI devices, and it is difficult for the cloud to determine which AI device should respond to the voice signal. Therefore, the user may not be able to Control AI devices accurately and reliably through voice signals, or cause multiple AI devices to respond to the user's voice signals simultaneously.
  • the cloud can use the sound source localization method to locate the sound source according to the source of the voice signal, so that the AI device only recognizes and feedbacks the voice signal for the preset direction of sound.
  • the cloud can also use the length of the sound propagation path to determine which AI device the user wishes to control. For example, the cloud can transmit the sound to the device with the smallest length of time according to the length of the sound propagation, and determine it as the device closest to the user.
  • the device is a device that the user wishes to control, and then the cloud controls the device to respond to a voice signal from the user.
  • the two devices When using the sound source localization method, if the two devices are placed in the same direction, or placed close to each other, such as a smart TV and a smart speaker placed on a TV cabinet at the same time, the direction of the sound source received by the two is almost the same. The cloud will struggle to make the right decisions.
  • the embodiments of the present application provide a voice control system, and correspondingly provide a voice control method performed by each part of the system, which can prevent the user's voice signals from being collected by multiple voice acquisition terminals and all sent to the control device.
  • the false triggering and false feedback caused by the transmission so that the voice control system can accurately and reliably respond to the user's voice control signals, and perform voice control on the IoT terminal such as AI equipment.
  • a voice control system provided by an embodiment of the present application includes a control device and at least one voice collection terminal.
  • the voice collection terminal is configured to collect a user's voice signal and send it to a control device;
  • the control device is configured to receive the user's voice signal and control at least one IoT terminal based on a recognition result of recognizing the voice signal.
  • the voice collection terminal in the voice control system is used to monitor the wake-up instruction; it is also used to send a voice channel establishment request to the control device when the wake-up instruction is received, and the voice channel establishment request includes the identity of the voice collection terminal Information and wake-up parameters corresponding to the wake-up instruction; it is also used to collect a voice signal when the voice channel is established with the control device, and send the voice signal to the control device through the voice channel for the control device to the Internet of Things based on the voice signal The terminal performs voice control.
  • the control device in the voice control system is configured to receive a voice channel establishment request from the voice collection terminal.
  • the voice channel establishment request includes identification information of the voice collection terminal and a wake-up signal corresponding to a wake-up instruction received by the voice collection terminal.
  • Parameters also used to determine whether to establish a voice channel with the voice collection terminal according to the wake-up parameter; also used to receive the voice signal collected by the voice collection terminal in the case of establishing a voice channel with the voice collection terminal; also used based on the voice signal Voice control of IoT terminals.
  • the voice control system preferably further includes a server.
  • the server can be used to receive the voice signal from the control device, where the voice signal is collected by a voice acquisition terminal that has established a voice channel with the control device; it is also used to identify the voice signal and return the recognition result to the control device for the control device Voice control of the IoT terminal based on the recognition results.
  • control device may be a device specifically for receiving a voice signal and performing voice control on the IoT terminal based on the voice signal, or any one of a plurality of IoT terminals, as long as the control can be realized The function of the device is sufficient.
  • the voice control system there can be only one control device at any time, that is, there can be only one device at any time (either a special device or a certain thing).
  • the network terminal plays the role of a control device.
  • the control device in order to improve the robustness of the voice control system, if the preset conditions are met, for example, when the user issues an instruction to switch the control device, or the current control device fails, runs abnormally, or lacks resources In other cases, the control device can be switched in this system. Specifically, the current control device may determine any one of the IoT terminals other than the current control device as a new control device to maintain the continuous and stable operation of the voice control system.
  • the voice collection terminal can send a voice channel establishment request to the control device in the case of receiving the wake-up instruction, and request to establish a voice channel with the control device. Furthermore, when a voice channel is established with the control device, the voice signal is collected and sent to the control device for the control device to perform voice control on the IoT terminal based on the voice signal. Correspondingly, the control device will only receive the voice signal collected by the voice collection terminal if the voice channel is established with a certain voice collection terminal. Therefore, by adopting the solution provided in the embodiment of the present application, it is possible to avoid the false triggering and feedback caused by the user's voice signal being collected by multiple voice collection terminals and sent to the control device, so that the voice control system can respond accurately and reliably.
  • the user's voice control signal performs voice control on the IoT terminal such as AI equipment.
  • an embodiment of the present application provides a voice control method, which is applied to a voice collection terminal.
  • the method may include:
  • the voice collection terminal may be in a standby state.
  • the voice collection terminal may continuously perform step S101 to monitor the wake-up instruction issued by the user in order to respond in time.
  • the foregoing wake-up instruction can be understood as a pre-stored sound signal in the voice collection terminal, which is used to trigger the voice collection terminal to enter a working state.
  • This sound signal also known as the wake-up sound signal, can be set by the voice acquisition terminal by default (for example, “Enable Voice Control”, “Little A Small A”, etc.), or can be set by the user according to his own needs and Preferences are preset (for example, "I'm back", “I'm gone", etc.).
  • the above wake-up instruction may be a pre-stored sound signal or multiple, so as to match the needs of the user in different scenarios. For example, when the user just returns home, he can enter the wake-up command "I am back" to trigger the voice collection terminal to enter the working state, so as to control the Internet of Things terminal such as smart TV and smart air conditioner to start running. As another example, when the user is ready to go to work, he can enter a wake-up instruction of "I'm gone" to trigger the voice collection terminal to enter the working state, so as to further control the Internet of Things terminal at home to stop running.
  • Using pre-stored multi-segment sound signals as a wake-up command users can express the state of the current scene in a straightforward manner, without using a fixed wake-up command (for example, "start voice control", etc.), which enhances the fun and scene of voice control It is beneficial to improve the user experience.
  • a fixed wake-up command for example, "start voice control", etc.
  • the voice collection terminal recognizes the sound signal that is monitored, and if it matches the pre-stored sound signal as the wake-up instruction, it determines that the wake-up instruction is received; if it matches the pre-stored, it is used as the wake-up instruction If the sound signals do not match, it is determined that the wake-up instruction has not been received, and the voice collection terminal will continue to perform step S101 to monitor the wake-up instruction.
  • a voice channel establishment request is sent to the control device.
  • the voice channel establishment request includes identification information of the voice collection terminal and a wake-up parameter corresponding to the wake-up instruction.
  • the voice collection terminal may determine the wake-up parameter according to the volume of the wake-up command, and then send the wake-up parameter to the control device in a voice channel establishment request for the control device to determine whether it is related to the voice
  • the acquisition terminal establishes a voice channel.
  • the volume of the wake-up instruction can reflect the distance between the user and the voice collection terminal. Therefore, the wake-up parameter (which can be recorded as Hx) determined according to the volume of the wake-up instruction can also reflect the distance.
  • the voice collection terminal sends a wake-up parameter capable of reflecting the volume of the wake-up instruction to the control device, and the control device can determine whether to receive a voice signal from the voice collection terminal.
  • the determined wake-up parameter and the volume of the wake-up instruction may be positively correlated. That is, the larger the volume of the wake-up command, the larger the wake-up parameter; conversely, the smaller the volume of the wake-up command, the smaller the wake-up parameter.
  • step S203 when the control device executes step S203 to determine whether to establish a voice channel with the voice collection terminal according to the wake-up parameter, if the wake-up parameter sent by a voice collection terminal is greater than or equal to the first preset Threshold, the control device may determine to establish a voice channel with the voice collection terminal.
  • the control device may determine not to establish a voice channel with the voice collection terminal.
  • each voice channel establishment request will include a wake-up parameter determined by the corresponding voice acquisition terminal according to the received wake-up instruction.
  • the control device can determine the voice collection terminal with the largest wake-up parameter among the multiple voice collection terminals as the target voice collection terminal according to the wake-up parameters, and then determine to establish a voice channel with the target voice collection terminal. It can be understood that the voice collection terminal with the largest wake-up parameter among multiple voice collection terminals can be understood as the voice collection terminal closest to the user, and thus can be considered as the target voice collection terminal that the user desires to wake up.
  • control device executes step S203 and determines whether to establish a voice channel with the voice collection terminal according to the wake-up parameters, it may consider only the wake-up parameters corresponding to a certain voice collection terminal (based on the first preset threshold and (Or the second preset threshold judgment), or only the size order among wake-up parameters corresponding to multiple voice collection terminals, or a comprehensive consideration, for example, sorting wake-up parameters greater than or equal to the first preset threshold ,and many more.
  • the voice channel establishment request sent by the voice collection terminal to the control device also includes identification information of the voice collection terminal, so that the control device recognizes the voice collection terminal corresponding to the wake-up parameter, and then determines whether to agree or reject the voice channel of the voice collection terminal. Create request.
  • the control device may return the determined result to the voice collection terminal, as shown in FIG. 3.
  • step S105 may be further performed.
  • the control device can establish only one voice channel at the same time, and only allows the voice collected by one voice acquisition terminal to be received. Signal, so it can avoid the problem of misidentification and false triggering caused by receiving multiple voice signals.
  • S105 In the case of establishing a voice channel with the control device, collect a voice signal and send the voice signal to the control device for the control device to perform voice control on the IoT terminal based on the voice signal.
  • the user sends a wake-up instruction to the voice collection terminal to wake up the voice collection terminal.
  • the voice collection terminal establishes a voice channel with the control device, the user can be prompted to input a voice signal.
  • the voice collection terminal collects the voice signal sent by the user, it can further send the voice signal to the control device.
  • the voice signal may be included in the voice control information, and then the voice control information is sent to the control device.
  • control device can receive a voice signal when a voice channel is established. After receiving the voice signal, the control device may further recognize the voice signal so as to perform voice control on the IoT terminal based on the voice signal, as shown in FIG. 3.
  • the voice control information sent by the voice collection terminal to the control device through the voice channel may also include identification information of the voice collection terminal, so that the control device sends the voice control information containing the voice signal to the control device through the voice channel.
  • the voice control result obtained by performing voice control on the IoT terminal based on the voice signal can be further returned to the same voice collection terminal, as shown in FIG. 4. After the voice collection terminal receives the voice control result, it can further show it to the user so that the user knows the result of the voice control.
  • the voice collection terminal may display the voice control result through at least one of a sound signal, a light signal, and a vibration signal.
  • the voice signal input by the user is "Turn on the air conditioner in the living room”
  • the voice acquisition terminal may feedback the voice control result to the user with the voice signal "Living room air conditioner is turned on” or "Voice control succeeded”.
  • the voice collection terminal may indicate that the voice control is successful with a green aperture, and indicate that the voice control has failed with a red aperture.
  • an embodiment of the present application further provides a voice control method, which is applied to a control device.
  • the method includes:
  • the voice channel establishment request includes identification information of the voice collection terminal and a wake-up parameter corresponding to a wake-up instruction received by the voice collection terminal.
  • control device executes the voice channel establishment request received in step S201, which corresponds to the voice channel establishment request sent by the voice acquisition terminal in step S103, and is not repeated here.
  • S203 Determine whether to establish a voice channel with the voice collection terminal according to the wake-up parameter.
  • the voice collection terminal determines the wake-up parameter according to the volume of the wake-up instruction, and the wake-up parameter is positively related to the volume of the wake-up instruction. Therefore, when the control device executes step S203, it can use at least one of the following to determine whether to establish a voice channel with the voice collection terminal:
  • the wakeup parameter is less than or equal to the second preset threshold, it is determined not to establish a voice channel with the voice collection terminal;
  • the voice collection terminal with the largest wake-up parameter among the multiple voice collection terminals is determined as the target voice collection terminal, and a voice channel is established with the target voice collection terminal.
  • S205 In the case of establishing a voice channel with the voice collection terminal, a voice signal is received, and the voice signal is collected by the voice collection terminal.
  • the voice signal received by the control device in step S205 corresponds to the voice signal collected by the voice collection terminal in step S105 and sent to the control device, which is not repeated here.
  • S207 Perform voice control on the IoT terminal based on the voice signal.
  • step S207 if the control device has the ability to recognize the voice signal, it can directly recognize the voice signal locally and perform voice control on the Internet of Things terminal based on the recognition result. If the control device does not have the ability to recognize the voice signal, the voice signal may be sent to the server for the server to recognize the voice signal, as shown in FIG. 3. Correspondingly, after receiving the voice signal from the control device, the server can recognize the voice signal and return the recognition result to the control device, as shown in FIG. 3 and FIG. 5. After receiving the recognition result returned by the server, the control device may further perform voice control on the IoT terminal based on the recognition result.
  • the voice signal can be processed with natural language processing (NLP (Full Name Natural Language) and natural Language understanding NLU (full name Natural Language) Understanding.
  • NLP Natural Language Processing
  • NLU full name Natural Language
  • natural language processing and natural language understanding are to transform human language forms (specifically, speech signals input by users) into machine-understandable, structured, and complete semantic representations.
  • natural language processing may include links such as word segmentation, lexical analysis, grammatical analysis, and semantic analysis.
  • natural language processing and natural language understanding is various types of natural language processing data sets, such as corpus training set (full name tc-corpus-train), Chinese and English news classification corpus for text classification research, and IG Multi-dimensional ARFF (full name Attribute-Relation File Format) format Chinese vector space model VSM (full name Vector Model) generated by chi-square and other feature word selection methods, 10,000 random extraction papers Chinese DBLP resources, and unsupervised Chinese word segmentation algorithms. Chinese word segmentation thesaurus, UCI evaluation ranking data, sentiment analysis data set with initialization instructions, etc. Therefore, it is preferable to send the language signal to the server for identification, as shown in FIG. 3.
  • corpus training set full name tc-corpus-train
  • Chinese and English news classification corpus for text classification research for text classification research
  • IG Multi-dimensional ARFF full name Attribute-Relation File Format
  • VSM full name Vector Model
  • voice control when voice control is performed based on the voice signal, specifically, the recognition result based on the recognition of the voice signal, text-to-speech TTS (Text To Speech) output can be performed, so that the control device can provide the user with a voice Feedback. It is also possible to determine the target IoT terminal and target operation corresponding to the recognition result of the speech signal based on the voice signal, and then control the target IoT terminal to perform the target operation.
  • text-to-speech TTS Text To Speech
  • control device may further return a voice control result to the voice collection terminal for the voice collection terminal to display to the user.
  • the voice collection terminal can send a voice channel establishment request to the control device when the wake-up instruction is received, and request to establish a voice channel with the control device. Furthermore, when a voice channel is established with the control device, the voice signal is collected and sent to the control device for the control device to perform voice control on the IoT terminal based on the voice signal. Correspondingly, the control device will only receive the voice signal collected by the voice collection terminal if the voice channel is established with a certain voice collection terminal. Therefore, by adopting the solution provided in the embodiment of the present application, it is possible to avoid the false triggering and feedback caused by the user's voice signal being collected by multiple voice collection terminals and sent to the control device, so that the voice control system can respond accurately and reliably.
  • the user's voice control signal performs voice control on the IoT terminal such as AI equipment.
  • an embodiment of the present application further provides a voice control method, which is applied to a server.
  • the server may be a cloud server.
  • the method includes:
  • S301 Receive a voice signal from a control device, and the voice signal is collected by a voice acquisition terminal that has established a voice channel with the control device;
  • S303 Recognize the voice signal and return the recognition result to the control device for the control device to perform voice control on the IoT terminal based on the recognition result.
  • step S303 when the server executes step S303 to recognize the speech signal, it also performs natural language processing NLP and natural language understanding NLU processing on the speech signal.
  • the voice control method executed by the server and the voice control method executed by the control device are coordinated to realize the recognition of the voice signal, so that the control device can correctly and timely respond to the voice signal input by the user through the voice acquisition terminal, and realize Voice control of IoT terminals.
  • the related descriptions in the foregoing embodiments are all applicable to the voice control method performed by the server, and are not repeated here.
  • an embodiment of the present application further provides a voice collection terminal, including:
  • a monitoring module 101 configured to monitor a wake-up instruction
  • the request sending module 103 is configured to send a voice channel establishment request to the control device when the wake-up instruction is received, and the voice channel establishment request includes identification information of the voice collection terminal and a wake-up parameter corresponding to the wake-up instruction;
  • the voice signal acquisition and transmission module 105 is configured to collect a voice signal when the voice channel is established with the control device, and send the voice signal to the control device through the voice channel, for the control device to perform voice control on the IoT terminal based on the voice signal.
  • the voice collection terminal shown in FIG. 6 can implement each step of the voice control method described in FIG. 2.
  • the explanations about the voice collection terminal in the foregoing embodiments are applicable to this, and will not be repeated here.
  • an embodiment of the present application further provides a control device, including:
  • the request receiving module 201 is configured to receive a voice channel establishment request from a voice collection terminal.
  • the voice channel establishment request includes identification information of the voice collection terminal and a wake-up parameter corresponding to a wake-up instruction received by the voice collection terminal.
  • a judging module 203 configured to determine whether to establish a voice channel with the voice collection terminal according to the wake-up parameter
  • a first voice signal receiving module 205 configured to receive a voice signal when a voice channel is established with the voice collection terminal, and the voice signal is collected by the voice collection terminal;
  • the voice control module 207 is configured to perform voice control on the IoT terminal based on a voice signal.
  • control device shown in FIG. 7 can implement each step of the voice control method described in FIG. 4.
  • the descriptions of the control device in the foregoing embodiments are applicable to this, and are not repeated here.
  • an embodiment of the present application further provides a server, including:
  • a second voice signal receiving module 301 configured to receive a voice signal from a control device, and the voice signal is collected by a voice acquisition terminal that has established a voice channel with the control device;
  • the voice signal recognition module 303 is configured to recognize a voice signal and return the recognition result to the control device for the control device to perform voice control on the IoT terminal based on the recognition result.
  • server shown in FIG. 8 can implement each step of the voice control method described in FIG. 5.
  • the description of the server in the foregoing embodiment is applicable to this, and is not repeated here.
  • FIG. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • the electronic device includes a processor, and optionally an internal bus, a network interface, and a memory.
  • the memory may include a memory, such as a high-speed random access memory (Random-Access Memory, RAM), and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory.
  • RAM Random-Access Memory
  • non-volatile memory such as at least one disk memory.
  • the electronic device may also include hardware required for other services.
  • the processor, network interface and memory can be connected to each other through an internal bus, which can be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, or an EISA (Extended Industry Standard Architecture (Extended Industry Standard Architecture) bus and so on.
  • the bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only a two-way arrow is used in FIG. 9, but it does not mean that there is only one bus or one type of bus.
  • the program may include program code, where the program code includes a computer operation instruction.
  • the memory may include memory and non-volatile memory, and provide instructions and data to the processor.
  • the processor reads the corresponding computer program from the non-volatile memory into the memory and then runs it to form a voice control device on a logical level.
  • the processor executes the program stored in the memory and is specifically used to perform the following operations:
  • the voice channel establishment request including identification information of the voice collection terminal and a wake-up parameter corresponding to the wake-up instruction
  • a voice signal is collected, and the voice signal is sent to the control device.
  • the processor executes the program stored in the memory and is specifically used to perform the following operations:
  • the voice channel establishment request including identification information of the voice collection terminal and a wake-up parameter corresponding to a wake-up instruction received by the voice collection terminal;
  • a voice channel When a voice channel is established with the voice collection terminal, a voice signal is received, and the voice signal is collected by the voice collection terminal.
  • the processor executes the program stored in the memory and is specifically used to perform the following operations:
  • Identify the voice signal and return the recognition result to the control device for the control device to perform voice control on the IoT terminal based on the recognition result.
  • the method performed by the voice control apparatus disclosed in the foregoing corresponding embodiments of the present application may be applied to a processor, or implemented by a processor.
  • the processor may be an integrated circuit chip with signal processing capabilities.
  • each step of the above method may be completed by an integrated logic circuit of hardware in a processor or an instruction in a form of software.
  • the aforementioned processor may be a general-purpose processor, including a central processing unit (CPU), a network processor (NP), etc .; it may also be a digital signal processor (DSP), special integration Circuit (Application Specific Integrated Circuit, ASIC), Field Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the steps of the method disclosed in combination with the embodiments of the present application may be directly implemented by a hardware decoding processor, or may be performed by using a combination of hardware and software modules in the decoding processor.
  • the software module may be located in a mature storage medium such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory, or an electrically erasable programmable memory, a register, and the like.
  • the storage medium is located in a memory, and the processor reads the information in the memory and completes the steps of the foregoing method in combination with its hardware.
  • the electronic device may also execute the method performed by the corresponding voice control device and implement the functions of the voice control device in the foregoing corresponding embodiments, which are not repeatedly described in this embodiment of the present application.
  • An embodiment of the present application further provides a computer-readable storage medium, where the computer-readable storage medium stores one or more programs, the one or more programs include instructions, and the instructions are executed by an electronic device including multiple application programs ,
  • the electronic device can be caused to execute the method executed by the voice control apparatus in the embodiment shown in FIG. 2, and is specifically configured to execute:
  • the voice channel establishment request including identification information of the voice collection terminal and a wake-up parameter corresponding to the wake-up instruction
  • a voice signal is collected, and the voice signal is sent to the control device.
  • An embodiment of the present application further provides a computer-readable storage medium, where the computer-readable storage medium stores one or more programs, the one or more programs include instructions, and the instructions are executed by an electronic device including multiple application programs At this time, the electronic device can be caused to execute the method executed by the voice control apparatus in the embodiment shown in FIG. 4, and is specifically configured to execute:
  • the voice channel establishment request including identification information of the voice collection terminal and a wake-up parameter corresponding to a wake-up instruction received by the voice collection terminal;
  • a voice channel When a voice channel is established with the voice collection terminal, a voice signal is received, and the voice signal is collected by the voice collection terminal.
  • An embodiment of the present application further provides a computer-readable storage medium, where the computer-readable storage medium stores one or more programs, the one or more programs include instructions, and the instructions are executed by an electronic device including multiple application programs At this time, the electronic device can be caused to execute the method performed by the voice control apparatus in the embodiment shown in FIG. 5, and is specifically configured to execute:
  • Identify the voice signal and return the recognition result to the control device for the control device to perform voice control on the IoT terminal based on the recognition result.
  • the embodiments of the present invention may be provided as a method, a system, or a computer program product. Therefore, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Moreover, the present invention may take the form of a computer program product implemented on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing device to work in a particular manner such that the instructions stored in the computer-readable memory produce a manufactured article including an instruction device, the instructions
  • the device implements the functions specified in one or more flowcharts and / or one or more blocks of the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing device, so that a series of steps can be performed on the computer or other programmable device to produce a computer-implemented process, which can be executed on the computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more flowcharts and / or one or more blocks of the block diagrams.
  • a computing device includes one or more processors (CPUs), input / output interfaces, network interfaces, and memory.
  • processors CPUs
  • input / output interfaces output interfaces
  • network interfaces network interfaces
  • memory volatile and non-volatile memory
  • Memory may include non-persistent memory, random access memory (RAM), and / or non-volatile memory in computer-readable media, such as read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
  • RAM random access memory
  • ROM read-only memory
  • flash RAM flash memory
  • Computer-readable media includes both permanent and non-persistent, removable and non-removable media.
  • Information can be stored by any method or technology.
  • Information may be computer-readable instructions, data structures, modules of a program, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), and read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, read-only disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media may be used to store information that can be accessed by computing devices.
  • computer-readable media does not include temporary computer-readable media, such as modulated data signals and carrier waves.
  • this application may be provided as a method, a system, or a computer program product. Therefore, this application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Moreover, this application may take the form of a computer program product implemented on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.

Abstract

一种语音控制方法、装置及系统,应用于语音采集终端,包括:监听唤醒指令(S101);在接收到唤醒指令的情况下,向控制设备发送语音通道建立请求,语音通道建立请求中包括语音采集终端的标识信息以及与唤醒指令相对应的唤醒参数(S103);在与控制设备建立语音通道的情况下,采集语音信号,并将语音信号发送至控制设备,供控制设备基于语音信号对物联网终端进行语音控制(S105)。语音控制系统能够准确可靠的响应用户的语音控制信号,对AI设备等物联网终端进行语音控制。

Description

语音控制方法、装置及系统
本申请要求2018年08月29日递交的申请号为201810997304.8、发明名称为“语音控制方法、装置及系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,尤其涉及一种语音控制方法、装置及系统。
背景技术
随着人工智能AI(全称Artificial Intelligence)技术在智能家居中的普及,越来越多的智能家居设备采用了内置(inside)人工智能模块,以提高产品的智能化水平,提升用户体验。
在一个有限空间内,很可能会布设有多个用户可以通过语音信号控制的AI设备,例如,可语音控制的冰箱,可语音控制的电视机,可语音控制的扫地机器人,可语音控制的空调,等等。这些AI家居设备将构成物联网系统。
由于声音信号将呈辐射状向四面八方传播,因此,用户发出的语音信号将可能被多个AI家居设备接受并响应,导致用户无法准确可靠通过语音信号控制AI设备。
因此,亟需一种能够准确可靠的响应用户的语音控制信号的语音控制方法,以提升用户体验。
发明内容
本申请实施例提供了分别由语音采集终端和控制设备执行的语音控制方法,旨在准确可靠的响应用户的语音控制信号,对物联网终端进行语音控制。
本申请实施例还提供了由服务端执行的语音控制方法,旨在辅助控制设备准确可靠的响应用户的语音控制信号,对物联网终端进行语音控制。
本申请实施例采用下述技术方案:
第一方面,本申请实施例提供一种语音控制方法,应用于语音采集终端,其中,包括:
监听唤醒指令;
在接收到所述唤醒指令的情况下,向控制设备发送语音通道建立请求,所述语音通 道建立请求中包括所述语音采集终端的标识信息以及与所述唤醒指令相对应的唤醒参数;
在与所述控制设备建立语音通道的情况下,采集语音信号,并将所述语音信号发送至所述控制设备,供所述控制设备基于所述语音信号对物联网终端进行语音控制。
优选的,本申请实施例第一方面提供的方法中,在向控制设备发送语音通道建立请求之前,所述方法还包括:
在接收到所述唤醒指令的情况下,依据所述唤醒指令的音量确定所述唤醒参数。
优选的,本申请实施例第一方面提供的方法中,所述唤醒参数与所述唤醒指令的音量正相关。
优选的,本申请实施例第一方面提供的方法中,将所述语音信号发送至所述控制设备,包括:
将所述语音信号包含在语音控制信息中,将所述语音控制信息发送至所述控制设备。
优选的,本申请实施例第一方面提供的方法中,所述语音控制信息中还包含所述语音采集终端的标识信息,则在将所述语音信号发送至所述控制设备之后,所述方法还包括:
接收所述控制设备返回的语音控制结果,所述语音控制结果由所述控制设备基于所述语音信号对物联网终端进行语音控制得到;
展示所述语音控制结果。
优选的,本申请实施例第一方面提供的方法中,展示所述语音控制结果,具体包括:
通过声音信号、光信号和振动信号中至少一项,展示所述语音控制结果。
第二方面,本申请实施例还提供一种语音控制方法,应用于控制设备,其中,包括:
接收来自语音采集终端的语音通道建立请求,所述语音通道建立请求中包括所述语音采集终端的标识信息以及与所述语音采集终端接收到的唤醒指令相对应的唤醒参数;
根据所述唤醒参数,确定是否与所述语音采集终端建立语音通道;
在与所述语音采集终端建立语音通道的情况下,接收语音信号,所述语音信号由所述语音采集终端采集;
基于所述语音信号对物联网终端进行语音控制。
优选的,本申请实施例第二方面提供的方法中,基于所述语音信号对物联网终端对物联网终端进行语音控制,包括:
对所述语音信号进行识别,并基于识别结果对所述物联网终端对物联网终端进行语 音控制。
优选的,本申请实施例第二方面提供的方法中,对所述语音信号进行识别,包括:
对所述语音信号进行自然语言处理NLP和自然语言理解NLU处理。
优选的,本申请实施例第二方面提供的方法中,基于所述语音信号对物联网终端进行语音控制,包括:
将所述语音信号发送至服务端,供所述服务端对所述语音信号进行识别;
接收所述服务端返回的识别结果,并基于所述识别结果对物联网终端进行语音控制。
优选的,本申请实施例第二方面提供的方法中,在基于所述语音信号对物联网终端进行语音控制之后,所述方法还包括:
向所述语音采集终端返回语音控制结果,供所述语音采集终端展示。
优选的,本申请实施例第二方面提供的方法中,基于所述语音信号对物联网终端进行语音控制,包括以下至少一项:
基于所述语音信号,进行文字转语音输出;
基于所述语音信号,控制目标物联网终端执行目标操作,所述目标物联网终端和所述目标操作与所述语音信号的识别结果相对应。
优选的,本申请实施例第二方面提供的方法中,所述唤醒参数与所述唤醒指令的音量正相关,则根据所述唤醒参数,确定是否与所述语音采集终端建立语音通道,包括以下至少一项:
若所述唤醒参数大于或等于第一预设阈值,则确定与所述语音采集终端建立语音通道;
若所述唤醒参数小于或等于第二预设阈值,则确定不与所述语音采集终端建立语音通道。
优选的,本申请实施例第二方面提供的方法中,所述唤醒参数与所述唤醒指令的音量正相关;
接收来自语音采集终端的语音通道建立请求,具体为:
接收来自多个语音采集终端的语音通道建立请求;
则根据所述唤醒参数,确定是否与所述语音采集终端建立语音通道,包括:
根据所述唤醒参数,将所述多个语音采集终端中唤醒参数最大的语音采集终端确定为目标语音采集终端;
确定与所述目标语音采集终端建立语音通道。
优选的,本申请实施例第二方面提供的方法中,所述控制设备为多个物联网终端中任一个。
优选的,本申请实施例第二方面提供的方法中,所述方法还包括:
在满足预设条件的情况下,将所述多个物联网终端中、除所述控制设备以外的任一个物联网终端,确定为新的控制设备。
第三方面,本申请实施例还提供一种语音控制方法,应用于服务端,其中,包括:
接收来自控制设备的语音信号,所述语音信号由与所述控制设备建立了语音通道的语音采集终端采集;
对所述语音信号进行识别,并将识别结果返回所述控制设备,供所述控制设备基于所述识别结果对物联网终端进行语音控制。
优选的,本申请实施例第三方面提供的方法中,对所述语音信号进行识别,包括:
对所述语音信号进行自然语言处理NLP和自然语言理解NLU处理。
第四方面,本申请实施例还提供一种语音控制系统,其中,所述语音控制系统包括:
控制设备,用于接收来自语音采集终端的语音通道建立请求,所述语音通道建立请求中包括所述语音采集终端的标识信息以及与所述语音采集终端接收到的唤醒指令相对应的唤醒参数;还用于根据所述唤醒参数,确定是否与所述语音采集终端建立语音通道;还用于在与所述语音采集终端建立语音通道的情况下,接收语音信号,所述语音信号由所述语音采集终端采集;还用于基于所述语音信号对物联网终端进行语音控制;
语音采集终端,用于监听唤醒指令;还用于在接收到所述唤醒指令的情况下,向控制设备发送语音通道建立请求,所述语音通道建立请求中包括所述语音采集终端的标识信息以及与所述唤醒指令相对应的唤醒参数;还用于在与所述控制设备建立语音通道的情况下,采集语音信号,并将所述语音信号发送至所述控制设备,供所述控制设备基于所述语音信号对物联网终端进行语音控制。
优选的,本申请实施例第四方面提供的系统中,所述语音控制系统还包括:
服务端,用于接收来自控制设备的语音信号,所述语音信号由与所述控制设备建立了语音通道的语音采集终端采集;还用于对所述语音信号进行识别,并将识别结果返回所述控制设备,供所述控制设备基于所述识别结果对物联网终端进行语音控制。
第五方面,本申请实施例还提供一种语音采集终端,其中,所述语音采集终端用于执行本申请实施例第一方面提供的方法。
第六方面,本申请实施例还提供一种控制设备,其中,所述控制设备用于执行本申 请实施例第二方面提供的方法。
第七方面,本申请实施例还提供一种服务端,其中,所述服务端用于执行本申请实施例第三方面提供的方法。
第八方面,本申请实施例还提供一种电子设备,应用于语音采集终端,其中,包括:
处理器;以及
被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器执行以下操作:
监听唤醒指令;
在接收到所述唤醒指令的情况下,向控制设备发送语音通道建立请求,所述语音通道建立请求中包括所述语音采集终端的标识信息以及与所述唤醒指令相对应的唤醒参数;
在与所述控制设备建立语音通道的情况下,采集语音信号,并将所述语音信号发送至所述控制设备,供所述控制设备基于所述语音信号对物联网终端进行语音控制。
第九方面,本申请实施例还提供一种计算机可读存储介质,其中,所述计算机可读存储介质存储一个或多个程序,所述一个或多个程序当被包括多个应用程序的电子设备执行时,使得所述电子设备执行以下操作:
监听唤醒指令;
在接收到所述唤醒指令的情况下,向控制设备发送语音通道建立请求,所述语音通道建立请求中包括所述语音采集终端的标识信息以及与所述唤醒指令相对应的唤醒参数;
在与所述控制设备建立语音通道的情况下,采集语音信号,并将所述语音信号发送至所述控制设备,供所述控制设备基于所述语音信号对物联网终端进行语音控制。
第十方面,本申请实施例还提供一种电子设备,应用于控制设备,其中,包括:
处理器;以及
被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器执行以下操作:
接收来自语音采集终端的语音通道建立请求,所述语音通道建立请求中包括所述语音采集终端的标识信息以及与所述语音采集终端接收到的唤醒指令相对应的唤醒参数;
根据所述唤醒参数,确定是否与所述语音采集终端建立语音通道;
在与所述语音采集终端建立语音通道的情况下,接收语音信号,所述语音信号由所 述语音采集终端采集;
基于所述语音信号对物联网终端进行语音控制。
第十一方面,本申请实施例还提供一种计算机可读存储介质,其中,所述计算机可读存储介质存储一个或多个程序,所述一个或多个程序当被包括多个应用程序的电子设备执行时,使得所述电子设备执行以下操作:
接收来自语音采集终端的语音通道建立请求,所述语音通道建立请求中包括所述语音采集终端的标识信息以及与所述语音采集终端接收到的唤醒指令相对应的唤醒参数;
根据所述唤醒参数,确定是否与所述语音采集终端建立语音通道;
在与所述语音采集终端建立语音通道的情况下,接收语音信号,所述语音信号由所述语音采集终端采集;
基于所述语音信号对物联网终端进行语音控制。
第十二方面,本申请实施例还提供一种电子设备,应用于服务端,其中,包括:
处理器;以及
被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器执行以下操作:
接收来自控制设备的语音信号,所述语音信号由与所述控制设备建立了语音通道的语音采集终端采集;
对所述语音信号进行识别,并将识别结果返回所述控制设备,供所述控制设备基于所述识别结果对物联网终端进行语音控制。
第十三方面,本申请实施例还提供一种计算机可读存储介质,其中,所述计算机可读存储介质存储一个或多个程序,所述一个或多个程序当被包括多个应用程序的电子设备执行时,使得所述电子设备执行以下操作:
接收来自控制设备的语音信号,所述语音信号由与所述控制设备建立了语音通道的语音采集终端采集;
对所述语音信号进行识别,并将识别结果返回所述控制设备,供所述控制设备基于所述识别结果对物联网终端进行语音控制。
本申请实施例采用的上述至少一个技术方案能够达到以下有益效果:
本申请实施例中,语音采集终端能够在接收到唤醒指令的情况下,向控制设备发送语音通道建立请求,请求与控制设备建立语音通道。进而在与控制设备建立了语音通道的情况下,才会采集语音信号并发送至控制设备,供控制设备基于语音信号对物联网终 端进行语音控制。相对应的,控制设备也只会在与某一语音采集终端建立了语音通道的情况下,才会接收该语音采集终端采集到的语音信号。因此,采用本申请实施例提供的方案,可以避免用户的语音信号被多个语音采集终端采集、并都向控制设备发送而导致的误触发、误反馈,因而使得语音控制系统能够准确可靠的响应用户的语音控制信号,对AI设备等物联网终端对物联网终端进行语音控制。
附图说明
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:
图1为本申请实施例所提供的语音控制系统的架构示意图;
图2为本申请实施例中由语音采集终端执行的语音控制方法的流程示意图;
图3为本申请实施例中语音采集终端和控制设备等多侧交互的流程示意图;
图4为本申请实施例中由控制设备执行的语音控制方法的流程示意图;
图5为本申请实施例中由服务端执行的语音控制方法的流程示意图;
图6为本申请实施例中语音采集终端的结构示意图;
图7为本申请实施例中控制设备的结构示意图;
图8为本申请实施例中云服务终端的结构示意图;
图9为本申请实施例中一种电子设备的结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合本申请具体实施例及相应的附图对本申请技术方案进行清楚、完整地描述。显然,所描述的实施例仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
在相关技术中,如果有限空间内布设了多个AI家居设备,则用户发出的语音信号可能被多个AI设备接受,云端难以判断应由哪个AI设备响应该语音信号,因此,可能导致用户无法准确可靠的通过语音信号控制AI设备,或者,导致多个AI设备同时响应用户的语音信号。
为了解决这一问题,云端可以采用声源定位的方式,根据语音信号的发生源进行声源定位,使得AI设备只针对预设的发声方向进行语音信号的识别和反馈。云端也可以采 用声音传播路径的时长判断用户希望控制的是哪个AI设备,例如,云端可根据声音传播的时长,将声音传播到设备的时长最小的设备,确定为距离用户最近的设备,认为该设备为用户希望控制的设备,进而云端控制该设备响应用户发出的语音信号。
然而,以上两种方式都存在一些问题。
采用声源定位的方式时,如果两个设备放置的方向一致,或者放置的位置接近,如同时摆放在电视柜上的智能电视机和智能音箱,这二者接收到的声源方向相差无几,云端将难以做出正确决策。
采用声音传播时长的方式时,在较小的空间内进行传播距离识别时,对时间同步和计算速度的要求非常高,因此,对云端算法的延时和本地处理时钟的校准要求都极高,否则无法区分声音到达不同AI设备的传播路径时长。
因此,在实际应用场景中,以上两种方式往往难以达到理想的识别效果,仍然可能出现误识别、误触发的情况,影响用户体验。
鉴于此,本申请实施例提供了一种语音控制系统,并相应提供了该系统中各部分所执行的语音控制方法,可以避免用户的语音信号被多个语音采集终端采集、并都向控制设备发送而导致的误触发、误反馈,因而使得语音控制系统能够准确可靠的响应用户的语音控制信号,对AI设备等物联网终端对物联网终端进行语音控制。
参见图1所示,本申请实施例提供的语音控制系统包括控制设备和至少一个语音采集终端。其中,语音采集终端用于采集用户的语音信号,并发送给控制设备;控制设备用于接收用户的语音信号,并基于对语音信号进行识别的识别结果对至少一个物联网终端进行控制。
具体的,语音控制系统中的语音采集终端,用于监听唤醒指令;还用于在接收到唤醒指令的情况下,向控制设备发送语音通道建立请求,语音通道建立请求中包括语音采集终端的标识信息以及与唤醒指令相对应的唤醒参数;还用于在与控制设备建立语音通道的情况下,采集语音信号,并将语音信号通过语音通道发送至控制设备,供控制设备基于语音信号对物联网终端进行语音控制。
具体的,语音控制系统中的控制设备,用于接收来自语音采集终端的语音通道建立请求,语音通道建立请求中包括语音采集终端的标识信息以及与语音采集终端接收到的唤醒指令相对应的唤醒参数;还用于根据唤醒参数,确定是否与语音采集终端建立语音通道;还用于在与语音采集终端建立语音通道的情况下,接收由语音采集终端采集的语音信号;还用于基于语音信号对物联网终端进行语音控制。
本申请实施例提供的语音控制系统中,还优选包含服务端。服务端可用于接收来自控制设备的语音信号,其中,语音信号由与控制设备建立了语音通道的语音采集终端采集;还用于对语音信号进行识别,并将识别结果返回控制设备,供控制设备基于识别结果对物联网终端进行语音控制。
能够理解到,上述控制设备,在实体上可以是专门用于接收语音信号并基于语音信号对物联网终端进行语音控制的设备,也可以是多个物联网终端中的任意一个,只要能够实现控制设备的功能即可。
需要说明的是,在本申请实施例提供的语音控制系统中,任一时刻只会存在一个控制设备,即任一时刻只会有一个设备(既可以是专门的设备,也可以是某一物联网终端)充当控制设备的角色。
还需要说明的是,为提高语音控制系统的健壮性,在满足预设条件的情况下,例如,用户发出了切换控制设备的指令时,或者,当前的控制设备出现故障、运行异常、资源不足等情况下,可以在该系统中对控制设备进行切换。具体的,当前的控制设备可以将多个物联网终端中、除当前控制设备以外的其他任一个物联网终端,确定为新的控制设备,以便维持语音控制系统的持续稳定运行。
在本申请实施例中,语音采集终端能够在接收到唤醒指令的情况下,向控制设备发送语音通道建立请求,请求与控制设备建立语音通道。进而在与控制设备建立了语音通道的情况下,才会采集语音信号并发送至控制设备,供控制设备基于语音信号对物联网终端进行语音控制。相对应的,控制设备也只会在与某一语音采集终端建立了语音通道的情况下,才会接收该语音采集终端采集到的语音信号。因此,采用本申请实施例提供的方案,可以避免用户的语音信号被多个语音采集终端采集、并都向控制设备发送而导致的误触发、误反馈,因而使得语音控制系统能够准确可靠的响应用户的语音控制信号,对AI设备等物联网终端对物联网终端进行语音控制。
以下结合附图,从多个角度详细说明本申请各实施例提供的技术方案。
参见图2和图3所示,本申请实施例提供了一种语音控制方法,应用于语音采集终端。该方法可包括:
S101:监听唤醒指令。
能够理解到,为降低语音采集终端的功耗,在用户不需要对物联网终端进行语音控制时,语音采集终端可以处于待机状态。在待机过程中,语音采集终端可以持续执行步骤S101,监听用户发出的唤醒指令,以便及时响应。
上述唤醒指令,可以理解为语音采集终端中预存的声音信号,用于触发语音采集终端进入工作状态。这段声音信号,也可称为唤醒声音信号,既可以由语音采集终端默认设置(例如,“启动语音控制”,“小A小A”,等等),也可以由用户根据自己的需要和喜好预先设置(例如,“我回来了”,“我走了”,等等)。
可以理解到,上述唤醒指令,可以是一条预存的声音信号,也可以是多条,以便匹配用户在不同场景下的需求。例如,用户刚回到家时,可以输入“我回来了”这一唤醒指令触发语音采集终端进入工作状态,以便进而控制智能电视机、智能空调等物联网终端启动运行。又例如,用户准备上班时,可以输入“我走了”这一唤醒指令触发语音采集终端进入工作状态,以便进而控制家中的物联网终端停止运行。采用预存多段声音信号作为唤醒指令的方式,用户可以直截了当的表达当前场景下的状态,而无需使用一成不变的唤醒指令(例如,“启动语音控制”,等),增强了语音控制的趣味性和场景性,有利于提升用户体验。
语音采集终端在监听唤醒指令的过程中,会对监听到的声音信号进行识别,如果与预存的、作为唤醒指令的声音信号相匹配,则确定接收到唤醒指令;如果与预存的、作为唤醒指令的声音信号不相匹配,则确定未接收到唤醒指令,语音采集终端将继续执行步骤S101监听唤醒指令。
S103:在接收到唤醒指令的情况下,向控制设备发送语音通道建立请求,语音通道建立请求中包括语音采集终端的标识信息以及与唤醒指令相对应的唤醒参数。
可选的,在接收到唤醒指令的情况下,语音采集终端可以依据唤醒指令的音量确定唤醒参数,进而将唤醒参数携带在语音通道建立请求中发送至控制设备,供控制设备确定是否与该语音采集终端建立语音通道。
可以理解到,唤醒指令的音量能够反映用户与语音采集终端之间的距离,因此,根据唤醒指令的音量确定出的唤醒参数(可记为Hx),也能够反映用户与语音采集终端之间的距离。
在实际场景中,当有限空间中存在多个语音采集终端的情况下,用户通常会更希望唤醒距离较近的语音采集终端。例如,进出家门时,用户通常更希望唤醒布设于门厅或客厅的语音采集终端,而非卧室的。因此,语音采集终端将能够反映唤醒指令音量的唤醒参数发送至控制设备,控制设备就能够判断是否从该语音采集终端接收语音信号。
可选的,依据唤醒指令的音量确定唤醒参数时,确定出的唤醒参数与唤醒指令的音量可以正相关。也就是说,唤醒指令的音量越大,唤醒参数越大;反之,唤醒指令的音 量越小,唤醒参数也越小。
这种情况下,参见图4所示,控制设备在执行步骤S203,根据唤醒参数,确定是否与语音采集终端建立语音通道时,若某语音采集终端发送来的唤醒参数大于或等于第一预设阈值,则控制设备可以确定与该语音采集终端建立语音通道。可选的,若某语音采集终端发送来的唤醒参数小于或等于第二预设阈值,则控制设备可以确定不与该语音采集终端建立语音通道。
如果控制设备接收到了来自多个语音采集终端的语音通道建立请求,每个语音通道建立请求中,都将包含对应语音采集终端依据接收到的唤醒指令确定出的唤醒参数。此时,控制设备可以根据唤醒参数,将这多个语音采集终端中唤醒参数最大的语音采集终端确定为目标语音采集终端,进而确定与目标语音采集终端建立语音通道。可以理解到,多个语音采集终端中唤醒参数最大的语音采集终端,可以理解为距离用户最近的语音采集终端,因而可以认为是用户所期望唤醒的目标语音采集终端。
需要说明的是,控制设备在执行步骤S203,根据唤醒参数,确定是否与语音采集终端建立语音通道时,既可以只考虑某个语音采集终端所对应的唤醒参数(根据上述第一预设阈值和/或第二预设阈值判断),也可以只考虑多个语音采集终端所对应唤醒参数之间的大小排序,还可以综合考虑,例如,对大于或等于第一预设阈值的唤醒参数进行排序,等等。
能够理解,语音采集终端向控制设备发送的语音通道建立请求中还包括语音采集终端的标识信息,以便控制设备识别唤醒参数所对应的语音采集终端,进而确定同意或者拒绝该语音采集终端的语音通道建立请求。相对应的,控制设备在确定是否与某语音采集终端建立语音通道后,可以将确定的结果返回该语音采集终端,参见图3所示。
如果某语音采集终端发出的语音通道建立请求被控制设备拒绝,则该语音通道将在待机状态下继续监听唤醒指令。如果某语音采集终端发出的语音通道建立请求被控制设备同意,则该语音采集终端将与控制设备建立语音通道,可以进而执行步骤S105。
可以理解到,采用上述请求(由语音采集终端执行)-建立(由控制设备执行)语音通道的方式,控制设备在同一时刻可以只建立一条语音通道,只允许接收一个语音采集终端采集到的语音信号,因而能够避免接收多路语音信号所导致的误识别、误触发问题。
S105:在与控制设备建立语音通道的情况下,采集语音信号,并将语音信号发送至控制设备,供控制设备基于语音信号对物联网终端进行语音控制。
可以理解到,用户向语音采集终端发出唤醒指令,以唤醒语音采集终端,如果语音 采集终端与控制设备建立了语音通道,则可以提示用户输入语音信号。语音采集终端采集了用户发出的语音信号后,可进一步将语音信号发送至控制设备。具体的,在发送语音信号时,可以将语音信号包含在语音控制信息中,进而将语音控制信息发送至控制设备。
相对应的,控制设备可以在建立了语音通道的情况下接收语音信号。在接收到语音信号后,控制设备可以进一步对语音信号进行识别,以便基于语音信号对物联网终端进行语音控制,参见图3所示。
可选的,语音采集终端通过语音通道发送到控制设备的语音控制信息中,还可以包含语音采集终端的标识信息,以便控制设备在将包含有语音信号的语音控制信息通过语音通道发送至控制设备之后,可以进一步将基于语音信号对物联网终端进行语音控制得到的语音控制结果返回至同样的语音采集终端,参见图4所示。语音采集终端接收到语音控制结果后,可以进一步向用户进行展示,以便用户知晓语音控制的结果。
可选的,语音采集终端可以通过声音信号、光信号和振动信号中至少一项,展示语音控制结果。例如,用户输入的语音信号为“打开客厅空调”,语音采集终端可以以声音信号“客厅空调已打开”或者“语音控制成功”等向用户反馈语音控制结果。又例如,语音采集终端可以以绿色光圈表示语音控制成功,以红色光圈表示语音控制失败。
相对应的,参见图4所示,本申请实施例还提供一种语音控制方法,应用于控制设备。其中,该方法包括:
S201:接收来自语音采集终端的语音通道建立请求,语音通道建立请求中包括语音采集终端的标识信息以及与语音采集终端接收到的唤醒指令相对应的唤醒参数。
能够理解,控制设备执行步骤S201接收到的语音通道建立请求,与语音采集终端执行步骤S103发送的语音通道建立请求相对应,此处不再赘述。
S203:根据唤醒参数,确定是否与语音采集终端建立语音通道。
能够理解,语音采集终端根据唤醒指令的音量确定出唤醒参数,唤醒参数与唤醒指令的音量正相关。因此,控制设备在执行步骤S203时,可以采用以下至少一项确定是否与语音采集终端建立语音通道:
若唤醒参数大于或等于第一预设阈值,则确定与语音采集终端建立语音通道;
若唤醒参数小于或等于第二预设阈值,则确定不与语音采集终端建立语音通道;
根据唤醒参数,将多个语音采集终端中唤醒参数最大的语音采集终端确定为目标语音采集终端,并确定与目标语音采集终端建立语音通道。
S205:在与语音采集终端建立语音通道的情况下,接收语音信号,该语音信号由语音采集终端采集。
能够理解,控制设备执行步骤S205接收到的语音信号,与语音采集终端执行步骤S105采集并发送到控制设备的语音信号相对应,此处不再赘述。
S207:基于语音信号对物联网终端进行语音控制。
在执行步骤S207时,若控制设备具备对语音信号的识别能力,则可以直接在本地对语音信号进行识别,并基于识别结果对物联网终端进行语音控制。若控制设备不具备对语音信号的识别能力,则可将语音信号发送至服务端,供服务端对语音信号进行识别,参见图3所示。相对应的,服务端在接收到来自控制设备的语音信号后,可以对语音信号进行识别,并将识别结果返回控制设备,参见图3和图5所示。控制设备在接收到服务端返回的识别结果后,可进而基于识别结果对物联网终端进行语音控制。
需要说明的是,无论是控制设备在本地对语音信号进行识别,还是发送至服务端,由服务端对语音信号进行识别,都可以对语音信号进行自然语言处理NLP(全称Natural Language Processing)和自然语言理解NLU(全称Natural Language Understanding)处理。
其中,自然语言处理和自然语言理解的目的,在于将人类的语言形式(此处具体为用户输入的语音信号)转化为机器可理解的、结构化的、完整的语义表示。具体的,自然语言处理可以包括分词、词法分析、语法分析、语义分析等环节。
可以理解到,自然语言处理和自然语言理解的基础,是各类自然语言处理数据集,例如,语料库训练集(全称tc-corpus-train)、面向文本分类研究的中英文新闻分类语料、以IG卡方等特征词选择方法生成的多维度ARFF(全称Attribute-Relation File Format)格式中文向量空间模型VSM(全称Vector Space Model)、万篇随机抽取论文中文DBLP资源、用于非监督中文分词算法的中文分词词库、UCI评价排序数据、带有初始化说明的情感分析数据集,等等。因此,优选将语言信号发送至服务端进行识别,参见图3所示。
可选的,在基于语音信号,确切的说,基于对语音信号进行识别的识别结果进行语音控制时,可以进行文字转语音TTS(Text To Speech)输出,以便控制设备可以以语音的形式向用户反馈。也可以基于语音信号,确定与语音信号的识别结果相对应的目标物联网终端和目标操作,进而控制目标物联网终端执行目标操作。
可选的,控制设备在基于语音信号对物联网终端进行语音控制之后,还可进一步向语音采集终端返回语音控制结果,供语音采集终端向用户展示。
本申请实施例中,语音采集终端能够在接收到唤醒指令的情况下,向控制设备发送语音通道建立请求,请求与控制设备建立语音通道。进而在与控制设备建立了语音通道的情况下,才会采集语音信号并发送至控制设备,供控制设备基于语音信号对物联网终端进行语音控制。相对应的,控制设备也只会在与某一语音采集终端建立了语音通道的情况下,才会接收该语音采集终端采集到的语音信号。因此,采用本申请实施例提供的方案,可以避免用户的语音信号被多个语音采集终端采集、并都向控制设备发送而导致的误触发、误反馈,因而使得语音控制系统能够准确可靠的响应用户的语音控制信号,对AI设备等物联网终端对物联网终端进行语音控制。
相对应的,参见图5所示,本申请实施例还提供一种语音控制方法,应用于服务端,具体的,服务端可以是云服务端。其中,该方法包括:
S301:接收来自控制设备的语音信号,语音信号由与控制设备建立了语音通道的语音采集终端采集;
S303:对语音信号进行识别,并将识别结果返回控制设备,供控制设备基于识别结果对物联网终端进行语音控制。
能够理解,服务端执行步骤S303对语音信号进行识别时,也对语音信号进行自然语言处理NLP和自然语言理解NLU处理。
能够理解,服务端执行的语音控制方法,与控制设备执行的语音控制方法向配合,实现了对语音信号的识别,使得控制设备能够正确、及时的响应用户通过语音采集终端输入的语音信号,实现了对物联网终端的语音控制。前述实施例中的相关描述均适用于服务端执行的语音控制方法,此处不再赘述。
参见图6所示,本申请实施例还提供了一种语音采集终端,包括:
监听模块101,用于监听唤醒指令;
请求发送模块103,用于在接收到唤醒指令的情况下,向控制设备发送语音通道建立请求,语音通道建立请求中包括语音采集终端的标识信息以及与唤醒指令相对应的唤醒参数;
语音信号采集发送模块105,用于在与控制设备建立语音通道的情况下,采集语音信号,并将语音信号通过语音通道发送至控制设备,供控制设备基于语音信号对物联网终端进行语音控制。
能够理解,图6给出的语音采集终端能够实现图2中所述的语音控制方法的各个步骤,前述实施例中关于语音采集终端的阐述均适用于此,此处不再赘述。
参见图7所示,本申请实施例还提供了一种控制设备,包括:
请求接收模块201,用于接收来自语音采集终端的语音通道建立请求,语音通道建立请求中包括语音采集终端的标识信息以及与语音采集终端接收到的唤醒指令相对应的唤醒参数;
判断模块203,用于根据唤醒参数,确定是否与语音采集终端建立语音通道;
第一语音信号接收模块205,用于在与语音采集终端建立语音通道的情况下,接收语音信号,语音信号由语音采集终端采集;
语音控制模块207,用于基于语音信号对物联网终端进行语音控制。
能够理解,图7给出的控制设备能够实现图4中所述的语音控制方法的各个步骤,前述实施例中关于控制设备的阐述均适用于此,此处不再赘述。
参见图8所述,本申请实施例还提供一种服务端,包括:
第二语音信号接收模块301,用于接收来自控制设备的语音信号,语音信号由与控制设备建立了语音通道的语音采集终端采集;
语音信号识别模块303,用于对语音信号进行识别,并将识别结果返回控制设备,供控制设备基于识别结果对物联网终端进行语音控制。
能够理解,图8给出的服务端能够实现图5中所述的语音控制方法的各个步骤,前述实施例中关于服务端的阐述均适用于此,此处不再赘述。
图9是本申请的一个实施例电子设备的结构示意图。请参考图9,在硬件层面,该电子设备包括处理器,可选地还包括内部总线、网络接口、存储器。其中,存储器可能包含内存,例如高速随机存取存储器(Random-Access Memory,RAM),也可能还包括非易失性存储器(non-volatile memory),例如至少1个磁盘存储器等。当然,该电子设备还可能包括其他业务所需要的硬件。
处理器、网络接口和存储器可以通过内部总线相互连接,该内部总线可以是ISA(Industry Standard Architecture,工业标准体系结构)总线、PCI(Peripheral Component Interconnect,外设部件互连标准)总线或EISA(Extended Industry Standard Architecture,扩展工业标准结构)总线等。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图9中仅用一个双向箭头表示,但并不表示仅有一根总线或一种类型的总线。
存储器,用于存放程序。具体地,程序可以包括程序代码,所述程序代码包括计算机操作指令。存储器可以包括内存和非易失性存储器,并向处理器提供指令和数据。
处理器从非易失性存储器中读取对应的计算机程序到内存中然后运行,在逻辑层面上形成语音控制装置。
当处理器应用于语音采集终端中时,处理器,执行存储器所存放的程序,并具体用于执行以下操作:
监听唤醒指令;
在接收到所述唤醒指令的情况下,向控制设备发送语音通道建立请求,所述语音通道建立请求中包括所述语音采集终端的标识信息以及与所述唤醒指令相对应的唤醒参数;
在与所述控制设备建立语音通道的情况下,采集语音信号,并将所述语音信号发送至所述控制设备。
当处理器应用于控制设备中时,处理器,执行存储器所存放的程序,并具体用于执行以下操作:
接收来自语音采集终端的语音通道建立请求,所述语音通道建立请求中包括所述语音采集终端的标识信息以及与所述语音采集终端接收到的唤醒指令相对应的唤醒参数;
根据所述唤醒参数,确定是否与所述语音采集终端建立语音通道;
在与所述语音采集终端建立语音通道的情况下,接收语音信号,所述语音信号由所述语音采集终端采集。
当处理器应用于服务端中时,处理器,执行存储器所存放的程序,并具体用于执行以下操作:
接收来自控制设备的语音信号,所述语音信号由与所述控制设备建立了语音通道的语音采集终端采集;
对所述语音信号进行识别,并将识别结果返回所述控制设备,供所述控制设备基于所述识别结果对物联网终端进行语音控制。
上述如本申请前述对应实施例揭示的语音控制装置执行的方法可以应用于处理器中,或者由处理器实现。处理器可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器可以是通用处理器,包括中央处理器(Central Processing Unit,CPU)、网络处理器(Network Processor,NP)等;还可以是数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分 立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
该电子设备还可执行前述对应语音控制装置执行的方法,并实现语音控制装置在前述对应实施例的功能,本申请实施例在此不再赘述。
本申请实施例还提出了一种计算机可读存储介质,该计算机可读存储介质存储一个或多个程序,该一个或多个程序包括指令,该指令当被包括多个应用程序的电子设备执行时,能够使该电子设备执行图2所示实施例中语音控制装置执行的方法,并具体用于执行:
监听唤醒指令;
在接收到所述唤醒指令的情况下,向控制设备发送语音通道建立请求,所述语音通道建立请求中包括所述语音采集终端的标识信息以及与所述唤醒指令相对应的唤醒参数;
在与所述控制设备建立语音通道的情况下,采集语音信号,并将所述语音信号发送至所述控制设备。
本申请实施例还提出了一种计算机可读存储介质,该计算机可读存储介质存储一个或多个程序,该一个或多个程序包括指令,该指令当被包括多个应用程序的电子设备执行时,能够使该电子设备执行图4所示实施例中语音控制装置执行的方法,并具体用于执行:
接收来自语音采集终端的语音通道建立请求,所述语音通道建立请求中包括所述语音采集终端的标识信息以及与所述语音采集终端接收到的唤醒指令相对应的唤醒参数;
根据所述唤醒参数,确定是否与所述语音采集终端建立语音通道;
在与所述语音采集终端建立语音通道的情况下,接收语音信号,所述语音信号由所述语音采集终端采集。
本申请实施例还提出了一种计算机可读存储介质,该计算机可读存储介质存储一个或多个程序,该一个或多个程序包括指令,该指令当被包括多个应用程序的电子设备执 行时,能够使该电子设备执行图5所示实施例中语音控制装置执行的方法,并具体用于执行:
接收来自控制设备的语音信号,所述语音信号由与所述控制设备建立了语音通道的语音采集终端采集;
对所述语音信号进行识别,并将识别结果返回所述控制设备,供所述控制设备基于所述识别结果对物联网终端进行语音控制。
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介 质的示例。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。
本领域技术人员应明白,本申请的实施例可提供为方法、系统或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
以上所述仅为本申请的实施例而已,并不用于限制本申请。对于本领域技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本申请的权利要求范围之内。

Claims (29)

  1. 一种语音控制方法,应用于语音采集终端,其中,包括:
    监听唤醒指令;
    在接收到所述唤醒指令的情况下,向控制设备发送语音通道建立请求,所述语音通道建立请求中包括所述语音采集终端的标识信息以及与所述唤醒指令相对应的唤醒参数;
    在与所述控制设备建立语音通道的情况下,采集语音信号,并将所述语音信号发送至所述控制设备,供所述控制设备基于所述语音信号对物联网终端进行语音控制。
  2. 根据权利要求1所述方法,其中,在向控制设备发送语音通道建立请求之前,所述方法还包括:
    在接收到所述唤醒指令的情况下,依据所述唤醒指令的音量确定所述唤醒参数。
  3. 根据权利要求2所述方法,其中,所述唤醒参数与所述唤醒指令的音量正相关。
  4. 根据权利要求1所述方法,其中,将所述语音信号发送至所述控制设备,包括:
    将所述语音信号包含在语音控制信息中,将所述语音控制信息发送至所述控制设备。
  5. 根据权利要求4所述方法,其中,所述语音控制信息中还包含所述语音采集终端的标识信息,则在将所述语音信号发送至所述控制设备之后,所述方法还包括:
    接收所述控制设备返回的语音控制结果,所述语音控制结果由所述控制设备基于所述语音信号对物联网终端进行语音控制得到;
    展示所述语音控制结果。
  6. 根据权利要求5所述方法,其中,展示所述语音控制结果,具体包括:
    通过声音信号、光信号和振动信号中至少一项,展示所述语音控制结果。
  7. 一种语音控制方法,应用于控制设备,其中,包括:
    接收来自语音采集终端的语音通道建立请求,所述语音通道建立请求中包括所述语音采集终端的标识信息以及与所述语音采集终端接收到的唤醒指令相对应的唤醒参数;
    根据所述唤醒参数,确定是否与所述语音采集终端建立语音通道;
    在与所述语音采集终端建立语音通道的情况下,接收语音信号,所述语音信号由所述语音采集终端采集;
    基于所述语音信号对物联网终端对物联网终端进行语音控制。
  8. 根据权利要求7所述方法,其中,基于所述语音信号对物联网终端对物联网终端进行语音控制,包括:
    对所述语音信号进行识别,并基于识别结果对所述物联网终端对物联网终端进行语音控制。
  9. 根据权利要求8所述方法,其中,对所述语音信号进行识别,包括:
    对所述语音信号进行自然语言处理NLP和自然语言理解NLU处理。
  10. 根据权利要求7所述方法,其中,基于所述语音信号对物联网终端进行语音控制,包括:
    将所述语音信号发送至服务端,供所述服务端对所述语音信号进行识别;
    接收所述服务端返回的识别结果,并基于所述识别结果对物联网终端进行语音控制。
  11. 根据权利要求7所述方法,其中,在基于所述语音信号对物联网终端进行语音控制之后,所述方法还包括:
    向所述语音采集终端返回语音控制结果,供所述语音采集终端展示。
  12. 根据权利要求7所述方法,其中,基于所述语音信号对物联网终端进行语音控制,包括以下至少一项:
    基于所述语音信号,进行文字转语音输出;
    基于所述语音信号,控制目标物联网终端执行目标操作,所述目标物联网终端和所述目标操作与所述语音信号的识别结果相对应。
  13. 根据权利要求7所述方法,其中,所述唤醒参数与所述唤醒指令的音量正相关,则根据所述唤醒参数,确定是否与所述语音采集终端建立语音通道,包括以下至少一项:
    若所述唤醒参数大于或等于第一预设阈值,则确定与所述语音采集终端建立语音通道;
    若所述唤醒参数小于或等于第二预设阈值,则确定不与所述语音采集终端建立语音通道。
  14. 根据权利要求7所述方法,其中,所述唤醒参数与所述唤醒指令的音量正相关;
    接收来自语音采集终端的语音通道建立请求,具体为:
    接收来自多个语音采集终端的语音通道建立请求;
    则根据所述唤醒参数,确定是否与所述语音采集终端建立语音通道,包括:
    根据所述唤醒参数,将所述多个语音采集终端中唤醒参数最大的语音采集终端确定 为目标语音采集终端;
    确定与所述目标语音采集终端建立语音通道。
  15. 根据权利要求7~14之任一所述方法,其中,所述控制设备为多个物联网终端中任一个。
  16. 根据权利要求15所述方法,其中,所述方法还包括:
    在满足预设条件的情况下,将所述多个物联网终端中、除所述控制设备以外的任一个物联网终端,确定为新的控制设备。
  17. 一种语音控制方法,应用于服务端,其中,包括:
    接收来自控制设备的语音信号,所述语音信号由与所述控制设备建立了语音通道的语音采集终端采集;
    对所述语音信号进行识别,并将识别结果返回所述控制设备,供所述控制设备基于所述识别结果对物联网终端进行语音控制。
  18. 根据权利要求17所述方法,其中,对所述语音信号进行识别,包括:
    对所述语音信号进行自然语言处理NLP和自然语言理解NLU处理。
  19. 一种语音控制系统,其中,所述语音控制系统包括:
    控制设备,用于接收来自语音采集终端的语音通道建立请求,所述语音通道建立请求中包括所述语音采集终端的标识信息以及与所述语音采集终端接收到的唤醒指令相对应的唤醒参数;还用于根据所述唤醒参数,确定是否与所述语音采集终端建立语音通道;还用于在与所述语音采集终端建立语音通道的情况下,接收语音信号,所述语音信号由所述语音采集终端采集;还用于基于所述语音信号对物联网终端进行语音控制;
    语音采集终端,用于监听唤醒指令;还用于在接收到所述唤醒指令的情况下,向控制设备发送语音通道建立请求,所述语音通道建立请求中包括所述语音采集终端的标识信息以及与所述唤醒指令相对应的唤醒参数;还用于在与所述控制设备建立语音通道的情况下,采集语音信号,并将所述语音信号发送至所述控制设备,供所述控制设备基于所述语音信号对物联网终端进行语音控制。
  20. 根据权利要求19所述系统,其中,所述语音控制系统还包括:
    服务端,用于接收来自控制设备的语音信号,所述语音信号由与所述控制设备建立了语音通道的语音采集终端采集;还用于对所述语音信号进行识别,并将识别结果返回 所述控制设备,供所述控制设备基于所述识别结果对物联网终端进行语音控制。
  21. 一种语音采集终端,其中,所述语音采集终端用于执行权利要求1~6之任一所述方法。
  22. 一种控制设备,其中,所述控制设备用于执行权利要求7~16之任一所述方法。
  23. 一种服务端,其中,所述服务端用于执行权利要求17或18所述方法。
  24. 一种电子设备,应用于语音采集终端,其中,包括:
    处理器;以及
    被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器执行以下操作:
    监听唤醒指令;
    在接收到所述唤醒指令的情况下,向控制设备发送语音通道建立请求,所述语音通道建立请求中包括所述语音采集终端的标识信息以及与所述唤醒指令相对应的唤醒参数;
    在与所述控制设备建立语音通道的情况下,采集语音信号,并将所述语音信号发送至所述控制设备,供所述控制设备基于所述语音信号对物联网终端进行语音控制。
  25. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储一个或多个程序,所述一个或多个程序当被包括多个应用程序的电子设备执行时,使得所述电子设备执行以下操作:
    监听唤醒指令;
    在接收到所述唤醒指令的情况下,向控制设备发送语音通道建立请求,所述语音通道建立请求中包括所述语音采集终端的标识信息以及与所述唤醒指令相对应的唤醒参数;
    在与所述控制设备建立语音通道的情况下,采集语音信号,并将所述语音信号发送至所述控制设备,供所述控制设备基于所述语音信号对物联网终端进行语音控制。
  26. 一种电子设备,应用于控制设备,其中,包括:
    处理器;以及
    被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器执行以下操作:
    接收来自语音采集终端的语音通道建立请求,所述语音通道建立请求中包括所述语音采集终端的标识信息以及与所述语音采集终端接收到的唤醒指令相对应的唤醒参数;
    根据所述唤醒参数,确定是否与所述语音采集终端建立语音通道;
    在与所述语音采集终端建立语音通道的情况下,接收语音信号,所述语音信号由所述语音采集终端采集;
    基于所述语音信号对物联网终端进行语音控制。
  27. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储一个或多个程序,所述一个或多个程序当被包括多个应用程序的电子设备执行时,使得所述电子设备执行以下操作:
    接收来自语音采集终端的语音通道建立请求,所述语音通道建立请求中包括所述语音采集终端的标识信息以及与所述语音采集终端接收到的唤醒指令相对应的唤醒参数;
    根据所述唤醒参数,确定是否与所述语音采集终端建立语音通道;
    在与所述语音采集终端建立语音通道的情况下,接收语音信号,所述语音信号由所述语音采集终端采集;
    基于所述语音信号对物联网终端进行语音控制。
  28. 一种电子设备,应用于服务端,其中,包括:
    处理器;以及
    被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器执行以下操作:
    接收来自控制设备的语音信号,所述语音信号由与所述控制设备建立了语音通道的语音采集终端采集;
    对所述语音信号进行识别,并将识别结果返回所述控制设备,供所述控制设备基于所述识别结果对物联网终端进行语音控制。
  29. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储一个或多个程序,所述一个或多个程序当被包括多个应用程序的电子设备执行时,使得所述电子设备执行以下操作:
    接收来自控制设备的语音信号,所述语音信号由与所述控制设备建立了语音通道的语音采集终端采集;
    对所述语音信号进行识别,并将识别结果返回所述控制设备,供所述控制设备基于所述识别结果对物联网终端进行语音控制。
PCT/CN2019/101913 2018-08-29 2019-08-22 语音控制方法、装置及系统 WO2020042993A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810997304.8 2018-08-29
CN201810997304.8A CN110875041A (zh) 2018-08-29 2018-08-29 语音控制方法、装置及系统

Publications (1)

Publication Number Publication Date
WO2020042993A1 true WO2020042993A1 (zh) 2020-03-05

Family

ID=69643935

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/101913 WO2020042993A1 (zh) 2018-08-29 2019-08-22 语音控制方法、装置及系统

Country Status (2)

Country Link
CN (1) CN110875041A (zh)
WO (1) WO2020042993A1 (zh)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111667827A (zh) * 2020-05-28 2020-09-15 北京小米松果电子有限公司 应用程序的语音控制方法、装置及存储介质
CN111722824A (zh) * 2020-05-29 2020-09-29 北京小米松果电子有限公司 语音控制方法、装置及计算机存储介质
CN111951829A (zh) * 2020-05-13 2020-11-17 慧言科技(天津)有限公司 基于时域单元的声源定位方法、装置及系统
CN112489653A (zh) * 2020-11-16 2021-03-12 北京小米松果电子有限公司 语音识别的方法、装置及存储介质
CN112669844A (zh) * 2020-12-25 2021-04-16 美的集团股份有限公司 通过语音贴控制设备的方法、设备控制方法及装置
CN113436631A (zh) * 2021-05-20 2021-09-24 青岛海尔空调器有限总公司 语音信息处理方法、系统及用于语音信息处理的装置
CN113571204A (zh) * 2020-04-29 2021-10-29 阿里巴巴集团控股有限公司 信息交互方法、装置及系统
CN114546304A (zh) * 2022-02-09 2022-05-27 青岛海尔科技有限公司 打印方法、装置、电子设备与存储介质
CN115171676A (zh) * 2022-05-30 2022-10-11 青岛海尔科技有限公司 意图行为的确定方法和装置、存储介质及电子装置

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111562346A (zh) * 2020-05-06 2020-08-21 江苏美的清洁电器股份有限公司 集尘站的控制方法、装置、设备、集尘站及存储介质
CN111599352B (zh) * 2020-06-01 2021-03-30 聆感智能科技(深圳)有限公司 语音唤醒方法、装置、计算机设备和存储介质
CN111785293B (zh) * 2020-06-04 2023-04-25 杭州海康威视系统技术有限公司 语音传输方法、装置及设备、存储介质
CN112071306A (zh) * 2020-08-26 2020-12-11 吴义魁 语音控制方法、系统、可读存储介质及网关设备
CN114268639A (zh) * 2020-09-14 2022-04-01 中国电信股份有限公司 用于进行语音控制的方法、装置、设备、系统和介质
CN112735391A (zh) * 2020-12-29 2021-04-30 科大讯飞股份有限公司 分布式语音的响应方法及相关装置
CN113242162B (zh) * 2021-05-19 2022-04-01 广东职业技术学院 一种基于前臂矫正器的家用电器控制方法和系统
CN113689857B (zh) * 2021-08-20 2024-04-26 北京小米移动软件有限公司 语音协同唤醒方法、装置、电子设备及存储介质
CN113611332B (zh) * 2021-10-09 2022-01-18 聊城中赛电子科技有限公司 一种基于神经网络的智能控制开关电源方法及装置
CN114244641B (zh) * 2021-11-25 2024-04-12 青岛海尔空调器有限总公司 用于家电系统语音识别的方法及装置、家电系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104145304A (zh) * 2012-03-08 2014-11-12 Lg电子株式会社 用于多个装置语音控制的设备和方法
CN105554283A (zh) * 2015-12-21 2016-05-04 联想(北京)有限公司 一种信息处理方法及电子设备
CN106469040A (zh) * 2015-08-19 2017-03-01 华为终端(东莞)有限公司 通信方法、服务器及设备
US9818407B1 (en) * 2013-02-07 2017-11-14 Amazon Technologies, Inc. Distributed endpointing for speech recognition
CN107919119A (zh) * 2017-11-16 2018-04-17 百度在线网络技术(北京)有限公司 多设备交互协同的方法、装置、设备及计算机可读介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107710148B (zh) * 2015-09-30 2020-02-14 华为技术有限公司 一种语音控制的处理方法和装置
CN107610702B (zh) * 2017-09-22 2021-01-29 百度在线网络技术(北京)有限公司 终端设备待机唤醒方法、装置及计算机设备
CN107919122A (zh) * 2017-11-27 2018-04-17 深圳市灸大夫医疗科技有限公司 一种灸头的语音控制方法以及系统
CN108337601A (zh) * 2018-01-30 2018-07-27 出门问问信息科技有限公司 音箱的控制方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104145304A (zh) * 2012-03-08 2014-11-12 Lg电子株式会社 用于多个装置语音控制的设备和方法
US9818407B1 (en) * 2013-02-07 2017-11-14 Amazon Technologies, Inc. Distributed endpointing for speech recognition
CN106469040A (zh) * 2015-08-19 2017-03-01 华为终端(东莞)有限公司 通信方法、服务器及设备
CN105554283A (zh) * 2015-12-21 2016-05-04 联想(北京)有限公司 一种信息处理方法及电子设备
CN107919119A (zh) * 2017-11-16 2018-04-17 百度在线网络技术(北京)有限公司 多设备交互协同的方法、装置、设备及计算机可读介质

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113571204A (zh) * 2020-04-29 2021-10-29 阿里巴巴集团控股有限公司 信息交互方法、装置及系统
CN111951829B (zh) * 2020-05-13 2023-05-19 慧言科技(天津)有限公司 基于时域单元的声源定位方法、装置及系统
CN111951829A (zh) * 2020-05-13 2020-11-17 慧言科技(天津)有限公司 基于时域单元的声源定位方法、装置及系统
CN111667827A (zh) * 2020-05-28 2020-09-15 北京小米松果电子有限公司 应用程序的语音控制方法、装置及存储介质
CN111667827B (zh) * 2020-05-28 2023-10-17 北京小米松果电子有限公司 应用程序的语音控制方法、装置及存储介质
CN111722824A (zh) * 2020-05-29 2020-09-29 北京小米松果电子有限公司 语音控制方法、装置及计算机存储介质
CN111722824B (zh) * 2020-05-29 2024-04-30 北京小米松果电子有限公司 语音控制方法、装置及计算机存储介质
CN112489653A (zh) * 2020-11-16 2021-03-12 北京小米松果电子有限公司 语音识别的方法、装置及存储介质
CN112489653B (zh) * 2020-11-16 2024-04-26 北京小米松果电子有限公司 语音识别的方法、装置及存储介质
CN112669844A (zh) * 2020-12-25 2021-04-16 美的集团股份有限公司 通过语音贴控制设备的方法、设备控制方法及装置
CN113436631A (zh) * 2021-05-20 2021-09-24 青岛海尔空调器有限总公司 语音信息处理方法、系统及用于语音信息处理的装置
CN114546304A (zh) * 2022-02-09 2022-05-27 青岛海尔科技有限公司 打印方法、装置、电子设备与存储介质
CN114546304B (zh) * 2022-02-09 2023-11-28 青岛海尔科技有限公司 打印方法、装置、电子设备与存储介质
CN115171676A (zh) * 2022-05-30 2022-10-11 青岛海尔科技有限公司 意图行为的确定方法和装置、存储介质及电子装置

Also Published As

Publication number Publication date
CN110875041A (zh) 2020-03-10

Similar Documents

Publication Publication Date Title
WO2020042993A1 (zh) 语音控制方法、装置及系统
CN109378000B (zh) 语音唤醒方法、装置、系统、设备、服务器及存储介质
KR102309540B1 (ko) 사용자의 입력 입력에 기초하여 타겟 디바이스를 결정하고, 타겟 디바이스를 제어하는 서버 및 그 동작 방법
WO2017084185A1 (zh) 基于语义分析的智能终端控制方法、系统及智能终端
US11610578B2 (en) Automatic hotword threshold tuning
US10891945B2 (en) Method and apparatus for judging termination of sound reception and terminal device
CN111192590B (zh) 语音唤醒方法、装置、设备及存储介质
US20200019641A1 (en) Responding to multi-intent user input to a dialog system
WO2020135067A1 (zh) 语音交互方法、装置、机器人及计算机可读存储介质
WO2019051813A1 (zh) 一种目标识别方法、装置和智能终端
CN111462741A (zh) 语音数据处理方法、装置及存储介质
US11948565B2 (en) Combining device or assistant-specific hotwords in a single utterance
CN112382279B (zh) 语音识别方法、装置、电子设备和存储介质
JP2019191552A (ja) クラウドウェイクアップ方法及びシステム、端末並びにコンピュータ可読記憶媒体
CN111210824A (zh) 语音信息处理方法、装置、电子设备及存储介质
JP7341323B2 (ja) 全二重による音声対話の方法
US20230306964A1 (en) Device-specific skill processing
CN112420043A (zh) 基于语音的智能唤醒方法、装置、电子设备及存储介质
US11580974B2 (en) Method for exiting a voice skill, apparatus, device and storage medium
US11442692B1 (en) Acoustic workflow system distribution
CN111414760B (zh) 自然语言处理方法及相关设备、系统和存储装置
US11875786B2 (en) Natural language recognition assistant which handles information in data sessions
WO2018023518A1 (zh) 一种语音交互识别智能终端
CN112885341A (zh) 一种语音唤醒方法、装置、电子设备和存储介质
KR20200129315A (ko) 음성인식 발화어의 인식을 위한 리모컨 및 셋톱박스의 동작 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19853304

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19853304

Country of ref document: EP

Kind code of ref document: A1