CN112420043A - Intelligent awakening method and device based on voice, electronic equipment and storage medium - Google Patents

Intelligent awakening method and device based on voice, electronic equipment and storage medium Download PDF

Info

Publication number
CN112420043A
CN112420043A CN202011408983.4A CN202011408983A CN112420043A CN 112420043 A CN112420043 A CN 112420043A CN 202011408983 A CN202011408983 A CN 202011408983A CN 112420043 A CN112420043 A CN 112420043A
Authority
CN
China
Prior art keywords
voice
intelligent
information
intelligent voice
equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011408983.4A
Other languages
Chinese (zh)
Inventor
何海亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Oribo Technology Co Ltd
Original Assignee
Shenzhen Oribo Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Oribo Technology Co Ltd filed Critical Shenzhen Oribo Technology Co Ltd
Priority to CN202011408983.4A priority Critical patent/CN112420043A/en
Publication of CN112420043A publication Critical patent/CN112420043A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • H04L12/2803Home automation networks
    • H04L12/2816Controlling appliance services of a home automation network by calling their functionalities
    • H04L12/282Controlling appliance services of a home automation network by calling their functionalities based on user interaction within the home
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • H04L12/2803Home automation networks
    • H04L2012/284Home automation networks characterised by the type of medium used
    • H04L2012/2841Wireless

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Automation & Control Theory (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Telephone Function (AREA)

Abstract

The embodiment of the application discloses an intelligent awakening method, an intelligent awakening device, electronic equipment and a storage medium based on voice, and relates to the field of voice recognition, wherein the method is applied to first intelligent voice equipment in a local area network, and comprises the following steps: receiving voice information input by a user; acquiring voice parameters of voice information; and when the voice parameters meet the preset conditions, responding to the voice information, and sending a response stopping instruction to the second intelligent voice equipment in the local area network, wherein the response stopping instruction is used for controlling the second intelligent voice equipment not to respond to the voice information when receiving the voice information. According to the embodiment of the application, when the voice parameters of the voice information meet the preset conditions, the first intelligent voice device responds to the voice information and sends the response stopping instruction to the second intelligent voice device in the local area network, and the problem that a plurality of intelligent voice devices respond to the voice information at the same time is solved.

Description

Intelligent awakening method and device based on voice, electronic equipment and storage medium
Technical Field
The present application relates to the field of speech recognition technologies, and in particular, to a speech-based intelligent wake-up method and apparatus, an electronic device, and a storage medium.
Background
Along with the progress of science and technology, smart home is also popularized in people's daily life, and more users can all select to use smart home to promote the quality of life of oneself. The user can realize the control to intelligent home equipment through intelligent speech equipment with voice interaction's mode, but when there are a plurality of intelligent speech equipment in the environment that the user is located, probably there are more than two intelligent speech equipment and awaken by the pronunciation of user input simultaneously, cause the user not to know and which intelligent speech equipment carries out the puzzlement of voice interaction, reduced user's interactive experience. Therefore, how to optimize the voice interaction process of the intelligent voice device and avoid that a plurality of intelligent voice devices are awakened at the same time is a problem to be solved urgently at present.
Disclosure of Invention
In view of the foregoing, the present application provides a voice-based smart wake-up method, apparatus, electronic device, and storage medium.
In a first aspect, an embodiment of the present application provides a voice-based intelligent wake-up method, which is applied to a first intelligent voice device in a local area network, and the method includes: receiving voice information input by a user; acquiring voice parameters of the voice information; and when the voice parameters meet preset conditions, responding to the voice information, and sending a response stopping instruction to a second intelligent voice device in the local area network, wherein the response stopping instruction is used for controlling the second intelligent voice device not to respond to the voice information when receiving the voice information.
Further, the sending a stop response instruction to the second intelligent voice device in the local area network includes: and sending a stop response instruction to a server in the local area network, so that the server sends a signal with the stop response instruction to the second intelligent voice device after receiving the stop response instruction.
Further, the responding the voice message and sending a response stopping instruction to the second intelligent voice device in the local area network includes: recognizing the voice information to acquire a control command corresponding to the voice information; sending the stop response instruction to the second intelligent voice equipment in the local area network; monitoring whether feedback information sent by the second intelligent voice device is acquired within preset time, wherein the feedback information is information sent by the second intelligent voice device after the second intelligent voice device receives the response stopping instruction; and if the feedback information is acquired, executing the control command.
Further, before the sending the stop response instruction to the second smart voice device, the method further comprises: acquiring resources required for executing the control command; judging whether the state of the resource of the first intelligent voice equipment is an idle state or not; if yes, sending the stop response instruction to the second intelligent voice device; if not, the control command is sent to the second intelligent voice device, so that the second intelligent voice device executes the control command.
In a second aspect, an embodiment of the present application provides a voice-based intelligent wake-up method, which is applied to a server, and the method includes: receiving information and a response stopping instruction which are sent by first intelligent voice equipment and used for responding to voice information input by a user by the first intelligent voice equipment; when request information sent by second intelligent voice equipment and used for responding to the voice information by the second intelligent voice equipment is received, sending a signal with a response stopping instruction to the second intelligent voice equipment according to the response stopping instruction, so that the second intelligent voice equipment does not respond to the voice information after receiving the signal.
Further, after the receiving the information that the first intelligent voice device responds to the voice information input by the user and the stop response instruction, which are sent by the first intelligent voice device, the method further includes: acquiring intelligent voice equipment awakened by the user by intention selected from a plurality of intelligent voice equipment in the local area network; if the first intelligent voice equipment is not the intelligent voice equipment which is intended to be awakened by the user, marking the voice information as false awakening data; and updating the preset condition of the first intelligent voice device according to the false awakening data, wherein the preset condition is used for responding to the voice information by the first intelligent voice device when the voice parameters of the voice information accord with the preset condition.
In a third aspect, an embodiment of the present application provides an intelligent wake-up apparatus based on voice, which is applied to a first intelligent voice device in a local area network, and the apparatus includes: the voice receiving module is used for receiving voice information input by a user; the parameter acquisition module is used for acquiring the voice parameters of the voice information; and the first processing module is used for responding to the voice information and sending a response stopping instruction to the second intelligent voice equipment in the local area network when the voice parameters meet preset conditions, wherein the response stopping instruction is used for controlling the second intelligent voice equipment not to respond to the voice information when receiving the voice information.
In a fourth aspect, an embodiment of the present application provides an intelligent wake-up apparatus based on voice, which is applied to a server, and the apparatus includes: the instruction receiving module is used for receiving information which is sent by first intelligent voice equipment and used for responding to voice information input by a user by the first intelligent voice equipment and a response stopping instruction; and the second processing module is used for sending a signal with a response stopping instruction to the second intelligent voice equipment according to the response stopping instruction when receiving request information sent by the second intelligent voice equipment for responding to the voice information, so that the second intelligent voice equipment does not respond to the voice information after receiving the signal.
In a fifth aspect, the present application provides an electronic device, comprising: one or more processors; a memory; one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more applications being configured to perform the method of the first aspect.
In a sixth aspect, the present application provides a computer-readable storage medium having program code stored therein, the program code being invoked by a processor to perform the method of the first aspect.
The embodiment of the application discloses an intelligent awakening method, an intelligent awakening device, electronic equipment and a storage medium based on voice, and relates to the field of voice recognition, wherein the method comprises the following steps: receiving voice information input by a user; acquiring voice parameters of the voice information; and when the voice parameters meet preset conditions, responding to the voice information, and sending a response stopping instruction to a second intelligent voice device in the local area network, wherein the response stopping instruction is used for controlling the second intelligent voice device not to respond to the voice information when receiving the voice information. According to the voice information processing method and device, when the voice parameters of the voice information meet the preset conditions, the first intelligent voice device responds to the voice information and sends the response stopping instruction to the second intelligent voice device in the local area network, and therefore the situation that a plurality of devices in the local area network respond to the voice information of the user at the same time can be avoided.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 shows a schematic diagram of an application environment suitable for the embodiment of the present application.
Fig. 2 shows a flowchart of a method for a voice-based smart wake-up method according to an embodiment of the present application.
Fig. 3 is a flowchart illustrating a method of a voice-based smart wake-up method according to another embodiment of the present application.
Fig. 4 is a flowchart illustrating a method for a voice-based smart wake-up method according to another embodiment of the present application.
Fig. 5 shows a block diagram of a voice-based smart wake-up apparatus according to an embodiment of the present application.
Fig. 6 shows a block diagram of another voice-based smart wake-up apparatus according to an embodiment of the present application.
Fig. 7 shows a block diagram of an electronic device for performing a voice-based smart wake-up method according to an embodiment of the present application.
Fig. 8 illustrates a storage unit for storing or carrying program codes for implementing the voice-based smart wake-up method according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The intelligent home is characterized in that a home is used as a platform, facilities related to home life are integrated by utilizing a comprehensive wiring technology, a network communication technology, a safety precaution technology, an automatic control technology and a multimedia information technology, an efficient management system of home facilities and family schedule affairs is constructed, home safety, convenience, comfortableness and artistry are improved, and an environment-friendly and energy-saving living environment is realized. Along with the popularization of smart homes and the development of voice recognition technology, many smart home devices have a voice interaction function, a user can use the smart home devices to realize the control of the smart home devices, and when the user speaks a preset awakening word, the user can awaken the smart home devices so as to control the smart home devices through voice instructions. However, the intelligent voice devices of the same manufacturer often have the same wake-up word, and when a plurality of intelligent voice devices of the same manufacturer exist in an environment where a user is located, a sound of the user speaking a preset wake-up word may be detected by the plurality of intelligent voice devices, which may cause that more than two intelligent voice devices may be woken up, cause a confusion that the user does not know which intelligent voice device to perform voice interaction, and affect the use experience of the user.
In order to solve the above problems, the inventors have long studied and proposed a voice-based smart wake-up method, apparatus, electronic device and storage medium in the embodiments of the present application. In the embodiment of the application, a first intelligent voice device in a local area network receives voice information input by a user; acquiring voice parameters of voice information; and when the voice parameters meet the preset conditions, responding to the voice information, and sending a response stopping instruction to the second intelligent voice equipment in the local area network, wherein the response stopping instruction is used for controlling the second intelligent voice equipment not to respond to the voice information when receiving the voice information. Through the mode, the problem that the intelligent voice devices are simultaneously awakened by the voice information of the user and respond is solved, the voice instruction input by the user is prevented from being repeatedly executed, and the use experience of the user is improved.
In order to better understand the voice-based intelligent wake-up method, apparatus, electronic device, and storage medium provided in the embodiments of the present application, an application environment suitable for the embodiments of the present application is described below.
Referring to fig. 1, fig. 1 is a schematic diagram illustrating an application environment suitable for the embodiment of the present application.
The voice-based intelligent awakening method provided by the embodiment of the application can be applied to a multi-device scenario shown in fig. 1, where the multi-device scenario includes a first intelligent voice device 101, a second intelligent voice device 102 and a server 103, the first intelligent voice device 101 and the second intelligent voice device 102 are located in the same local area network and are in communication connection with the server 103, respectively, where the server 103 may be an individual server, a server cluster, a local server, or a cloud server, and is not limited specifically herein.
The first smart voice device 101 and the second smart voice device 102 may be networked in various manners, for example, Wireless Fidelity (WIFI), ZigBee, bluetooth, hotspot, and the like, and may be networked in different manners according to application scenarios of different smart voice devices. Alternatively, first smart voice device 101 and second smart voice device 102 may communicate directly or through a server.
First smart voice device 101 and second smart voice device 102 may be various electronic devices with voice interaction means, including but not limited to smart home devices, smart gateways, smart stereos, smart phones, tablets, laptop portable computers, desktop computers, wearable electronic devices, and the like. Specifically, the first smart voice device 101 and the second smart voice device 102 may include a voice input module such as a microphone, a voice output module such as a speaker, and a processor. The voice interaction device may be built into the apparatus or may be an independent module, which communicates with the apparatus through an API or other means.
Wherein the first smart voice device 101 and the second smart voice device 102 are electronic devices having the same wake-up word. Alternatively, the first smart voice device 101 and the second smart voice device 102 may be the same electronic device or different electronic devices. For example, the first smart voice device 101 and the second smart voice device 102 may be smart home control panels with the same wake-up word, or may be smart home control panels and smart televisions with the same wake-up word.
As a mode, the first smart voice device 101 and the second smart voice device 102 may be connected to at least one controlled device, for example, the first smart voice device 101 and the second smart voice device 102 may be control devices such as a smart audio, a smart gateway, and a smart home control panel, the controlled device may include but is not limited to smart home devices such as an air conditioner, a floor heating device, a fresh air, a curtain, a lamp, a television, a refrigerator, and an electric fan, and the smart voice device and the smart home devices may be connected through bluetooth, WIFI, or ZigBee. The type of smart voice device is not specifically limited herein.
In some embodiments, the first intelligent voice device may be the highest priority device for responding to voice information within the local area network. Optionally, the first intelligent voice device may be any one intelligent voice device in a local area network set by a user through a user, and the first intelligent voice device may also be an intelligent voice device set by a preset rule in the local area network.
As one mode, the preset rule may be that a device with better processing capability, which is determined among the plurality of intelligent voice devices according to the information parameter, is used as the first intelligent voice device, where the information parameter may include: at least one of a model of the intelligent voice device, a software version of the intelligent voice device, and a quality of a network to which the intelligent voice device is connected. It can be understood that the newer the model or software version of the intelligent voice device, the better the quality of the connected network, and the better the processing performance of the intelligent voice device. For example, the intelligent device with the best quality of the connected network can be set as a first intelligent voice device, and the other intelligent voice devices can be set as second intelligent voice devices.
Alternatively, the preset rule may be that an intention interaction device determined among the plurality of smart voice devices as the first smart voice device according to the multi-modal information of the user. For example, an image acquisition device may be configured on the smart voice device, the image acquisition device may acquire images such as a motion or a gesture input by a user, may determine whether the smart voice device is focused by human eyes or directed by the gesture, and when the smart voice device is watched by the human eyes or directed by the gesture, the device is regarded as the first smart voice device.
As yet another way, the preset rule may be to treat the device in the active state as the first smart voice device. The active state is used for representing that the intelligent voice equipment is in a state of playing videos, music and the like.
In some embodiments, the priorities of the first smart voice device and the second smart voice device for responding to the voice message are the same, and the second smart voice device may be a smart voice device other than the first smart voice device in the local area network, that is, the method applied to the first smart voice device in the embodiment of the present application may also be applied to the second smart voice device. In other embodiments, the second smart voice device may respond to voice information with a lower priority than the first smart voice device.
In some embodiments, the server 103 may analyze the voice information received by the first and second intelligent voice devices 101 and 102 by using an Automatic Speech Recognition (ASR) technology to determine a control command corresponding to the voice information, and return the control command to the sending device to execute the control command, so as to provide a service for the user.
In other embodiments, the first smart voice device 101 and the second smart voice device 102 may be respectively provided with a device for processing information input by a user, so that the first smart voice device 101 and the second smart voice device 102 can realize interaction with the user without relying on establishing communication with the server 103, and at this time, the multi-device scenario may only include the first smart voice device 101 and the second smart voice device 102.
It can be understood that fig. 1 is only an example of a multi-device scenario, and the embodiment of the present application does not limit the number of the intelligent voice devices in the multi-device scenario, and does not limit the wakeup words preset in the intelligent voice devices. The above application environments are only examples for facilitating understanding, and it is to be understood that the embodiments of the present application are not limited to the above application environments.
Embodiments of the present application will be described in detail below with reference to the accompanying drawings.
Referring to fig. 2, a flowchart of a method for a voice-based intelligent wake-up method according to an embodiment of the present application is applied to a first intelligent voice device in a local area network, and the method includes steps S210 to S240.
S210: and receiving voice information input by a user.
When first intelligent voice device starts the voice awakening function, first intelligent voice device can gather the sound of surrounding environment in real time through sound collection modules such as microphones. When the user makes a sound within the pickup distance of the first intelligent voice device, the first intelligent voice device can receive voice information input by the user.
In some embodiments, the state of the first smart voice device receiving the voice information input by the user may be an offline state, where the offline state refers to the first smart voice device being in a semi-dormant state in which offline voice recognition may be performed in the background.
In other embodiments, the state of the first intelligent voice device receiving the voice information input by the user may be an awake state, where the awake state refers to that the first intelligent voice device is in a standby operating state, and the control command corresponding to the voice information may be executed at any time according to the voice information of the user.
S220: and acquiring voice parameters of the voice information.
Optionally, after the voice information is obtained, a certain preprocessing operation may be performed on the voice information, and then the voice parameters of the preprocessed voice information are obtained. The preprocessing operation may include noise suppression processing, echo cancellation processing, signal enhancement processing, and the like, and more accurate speech parameters may be obtained through the preprocessing operation.
As one way, the voice parameter may be an acoustic feature similarity of the voice information and a preset wake-up word. Specifically, the similarity between the voice information and the preset wake-up word may be calculated through a wake-up word detection model, where the wake-up word detection model is obtained through pre-training of a large amount of audio data, the input of the wake-up word detection model may be the voice information, and the output may be the acoustic feature similarity between the voice information and the preset wake-up word. Among other things, acoustic feature similarity can be used to characterize the confidence level of arousal. Optionally, the output of the wakeup word detection model may also be a wakeup flag, that is, the wakeup flag may be used as a voice parameter of the voice, and the wakeup flag includes a flag that allows or prohibits the first smart voice device from being woken up.
Alternatively, the speech parameter may be an energy value of the speech information that may be used to characterize a signal-to-noise ratio in the speech information received by the first intelligent speech device. Specifically, when the acoustic similarity between the voice information and the preset awakening word is greater than the preset similarity threshold, the voice information can be considered to include the preset awakening word, and the energy value of the voice information can be used as the voice parameter. Optionally, an energy value of a wakeup word in the voice message may be obtained as the voice parameter, where the energy value of the wakeup word may be used to represent a size of voice receiving energy in a time period in which the wakeup word in the voice message is located. Wherein the energy value may be at least one of an audio signal acceptance ratio, a sound intensity, and an audio signal intensity of a microphone through face. Illustratively, the audio signal strength of the direct surface and the audio signal strength of the voice information as a whole are different, so that the signal-to-noise ratio of the voice information can be more accurately characterized.
S230: and when the voice parameters meet the preset conditions, responding to the voice information and sending a response stopping instruction to the second intelligent voice equipment in the local area network.
The preset condition is a condition corresponding to the voice parameter, and may be used to represent the wakeup confidence, that is, how likely the first intelligent voice device is to respond to the voice information, and the preset condition may be a different condition according to a different voice parameter in step S220.
In some embodiments, when the voice parameter is an acoustic feature similarity between the voice information and a preset wake-up word, the preset condition may be a preset similarity threshold. Specifically, when the acoustic similarity between the voice information and the preset awakening word is greater than the preset similarity threshold, the voice information can be considered to include the preset awakening word, and it can be determined that the voice parameter meets the preset condition. Optionally, when the voice parameter is a wake-up identifier, an identifier that allows the first smart voice device to be woken up may be used as the preset condition.
In other embodiments, when the voice parameter is an energy value of the voice information, the preset condition may include a preset similarity threshold and a preset energy threshold. When the acoustic similarity between the voice information and the preset awakening word is larger than a preset similarity threshold, whether the energy value of the voice information is larger than a preset energy threshold or not can be judged, and when the energy value of the voice information is larger than the preset energy threshold, the voice parameter is judged to meet the preset condition. After the acoustic feature similarity is judged, whether the voice parameters meet the preset conditions or not is further judged, and whether response is carried out or not can be judged more accurately. It can be understood that, generally, the smaller the distance between the first intelligent speech device and the user who utters the speech information is, the larger the energy value of the speech information is, and the close wake-up function of the first intelligent speech device can be realized by using the preset energy threshold as the preset condition, that is, when the first intelligent speech device is closest to the user, it is determined that the speech parameter meets the preset condition.
Optionally, the preset conditions of different intelligent voice devices in the local area network may be the same or different.
In some embodiments, different smart voice devices within the local area network may have the same preset conditions. As one mode, the preset condition may be a condition preset by each smart voice device at the time of factory shipment. Alternatively, the preset condition may be a condition determined according to default wake-up data. Specifically, the default wake-up data may be voice information that each intelligent voice device in the local area network responds when being used alone, and the preset condition of each intelligent voice device may be determined according to a voice parameter of the default wake-up data. For example, an average value of default wake-up data or a value near a boundary may be taken as a preset condition.
In other embodiments, different intelligent voice devices in the local area network may have different preset conditions, and each intelligent voice device has a preset condition corresponding to the device. Specifically, the intelligent voice device has a server judgment mode, a plurality of intelligent voice devices in the local area network can report the acquired voice information to the server, and the server judges that the target intelligent voice device responds according to the voice parameters of the acquired voice information, wherein the target intelligent voice device is the device closest to the user who sends the voice information, so that the nearby awakening is realized. The server judges the mode, and can acquire the historical voice information obtained by each intelligent voice device within the preset time, and the server determines the historical voice information when the device is the target intelligent voice device as the sample voice, so that the preset condition of each intelligent voice device is determined according to the voice parameters of the sample voice.
As one mode, the awakening prediction model may be trained according to the speech parameters of the sample speech, and based on the awakening prediction model, the value range of the speech parameter corresponding to the preset confidence level is obtained, and the value range is used as the preset condition. It can be understood that the smaller the value range of the speech parameter, the higher the accuracy of the wake-up prediction model. By the mode, each intelligent voice device can determine the preset condition according to the voice parameter of the historical voice information which is responded by the device in the actual application scene, so that whether the voice information is responded or not can be determined more accurately.
For example, the audio signal ratio of the microphone direct-to-surface may be used as a speech parameter, the audio signal ratio of the microphone direct-to-surface of the sample speech of the device within 30 days may be used as training sample data, an audio signal acceptance ratio range corresponding to a preset confidence level is obtained through a clustering algorithm such as poisson distribution, and the audio signal acceptance ratio range may be used as a preset condition of the intelligent speech device. And when the voice parameter of the voice information is acquired, judging whether the voice parameter belongs to the audio signal acceptance ratio range, and if so, judging that the voice parameter meets the preset condition.
And when the voice parameters meet the preset conditions, responding to the voice information and sending a response stopping instruction to the second intelligent voice equipment in the local area network. Optionally, when the voice parameter does not meet the preset condition, as a way, the first intelligent voice device does not perform any response, but at this time, if the second intelligent voice device determines that the received voice information meets the preset condition of the second intelligent voice device, the second intelligent voice device may respond to the voice information; alternatively, the voice parameters may be sent to a server, where the mode is determined by the server to determine whether to respond to the voice message.
And responding the voice information in different operation modes according to the state of the first intelligent voice equipment.
In some embodiments, when the state of the first smart voice device receiving the voice information input by the user is an offline state, responding to the voice information may be switching the first smart voice device to an awake state. Optionally, after the first intelligent voice device is switched from the offline state to the awake state, the control command in the voice information may be recognized in response to the voice information, or the control command in the voice information is recognized and the control command is executed.
In other embodiments, when the state of the first smart voice device receiving the voice information input by the user is an awake state, the response voice information may be a control command for recognizing the voice information to obtain the voice information; the responding voice information can also be in the process of recognizing the voice information to acquire a control command corresponding to the voice information and executing the control command.
And the response stopping instruction is used for controlling the second intelligent voice equipment not to respond to the voice information when receiving the voice information. Similar to the first intelligent voice device, the response to the voice message may be different operations according to the state of the second intelligent voice device, and details are not repeated herein.
In some embodiments, the first smart voice device may send the stop response instruction directly to the second smart voice device. As a mode, when the intelligent voice devices are networked, the address of each intelligent voice device may be recorded, and the address of the device with the routing function is set as a multicast address, so as to obtain a correspondence between the multicast address and the address of each intelligent voice device, and a stop response instruction may be forwarded by the device with the routing function, so that the device with the routing function sends the stop response instruction to the second intelligent voice device. Optionally, the first intelligent voice device may be a device with a routing function, and directly send the stop response instruction to the second intelligent voice device; the first intelligent voice device can also send the stop response instruction to the device with the routing function, so that the device forwards the stop response instruction to the second intelligent voice device.
In other embodiments, the first smart voice device may send a stop response instruction to a server in the local area network, so that the server sends a signal with the stop response instruction to the second smart voice device after receiving the stop response instruction. Specifically, please refer to the following embodiments in detail.
In some embodiments, the second smart voice device may respond to voice information with a lower priority than the first smart voice device.
As one mode, if the voice parameter of the voice information acquired by the second intelligent voice device satisfies the preset condition of the device, the second intelligent voice device may send request information requesting to respond to the voice information to the server, and monitor whether to acquire an instruction including a stop response sent by the server, and if not, respond to the voice information.
As another mode, if the voice parameter of the voice message acquired by the second intelligent voice device satisfies the preset condition of the device, the second intelligent voice device may monitor whether to acquire the stop response instruction sent by the first intelligent voice device within a preset time duration, and if not, respond to the voice message.
In other embodiments, the second smart voice device responds to voice information with the same priority as the first smart voice device. However, because different intelligent voice devices have different distances from the sound source of the voice information, the first intelligent voice device closest to the first intelligent voice device can acquire the voice information first and judge the preset condition at the earliest to obtain a judgment result, so that the first intelligent voice device is awakened nearby.
According to the voice-based intelligent awakening method provided by the embodiment of the application, the voice parameters of the voice information are obtained by receiving the voice information input by the user, when the voice parameters meet the preset conditions, the voice information is responded, and a response stopping instruction is sent to the second intelligent voice equipment in the local area network, wherein the response stopping instruction is used for controlling the second intelligent voice equipment not to respond to the voice information when the voice information is received. By responding to the voice information and sending the response stopping instruction to other equipment, the first intelligent voice equipment can only respond, so that the situation that a plurality of intelligent voice equipment in the local area network respond to the voice information of the user at the same time can be avoided.
Referring to fig. 3, a flowchart of a method for a voice-based intelligent wake-up method according to an embodiment of the present application is shown, where the method is applied to a first intelligent voice device in a local area network, and the method includes steps S310 to S360.
S310: and receiving voice information input by a user.
S320: and acquiring voice parameters of the voice information.
S330: and when the voice parameters meet the preset conditions, recognizing the voice information to acquire a control command corresponding to the voice information.
The preset condition is a condition corresponding to the voice parameter, please refer to the content of step S230 in the above embodiment, which is not described herein again.
When the voice parameters meet the preset conditions, the state of the first intelligent voice device can be switched to be an awakening state, and the voice information is recognized to obtain the control command corresponding to the voice information.
In some implementations, the first smart voice device may recognize voice information through an acoustic model to obtain the control command. Specifically, the first intelligent voice device may be provided with a corresponding relationship between a preset keyword and a control command, and may extract an acoustic feature in the voice information based on the acoustic model, calculate a similarity between the acoustic feature of the voice information and the acoustic feature of the preset keyword, and when the similarity is greater than a preset acoustic feature similarity threshold, use the control command corresponding to the preset keyword as the control command corresponding to the voice information.
For example, mel-frequency cepstrum coefficients extracted from the speech signal may be used as the acoustic features, and the maximum likelihood ratio of the acoustic features between the speech information and the preset keywords may be used as the acoustic feature similarity. Specifically, each feature point of the acoustic features in the speech information may be obtained, similarity comparison may be performed on each feature point of the acoustic features corresponding to the preset keyword, and then the similarities of all the feature points are integrated to obtain a maximum likelihood value as the acoustic feature similarity.
In other embodiments, the first intelligent Speech device may convert the Speech information into a text through an Automatic Speech Recognition (ASR) technique, perform a Natural Speech Understanding operation (NLU) on the text to analyze the Speech information, and determine the control command according to an analysis result.
S340: and sending a stop response instruction to a second intelligent voice device in the local area network.
And the response stopping instruction is used for controlling the second intelligent voice equipment not to respond to the voice information when receiving the voice information.
In some embodiments, the first smart voice device may send the stop response instruction directly to the second smart voice device. In other embodiments, the first smart voice device may send a stop response instruction to a server in the local area network, so that the server sends a signal with the stop response instruction to the second smart voice device after receiving the stop response instruction. Specifically, please refer to the content of step S230 in the foregoing embodiment, which is not described herein again.
In some embodiments, before step S340, it may also be performed to acquire the resource required for executing the control command and determine whether the state of the resource of the first smart voice device is an idle state. Specifically, resources required for executing the control command may be acquired; judging whether the state of the resource of the first intelligent voice equipment is an idle state or not; if so, sending a response stopping instruction to the second intelligent voice equipment; if not, the control command is sent to the second intelligent voice device, so that the second intelligent voice device executes the control command.
The resource occupied by executing the control command is an interactive interface of the terminal equipment, and the interactive interface comprises a camera, a microphone, an indicator light, a loudspeaker and other devices, specifically, the camera and the microphone can be used as input devices, and the indicator light and the loudspeaker can be used as output devices. Wherein the states of the resource include an occupied state and an idle state. For example, when the first smart voice device plays music, the resources of the speakers are occupied.
By monitoring the state of the resource of the first intelligent voice device, when the resource required for executing the control command is in an idle state, the first intelligent voice device may send a stop response instruction to the second intelligent voice device, that is, perform step S340; when the resource state of the first intelligent voice device is an occupied state or the first intelligent voice device does not have the resource, as a mode, the first intelligent voice device may not send any instruction, and the first intelligent voice device may also allow the second intelligent voice device to respond to the instruction, so that the second intelligent voice device responds to the voice information; alternatively, the first smart voice device may send a control command to the second smart voice device to cause the second smart voice device to execute the control command. For example, when the voice processing capability of the first intelligent voice device is better than that of the second intelligent voice device, the control command of the voice information is directly sent to the second intelligent voice device, so that the time and resources for voice analysis of the second intelligent voice device can be saved, the resource utilization rate can be improved, and the voice information input by the user can be responded more flexibly.
S350: and monitoring whether feedback information sent by the second intelligent voice equipment is acquired within preset time.
The feedback information is information sent by the second intelligent voice device after receiving the stop response instruction, and can be used for representing that the second intelligent voice device does not respond to the voice information after receiving the stop response instruction.
In some embodiments, the second intelligent voice device and the first intelligent voice device have the same priority for responding to the voice information, and if the second intelligent voice device does not recognize the voice information when receiving the stop response instruction, or the second intelligent voice device already recognizes the control command in the voice information but does not execute the control command, the second intelligent voice device stops the current processing process of the voice information and sends the feedback information when receiving the stop response instruction. And if the second intelligent voice equipment already recognizes the control command in the voice information and executes the control command when receiving the response stopping instruction, sending the information that the second intelligent voice equipment has responded to the first intelligent equipment.
In some embodiments, the second smart voice device may respond to voice information with a lower priority than the first smart voice device. As a mode, if the voice parameter of the voice information acquired by the second intelligent voice device satisfies the preset condition of the device, the second intelligent voice device may monitor whether to acquire the stop response instruction sent by the first intelligent voice device within a preset time period; if so, the second intelligent voice device does not respond to the voice information and sends feedback information to the first intelligent voice device; and if not, the second intelligent voice equipment responds to the voice information.
In some embodiments, the feedback information sent by the second intelligent voice device may further include voice parameters of the voice information acquired by the second intelligent voice device. Specifically, the preset conditions of the first smart voice device may include a first preset condition and a second preset condition, where the wakeup confidence coefficient represented by the first preset condition is greater than the wakeup confidence coefficient represented by the second preset condition; when a first preset condition is met, the first intelligent voice equipment executes a first operation; and when a second preset condition is met, the first intelligent voice equipment executes a second operation.
The first operation can be responding to voice information and sending a response stopping instruction to the second intelligent voice device, wherein the responding to voice information can be identifying a control instruction corresponding to the voice information and executing the control instruction; the second operation may be to recognize a control instruction corresponding to the voice information, send a stop response instruction, and determine whether to execute the control command according to feedback information sent by the second intelligent voice device. Specifically, it may be determined whether the control command is executed by the first smart voice device by comparing voice parameters of voice information of the second smart voice device and the first smart voice device. Whether the voice information is responded or not is judged by combining the voice parameters of the second intelligent voice device under the condition that the awakening confidence coefficient is low, and the awakening accuracy of the first intelligent voice device can be further improved.
S360: and if the feedback information is acquired, executing the control command.
In some embodiments, if the first smart voice device obtains the feedback information, the control command is executed; and if the feedback information is not acquired or the responded information sent by the second intelligent voice equipment is received, the first intelligent voice equipment does not execute the control command.
As one mode, when the control command is an instruction to control the first smart voice device that acquires the first voice information, the first smart voice device may directly execute the control command. For example, if the first smart audio device is a smart audio and the control command is music playing, the first smart audio device directly executes the control command to play music.
As another way, when the control command is an instruction to control a controlled device connected to the first smart voice device, the first smart voice device may send the control command to the controlled device corresponding to the control command and instruct the controlled device to execute the control command. The controlled device can be a device which is locally connected with the first intelligent voice device in a Bluetooth, WIFI or ZigBee mode or the like, and can also be a WIFI device which is connected with the first intelligent voice device under the same WIFI. For example, when the first smart voice device is a smart home control panel, the control command may be a command executable by a smart home device controlled by the smart home control panel.
In the embodiment of the present application, steps S310 to S320 may refer to the contents of the above embodiments, and are not described herein again.
The voice-based intelligent awakening method provided by the embodiment of the application receives voice information input by a user; acquiring a voice parameter of the voice information; when the voice parameters meet preset conditions, recognizing the voice information to acquire a control command corresponding to the voice information; sending a response stopping instruction to second intelligent voice equipment in the local area network; monitoring whether feedback information sent by the second intelligent voice equipment is acquired within preset time, wherein the feedback information is information sent by the second intelligent voice equipment after receiving a response stopping instruction; and if the feedback information is acquired, executing the control command. By monitoring the feedback information after the response stopping instruction is sent and executing the control command when the feedback information is acquired, the control command can be executed under the condition that other equipment does not respond to the voice information, so that the condition that the control command is executed by a plurality of equipment is avoided, and the accuracy of the first intelligent equipment for responding to the voice information is improved.
Referring to fig. 4, a flowchart of a method for a voice-based smart wake-up method according to an embodiment of the present application is shown, where the method is applied to a server and includes steps S410 to S420.
S410: and receiving information which is sent by the first intelligent voice equipment and is used for responding to the voice information input by the user by the first intelligent voice equipment and a response stopping instruction.
The first intelligent voice equipment can acquire the voice parameters of the voice information after receiving the voice information input by the user, and when the voice parameters meet the preset conditions, the first intelligent voice equipment can respond to the voice information and send response information and a response stopping instruction to the server, wherein the response information is information of the first intelligent voice equipment responding to the voice information input by the user, and the response stopping instruction is used for controlling the second intelligent voice equipment not to respond to the voice information when receiving the voice information. Specifically, please refer to the contents of the above embodiments, which are not described herein again.
In some embodiments, before receiving the response information and the stop response instruction sent by the first intelligent voice device, the server may further receive request information sent by the first intelligent voice device, where the request information is information that the first intelligent voice device determines that the voice parameter meets the preset condition and requests the first intelligent voice device to respond to the voice information, and the request information is sent to the server. And if the first intelligent voice device is the first device requesting to respond to the voice information, sending a response permission instruction to the first intelligent voice device. And the response permission instruction is used for allowing the intelligent voice equipment receiving the instruction to respond to the voice information.
S420: when request information sent by the second intelligent voice equipment and used for responding to the voice information by the second intelligent voice equipment is received, a signal with a response stopping instruction is sent to the second intelligent voice equipment according to the response stopping instruction, so that the second intelligent voice equipment does not respond to the voice information after receiving the signal.
As one mode, after the server receives the response information and the stop response instruction sent by the first intelligent voice device, if the server receives the request information sent by the second intelligent voice device for responding to the voice information, the server sends a signal with the stop response instruction to the second intelligent voice device according to the stop response instruction, so that the second intelligent voice device does not respond to the voice information after receiving the signal.
As another mode, the server may send a signal with a stop response instruction to all second intelligent voice devices in the local area network, where the second intelligent voice devices are devices in the local area network except the first intelligent voice device.
In some embodiments, if the server does not receive the response information and the stop response instruction sent by the first intelligent voice device before receiving the request information sent by the second intelligent voice device for responding to the voice information, where the request information is sent by the second intelligent voice device, the server may send a response permission instruction to the second intelligent voice device, where the response permission instruction is used to allow the intelligent voice device that has received the instruction to respond to the voice information.
In some embodiments, after step S410, the server may further obtain an intelligent voice device that the user intends to wake up from among the plurality of intelligent voice devices within the local area network; if the first intelligent voice equipment is not the intelligent voice equipment which is intended to be awakened by the user, marking the voice information as false awakening data; and updating the preset condition of the first intelligent voice device according to the false awakening data.
The preset condition is used for responding to the voice information and sending a response stopping instruction to the server when the voice parameter of the voice information acquired by the first intelligent voice device meets the preset condition. Specifically, please refer to the contents of the above embodiments, which are not described herein again.
It can be understood that, when the locations of the two smart voice devices in the local area network are close to each other, or the openings of the microphones of the two smart voice devices are oriented differently, it may be possible that the voice parameters of the voice information obtained by the two smart voice devices are relatively close to each other, and it may happen that the voice parameter of the first smart voice device satisfies the preset condition, but actually the user intends to wake up the other smart voice device. The preset conditions are updated by marking the mistaken awakening data, so that the preset conditions are more suitable for actual conditions of a plurality of intelligent voice devices in the local area network, and the first intelligent voice device judges whether to respond or not based on the updated preset conditions, so that more accurate awakening can be realized.
In some embodiments, after obtaining the intelligent voice device that the user intends to wake up from among the plurality of intelligent voice devices in the lan, the server may further calculate a first difference between the voice parameter of the first intelligent voice device and the specified voice parameter, and if the first difference is smaller than a first preset threshold, the voice information may be marked as false wake-up data. The specified voice parameter is a voice parameter generated by the intelligent voice device intended to be awakened according to the voice information acquired by the device, and optionally, the voice parameter may be an energy value of the voice information. For example, the difference between the signal-to-noise ratios of two smart voice devices installed in a user's home is generally higher than 5%, and when the difference between the signal-to-noise ratios of the first smart voice device and the smart voice device intended to wake up is less than 5%, it may be a false wake-up caused by the orientations of the microphone openings of the two smart voice devices, and the voice information may be marked as false wake-up data.
It is understood that if the first difference is large, the first smart voice device may be considered to be responding to voice information normally, and only if the first difference is small, it may be a false wake-up due to the opening direction of the microphone of the different device or the position of the device. By the method, the voice parameter difference between the intelligent voice equipment awakened by the user intention and the first intelligent voice equipment is smaller than the preset value, and the accuracy of judgment of the preset condition of the single intelligent voice equipment is prevented from being influenced by the user's messy label.
Further, the server may further calculate a second difference between the voice parameter corresponding to the first intelligent voice device and the historical data of the device, and if the second difference is greater than a second preset threshold, the voice information may be marked as false wake-up data, where the historical data is the voice parameter of the voice information obtained by the device when the first intelligent voice device responds to the voice information within a past period of time. Through the mode, the mistaken awakening data is marked only when the voice parameter of the voice information acquired by the first intelligent voice equipment at this time is greatly different from the historical data of the equipment, and the effect of preventing the user from being confused to mark so as to influence the accuracy rate of judgment of the preset condition of the single intelligent voice equipment can be achieved.
According to the voice-based intelligent awakening method provided by the embodiment of the application, a server receives information which is sent by first intelligent voice equipment and used for responding to voice information input by a user and sent by the first intelligent voice equipment and a response stopping instruction; when request information sent by the second intelligent voice equipment and used for responding to the voice information by the second intelligent voice equipment is received, a signal with a response stopping instruction is sent to the second intelligent voice equipment according to the response stopping instruction, so that the second intelligent voice equipment does not respond to the voice information after receiving the signal. After the response stopping instruction is received, the information with the response stopping instruction is sent to the second intelligent device, so that the situation that a plurality of intelligent voice devices in the local area network respond to the voice information at the same time is avoided.
Referring to fig. 5, a block diagram of a voice-based intelligent wake-up apparatus 500 according to an embodiment of the present application is shown, where the apparatus 500 is applied to a first intelligent voice device in a local area network, and the apparatus 500 may include: a voice receiving module 510, a parameter obtaining module 520 and a first processing module 530. The voice receiving module 510 is configured to receive voice information input by a user; a parameter obtaining module 520, configured to obtain a voice parameter of the voice message; a first processing module 530, configured to respond to the voice message when the voice parameter meets a preset condition, and send a response stopping instruction to a second intelligent voice device in the local area network, where the response stopping instruction is used to control the second intelligent voice device not to respond to the voice message when receiving the voice message.
Further, the first processing module 530 may further include a server communication sub-module, where the server communication sub-module is configured to send a stop response instruction to a server in the local area network when the voice parameter meets a preset condition, so that the server sends a signal with the stop response instruction to the second intelligent voice device after receiving the stop response instruction.
Further, the first processing module 530 may further include a voice recognition sub-module, an instruction sending sub-module, an information monitoring sub-module, and a command execution sub-module. The voice recognition submodule is used for recognizing the voice information to acquire a control command corresponding to the voice information; the instruction sending submodule is used for sending the stop response instruction to the second intelligent voice equipment in the local area network; the information monitoring module is used for monitoring whether feedback information sent by the second intelligent voice device is acquired within preset time, wherein the feedback information is sent after the second intelligent voice device receives the response stopping instruction; and the command execution module is used for executing the control command if the feedback information is acquired.
Further, before the instruction sending sub-module sends the stop response instruction to the second intelligent voice device in the local area network, the first processing module 530 may further include a resource obtaining sub-module, a status determining sub-module, a first executing sub-module, and a second executing sub-module. The resource acquisition submodule is used for acquiring resources required by executing the control command; the state judgment submodule is used for judging whether the state of the resource of the first intelligent voice device is an idle state or not; the first execution submodule is used for sending the stop response instruction to the second intelligent voice device if the first execution submodule is yes; and the second execution sub-module is used for sending the control command to the second intelligent voice device if the control command is not sent to the second intelligent voice device, so that the second intelligent voice device executes the control command.
Referring to fig. 6, a block diagram of a voice-based intelligent wake-up apparatus 600 according to an embodiment of the present application is provided, where the apparatus 600 is applied to a server, and the apparatus 600 includes an instruction receiving module 610 and a second processing module 620. The instruction receiving module 610 is configured to receive information that is sent by a first intelligent voice device and used for the first intelligent voice device to respond to voice information input by a user, and a response stopping instruction; the second processing module 620 is configured to, when request information that the second intelligent voice device responds to the voice information and is sent by the second intelligent voice device is received, send a signal with the stop response instruction to the second intelligent voice device according to the stop response instruction, so that the second intelligent voice device does not respond to the voice information after receiving the signal.
Further, the apparatus 600 may further include an intention acquisition module, a data tagging module, and a condition updating module, where the intention acquisition module is configured to acquire an intelligent voice device that is woken up by the user with an intention selected from a plurality of intelligent voice devices in the lan; the data marking module is used for marking the voice information as false awakening data if the first intelligent voice equipment is not the intelligent voice equipment which is awakened by the user intention; and the condition updating module is used for updating the preset condition of the first intelligent voice device according to the false awakening data, wherein the preset condition is used for responding to the voice information by the first intelligent voice device when the voice parameter of the voice information accords with the preset condition.
It should be noted that, as will be clear to those skilled in the art, for convenience and brevity of description, the specific working processes of the above-described apparatuses and modules may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, the coupling between the modules may be electrical, mechanical or other type of coupling. In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
Referring to fig. 7, based on the foregoing intelligent voice-based wake-up method, apparatus, electronic device and storage medium, an embodiment of the present application further provides an electronic device 700 capable of executing the foregoing intelligent voice-based wake-up method. The electronic device 700 in the present application may include one or more of the following components: a processor 710, a memory 720, and one or more applications, wherein the one or more applications may be stored in the memory 720 and configured to be executed by the one or more processors 710, the one or more programs configured to perform a method as described in the aforementioned method embodiments.
Processor 710 may include one or more processing cores. The processor 710 interfaces with various components throughout the electronic device 700 using various interfaces and circuitry to perform various functions of the electronic device 700 and process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 720 and invoking data stored in the memory 720. Alternatively, the processor 710 may be implemented in hardware using at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 710 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 710, but may be implemented by a communication chip.
The Memory 720 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). The memory 720 may be used to store instructions, programs, code sets, or instruction sets. The memory 720 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like. The data storage area may also store data created by the electronic device 700 during use (e.g., phone books, audio-visual data, chat log data), and the like.
It will be understood by those skilled in the art that the structure shown in fig. 7 is merely an illustration and is not intended to limit the structure of the electronic device. For example, electronic device 700 may also include more or fewer components than shown in FIG. 7, or have a different configuration than shown in FIG. 7.
Referring to fig. 8, fig. 8 is a block diagram illustrating a structure of a computer-readable storage medium according to an embodiment of the present disclosure. The computer-readable medium 800 has stored therein a program code that can be called by a processor to execute the method described in the above-described method embodiments.
The computer-readable storage medium 800 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Optionally, the computer-readable storage medium 700 includes a non-volatile computer-readable storage medium. The computer readable storage medium 800 has storage space for program code 810 to perform any of the method steps of the method described above. The program code can be read from or written to one or more computer program products. The program code 810 may be compressed, for example, in a suitable form.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.
While the present embodiments have been described with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments described above, which are meant to be illustrative and not restrictive, and that various changes may be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A voice-based intelligent awakening method is applied to a first intelligent voice device in a local area network, and comprises the following steps:
receiving voice information input by a user;
acquiring voice parameters of the voice information;
and when the voice parameters meet preset conditions, responding to the voice information, and sending a response stopping instruction to a second intelligent voice device in the local area network, wherein the response stopping instruction is used for controlling the second intelligent voice device not to respond to the voice information when receiving the voice information.
2. The method of claim 1, wherein sending a stop response instruction to a second smart voice device within the local area network comprises:
and sending a stop response instruction to a server in the local area network, so that the server sends a signal with the stop response instruction to the second intelligent voice device after receiving the stop response instruction.
3. The method of claim 1, wherein said responding to said voice message and sending a stop response instruction to a second smart voice device within said local area network comprises:
recognizing the voice information to acquire a control command corresponding to the voice information;
sending the stop response instruction to the second intelligent voice equipment in the local area network;
monitoring whether feedback information sent by the second intelligent voice device is acquired within preset time, wherein the feedback information is information sent by the second intelligent voice device after the second intelligent voice device receives the response stopping instruction;
and if the feedback information is acquired, executing the control command.
4. The method of claim 3, wherein prior to said sending said stop response instruction to said second smart voice device within said local area network, said method further comprises:
acquiring resources required for executing the control command;
judging whether the state of the resource of the first intelligent voice equipment is an idle state or not;
if yes, sending the stop response instruction to the second intelligent voice device;
if not, the control command is sent to the second intelligent voice device, so that the second intelligent voice device executes the control command.
5. A voice-based intelligent awakening method is applied to a server and comprises the following steps:
receiving information and a response stopping instruction which are sent by first intelligent voice equipment and used for responding to voice information input by a user by the first intelligent voice equipment;
when request information sent by second intelligent voice equipment and used for responding to the voice information by the second intelligent voice equipment is received, sending a signal with a response stopping instruction to the second intelligent voice equipment according to the response stopping instruction, so that the second intelligent voice equipment does not respond to the voice information after receiving the signal.
6. The method of claim 5, wherein after receiving the information sent by the first smart voice device that the first smart voice device responds to the voice information input by the user and the stop response instruction, the method further comprises:
acquiring intelligent voice equipment awakened by the user by intention selected from a plurality of intelligent voice equipment in the local area network;
if the first intelligent voice equipment is not the intelligent voice equipment which is intended to be awakened by the user, marking the voice information as false awakening data;
and updating the preset condition of the first intelligent voice device according to the false awakening data, wherein the preset condition is used for responding to the voice information by the first intelligent voice device when the voice parameters of the voice information accord with the preset condition.
7. An intelligent voice-based wake-up device, applied to a first intelligent voice device in a local area network, the device comprising:
the voice receiving module is used for receiving voice information input by a user;
the parameter acquisition module is used for acquiring the voice parameters of the voice information;
and the first processing module is used for responding to the voice information and sending a response stopping instruction to the second intelligent voice equipment in the local area network when the voice parameters meet preset conditions, wherein the response stopping instruction is used for controlling the second intelligent voice equipment not to respond to the voice information when receiving the voice information.
8. A voice-based intelligent wake-up device, applied to a server, the device comprising:
the instruction receiving module is used for receiving information which is sent by first intelligent voice equipment and used for responding to voice information input by a user by the first intelligent voice equipment and a response stopping instruction;
and the second processing module is used for sending a signal with a response stopping instruction to the second intelligent voice equipment according to the response stopping instruction when receiving request information sent by the second intelligent voice equipment for responding to the voice information, so that the second intelligent voice equipment does not respond to the voice information after receiving the signal.
9. An electronic device, comprising:
one or more processors;
a memory;
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method of any of claims 1-6.
10. A computer-readable storage medium, having stored thereon program code that can be invoked by a processor to perform the method according to any one of claims 1 to 6.
CN202011408983.4A 2020-12-03 2020-12-03 Intelligent awakening method and device based on voice, electronic equipment and storage medium Pending CN112420043A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011408983.4A CN112420043A (en) 2020-12-03 2020-12-03 Intelligent awakening method and device based on voice, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011408983.4A CN112420043A (en) 2020-12-03 2020-12-03 Intelligent awakening method and device based on voice, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112420043A true CN112420043A (en) 2021-02-26

Family

ID=74830348

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011408983.4A Pending CN112420043A (en) 2020-12-03 2020-12-03 Intelligent awakening method and device based on voice, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112420043A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114222255A (en) * 2021-12-24 2022-03-22 珠海格力电器股份有限公司 Method and device for device ad hoc network, electronic device and storage medium
CN115297478A (en) * 2022-07-27 2022-11-04 四川虹美智能科技有限公司 Method for simultaneously distributing network to multiple voice devices through voice

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105654949A (en) * 2016-01-07 2016-06-08 北京云知声信息技术有限公司 Voice wake-up method and device
CN106469040A (en) * 2015-08-19 2017-03-01 华为终端(东莞)有限公司 Communication means, server and equipment
CN107134279A (en) * 2017-06-30 2017-09-05 百度在线网络技术(北京)有限公司 A kind of voice awakening method, device, terminal and storage medium
CN107146611A (en) * 2017-04-10 2017-09-08 北京猎户星空科技有限公司 A kind of voice response method, device and smart machine
CN108335696A (en) * 2018-02-09 2018-07-27 百度在线网络技术(北京)有限公司 Voice awakening method and device
CN109377987A (en) * 2018-08-31 2019-02-22 百度在线网络技术(北京)有限公司 Exchange method, device, equipment and the storage medium of intelligent sound equipment room
CN109658927A (en) * 2018-11-30 2019-04-19 北京小米移动软件有限公司 Wake-up processing method, device and the management equipment of smart machine
CN110136714A (en) * 2019-05-14 2019-08-16 北京探境科技有限公司 Natural interaction sound control method and device
CN110471296A (en) * 2019-07-19 2019-11-19 深圳绿米联创科技有限公司 Apparatus control method, device, system, electronic equipment and storage medium
CN110610711A (en) * 2019-10-12 2019-12-24 深圳市华创技术有限公司 Full-house intelligent voice interaction method and system of distributed Internet of things equipment
CN110727410A (en) * 2019-09-04 2020-01-24 上海博泰悦臻电子设备制造有限公司 Man-machine interaction method, terminal and computer readable storage medium
CN111128157A (en) * 2019-12-12 2020-05-08 珠海格力电器股份有限公司 Wake-up-free voice recognition control method for intelligent household appliance, computer readable storage medium and air conditioner
CN111276139A (en) * 2020-01-07 2020-06-12 百度在线网络技术(北京)有限公司 Voice wake-up method and device
CN111369988A (en) * 2018-12-26 2020-07-03 华为终端有限公司 Voice awakening method and electronic equipment
US20200265838A1 (en) * 2017-10-17 2020-08-20 Samsung Electronics Co., Ltd. Electronic device and operation method therefor
CN111613221A (en) * 2020-05-22 2020-09-01 云知声智能科技股份有限公司 Nearby awakening method, device and system

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106469040A (en) * 2015-08-19 2017-03-01 华为终端(东莞)有限公司 Communication means, server and equipment
CN105654949A (en) * 2016-01-07 2016-06-08 北京云知声信息技术有限公司 Voice wake-up method and device
CN107146611A (en) * 2017-04-10 2017-09-08 北京猎户星空科技有限公司 A kind of voice response method, device and smart machine
CN107134279A (en) * 2017-06-30 2017-09-05 百度在线网络技术(北京)有限公司 A kind of voice awakening method, device, terminal and storage medium
US20200265838A1 (en) * 2017-10-17 2020-08-20 Samsung Electronics Co., Ltd. Electronic device and operation method therefor
CN108335696A (en) * 2018-02-09 2018-07-27 百度在线网络技术(北京)有限公司 Voice awakening method and device
CN109377987A (en) * 2018-08-31 2019-02-22 百度在线网络技术(北京)有限公司 Exchange method, device, equipment and the storage medium of intelligent sound equipment room
CN109658927A (en) * 2018-11-30 2019-04-19 北京小米移动软件有限公司 Wake-up processing method, device and the management equipment of smart machine
CN111369988A (en) * 2018-12-26 2020-07-03 华为终端有限公司 Voice awakening method and electronic equipment
CN110136714A (en) * 2019-05-14 2019-08-16 北京探境科技有限公司 Natural interaction sound control method and device
CN110471296A (en) * 2019-07-19 2019-11-19 深圳绿米联创科技有限公司 Apparatus control method, device, system, electronic equipment and storage medium
CN110727410A (en) * 2019-09-04 2020-01-24 上海博泰悦臻电子设备制造有限公司 Man-machine interaction method, terminal and computer readable storage medium
CN110610711A (en) * 2019-10-12 2019-12-24 深圳市华创技术有限公司 Full-house intelligent voice interaction method and system of distributed Internet of things equipment
CN111128157A (en) * 2019-12-12 2020-05-08 珠海格力电器股份有限公司 Wake-up-free voice recognition control method for intelligent household appliance, computer readable storage medium and air conditioner
CN111276139A (en) * 2020-01-07 2020-06-12 百度在线网络技术(北京)有限公司 Voice wake-up method and device
CN111613221A (en) * 2020-05-22 2020-09-01 云知声智能科技股份有限公司 Nearby awakening method, device and system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114222255A (en) * 2021-12-24 2022-03-22 珠海格力电器股份有限公司 Method and device for device ad hoc network, electronic device and storage medium
CN114222255B (en) * 2021-12-24 2023-02-17 珠海格力电器股份有限公司 Method and device for device ad hoc network, electronic device and storage medium
CN115297478A (en) * 2022-07-27 2022-11-04 四川虹美智能科技有限公司 Method for simultaneously distributing network to multiple voice devices through voice

Similar Documents

Publication Publication Date Title
KR102543693B1 (en) Electronic device and operating method thereof
US9953648B2 (en) Electronic device and method for controlling the same
CN108351872B (en) Method and system for responding to user speech
CN112201246B (en) Intelligent control method and device based on voice, electronic equipment and storage medium
CN111045639B (en) Voice input method, device, electronic equipment and storage medium
CN110689889B (en) Man-machine interaction method and device, electronic equipment and storage medium
US20190221208A1 (en) Method, user interface, and device for audio-based emoji input
US20200219384A1 (en) Methods and systems for ambient system control
CN110808044B (en) Voice control method and device for intelligent household equipment, electronic equipment and storage medium
JP6619488B2 (en) Continuous conversation function in artificial intelligence equipment
US10540973B2 (en) Electronic device for performing operation corresponding to voice input
CN113132193B (en) Control method and device of intelligent device, electronic device and storage medium
US11393490B2 (en) Method, apparatus, device and computer-readable storage medium for voice interaction
KR20230141950A (en) Voice query qos based on client-computed content metadata
CN112420043A (en) Intelligent awakening method and device based on voice, electronic equipment and storage medium
CN109240641B (en) Sound effect adjusting method and device, electronic equipment and storage medium
US11636867B2 (en) Electronic device supporting improved speech recognition
CN115810356A (en) Voice control method, device, storage medium and electronic equipment
KR20230118164A (en) Combining device or assistant-specific hotwords into a single utterance
CN116582382B (en) Intelligent device control method and device, storage medium and electronic device
CN109658924B (en) Session message processing method and device and intelligent equipment
WO2019242415A1 (en) Position prompt method, device, storage medium and electronic device
CN113678119A (en) Electronic device for generating natural language response and method thereof
WO2019228140A1 (en) Instruction execution method and apparatus, storage medium, and electronic device
CN113990312A (en) Equipment control method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination