CN112233676A - Intelligent device awakening method and device, electronic device and storage medium - Google Patents

Intelligent device awakening method and device, electronic device and storage medium Download PDF

Info

Publication number
CN112233676A
CN112233676A CN202011311387.4A CN202011311387A CN112233676A CN 112233676 A CN112233676 A CN 112233676A CN 202011311387 A CN202011311387 A CN 202011311387A CN 112233676 A CN112233676 A CN 112233676A
Authority
CN
China
Prior art keywords
awakening
audio data
wake
result
identified
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011311387.4A
Other languages
Chinese (zh)
Inventor
宋汉冲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Oribo Technology Co Ltd
Original Assignee
Shenzhen Oribo Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Oribo Technology Co Ltd filed Critical Shenzhen Oribo Technology Co Ltd
Priority to CN202011311387.4A priority Critical patent/CN112233676A/en
Publication of CN112233676A publication Critical patent/CN112233676A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/443OS processes, e.g. booting an STB, implementing a Java virtual machine in an STB or power management in an STB
    • H04N21/4432Powering on the client, e.g. bootstrap loading using setup parameters being stored locally or received from the server
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Telephone Function (AREA)

Abstract

The application discloses a method and a device for waking up intelligent equipment, electronic equipment and a storage medium. Wherein, the method comprises the following steps: the method comprises the steps of obtaining audio data to be identified, obtaining a first awakening result of the audio data to be identified according to a first mistaken awakening algorithm, obtaining a second awakening result of the audio data to be identified according to a second mistaken awakening algorithm, and awakening the intelligent device when the first awakening result and the second awakening result accord with a preset awakening relation. Therefore, in the scheme provided by the embodiment of the application, the audio data to be recognized are respectively obtained by the first mistaken awakening algorithm and the second awakening algorithm to obtain the first awakening result and the second awakening result, and whether the intelligent device is awakened or not is judged by combining the first awakening result and the second awakening result, so that the mistaken awakening rate of the intelligent device is reduced.

Description

Intelligent device awakening method and device, electronic device and storage medium
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a method and an apparatus for waking up an intelligent device, an electronic device, and a storage medium.
Background
The traditional man-machine interaction mode is often started by a manual button, but the mode is not free under the condition that hands are required to be free and the distance is long. With the continuous development of artificial intelligence technology, more and more devices are provided with voice wake-up functions. The intelligent device can acquire voice data through a voice acquisition device such as a microphone and execute tasks according to instructions input by a user. The voice awakening function frees both hands of a user, and can be more convenient to interact with the intelligent device through voice.
In actual use, a wake-up mechanism needs to be set for the intelligent equipment, and when the acquired voice data meet wake-up conditions, the intelligent equipment is woken up to analyze user requirements; otherwise, the intelligent device is in a standby state.
However, when the user is chatting or watching tv, or there is other sounds of non-waking intention, the smart device is often suddenly awoken, making the user rather annoying.
Disclosure of Invention
In view of this, embodiments of the present application provide a method and an apparatus for waking up an intelligent device, an electronic device, and a storage medium to solve the above problem.
In a first aspect, an embodiment of the present application provides a method for waking up an intelligent device, where the method includes:
acquiring audio data to be identified;
obtaining a first awakening result of the audio data to be identified according to a first false awakening algorithm;
obtaining a second awakening result of the audio data to be identified according to a second false awakening algorithm;
and when the first awakening result and the second awakening result accord with a preset awakening relation, awakening the intelligent equipment.
In a second aspect, an embodiment of the present application provides an apparatus for waking up a smart device, where the apparatus includes:
the audio data to be identified acquisition module is used for acquiring the audio data to be identified;
the first awakening result acquisition module is used for acquiring a first awakening result of the audio data to be identified according to a first mistaken awakening algorithm;
the second awakening result acquisition module is used for acquiring a second awakening result of the audio data to be identified according to a second false awakening algorithm;
and the awakening module is used for executing awakening on the intelligent equipment when the first awakening result and the second awakening result accord with a preset awakening relation.
In a third aspect, an embodiment of the present application further provides an electronic device, where the electronic device includes:
one or more processors;
a memory;
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the smart device wake-up method provided by the first aspect above.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where a program code is stored in the computer-readable storage medium, and the program code may be called by a processor to execute the smart device wake-up method provided in the first aspect.
According to the scheme provided by the embodiment of the application, the audio data to be recognized is obtained according to the first error awakening algorithm, the first awakening result of the audio data to be recognized is obtained, the second awakening result of the audio data to be recognized is obtained according to the second error awakening algorithm, and when the first awakening result and the second awakening result accord with the preset awakening relation, the intelligent device is awakened. Therefore, in the scheme provided by the embodiment of the application, the audio data to be recognized are respectively obtained by the first mistaken awakening algorithm and the second awakening algorithm to obtain the first awakening result and the second awakening result, and whether the intelligent device is awakened or not is judged by combining the first awakening result and the second awakening result, so that the mistaken awakening rate of the intelligent device is reduced.
These and other aspects of the embodiments of the present application will be more readily apparent from the following description of the embodiments.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 illustrates an application scenario diagram of an intelligent device wake-up system according to an embodiment of the present application;
fig. 2 is a schematic flowchart illustrating a method for waking up a smart device according to an embodiment of the present application;
fig. 3 is a flowchart illustrating a method for waking up a smart device according to another embodiment of the present application;
fig. 4 is a flowchart illustrating a method for waking up a smart device according to another embodiment of the present application;
fig. 5 is a flowchart illustrating a smart device wake-up method according to yet another embodiment of the present application;
fig. 6 shows a block diagram of an electronic device according to an embodiment of the present application;
fig. 7 shows a block diagram of a control apparatus of an intelligent device according to an embodiment of the present application;
fig. 8 shows a block diagram of a computer-readable storage medium according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
The voice awakening has wide application field, and the intelligent device with the voice awakening function can be used as the electronic device in the embodiment of the application. Such as intelligent control panel, intelligent domestic appliance, intelligent wearing equipment, intelligent voice navigation equipment, intelligent robot etc.. And carrying out voice instructions on the intelligent equipment, and when the intelligent equipment meets the awakening condition, awakening the intelligent equipment from the awakening state and making a specified response, such as awakening the intelligent equipment to play music and the like.
The user wakes up the intelligent equipment by voice, the intelligent equipment receives the voice data of the user, processes and identifies the voice data of the user, and when the voice data comprises a preset wake-up word, the intelligent equipment wakes up correspondingly.
When the intelligent device identifies whether the received audio is the awakening audio, the received audio can be processed and then matched with the awakening audio, and the matching degree is obtained. And pre-configuring a threshold value compared with the matching degree, defining the threshold value as a preset threshold value, and determining whether a wake-up command is received or not and whether wake-up operation is executed or not according to the matching degree and the preset threshold value. If the matching degree is greater than the preset threshold, the received audio which is close to the awakening audio is received, and the awakening instruction can be determined to be received, and the awakening operation is executed; if the matching degree is smaller than or equal to the preset threshold, it indicates that the audio with lower proximity to the awakening audio is received, and it can be determined that the received audio is not the awakening instruction, and the awakening operation is not executed. However, when the smart device receives some voice data with no intention of waking up, such as a chat conversation of the user or a sound of a television program, the voice data with part of the no intention of waking up is close to the waking audio, and the smart device is often easy to wake up by mistake.
In order to effectively reduce the false wake-up rate of the smart device, the inventor has made long-term research and proposes a method, an apparatus, an electronic device, and a storage medium for waking up the smart device in the embodiments of the present application. The method comprises the steps of obtaining audio data to be identified, obtaining a first awakening result of the audio data to be identified according to a first mistaken awakening algorithm, obtaining a second awakening result of the audio data to be identified according to a second mistaken awakening algorithm, and awakening the intelligent device when the first awakening result and the second awakening result accord with a preset awakening relation. Therefore, in the scheme provided by the embodiment of the application, the audio data to be recognized are respectively obtained by the first mistaken awakening algorithm and the second awakening algorithm to obtain the first awakening result and the second awakening result, and whether the intelligent device is awakened or not is judged by combining the first awakening result and the second awakening result, so that the mistaken awakening rate of the intelligent device is reduced.
For convenience of detailed description, an application scenario to which the embodiments of the present application are applied is described below with reference to the accompanying drawings.
Referring to fig. 1, fig. 1 is a schematic view illustrating an application scenario of an intelligent device wake-up method provided in an embodiment of the present application, where the application scenario includes an intelligent device wake-up system provided in the embodiment of the present application. This smart machine awakens up system includes: smart device 100 and server 200.
The intelligent device 100 may be, but not limited to, an intelligent control panel, an intelligent household appliance, an intelligent wearable device, an intelligent voice navigation device, an intelligent robot, a mobile phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer iii, dynamic image compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer iv, dynamic image compression standard Audio Layer 4), a personal computer, or the like. The embodiment of the present application does not limit the type of the specific smart device.
In this embodiment, the smart device 100 is provided with an audio collector, such as a microphone, and can collect audio data through the audio collector.
The server 200 may be a traditional server, a cloud server, a server cluster composed of a plurality of servers, or a cloud computing service center.
In some possible embodiments, the device for processing the input audio data may be disposed in the server 200, and after the terminal 100 obtains the input voice, the input audio data may be sent to the server 200, and the server 200 processes the input audio data and then returns the processing result to the intelligent device 100, so that the intelligent device 100 may perform a subsequent operation according to the processing result.
The device for processing the input voice can be a false wake-up computing device. In some embodiments, the device processing the input speech may also match the device for the wake up result.
As an embodiment, the false wake-up calculation device may be disposed in the server 200, and the wake-up result matching device may be disposed in the smart device 100, and then the server 100 may return the wake-up result to the smart device 100, and the smart device 100 further determines whether the wake-up result conforms to the preset wake-up relationship based on the wake-up result.
In addition, as another embodiment, the setting positions of the false wake-up computing device and the wake-up result matching device may be interchanged, that is, the false wake-up computing device may be disposed in the smart device 100, and the wake-up result matching device may be disposed in the server 200, so that the smart device 100 processes the audio data based on the false wake-up computing device and sends the wake-up result to the server 200, and instructs the server 200 to further determine whether the wake-up result conforms to the preset wake-up relationship based on the wake-up result. And returns the processing result to the smart device 100. So that the smart device 100 can determine whether to wake up based on the processing result.
In yet another embodiment, the false wake-up calculation device and the wake-up result matching device may be disposed in the server 200, and the server 200 may return the result to the smart device 100, so that the smart device 100 may determine whether to wake up based on the result.
In other possible embodiments, the means for processing the input audio data may also be disposed on the smart device 100, so that the smart device 100 does not need to rely on establishing communication with the server 200, and may also process the input audio data to obtain a processing result, and then the smart device wake-up system may only include the smart device 100.
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Embodiments of the present application will be described in detail below with reference to the accompanying drawings.
Referring to fig. 2, fig. 2 is a schematic flowchart illustrating a method for waking up an intelligent device according to an embodiment of the present application, where the method can be seen from fig. 1 to include steps S110 to S140. As will be explained in detail below with respect to the embodiment shown in fig. 2, the method may specifically include the following steps:
step S110, obtaining audio data to be identified;
in the embodiment of the present application, the smart device 100 may be provided with an audio collector, and may also be connected to an external audio collector. The connection may be a wireless connection or a wired connection, and is not limited herein. In some embodiments, if the connection is Wireless, the terminal may be provided with a Wireless communication module, such as a Wireless Fidelity (WiFi) module, a Bluetooth (Bluetooth) module, and the like, and may obtain the audio data to be identified, which is collected by the audio collection device, based on the Wireless communication module.
In some embodiments, the terminal of the smart device 100 may collect sound through an audio collector, such as a microphone, to obtain audio data to be identified collected by the audio collector. Because the consumption that utilizes audio collector to carry out the pickup is lower, consequently, audio acquisition device can be in the on-state always and carry out the pickup. In some embodiments, the audio collector may buffer the collected audio at regular time, and send the buffered audio to the processor to process the collected audio data to be identified.
And step S120, obtaining a first awakening result of the audio data to be identified according to the first false awakening algorithm.
And S130, obtaining a second awakening result of the audio data to be identified according to the second false awakening algorithm.
In the embodiment of the present application, two different false wake-up algorithms are adopted in step S120 and step S130 to process the audio data to be recognized, and the wake-up results of the respective algorithms are obtained. It can be understood that each false wake-up algorithm always has a certain error in processing the audio data to be recognized. Assuming that the noise recognition accuracy is assumed to be 80% by the false wake-up algorithm for recognizing whether the audio data to be recognized is noise, 20% of the audio data to be recognized cannot be recognized by the false wake-up algorithm to cause false wake-up. Other false wake-up algorithms also suffer from the same problem. Therefore, the embodiment of the application adopts two different false awakening algorithms to process the audio data to be recognized respectively, so compared with the single false awakening algorithm, two different false awakening algorithms are adopted, because the consideration angles for judging whether the false awakening is carried out are different, some algorithms consider whether similar words are similar, some algorithms consider whether noise data is, and the audio data to be recognized, which cannot be recognized by one of the false awakening algorithms, can be additionally recognized in the other false awakening algorithm. Therefore, the false wake-up rate of the intelligent device can be greatly reduced.
And step S140, when the first awakening result and the second awakening result accord with a preset awakening relationship, awakening the intelligent device.
In the embodiment of the application, the audio data to be identified needs to be processed by two different false awakening algorithms, and if the audio data to be identified passes through the first false awakening algorithm, the probability of false awakening is a%, wherein a is greater than or equal to 0 and less than or equal to 100; and through a second false awakening algorithm, the probability that the audio data to be identified is false awakened is b%, wherein b is more than or equal to 0 and less than or equal to 100. In the first false wake-up algorithm, the probability of false wake-up is less than A%, wherein the audio data with A being more than or equal to 0 and less than or equal to 100 does not belong to false wake-up, and in the second false wake-up algorithm, the probability of false wake-up is less than B%, wherein the audio data with B being more than or equal to 0 and less than or equal to 100 does not belong to false wake-up. The intelligent device is awakened only when the first awakening result a is less than A and the second mistaken awakening result B is less than B. It can be understood that, under different application scenarios, the pre-wake-up relationship may be appropriately adjusted, for example, if the first false wake-up algorithm is a noise estimation algorithm, if the requirement of the noise of the audio data to be recognized is strict, the false wake-up probability threshold a of the first false wake-up algorithm may be adjusted to be smaller, for example, from 20% to 10%, so that after the audio data to be recognized is processed by the first false wake-up algorithm, it is necessary that the false wake-up probability is lower than 10% to wake up the smart device. It can be understood that, under different application scenarios, the pre-wake-up relationship can be adaptively adjusted according to the needs of an actual scenario.
According to the scheme provided by the embodiment of the application, the audio data to be recognized is obtained according to the first error awakening algorithm, the first awakening result of the audio data to be recognized is obtained, the second awakening result of the audio data to be recognized is obtained according to the second error awakening algorithm, and when the first awakening result and the second awakening result accord with the preset awakening relation, the intelligent device is awakened. Therefore, in the scheme provided by the embodiment of the application, the audio data to be recognized are respectively obtained by the first mistaken awakening algorithm and the second awakening algorithm to obtain the first awakening result and the second awakening result, and whether the intelligent device is awakened or not is judged by combining the first awakening result and the second awakening result, so that the mistaken awakening rate of the intelligent device is reduced.
Referring to fig. 3, fig. 3 is a schematic flowchart illustrating an intelligent device wake-up method according to another embodiment of the present application. As will be explained in detail below with respect to the embodiment shown in fig. 3, the method may specifically include the following steps:
and step S210, acquiring audio data to be identified.
For detailed description of step S210, please refer to step S110, which is not described herein again.
In an embodiment of the present application, the first wake-up algorithm employs a keyword algorithm, that is, the audio data to be recognized is processed, whether the audio data to be recognized includes a specific keyword is determined, and specifically, obtaining a first wake-up result of the audio data to be recognized according to the first false wake-up algorithm may specifically include:
and step S220, extracting keywords in the audio data to be identified.
In the embodiment of the application, the intelligent device can process the audio data to be identified after acquiring the audio data to be identified. Specifically, as an implementation manner, an Automatic Speech Recognition technology (ASR) may be performed on the audio data to be recognized, that is, the audio data to be recognized is converted into a text, and then the intelligent device performs a Natural Speech Understanding operation (NLU) on the converted text, so as to extract a keyword of the audio data to be recognized.
Step S230, calculating the keyword similarity between the keyword and the target keyword.
As an embodiment of the present application, after obtaining the keyword, the smart device may calculate a similarity between the keyword and the target keyword. The target keyword is a word preset by a user or a smart device for waking up. As an implementation manner of the present application, a target keyword matching model may be trained in advance through a machine learning model, that is, a large amount of training texts and target keywords are used to perform model training, so as to obtain a target keyword matching model capable of calculating similarity to the target keywords.
It can be understood that by calculating the similarity between the keywords and the target keywords, the probability that the voice data to be recognized is the user wants to interact with the intelligent device can be obtained. And when the similarity of the keywords and the keywords of the target keywords is lower than a preset threshold, the audio to be identified can be judged to be the false awakening audio. Therefore, whether the audio data to be identified is the mistaken awakening audio data or not can be judged from the perspective of whether the intention of interaction with the intelligent equipment exists or not in a mode of calculating the similarity of the keywords and the target keywords.
In the embodiment of the application, the second wake-up algorithm adopts a noise estimation algorithm, that is, the audio data to be recognized is processed, the noise estimation value of the audio data to be recognized is calculated, and specifically, the second wake-up result for obtaining the audio data to be recognized according to the second false wake-up algorithm may specifically include:
and step S240, calculating the noise estimation value of the audio data to be identified.
As an implementation manner of the application, the intelligent device further calculates a noise estimation value of the audio data to be identified after acquiring the audio data to be identified. It will be appreciated that the environment in which the smart device is located has many sounds that are not user-directed, such as vehicle traffic sounds, foreign noise sounds, or collision sounds, walking sounds, etc., which may also cause false wake-up of the smart device. In order to eliminate false awakening caused by noise, the noise estimation value of the audio data to be identified is calculated, and the probability that the audio data to be identified is noise can be obtained. And when the noise estimation value is higher than the preset threshold value, the audio to be identified can be judged to be the false awakening audio. Therefore, whether the audio data to be identified is the false wake-up audio data can be judged from the perspective of whether the audio data is the noise or not by calculating the noise estimation value.
As an embodiment of the present application, the noise estimation value calculation model may be trained in advance through a machine learning model, that is, training using a large amount of noise audio data, thereby obtaining a model that can estimate a noise estimation value. As another embodiment of the present application, the noise estimation value of the audio data to be identified may also be calculated by using a Minimum Statistics (MS) noise estimation algorithm or a Minimum Controlled Recursive Averaging (MCRA) algorithm, which is not limited in the present application.
And S250, when the similarity of the key words is greater than the threshold of the key words and the noise estimation value is less than the preset noise threshold, awakening the intelligent equipment.
Optionally, the range of the keyword threshold is 70% -90%; the noise threshold ranges from 10% to 30%.
It can be understood that the preset keyword threshold and the preset noise threshold can be set according to the requirements of the actual use scene. For example, the preset keyword threshold value may be set to 80%, the preset noise threshold value may be set to 20%, and when the similarity of the keywords of the audio data to be recognized is greater than 80%, and the estimated noise value is less than 20%, the smart device may be awakened. However, the present invention is not limited thereto, and the preset keyword threshold and the preset noise threshold may be set according to the requirements of the actual usage scenario.
In the embodiment of the application, the similarity of the keywords of the audio data to be recognized is calculated by the keyword similarity algorithm for the audio data to be recognized, the noise estimation value of the audio data to be recognized is calculated by the noise estimation algorithm, and whether the audio data to be recognized is the false awakening audio is comprehensively judged from two aspects of user interaction and noise by combining the similarity of the keywords and the noise estimation value, so that the false awakening rate of the intelligent device is reduced.
Referring to fig. 4, fig. 4 is a schematic flowchart illustrating an intelligent device wake-up method according to another embodiment of the present application. As will be explained in detail below with respect to the embodiment shown in fig. 4, the method may specifically include the following steps:
and step S310, acquiring audio data to be identified.
For detailed description of step S310, please refer to step S110, which is not described herein again.
And step S320, extracting keywords in the audio data to be recognized.
Step S330, calculating the similarity between the keywords and the target keywords.
In an embodiment of the present application, the first wake-up algorithm is a keyword algorithm, wherein the detailed descriptions of steps S320 to S330 refer to steps S220 to S230.
In the embodiment of the application, the second wake-up algorithm adopts an audio category judgment algorithm, that is, the audio data to be recognized is processed, and the background similarity and the user similarity of the audio data to be recognized are judged, specifically, the obtaining of the second wake-up result of the audio data to be recognized according to the second false wake-up algorithm may specifically include:
and step S340, calculating the background similarity between the audio data to be identified and the background audio.
In the embodiment of the application, the intelligent device further calculates the background similarity between the audio data to be identified and the background audio after acquiring the audio data to be identified. It will be appreciated that some background audio is likely to include target keywords, for example, when a user plays a song, the song may include target keywords. For another example, when the user plays a tv program, the audio data in the tv program is likely to include the target keyword. Although the audio data includes the target keywords, the user does not want to wake up the smart device, and if the background audio is not removed, it is likely that the user will influence the normal listening to music and watching television when the user plays music or watches television because the background audio wakes up the smart device. Therefore, the embodiment of the application also calculates the background similarity between the audio data to be identified and the background audio.
And step S350, calculating the user similarity between the audio data to be identified and the user audio.
In the embodiment of the application, the intelligent device further calculates the user similarity between the audio data to be identified and the user audio after acquiring the audio data to be identified. It can be understood that, the higher the user similarity between the audio data to be recognized and the user audio is, the lower the probability that the audio data is the background audio is, and therefore, in the embodiment of the application, the audio category of the audio to be recognized is comprehensively determined by calculating the background similarity and the user similarity of the audio to be recognized, so as to determine whether the audio to be recognized is the false wake-up audio.
As an embodiment of the present application, the audio category determination model may be trained in advance through a machine learning model, that is, a large amount of training audio is used to train the model, so as to obtain an audio category determination model capable of calculating the background similarity and the user similarity.
And S360, when the keyword similarity is greater than the keyword threshold, the background similarity is less than the background threshold, and the user similarity is greater than the user threshold, awakening the intelligent device.
Optionally, the range of the keyword threshold is 70% -90%; the range of the background threshold is 10-30%; the user threshold ranges from 80% to 95%.
It is understood that the keyword threshold, the background threshold and the user threshold may be set according to the needs of the actual usage scenario. For example, the keyword threshold value may be set to 80%, the background threshold value may be set to 20%, and the user threshold value may be set to 90%, when the keyword similarity of the audio data to be recognized is greater than 80%, the background similarity is less than 30%, and the user similarity is greater than 90%, the smart device is awakened. However, the present invention is not limited thereto, and the keyword threshold, the background threshold, and the user threshold may be set according to the needs of an actual usage scenario.
In the embodiment of the application, the keyword similarity of the audio data to be recognized is calculated by the keyword similarity algorithm for the audio data to be recognized, the background similarity and the user similarity of the audio data to be recognized are calculated by the audio category judgment algorithm, and whether the audio data to be recognized is the false awakening audio is comprehensively judged from two angles of user interaction and audio category by combining the keyword similarity, the background similarity and the user similarity, so that the false awakening rate of the intelligent device is reduced.
Referring to fig. 5, fig. 5 is a schematic flowchart illustrating an intelligent device wake-up method according to yet another embodiment of the present application. As will be explained in detail below with respect to the embodiment shown in fig. 5, the method may specifically include the following steps:
and step S410, acquiring audio data to be identified.
For detailed description of step S410, please refer to step S110, which is not described herein again.
Step S420, acquiring the current state of the intelligent device, wherein the current state comprises a working state and a dormant state.
In an embodiment of the present application, the current state of the smart device includes an operating state and a sleep state. As an implementation manner, when the smart device is not awakened within a preset time, the smart device enters a sleep state, and when the smart device is awakened again, the smart device enters a working state. As another embodiment, the working period of the smart device may also be set, that is, the smart device is in the working state at the working period and is in the dormant state at the non-working period.
And step S430, calculating the volume value of the audio data to be identified.
As an implementation mode of the application, the intelligent device is provided with an audio acquisition device, and the audio acquisition device can calculate the volume value of the audio data to be identified according to the amplitude and the like of the audio data to be identified.
And step S440, judging whether the current state is a working state. If so, step S441 is executed, and if not, step S443 is executed.
Step S441, determining whether the volume value of the audio data to be recognized is smaller than a first preset volume value, if so, performing step S442, and if not, performing step S450.
Step S442, the wakeup process is exited.
Step S443, determining whether the current state is the sleep state. If yes, go to step S444.
Step 444, judging whether the volume value of the audio data to be identified is smaller than a second preset volume value, if so, executing step 445; if not, go to step S450.
As an implementation manner of the application, no matter the intelligent device is in a working state or a sleeping state, the volume value of the audio to be recognized is required to reach a preset volume value, and then the intelligent device can be awakened. The probability that the audio to be identified with too low volume is the false awakening audio is higher. As an embodiment of the present application, the smart device is not awakened within a preset time, which may be a time period when the user goes out or has a rest at night. In order to prevent the noise from mistakenly waking up the intelligent device, the volume value of the audio to be identified, which wakes up the intelligent device in the dormant state, is higher than the volume value of the audio to be identified, which wakes up the intelligent device in the working state.
And S450, obtaining a first awakening result of the audio data to be identified according to the first false awakening algorithm.
Step S460, obtaining a second awakening result of the audio data to be identified according to a second false awakening algorithm;
step S470, when the first wake-up result and the second wake-up result conform to the preset wake-up relationship, perform wake-up on the smart device.
For the detailed description of steps S450 to S470, refer to steps S120 to S140, which are not described herein again.
Referring to fig. 6, fig. 6 shows an electronic device 500 provided in an embodiment of the present application, which includes a memory 510, a processor 520, and a computer program stored in the memory 510 and executable on the processor 520, and when the computer program is executed by the processor 520, the method described in the foregoing method embodiment is implemented.
Processor 520 may include one or more processing cores. The processor 520, using various interfaces and connections throughout the electronic device 500, performs various functions and processes data for the electronic device 500 by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 510, and invoking data stored in the memory 310. Alternatively, the processor 520 may be implemented in hardware using at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 520 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 520, but may be implemented solely by a communication chip.
The Memory 510 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). The memory 510 may be used to store instructions, programs, code sets, or instruction sets. The memory 510 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system, instructions for implementing at least one function (e.g., fetch, select, fetch, control, etc.), instructions for implementing various method embodiments described below, and the like. The storage data area may also store data created by the electronic device 500 in use, such as user input information, current state information, preset task rules, task execution information, and the like.
Referring to fig. 7, fig. 7 is a block diagram illustrating a control apparatus 600 of an intelligent device according to an embodiment of the present disclosure. As will be explained below with reference to the block diagram shown in fig. 7, the control apparatus 600 of the smart device of the present embodiment is applied to a smart device, and includes:
the audio data to be identified acquiring module 610 is configured to acquire audio data to be identified;
a first wake-up result obtaining module 620, configured to obtain a first wake-up result of the audio data to be identified according to a first false wake-up algorithm;
a second wake-up result obtaining module 630, configured to obtain a second wake-up result of the audio data to be identified according to a second false wake-up algorithm;
and the waking module 640 is configured to perform waking on the smart device when the first waking result and the second waking result conform to a preset waking relationship.
It can be clearly understood by those skilled in the art that the system upgrading device provided in the embodiment of the present application can implement each process implemented by the intelligent control terminal in the method embodiment of fig. 1, and for convenience and simplicity of description, the specific working processes of the device and the module described above may refer to corresponding processes in the foregoing method embodiment, and are not described herein again.
In some embodiments, the first wake up result obtaining module 620 may include:
a keyword extraction unit for extracting keywords in the audio data to be recognized,
and the similarity calculation unit is used for calculating the similarity of the keywords and the keywords of the target keywords.
In some embodiments, the second wake up result obtaining module 630 may include:
and the noise estimation value calculation unit is used for calculating the noise estimation value of the audio data to be identified.
In some embodiments, the wake-up module 640 may be further configured to wake up the smart device when the keyword similarity is greater than a preset keyword threshold and the noise estimation value is less than a preset noise threshold.
In some embodiments, the second wake up result obtaining module 630 may include:
the background similarity calculation unit is used for calculating the background similarity between the audio data to be identified and the background audio;
and the user similarity calculation unit is used for calculating the user similarity between the audio data to be identified and the user audio.
In some embodiments, the wake module 640 may be further configured to perform a wake on the smart device when the keyword similarity is greater than the keyword threshold, the background similarity is less than the background threshold, and the user similarity is greater than the user threshold.
In some embodiments, the control apparatus 600 of the smart device further includes:
the intelligent device comprises a current state acquisition module, a current state acquisition module and a control module, wherein a user acquires the current state of the intelligent device, and the current state comprises a working state and a dormant state;
a volume value calculating module for calculating the volume value of the audio data to be identified by the user,
the first judgment module is used for judging whether the volume value of the audio data to be identified is smaller than a first preset volume value or not when the current state is the working state, and if so, exiting the awakening process;
the second judging module is used for judging whether the volume value of the audio data to be identified is smaller than a second preset volume value or not when the current state is the dormant state, and if so, exiting the awakening process;
wherein the first preset volume value is smaller than the second preset volume value.
Referring to fig. 8, fig. 8 is a block diagram illustrating a computer-readable storage medium according to an embodiment of the present disclosure. The computer readable storage medium has stored therein a program code 810, said program code 810 being invokable by the processor for performing the method described in the above method embodiments.
The computer-readable storage medium may be an electronic memory such as a flash memory, an electrically-erasable programmable read-only memory (EEPROM), an erasable programmable read-only memory (EPROM), a hard disk, or a ROM. Alternatively, the computer-readable storage medium includes a non-volatile computer-readable storage medium. The computer readable storage medium has storage space for a program medium for performing any of the method steps of the above-described method. The program code can be read from or written to one or more computer program products. The program code may be compressed, for example, in a suitable form.
Alternatively, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable storage medium may include a propagated data signal with a computer program embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable storage medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. The computer program embodied on the computer readable storage medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
To sum up, according to the awakening method and apparatus for the intelligent device, the electronic device and the storage medium provided by the embodiment of the application, the audio data to be identified is acquired, the first awakening result of the audio data to be identified is acquired according to the first mistaken awakening algorithm, the second awakening result of the audio data to be identified is acquired according to the second mistaken awakening algorithm, and when the first awakening result and the second awakening result meet the preset awakening relationship, the intelligent device is awakened. Therefore, in the scheme provided by the embodiment of the application, the audio data to be recognized are respectively obtained by the first mistaken awakening algorithm and the second awakening algorithm to obtain the first awakening result and the second awakening result, and whether the intelligent device is awakened or not is judged by combining the first awakening result and the second awakening result, so that the mistaken awakening rate of the intelligent device is reduced.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not necessarily depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (10)

1. A smart device wake-up method, the method comprising:
acquiring audio data to be identified;
obtaining a first awakening result of the audio data to be identified according to a first false awakening algorithm;
obtaining a second awakening result of the audio data to be identified according to a second false awakening algorithm;
and when the first awakening result and the second awakening result accord with a preset awakening relation, awakening the intelligent equipment.
2. The method of claim 1, wherein the first false wake algorithm is a keyword similarity algorithm;
the step of obtaining the first wake-up result of the audio data to be identified according to the first false wake-up algorithm specifically includes:
extracting keywords in the audio data to be identified,
and calculating the similarity between the keywords and the target keywords.
3. The method according to claim 2, wherein the second false wake-up algorithm is a noise estimation algorithm, and the step of obtaining the second wake-up result of the audio data to be recognized according to the second false wake-up algorithm specifically includes:
and calculating a noise estimation value of the audio data to be identified.
4. The method according to claim 3, wherein the step of performing the wake-up on the smart device when the first wake-up result and the second wake-up result conform to a preset wake-up relationship specifically comprises:
and when the similarity of the keywords is greater than a keyword threshold value and the noise estimation value is smaller than a noise threshold value, awakening the intelligent equipment.
5. The method of claim 2, wherein the second false wake-up algorithm is an audio class determination algorithm
The step of obtaining a second wake-up result of the audio data to be identified according to a second false wake-up algorithm specifically includes:
calculating the background similarity between the audio data to be identified and background audio;
and calculating the user similarity of the audio data to be identified and the user audio.
6. The method of claim 5, wherein when the first wake-up result and the second wake-up result conform to a preset wake-up relationship, the step of performing wake-up on the smart device comprises:
and when the keyword similarity is greater than a keyword threshold, the background similarity is less than a background threshold, and the user similarity is greater than a user threshold, performing awakening on the intelligent device.
7. The method according to claim 1, wherein the step of obtaining the first wake-up result of the audio data to be identified according to the first false wake-up algorithm is preceded by the method further comprising the steps of:
acquiring the current state of the intelligent equipment, wherein the current state comprises a working state and a dormant state;
calculating a volume value of the audio data to be identified,
when the current state is a working state, judging whether the volume value of the audio data to be identified is smaller than a first preset volume value, if so, exiting the awakening process;
when the current state is a dormant state, judging whether the volume value of the audio data to be identified is smaller than a second preset volume value, if so, exiting the awakening process;
wherein the first preset volume value is smaller than the second preset volume value.
8. An apparatus for waking up a smart device, the apparatus comprising:
the audio data to be identified acquisition module is used for acquiring the audio data to be identified;
the first awakening result acquisition module is used for acquiring a first awakening result of the audio data to be identified according to a first mistaken awakening algorithm;
the second awakening result acquisition module is used for acquiring a second awakening result of the audio data to be identified according to a second false awakening algorithm;
and the awakening module is used for executing awakening on the intelligent equipment when the first awakening result and the second awakening result accord with a preset awakening relation.
9. An electronic device, characterized in that the electronic device comprises:
one or more processors;
a memory;
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method of any of claims 1-7.
10. A computer-readable storage medium, having stored thereon program code that can be invoked by a processor to perform the method according to any one of claims 1 to 7.
CN202011311387.4A 2020-11-20 2020-11-20 Intelligent device awakening method and device, electronic device and storage medium Pending CN112233676A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011311387.4A CN112233676A (en) 2020-11-20 2020-11-20 Intelligent device awakening method and device, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011311387.4A CN112233676A (en) 2020-11-20 2020-11-20 Intelligent device awakening method and device, electronic device and storage medium

Publications (1)

Publication Number Publication Date
CN112233676A true CN112233676A (en) 2021-01-15

Family

ID=74124542

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011311387.4A Pending CN112233676A (en) 2020-11-20 2020-11-20 Intelligent device awakening method and device, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN112233676A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113593546A (en) * 2021-06-25 2021-11-02 青岛海尔科技有限公司 Terminal device awakening method and device, storage medium and electronic device
CN113641795A (en) * 2021-08-20 2021-11-12 上海明略人工智能(集团)有限公司 Method and device for dialectical statistics, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105575395A (en) * 2014-10-14 2016-05-11 中兴通讯股份有限公司 Voice wake-up method and apparatus, terminal, and processing method thereof
CN106775569A (en) * 2017-01-12 2017-05-31 环旭电子股份有限公司 Setting position prompt system and method
CN106815507A (en) * 2015-11-30 2017-06-09 中兴通讯股份有限公司 Voice wakes up implementation method, device and terminal
CN109671426A (en) * 2018-12-06 2019-04-23 珠海格力电器股份有限公司 A kind of sound control method, device, storage medium and air-conditioning
CN109920420A (en) * 2019-03-08 2019-06-21 四川长虹电器股份有限公司 A kind of voice wake-up system based on environment measuring
CN110500721A (en) * 2019-08-21 2019-11-26 宁波奥克斯电气股份有限公司 A kind of air-conditioning sound control method, device and air conditioner
US20200227049A1 (en) * 2019-01-11 2020-07-16 Baidu Online Network Technology (Beijing) Co., Ltd. Method, apparatus and device for waking up voice interaction device, and storage medium
CN111880856A (en) * 2020-07-31 2020-11-03 Oppo广东移动通信有限公司 Voice wake-up method and device, electronic equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105575395A (en) * 2014-10-14 2016-05-11 中兴通讯股份有限公司 Voice wake-up method and apparatus, terminal, and processing method thereof
CN106815507A (en) * 2015-11-30 2017-06-09 中兴通讯股份有限公司 Voice wakes up implementation method, device and terminal
JP2019502947A (en) * 2015-11-30 2019-01-31 ゼットティーイー コーポレイション Voice wakeup implementation method, apparatus and terminal, and computer storage medium
CN106775569A (en) * 2017-01-12 2017-05-31 环旭电子股份有限公司 Setting position prompt system and method
CN109671426A (en) * 2018-12-06 2019-04-23 珠海格力电器股份有限公司 A kind of sound control method, device, storage medium and air-conditioning
US20200227049A1 (en) * 2019-01-11 2020-07-16 Baidu Online Network Technology (Beijing) Co., Ltd. Method, apparatus and device for waking up voice interaction device, and storage medium
CN109920420A (en) * 2019-03-08 2019-06-21 四川长虹电器股份有限公司 A kind of voice wake-up system based on environment measuring
CN110500721A (en) * 2019-08-21 2019-11-26 宁波奥克斯电气股份有限公司 A kind of air-conditioning sound control method, device and air conditioner
CN111880856A (en) * 2020-07-31 2020-11-03 Oppo广东移动通信有限公司 Voice wake-up method and device, electronic equipment and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113593546A (en) * 2021-06-25 2021-11-02 青岛海尔科技有限公司 Terminal device awakening method and device, storage medium and electronic device
CN113593546B (en) * 2021-06-25 2023-09-15 青岛海尔科技有限公司 Terminal equipment awakening method and device, storage medium and electronic device
CN113641795A (en) * 2021-08-20 2021-11-12 上海明略人工智能(集团)有限公司 Method and device for dialectical statistics, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109378000B (en) Voice wake-up method, device, system, equipment, server and storage medium
CN111223497B (en) Nearby wake-up method and device for terminal, computing equipment and storage medium
CN108182943B (en) Intelligent device control method and device and intelligent device
CN108962240B (en) Voice control method and system based on earphone
WO2018188586A1 (en) Method and device for user registration, and electronic device
CN111968644B (en) Intelligent device awakening method and device and electronic device
CN110704004B (en) Voice-controlled split-screen display method and electronic equipment
CN109284080B (en) Sound effect adjusting method and device, electronic equipment and storage medium
CN105556595A (en) Method and apparatus for adjusting detection threshold for activating voice assistant function
CN111192590B (en) Voice wake-up method, device, equipment and storage medium
CN108055617B (en) Microphone awakening method and device, terminal equipment and storage medium
CN110634468B (en) Voice wake-up method, device, equipment and computer readable storage medium
WO2021218600A1 (en) Voice wake-up method and device
CN110706707B (en) Method, apparatus, device and computer-readable storage medium for voice interaction
CN112201246A (en) Intelligent control method and device based on voice, electronic equipment and storage medium
CN108509225B (en) Information processing method and electronic equipment
CN111312222A (en) Awakening and voice recognition model training method and device
US20170178627A1 (en) Environmental noise detection for dialog systems
CN112185369B (en) Volume adjusting method, device, equipment and medium based on voice control
CN112233676A (en) Intelligent device awakening method and device, electronic device and storage medium
CN110968353A (en) Central processing unit awakening method and device, voice processor and user equipment
TW202022849A (en) Voice data identification method, apparatus and system
CN113963695A (en) Awakening method, awakening device, equipment and storage medium of intelligent equipment
CN114373462A (en) Voice interaction equipment and control method and control device thereof
CN113160815B (en) Intelligent control method, device, equipment and storage medium for voice wakeup

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination