CN110265036A

CN110265036A - Voice awakening method, system, electronic equipment and computer readable storage medium

Info

Publication number: CN110265036A
Application number: CN201910492994.6A
Authority: CN
Inventors: 李波; 夏波; 詹昌寿
Original assignee: Hunan Guosheng Acoustics Technology Co Ltd Shenzhen Branch; Hunan National Sound Pal Acoustics Technology Ltd
Current assignee: Hunan Guosheng Acoustics Technology Co Ltd Shenzhen Branch; Hunan National Sound Pal Acoustics Technology Ltd
Priority date: 2019-06-06
Filing date: 2019-06-06
Publication date: 2019-09-20
Also published as: WO2020244257A1

Abstract

The embodiment of the invention discloses a kind of voice awakening method, system, electronic equipment and computer readable storage mediums, are related to technical field of intelligent equipment.If wherein method includes: that voice activity detection monitoring the process to the collected voice signal of microphone meets the first preset condition, triggers processor and read the collected data-signal of gravity sensor；Judge whether voice signal is to be talked to generating by the user of wearable electronic device according to data-signal；If it is not, then ignoring the voice signal；Alternatively, if so, starting keyword identification process controls the electronic equipment and executes corresponding function if recognizing the voice signal includes preset phonetic order keyword.The embodiment of the present invention can save electronic device energy consumption, and can be to avoid the false wake-up of electronic equipment；In addition, it is without special bone-conduction microphone or other contact microphones, cost is relatively low, and algorithm is simple and practical, accuracy rate is high, and consumption resource is few.

Description

Voice wake-up method, system, electronic device and computer readable storage medium

Technical Field

The present invention relates to the field of voice processing technologies, and in particular, to a voice wake-up method, system, electronic device, and computer-readable storage medium.

Background

With the development of science and technology, various electronic devices generally have a voice wake-up function, and by presetting a wake-up word in the device or software, when a user sends a voice instruction, the device is woken up from a sleep state.

In a traditional language awakening scheme, audio signals acquired by a microphone are acquired through Voice Activity Detection (VAD), Voice energy is counted according to the audio signals, and when the Voice energy is greater than a preset threshold value, a processor is triggered to start keyword recognition so as to judge whether the audio signals are language instructions sent by a user. The voice awakening scheme does not consider whether the audio signal collected by the microphone is caused by the speaking of the wearer, so that the condition that the equipment is awakened by mistake exists, namely the equipment is also triggered when surrounding people unintentionally speak the keyword, and in a noisy environment, the VAD is continuously triggered to cause the digital signal processor to carry out keyword recognition, so that the power loss is large.

For the above defects existing in the conventional language awakening scheme, the existing language awakening scheme generally determines whether the audio signal collected by the microphone is caused by the speech of the wearer before triggering the processor to perform keyword recognition. However, in the prior art, a dedicated bone conduction microphone or other contact microphones are generally used to extract an audio signal to determine whether the audio signal is caused by the speech of the wearer, which results in high overall equipment cost due to the high cost of the bone conduction microphone and other contact microphones. In addition, in the prior art, whether the microphone audio signal is caused by the speaking of the wearer is judged through a software algorithm, but the judgment algorithm is generally complex, so that the judgment itself consumes resources.

In summary, the conventional and existing voice wake-up schemes have the problems of being likely to cause false wake-up of devices, having large power loss, high device cost and complex judgment algorithm, and causing resource consumption in judgment.

Disclosure of Invention

In view of this, embodiments of the present invention provide a voice wake-up method, a system, an electronic device, and a computer-readable storage medium, so as to solve the problem that the voice wake-up scheme may cause a device to be woken up by mistake, power consumption loss is large, device cost is high, and a determination algorithm is complex, which results in a relatively high resource consumption for determination.

A first aspect of an embodiment of the present invention provides a voice wake-up method, which is applied to an electronic device, where the electronic device includes a processor, a gravity sensor and a microphone, where the gravity sensor and the microphone are electrically connected to the processor, respectively, and the voice wake-up method includes executing, by the processor, the following steps:

if the voice activity detection process monitors that the voice signals collected by the microphone meet a first preset condition, triggering the processor to read the data signals collected by the gravity sensor;

judging whether the voice signal is generated by the speech of a user wearing the electronic equipment according to the data signal;

if the voice signal is not generated by the user speaking, ignoring the voice signal; or,

and if the voice signal is generated by the user speaking, starting a keyword recognition process, and if the voice signal is recognized to contain a preset voice instruction keyword, controlling the electronic equipment to execute a corresponding function.

Wherein the determining whether the voice signal is generated by a user wearing the electronic device speaking according to the data signal comprises:

performing time-frequency conversion on the data signals, and screening out the data signals with the frequency within a preset frequency range;

counting the signal energy of the data signal with the frequency within the preset frequency range on a frequency domain;

judging whether the signal energy is greater than a first preset energy threshold value or not;

if the signal energy is larger than the first preset energy threshold, the voice signal is generated by the speech of a user wearing the electronic equipment;

if the signal energy is less than or equal to the first preset energy threshold, it is indicated that the voice signal is not generated by the speech of the user wearing the electronic device.

performing band-pass filtering processing on the data signals, and screening out the data signals with the frequency within a preset frequency range;

counting the signal energy of the data signal with the frequency within the preset frequency range on a time domain;

if the energy value is larger than the first preset energy threshold value, the voice signal is generated by the speech of a user wearing the electronic equipment;

Wherein, if the voice activity detection process running on the processor monitors that the voice signal collected by the microphone accords with a first preset condition, the method further comprises the following steps:

judging whether to start the keyword identification process in advance according to the word loss degree allowed by the keyword identification process and the microphone starting speed;

if the word loss degree allowed by the keyword recognition process and the microphone starting speed meet a second preset condition, the keyword recognition process is started in advance, and the keyword recognition process and the process for detecting whether the voice signal is generated by the speech of the user wearing the electronic equipment are synchronously carried out at the moment;

if the voice signal contains a preset voice instruction keyword and the voice signal is generated by the user speaking, controlling the electronic equipment to execute a corresponding function; or,

and if the voice signal does not contain a preset voice instruction keyword or the voice signal is not generated by the user speaking, ignoring the voice signal.

The second preset condition is that the word loss degree allowed by the keyword recognition process is smaller than a preset word loss degree threshold value and the microphone starting speed is smaller than a preset starting speed threshold value.

The first preset condition is that the voice energy of the voice signal is larger than a second preset energy threshold.

A second aspect of an embodiment of the present invention provides a voice wake-up system, which is applied to an electronic device, where the electronic device includes a processor, a gravity sensor and a microphone, the gravity sensor and the microphone are electrically connected to the processor, respectively, and the voice wake-up system includes:

the voice activity detection unit is used for triggering the processor to read the data signal acquired by the gravity sensor if the voice activity detection process monitors that the voice signal acquired by the microphone conforms to a first preset condition;

the first judging unit is used for judging whether the voice signal is generated by the speech of a user wearing the electronic equipment according to the data signal;

an execution unit, configured to ignore the voice signal if the voice signal is not generated by the user speaking; or, the electronic device is configured to start a keyword recognition process if the voice signal is generated by the user speaking, and control the electronic device to execute a corresponding function if the voice signal is recognized to include a preset voice instruction keyword.

The first judging unit is specifically configured to:

if the signal energy is less than or equal to the first preset energy threshold, it is indicated that the voice signal is not generated by the speech of the user wearing the electronic equipment;

or, the first determining unit is specifically configured to:

A third aspect of an embodiment of the present invention provides an electronic device, including a gravity sensor, a microphone, a memory, a processor, and a computer program stored in the memory and executable on the processor, where the gravity sensor, the microphone, and the memory are all electrically connected to the processor, and the processor implements the steps of the voice wake-up method according to any one of the embodiments of the first aspect when executing the computer program.

A fourth aspect of embodiments of the present invention provides a computer-readable storage medium, which stores a computer program, and the computer program, when executed by a processor, implements the steps of the voice wake-up method according to any one of the embodiments of the first aspect.

Compared with the problem that the awakening scheme in the prior art possibly causes mistaken awakening of equipment, large power loss and high equipment cost, the voice awakening method, the voice awakening system, the electronic equipment and the computer readable storage medium provided by the embodiment of the invention further judge whether the voice signal is generated by a user who wears the electronic equipment when the voice activity detection process monitors that the voice signal acquired by the microphone meets a first preset condition, and the keyword identification process is started when the voice signal is generated by the user, so that the energy consumption of the electronic equipment can be saved, and the mistaken awakening of the electronic equipment can be avoided; in addition, whether the voice signal is generated by the speech of the user wearing the electronic equipment is judged by the data signal acquired by the gravity sensor carried by the electronic equipment, so that a special bone conduction microphone or other contact microphones are not needed, the cost is low, the algorithm is simple and practical, the accuracy is high, and the consumed resources are few.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1 is a block diagram of an electronic device according to an embodiment of the present invention;

fig. 2 is a schematic flowchart illustrating a specific implementation flow of a voice wake-up method according to an embodiment of the present invention;

fig. 3 is a schematic flowchart of a specific implementation of the voice wake-up method according to a second embodiment of the present invention;

fig. 4 is a schematic structural diagram of a voice wake-up system according to a third embodiment of the present invention;

fig. 5 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present invention.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, interface switching devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.

In order to explain the technical means of the present invention, the following description will be given by way of specific examples.

Fig. 1 is a block diagram of an electronic device according to an embodiment of the present invention. Only the portions related to the present embodiment are shown for convenience of explanation.

Referring to fig. 1, an electronic device 100 according to an embodiment of the present invention includes a processor 103, a gravity sensor 101, and a microphone 102, where the gravity sensor 101 and the microphone 102 are electrically connected to the processor 103 respectively.

The electronic device 100 includes, but is not limited to, a smart wearable device such as a headset. The microphone 102 is a conventional, low-cost type of microphone 102 that is self-contained on the electronic device 100. The gravity sensor 101 is a sensor provided in the electronic device 100 for determining a wearing state and implementing a single-click and double-click function.

Based on the structure of the electronic device 100 described above, the following embodiments of the present invention are proposed.

Example one

Fig. 2 is a flowchart illustrating a specific implementation of the voice wake-up method according to an embodiment of the present invention, where the method is applied to the electronic device 100 shown in fig. 1, and an execution subject of the method is the processor 103 in the electronic device 100 shown in fig. 1. Referring to fig. 2, the voice wake-up method provided in this embodiment may include the following steps:

step S201, if the voice activity detection process monitors that the voice signal collected by the microphone 102 meets a first preset condition, the processor 103 is triggered to read the data signal collected by the gravity sensor 101.

In a specific implementation manner, the first preset condition is that the voice energy of the voice signal is greater than a second preset energy threshold. The voice activity detection process and the microphone 102 are kept on when the electronic device 100 is in a standby state, when the microphone 102 collects a voice signal, the voice signal is transmitted to the voice activity detection process, the voice activity detection process detects voice energy of the voice signal, and when the voice energy is greater than the second preset energy threshold, the processor 103 is triggered to read a data signal collected by the gravity sensor 101.

In this embodiment, because the voice energy that detects speech signal is greater than just trigger when the second presets the energy threshold value processor 103 reads the data signal that gravity sensor 101 gathered can avoid like this under noisy environment, processor 103 is triggered repeatedly and judges whether speech signal is the flow that produces by the user speech of wearing electronic equipment 100, and further the consumption at terminal is saved.

Step S202, judging whether the voice signal is generated by the speech of the user wearing the electronic equipment 100 according to the data signal; if the voice signal is not generated by the user speaking, step S203 is entered; if the voice signal is generated by the user speaking, go to step S204.

In this embodiment, the electronic device 100 is worn on the head of the user, and when the user speaks or does not speak, the vibration frequencies and amplitudes of the maxilla and the mandible of the head are different, which results in different data signals collected by the gravity sensor 101, so that it can be determined whether the voice signal is generated by the user wearing the electronic device 100 by analyzing the data signals collected by the gravity sensor 101.

In a specific implementation manner, the determining whether the voice signal is generated by a speech of a user wearing the electronic device 100 according to the data signal includes:

if the signal energy is greater than the first preset energy threshold, it indicates that the voice signal is generated by a user wearing the electronic device 100 speaking;

if the signal energy is less than or equal to the first preset energy threshold, it indicates that the voice signal is not generated by the speech of the user wearing the electronic device 100.

In another specific implementation, the determining whether the voice signal is generated by a user wearing the electronic device 100 speaking according to the data signal includes:

if the energy value is greater than the first preset energy threshold value, it indicates that the voice signal is generated by the speech of the user wearing the electronic device 100;

It should be noted that, in the two specific implementation manners, the first preset energy threshold is an energy threshold obtained through a large amount of training learning in advance and used for distinguishing whether a voice signal is generated by a user wearing the electronic device speaking. When the user wearing the electronic device speaks, the vibration frequency and amplitude of the maxilla and the mandible of the head of the user are large, and the signal energy of the data signal acquired by the gravity sensor 101 is large, so that when the signal energy of the data signal is larger than the first preset energy threshold, the voice signal is generated by the user wearing the electronic device speaking; on the contrary, if the signal energy is less than or equal to the first preset energy threshold, it indicates that the vibration frequency and amplitude of the maxilla and the mandible of the head of the user are both small, and thus indicates that the voice signal is not generated by the speech of the user wearing the electronic device. The preset frequency range is the frequency range of the voice signal when a person speaks. Specifically, the preset frequency range is 300-3000 Hz, and because the frequency band of the voice signal is different from the frequency band of the environmental noise when the person speaks, only the preset frequency range is counted in the embodiment, namely, the signal energy in the frequency band where the voice signal is located when the person speaks can filter the influence of other noise energy on the judgment result, so that the judgment result is more accurate.

Step S203, ignoring the speech signal.

In this embodiment, if the voice signal is not caused by the speech of the user wearing the electronic device 100, it indicates that the voice signal is generated by ambient noise or speech of another person, and is not a voice control instruction input by the user wearing the electronic device 100, so that the voice signal is ignored, and keyword recognition is not further performed on the voice signal, which can save power consumption of the electronic device 100.

Step S204, starting a keyword recognition process, and if it is recognized that the voice signal includes a preset voice instruction keyword, controlling the electronic device 100 to execute a corresponding function.

In this embodiment, if the voice signal is caused by a user wearing the electronic device 100 speaking, it indicates that the voice signal may be a voice control instruction input by the user, so that a keyword recognition process is further started, whether the voice signal includes a preset voice instruction keyword is recognized, and if the voice signal includes the preset voice instruction keyword, the electronic device 100 is controlled to execute a corresponding voice control function; on the contrary, if the preset voice command keyword is not included, it indicates that the voice signal is generated by the user speaking, but is not a voice control command, and therefore the electronic device 100 is not woken up by ignoring the voice signal.

As can be seen from the above, in the voice wake-up method provided in this embodiment, when the voice activity detection process monitors that the voice signal acquired by the microphone 102 meets the first preset condition, it is further determined whether the voice signal is generated by the speech of the user wearing the electronic device 100, and when the voice signal is generated by the speech of the user, the keyword recognition process is started, so that the energy consumption of the electronic device 100 can be saved, and the electronic device 100 can be prevented from being woken up by mistake; in addition, whether the voice signal is generated by the speech of the user wearing the electronic device 100 is judged by the data signal collected by the gravity sensor 101 carried by the electronic device 100, so that a special bone conduction microphone or other contact microphones are not needed, the cost is low, the algorithm is simple and practical, the accuracy is high, and the consumed resources are few.

Example two

Fig. 3 is a flowchart illustrating a specific implementation of the voice wake-up method according to a second embodiment of the present invention, where the method is applied to the electronic device 100 shown in fig. 1, and an execution subject of the method is the processor 103 in the electronic device 100 shown in fig. 1. Referring to fig. 3, the voice wake-up method provided in this embodiment may include the following steps:

step S301, determining whether the voice activity detection process monitors that the voice signal acquired by the microphone 102 meets a first preset condition, and if the voice signal meets the first preset condition, entering step S302-1 and step S302-2 at the same time. The specific implementation manner of this step is the same as that of the first embodiment, and is not described herein again.

Step S302-1, judging whether to start the keyword identification process in advance according to the word loss degree allowed by the keyword identification process and the starting speed of the microphone 102; if the word missing degree allowed by the keyword recognition process and the starting speed of the microphone 102 meet a second preset condition, the process goes to step S303-1.

Step S303-1, starting the keyword recognition process, and recognizing whether the voice signal contains preset voice instruction keywords.

The first preset condition is that the voice energy of the voice signal is larger than a second preset energy threshold. The voice activity detection process and the microphone 102 are kept on when the electronic device 100 is in a standby state, when the microphone 102 collects a voice signal, the voice signal is transmitted to the voice activity detection process, the voice activity detection process detects voice energy of the voice signal, and when the voice energy is greater than the second preset energy threshold, the processor 103 is triggered to read a data signal collected by the gravity sensor 101.

The second preset energy threshold is an energy threshold preset to avoid that the processor is repeatedly triggered to judge whether the voice signal is a speech process of a user wearing the electronic device in a noisy environment. Because the speech energy that detects speech signal is greater than just trigger when the second presets the energy threshold value processor 103 reads the data signal that gravity sensor 101 gathered can avoid like this under noisy environment, processor 103 is triggered repeatedly and judges whether speech signal is the flow that produces by the user speech of wearing electronic equipment 100, the further consumption that saves the terminal.

The second preset condition is that the word loss degree allowed by the keyword recognition process is smaller than a preset word loss degree threshold value, and the microphone 102 starting speed is smaller than a preset starting speed threshold value.

In this embodiment, when the word loss degree allowed by the keyword recognition process is smaller than a preset word loss degree threshold and the microphone 102 start speed is smaller than a preset start speed threshold, the step S303-1 is performed to start the keyword recognition process in advance, so that a situation that a complete voice control instruction cannot be recognized due to too many word losses caused by too low start speed of the microphone 102 and failure to acquire a voice signal sent by a user in time if the keyword recognition process is not started in advance can be avoided; on the contrary, if the word loss degree allowed by the keyword recognition process is greater than or equal to the preset word loss degree or the microphone 102 start speed is greater than or equal to the preset start speed threshold, the possibility of the situation of losing the voice control instruction is low, so the keyword recognition process is not started in advance, and the wake-up process in this situation is the same as the voice wake-up process provided in the first embodiment, and therefore, details are not described here again.

Step S302-2, triggering the processor 103 to read the data signal acquired by the gravity sensor 101;

step S303-2, determining whether the voice signal is generated by a user wearing the electronic device 100 speaking according to the data signal. It should be noted that, since the implementation manners of step S302-2 and step S303-2 are the same as the implementation manners of the corresponding steps in the first embodiment, no further description is provided herein.

Step S304, if the voice signal includes a preset voice command keyword and the voice signal is generated by the user speaking, controlling the electronic device 100 to execute a corresponding function.

Step S305, if the voice signal does not include a preset voice command keyword or the voice signal is not generated by the user speaking, ignoring the voice signal.

In this embodiment, when the voice signal simultaneously satisfies the condition that the voice signal includes the preset voice command keyword and is generated by the speech of the user wearing the electronic device 100, the electronic device 100 is controlled to execute the corresponding voice control function, and when the voice signal does not satisfy any of the two conditions, the voice signal is ignored, so that the electronic device 100 can be prevented from being awoken by mistake.

As can be seen from the above, the voice wake-up method provided in this embodiment can also avoid the false wake-up of the electronic device 100, and because the data signal acquired by the gravity sensor 101 of the electronic device 100 is used to determine whether the voice signal is generated by the speech of the user wearing the electronic device 100, a special bone conduction microphone or other contact microphones are not required, so that the cost is low, the algorithm is simple and practical, the accuracy is high, and the consumed resources are few; in addition, compared with the previous embodiment, in the voice wakeup method provided in this embodiment, when the word loss degree allowed by the keyword recognition process is smaller than the preset word loss degree threshold and the microphone 102 starting speed is smaller than the preset starting speed threshold, the keyword recognition process is started in advance, so that the situation that the complete voice control instruction cannot be recognized due to too much word loss caused by too low starting speed of the microphone 102 and the fact that the keyword recognition process is not started in advance can be avoided.

EXAMPLE III

Fig. 4 is a schematic structural diagram of a voice wake-up system according to a third embodiment of the present invention, where the system is applied to the electronic device 100 shown in fig. 1 and runs in the processor 103 of the electronic device 100 shown in fig. 1. Only the portions related to the present embodiment are shown for convenience of explanation.

Referring to fig. 4, the voice wake-up system 4 provided in this embodiment includes:

the voice activity detection unit 41 is configured to trigger the processor 103 to read a data signal acquired by the gravity sensor 101 if the voice activity detection process monitors that the voice signal acquired by the microphone 102 meets a first preset condition; the first preset condition is that the voice energy of the voice signal is larger than a second preset energy threshold.

A first judging unit 42, configured to judge whether the voice signal is generated by a user wearing the electronic device 100 speaking according to the data signal;

an execution unit 43, configured to ignore the voice signal if the voice signal is not generated by the user speaking; or, the electronic device 100 is configured to start a keyword recognition process if the voice signal is generated by the user speaking, and control the electronic device 100 to execute a corresponding function if the voice signal is recognized to include a preset voice instruction keyword.

Optionally, the first determining unit 42 is specifically configured to:

if the signal energy is less than or equal to the first preset energy threshold, it indicates that the voice signal is not generated by the speech of the user wearing the electronic device 100;

the first preset energy threshold is an energy threshold which is obtained through a large amount of training learning in advance and is used for distinguishing whether a voice signal is generated by speaking of a user wearing the electronic equipment. When the user wearing the electronic device speaks, the vibration frequency and amplitude of the maxilla and the mandible of the head of the user are large, and the signal energy of the data signal acquired by the gravity sensor 101 is large, so that when the signal energy of the data signal is larger than the first preset energy threshold, the voice signal is generated by the user wearing the electronic device speaking; on the contrary, if the signal energy is less than or equal to the first preset energy threshold, it indicates that the vibration frequency and amplitude of the maxilla and the mandible of the head of the user are both small, and thus indicates that the voice signal is not generated by the speech of the user wearing the electronic device.

Alternatively, the first determining unit 42 is specifically configured to:

Optionally, the voice wake-up system further includes:

a second judging unit 44, configured to judge whether to start the keyword recognition process in advance according to the word loss degree allowed by the keyword recognition process and the starting speed of the microphone 102;

the execution unit 43 is further configured to:

if the word loss degree allowed by the keyword recognition process and the starting speed of the microphone 102 meet a second preset condition, the keyword recognition process is started in advance, and at the moment, the keyword recognition process and the process of detecting whether the voice signal is generated by the speech of the user wearing the electronic equipment 100 are synchronously performed;

if the voice signal contains a preset voice instruction keyword and the voice signal is generated by the user speaking, controlling the electronic device 100 to execute a corresponding function; or,

Optionally, the second preset condition is that the word loss degree allowed by the keyword recognition process is smaller than a preset word loss degree threshold, and the microphone 102 starting speed is smaller than a preset starting speed threshold.

Optionally, the first preset condition is that the voice energy of the voice signal is greater than a second preset energy threshold. The second preset energy threshold is an energy threshold preset to avoid that the processor is repeatedly triggered to judge whether the voice signal is a speech process of a user wearing the electronic device in a noisy environment. Because the speech energy that detects speech signal is greater than just trigger when the second presets the energy threshold value processor 103 reads the data signal that gravity sensor 101 gathered can avoid like this under noisy environment, processor 103 is triggered repeatedly and judges whether speech signal is the flow that produces by the user speech of wearing electronic equipment 100, the further consumption that saves the terminal.

It should be noted that, since each unit of the above-mentioned system provided in the embodiment of the present invention is based on the same concept as that of the embodiment of the method of the present invention, the technical effect thereof is the same as that of the embodiment of the method of the present invention, and specific contents thereof may be referred to the description in the embodiment of the method of the present invention, and are not described herein again.

It will be understood by those of ordinary skill in the art that all or some of the steps of the disclosed methods of the present embodiments may be implemented as software, firmware, hardware, or any suitable combination thereof.

Example four

Fig. 5 is a schematic structural diagram of an electronic device 100 according to a fourth embodiment of the present invention. Only the portions related to the present embodiment are shown for convenience of explanation.

Referring to fig. 5, the electronic device 100 provided in this embodiment includes a gravity sensor 101, a microphone 102, a memory 104, a processor 103, and a computer program 105 stored in the memory 104 and executable on the processor 103, wherein the gravity sensor 101, the microphone 102, and the memory 104 are all electrically connected to the processor 103, and the processor 103 implements the steps of the voice wake-up method according to the first embodiment or the second embodiment when executing the computer program 105. The electronic device 100 includes, but is not limited to, a smart wearable device such as a headset.

The electronic device 100 of this embodiment and the voice wake-up method of the first or second embodiment belong to the same concept, and specific implementation processes thereof are described in detail in the method embodiments, and technical features in the method embodiments are correspondingly applicable in the device embodiments, which are not described herein again.

EXAMPLE five

Fifth, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the steps of the voice wake-up method according to the first embodiment or the second embodiment are implemented.

The computer-readable storage medium of this embodiment and the voice wakeup method of the first or second embodiment belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments, and technical features in the method embodiments are correspondingly applicable in the apparatus embodiments, and are not described herein again.

In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the physical components may be implemented as software executed by a processor 103, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as integrated circuits, such as application specific integrated circuits. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.

The preferred embodiments of the present invention have been described above with reference to the accompanying drawings, and are not to be construed as limiting the scope of the invention. Any modifications, equivalents and improvements which may occur to those skilled in the art without departing from the scope and spirit of the present invention are intended to be within the scope of the claims.

Claims

1. A voice awakening method is applied to electronic equipment, wherein the electronic equipment comprises a processor, a gravity sensor and a microphone, the gravity sensor and the microphone are respectively and electrically connected with the processor, and the voice awakening method comprises the following steps of:

2. The method of claim 1, wherein said determining from the data signal whether the voice signal is generated by a user wearing the electronic device speaking comprises:

3. The voice wake-up method according to claim 1, wherein the determining from the data signal whether the voice signal is generated by a user wearing the electronic device speaking comprises:

4. The voice wake-up method according to claim 1, wherein the step of, if the voice activity detection process running on the processor detects that the voice signal collected by the microphone meets the first predetermined condition, further comprises:

5. The voice wake-up method according to claim 4, wherein the second predetermined condition is that the word loss degree allowed by the keyword recognition process is less than a predetermined word loss degree threshold and the microphone activation speed is less than a predetermined activation speed threshold.

6. The voice wake-up method according to claim 1, wherein the first predetermined condition is that the voice energy of the voice signal is greater than a second predetermined energy threshold.

7. The utility model provides a system is awaken up to pronunciation, is applied to electronic equipment, electronic equipment includes treater, gravity sensor and microphone, gravity sensor with the microphone respectively with treater electric connection, its characterized in that, the system is awaken up to pronunciation includes:

8. The language wake-up system according to claim 7, wherein the first determining unit is specifically configured to:

or, the first determining unit is specifically configured to:

9. An electronic device comprising a gravity sensor, a microphone, a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the gravity sensor, the microphone and the memory are all electrically connected to the processor, and wherein the processor implements the steps of the voice wake-up method according to any of claims 1 to 6 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the voice wake-up method according to any one of claims 1 to 6.