CN105261368A - Voice wake-up method and apparatus - Google Patents

Voice wake-up method and apparatus Download PDF

Info

Publication number
CN105261368A
CN105261368A CN201510549435.6A CN201510549435A CN105261368A CN 105261368 A CN105261368 A CN 105261368A CN 201510549435 A CN201510549435 A CN 201510549435A CN 105261368 A CN105261368 A CN 105261368A
Authority
CN
China
Prior art keywords
moment
threshold
threshold value
noise energy
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510549435.6A
Other languages
Chinese (zh)
Other versions
CN105261368B (en
Inventor
马涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Gaohang Intellectual Property Operation Co ltd
Nanjing Advanced Biomaterials And Process Equipment Research Institute Co ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201510549435.6A priority Critical patent/CN105261368B/en
Publication of CN105261368A publication Critical patent/CN105261368A/en
Application granted granted Critical
Publication of CN105261368B publication Critical patent/CN105261368B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The embodiment of the invention provides a voice wake-up method and apparatus. The method comprises: periodic sampling is carried out on an audio signal, wherein sampling is carried out at ti time to obtain a sampling signal; audio energy of the sampling signal is calculated; when the audio energy is larger than or equal to a first threshold value at the ti time, a digital signal processor (DSP) is waken up to carry out voice activation detection (VAD); when the VAD fails, detection fails n times continuously before the ti time, and a difference value between first noise energy and a first threshold value at the ti time is larger than a preset first threshold value, a second threshold value is generated according to the first noise energy and is used as a first threshold value at ti+1 time, wherein the first noise energy is obtained by extracting the sampling signal by a first extraction rate 1/x and carrying out slow-speed tracking filtering on the extracted sampling point. According to the embodiment of the invention, the number of times of VAD is reduced and reduction of power consumption of the terminal in a noisy environment is realized.

Description

A kind of voice awakening method and device
Technical field
The embodiment of the present invention relates to voice awakening technology, particularly relates to a kind of voice awakening method and device.
Background technology
Along with the development of science and technology, terminal generally has voice arousal function, and user uses voice to wake terminal up and carries out corresponding Voice command to it.
It is adopt microphone to activate to detect (MicrophoneActivityDetection that current voice wake scheme up, be called for short: MAD) (DigitalSignalProcessor is called for short: DSP) two-stage cooperation wakes terminal up for circuit and digital signal processor.Wherein, if MAD electric circuit inspection to the energy of current audio signals be greater than predetermined threshold value, then wake DSP up to carry out voice activation detection (VoiceActivityDetection, is called for short: VAD), to be identified above-mentioned sound signal by VAD whether for the voice of user; If so, then terminal is waken up; If not, DSP wakes up as Lost wake-up or false wake-up.Particularly, VAD, by the feature of the feature of the above-mentioned sound signal of comparison and the voice of user, judges that whether voice signal is the voice of user.
Above-mentioned voice are adopted to wake scheme up; such as, when terminal is in different environment, under being switched to noisy environment by quiet environment, because predetermined threshold value is fixing; therefore often there will be the phenomenon of Lost wake-up or false wake-up, cause the power consumption of terminal under noisy environment higher.
Summary of the invention
The embodiment of the present invention provides a kind of voice awakening method and device, to reduce the power consumption of terminal under noisy environment.
First aspect, the embodiment of the present invention provides a kind of voice awakening method, comprising:
Periodic samples is carried out to sound signal, wherein, at t iinstance sample obtains sampled signal y i, i is positive integer;
Calculate described sampled signal y iaudio power T i;
At described audio power T ibe more than or equal to described t ithe first threshold A in moment 0when, carry out voice activation and detect VAD;
When VAD detects unsuccessfully for n time continuously, and when VAD detects unsuccessfully, and at described t idetected unsuccessfully for n time continuously before moment, and the first noise energy S 0with described t ithe first threshold A in moment 0difference be greater than the first default threshold value M 0time, according to described first noise energy S 0generate Second Threshold A 1, and by described Second Threshold A 1as t i+1the first threshold A in moment 0, wherein, described first noise energy S 0be by with the first extraction yield 1/x to described sampled point y iextract, and tracking filter is at a slow speed carried out to the sampled point ys extracted obtain, x be greater than 1 natural number, n is positive integer and n is less than i.
In conjunction with first aspect, in the first possible implementation of first aspect, described according to described first noise energy S 0generate Second Threshold A 1, comprising:
By described first noise energy S 0as described Second Threshold A 1;
Or, by described first noise energy S 0with the first correction N preset 0sum is as described Second Threshold A 1;
Or, by described first noise energy S 0with the first coefficient a preset 0long-pending as described Second Threshold A 1.
In conjunction with first aspect, in the implementation that the second of first aspect is possible, at the described sampled signal y of described calculating iaudio power T iafterwards, also comprise:
At described audio power T ibe less than described t ithe first threshold A in moment 0, and from t i-mmoment is until t imoment first threshold A separately 0with the second noise energy F 0difference be all greater than the second default threshold value M 1when, carry out VAD, m and be positive integer and m is less than i;
When VAD detects successfully, according to described second noise energy F 0generate the 3rd threshold value A 2, and by described 3rd threshold value A 2as t i+1the first threshold A in moment 0, wherein, described second noise energy F 0be by with the second extraction yield 1/z to described sampled signal y iextract, and carry out quick tracking filter to the sampled point yf extracted and obtain, wherein, z is the natural number being greater than x.
In conjunction with the implementation that the second of first aspect is possible, in the third possible implementation of first aspect, described according to described second noise energy F 0generate the 3rd threshold value A 2, comprising:
By described second noise energy F 0as described 3rd threshold value A 2;
Or, by described second noise energy F 0with the second correction N preset 1sum is as described 3rd threshold value A 2;
Or, by described second noise energy F 0with the second coefficient a preset 1long-pending as described 3rd threshold value A 2.
In conjunction with the second or the third possible implementation of first aspect, in the 4th kind of possible implementation of first aspect, by described 3rd threshold value A 2as t i+1the first threshold A in moment 0before, also comprise:
Record described t imoment is for reducing the threshold value moment;
As described t imoment and upper one reduce the threshold value moment interval greater than preset value T timetime, perform described by described 3rd threshold value A 2as t i+1the first threshold A in moment 0step, otherwise, do not perform described by described 3rd threshold value A 2as t i+1the first threshold A in moment 0step.
In conjunction with first aspect, in the 5th kind of possible implementation of first aspect, at the described sampled signal y of described calculating iaudio power T iafterwards, also comprise:
At described audio power T ibe less than described t ithe first threshold A in moment 0, and described t ithe first threshold A in moment 0with described first noise energy S 0difference be greater than the 3rd default threshold value M 2when, according to described first noise energy S 0generate the 4th threshold value A 3, and by described 4th threshold value A 3as t i+1the first threshold A in moment 0.
In conjunction with the 5th kind of possible implementation of first aspect, in the 6th kind of possible implementation of first aspect, described according to described first noise energy S 0generate the 4th threshold value A 3, comprising:
By described first noise energy S 0as described 4th threshold value A 3;
Or, by described first noise energy S 0with the 3rd correction N preset 2sum is as described 4th threshold value A 3;
Or, by described first noise energy S 0with the 3rd coefficient a preset 2long-pending as described 4th threshold value A 3.
In conjunction with the 5th kind or the 6th kind of possible implementation of first aspect, in the 7th kind of possible implementation of first aspect, by described 4th threshold value A 3as t i+1the first threshold A in moment 0before, also comprise:
Record described t imoment is for reducing the threshold value moment;
As described t imoment and upper one reduce the threshold value moment interval greater than preset value T timetime, perform described by described 4th threshold value A 3as t i+1the first threshold A in moment 0step, otherwise, do not perform described by described 4th threshold value A 3as t i+1the first threshold A in moment 0step.
Second aspect, the embodiment of the present invention provides a kind of voice Rouser, comprising:
Sampling frequency converter SRC, for carrying out periodic samples to sound signal, wherein, at t iinstance sample obtains sampled signal y i, i is positive integer;
Computing circuit, for calculating described sampled signal y iaudio power T i;
Threshold value decision circuit, for judging described audio power T iwhether be more than or equal to described t ithe first threshold A in moment 0; At described audio power T ibe more than or equal to described t ithe first threshold A in moment 0when, triggered interrupts treatment circuit exports interruption pulse signal to interrupt control circuit, carries out voice activation detect VAD by described interrupt control circuit enable digital signals processor DSP or processor;
First withdrawal device, the input end of described first withdrawal device is coupled to the output terminal of described SRC, for the first extraction yield 1/x to described sampled signal y icarry out extraction and obtain sampled point ys, x be greater than 1 natural number;
The input end of tracking filter STF at a slow speed, described STF is coupled to the output terminal of described first withdrawal device, carries out tracking filter at a slow speed obtain the first noise energy S for obtaining sampled point ys to described extraction 0;
Comparer, the input end of described comparer is coupled to and the output terminal of described STF and described threshold value decision circuit, for more described first noise energy S 0with described t ithe first threshold A in moment 0difference whether be greater than the first default threshold value M 0;
Configurator, for detecting unsuccessfully as VAD, and at described t idetected unsuccessfully for n time continuously before moment, and described first noise energy S 0with described t ithe first threshold A in moment 0difference be greater than the first default threshold value M 0time, according to described first noise energy S 0generate Second Threshold A 1, and by described Second Threshold A 1as t i+1the first threshold A in moment 0, be issued to described threshold value decision circuit, n is positive integer and n is less than i.
In conjunction with second aspect, in the first possible implementation of second aspect, described configurator specifically for:
By described first noise energy S 0as described Second Threshold A 1;
Or, by described first noise energy S 0with the first correction N preset 0sum is as described Second Threshold A 1;
Or, by described first noise energy S 0with the first coefficient a preset 0long-pending as described Second Threshold A 1.
In conjunction with second aspect, in the implementation that the second of second aspect is possible, also comprise:
Second withdrawal device, the input end of described second withdrawal device is coupled to the output terminal of described SRC, for the second extraction yield 1/z to described sampled signal y icarry out extraction and obtain sampled point yf, wherein, z is the natural number being greater than x;
The input end of fast tracking filter FTF, described FTF is coupled to the output terminal of described second withdrawal device, carries out quick tracking filter obtain the second noise energy F for obtaining sampled point yf to described extraction 0second noise energy;
Described comparer, with the output terminal of described FTF, also at described audio power T ibe less than described t ithe first threshold A in moment 0when, the first threshold in more each moment and described second noise energy F 0difference whether be greater than the second default threshold value M 1; And work as from t i-mmoment is until t imoment first threshold A separately 0with described second noise energy F 0difference be all greater than the second default threshold value M 1when, trigger described interrupt processing circuit and export interruption pulse signal to described interrupt control circuit, carry out VAD, m by the enable described DSP of described interrupt control circuit or described processor and be positive integer and m is less than i;
Described configurator, also for when VAD detects successfully, according to described second noise energy F 0generate the 3rd threshold value A 2, and by described 3rd threshold value A 2as t i+1the first threshold A in moment 0, be issued to described threshold value decision circuit.
In conjunction with the implementation that the second of second aspect is possible, in the third possible implementation of second aspect, described configurator specifically for:
By described second noise energy F 0as described 3rd threshold value A 2;
Or, by described second noise energy F 0with the second correction N preset 1sum is as described 3rd threshold value A 2;
Or, by described second noise energy F 0with the second coefficient a preset 1long-pending as described 3rd threshold value A 2.
In conjunction with the second or the third possible implementation of second aspect, in the 4th kind of possible implementation of second aspect, described configurator also for:
Record described t imoment is for reducing the threshold value moment;
As described t imoment and upper one reduce the threshold value moment interval greater than preset value T timetime, perform described by described 3rd threshold value A 2as t i+1the first threshold A in moment 0step, otherwise, do not perform described by described 3rd threshold value A 2as t i+1the first threshold A in moment 0step.
In conjunction with second aspect, in the 5th kind of possible implementation of second aspect, described configurator also for:
At described audio power T ibe less than described t ithe first threshold A in moment 0, and described t ithe first threshold A in moment 0with described first noise energy S 0difference be greater than the 3rd default threshold value M 2when, according to described first noise energy S 0generate the 4th threshold value A 3, and by described 4th threshold value A 3as t i+1the first threshold A in moment 0.
In conjunction with the 5th kind of possible implementation of second aspect, in the 6th kind of possible implementation of second aspect, described configurator specifically for:
By described first noise energy S 0as described 4th threshold value A 3;
Or, by described first noise energy S 0with the 3rd correction N preset 2sum is as described 4th threshold value A 3;
Or, by described first noise energy S 0with the 3rd coefficient a preset 2long-pending as described 4th threshold value A 3.
In conjunction with the 5th kind or the 6th kind of possible implementation of second aspect, in the 7th kind of possible implementation of second aspect, described configurator also for:
Record described t imoment is for reducing the threshold value moment;
As described t imoment and upper one reduce the threshold value moment interval greater than preset value T timetime, perform described by described 4th threshold value A 3as t i+1the first threshold A in moment 0step, otherwise, do not perform described by described 4th threshold value A 3as t i+1the first threshold A in moment 0step.
The embodiment of the present invention provides a kind of voice awakening method and device, by obtaining t iinstance sample obtains sampled signal y iaudio power T i, and at this audio power T ibe more than or equal to t ithe first threshold A in moment 0when, carry out VAD; When VAD detects unsuccessfully, and at t idetected unsuccessfully for n time continuously before moment, and the first noise energy S 0with t ithe first threshold A in moment 0difference be greater than the first default threshold value M 0time, adjustment first threshold A 0size, obtain t i+1the first threshold A in moment 0: according to the first noise energy S 0generate Second Threshold A 1, and by Second Threshold A 1as t i+1the first threshold A in moment 0.Wherein, the first noise energy S 0be by with the first extraction yield 1/x to sampled signal y iextract, and tracking filter is at a slow speed carried out to the sampled point ys extracted obtain, that is, t i+1the first threshold A in moment 0according to t ithe first noise energy S in moment 0obtain, like this, terminal can adjust the first threshold A of subsequent time according to ambient noise present 0size, make the first threshold A in each moment 0with environments match, to reduce the number of times carrying out VAD, realize the reduction of terminal power consumption under noisy environment.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, the accompanying drawing used required in describing embodiment is done one below to introduce simply, apparently, accompanying drawing in the following describes is some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
Fig. 1 is the process flow diagram of voice awakening method embodiment one of the present invention;
Fig. 2 is voice awakening method of the present invention first threshold exemplary plot under various circumstances;
Fig. 3 is the process flow diagram of voice awakening method embodiment two of the present invention;
Fig. 4 is the process flow diagram of voice awakening method embodiment three of the present invention;
Fig. 5 is the structural representation of voice Rouser embodiment one of the present invention;
Fig. 6 is the structural representation of voice Rouser embodiment two of the present invention;
Fig. 7 is the structural representation of voice Rouser embodiment three of the present invention.
Embodiment
For making the object of the embodiment of the present invention, technical scheme and advantage clearly, below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly described, obviously, described embodiment is the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
The implication that voice wake up, refers in any case, can wake word up, activate terminal by predefined, and performs specific application.Similar user key-press lights screen, the process Activated Phone.The advantage that voice wake up is the both hands of having liberated user.
Wake up in scheme at the voice of a smart mobile phone, under quiet environment, the stand-by power consumption of this smart mobile phone is about 2.2 milliamperes × 3.8 volts; Under noisy environment, the stand-by power consumption of this smart mobile phone is 5.5 milliamperes × 3.8 volts.Visible, the power consumption difference of this smart mobile phone under noisy environment and quiet environment is about 12 milliwatts, (5.5-2.2) × 3.8=12.
According to power consumption estimation model: average power consumption=noisy power consumption × 30% of quiet power consumption × 70%+, therefore, should consider the power consumption under reduction noisy environment, the embodiment of the present invention pays close attention to the optimised power consumption under noisy environment.
A kind of method that the embodiment of the present invention provides voice to wake digital signal processor in terminal up and device, wake DSP in terminal up carry out the number of times of VAD to reduce, realize the reduction of the power consumption of terminal under noisy environment.
Fig. 1 is the process flow diagram of voice awakening method embodiment one of the present invention.The method can be performed by voice Rouser, and this device can be realized by the mode of hardware.PDA) etc. voice Rouser can be integrated in such as panel computer, smart mobile phone, palm PC, and (PersonalDigitalAssistant is called for short: in terminal.As shown in Figure 1, voice awakening method comprises:
S101, periodic samples is carried out to sound signal, wherein, at t iinstance sample obtains sampled signal y i, i is positive integer.
Similarly, t i-1the sampled signal in moment can be denoted as y i-1, t i+1the sampled signal in moment can be denoted as y i+1, by that analogy, do not enumerate here.
Wherein, in any embodiment of the present invention, the signal that sound signal can collect for sound collection equipment such as microphones.By sampling frequency converter, (SampleRateConvertor is called for short: SRC) carry out periodic samples to the sound signal that the sound collection equipment such as microphone collect.Or the sound signal sound collection equipment such as microphone collected after the filter process such as such as bandpass filter, then carries out periodic samples by SRC, and the embodiment of the present invention is not limited.
S102, calculating sampling signal y iaudio power T i.
It should be noted that, can carry out after obtaining sampled signal the calculating of the audio power of sampled signal, such as: at t i-1instance sample obtains sampled signal y i-1after, also can calculate sampled signal y i-1corresponding audio power T i-1.
It will be appreciated by those skilled in the art that because of sampled signal y icertain, therefore, sampled signal y iaudio power T ican obtain by calculating.
Particularly, x (j) is adopted to represent sampled signal y iin the amplitude of jth sampled point, x (j) × x (j) represents sampled signal y iin the energy size in jth moment, j is the integer between 0 to M-1, and M is total number of sample points, coefficient a jbe used for representing the weight size of each sampled point, T irepresent sampled signal y iaudio power.Such as, formula is below a normalized process, the number percent that each sampled point of concrete expression takies at integral energy:
T i = Σ j = 0 M - 1 a j × x ( j ) × x ( j ) , Wherein, Σ j = 0 M - 1 a j = 1
Here only example illustrates calculating sampling signal y iaudio power T i, the embodiment of the present invention not as restriction, also can pass through root mean square (Rootmeansquare, be called for short: RMS) or other similar fashion obtain sampled signal y iaudio power T i, the process be not such as normalized, etc.
S103, at audio power T ibe more than or equal to t ithe first threshold A in moment 0when, carry out VAD.
Wherein, what carry out VAD can be specifically the element such as DSP or processor in terminal.
S104, to detect unsuccessfully as VAD, and at t idetected unsuccessfully for n time continuously before moment, and the first noise energy S 0with t ithe first threshold A in moment 0difference be greater than the first default threshold value M 0time, according to the first noise energy S 0generate Second Threshold A 1, and by Second Threshold A 1as t i+1the first threshold A in moment 0, wherein, the first noise energy S 0be by with the first extraction yield 1/x to sampled signal y iextract, and tracking filter is at a slow speed carried out to the sampled point ys extracted obtain, x be greater than 1 natural number, n is positive integer and n is less than i.
It should be noted that, when VAD detects unsuccessfully, and at described t ibefore moment, n detection unsuccessfully refers to continuously: at t ithe VAD that moment carries out detects unsuccessfully, and from t i-nmoment is to t i-1the VAD that moment carries out detects all failed, particularly, supposes that n is 2, then when VAD detects unsuccessfully, and at described t ibefore moment, n detection unsuccessfully refers to continuously: at t ibefore the VAD that carries out of moment detects unsuccessfully, continuous two moment are (namely from t i-2moment is to t i-1moment) VAD that carries out detects failure continuously 2 times.Further, for the ease of understanding technical scheme of the present invention better, VAD being detected and is unsuccessfully illustrated, such as: current is the sound of motor car engine, and the audio power due to this sound is greater than the first threshold A of current time 0, then need to carry out VAD, but by VAD, can judge that this sound is not the voice of user, therefore VAD detects unsuccessfully.In other words, if terminal is in high-noise environment, accordingly, the noise energy of neighbourhood noise can be higher, once the noise energy of neighbourhood noise is greater than the first threshold A of current time 0, just need to start VAD, but, because neighbourhood noise itself is disorderly and unsystematic, when VAD detects, cannot, from the voice signal be wherein tested with, VAD therefore can be caused to detect unsuccessfully.First noise energy S 0represent the energy level of the steady-state noise of environment residing for terminal.First threshold value M 0be default parameter, can be determined by debugging.
Also it should be noted that, in any embodiment of the present invention, first and second for distinguishing same term, such as, " second " in " first " and " Second Threshold " of " first threshold ", be only the naming method that different threshold value is distinguished, do not represent the order between threshold value.
Under the application scenarios of reality, the noise under different application scene varies in size.Such as, under quiet environment, (decibel, is called for short: db) noise about 30 to 35 decibels; Under noisy environment, neighbourhood noise can with reference to following data: mall noise is about 60db, and road noise is about 70db, and aircraft cabin noise is about 70db, and public transport noise is about 80db, and metro noise is about 90db, etc.In addition, same place, the noise size of different time is also different.Such as, same place, the noise in daytime and evening may differ 10 to 15db.
Moreover user carries out conversing under noisy environment, when talking, subconsciously can improve speech volume, thus (SignalNoiseRatio is called for short: SNR), provide feasibility basis for voice wake up to improve signal to noise ratio (S/N ratio).
Therefore, adopt at present unified noise gate, i.e. predetermined threshold value, voice wake scheme up, when voice wake terminal up, cannot distinguish and treat quiet environment and noisy environment, if predetermined threshold value arranges too high, voice can be caused undetected; If predetermined threshold value arranges too low, then can cause frequent wake up process device, and then cause power consumption bigger than normal.
In embodiments of the present invention, adjust the first threshold A in each moment in good time 0size.
Particularly, by S101 to S103, obtain at t iinstance sample obtains sampled signal y iaudio power T iand this audio power T irelative t ithe first threshold A in moment 0size, and as audio power T ibe more than or equal to t ithe first threshold A in moment 0when, carry out VAD, carry out VAD to make DSP or processor etc. and according to the result of VAD, judge whether to wake terminal up.Wherein, VAD detects successfully, and namely DSP or processor etc. can carry out the element of VAD at sampled signal y iin detect and the voice of user then wake terminal up; Otherwise VAD detects unsuccessfully, namely DSP or processor etc. can carry out the element of VAD at sampled signal y iin do not detect and the voice of user then do not wake terminal up.
In S104, at the first noise energy S 0with t ithe first threshold A in moment 0difference be greater than the first default threshold value M 0time, show the current environment that may be in high ground unrest of terminal.Now, according to the first noise energy S 0generate Second Threshold A 1, and by Second Threshold A 1as t i+1the first threshold A in moment 0.Wherein, the first noise energy S 0be by with the first extraction yield 1/x to sampled signal y iextract, and tracking filter is at a slow speed carried out to the sampled point ys extracted obtain, x be greater than 1 natural number, n is the positive integer being less than i.In practical application, sampled signal y it may be comprised ithe voice of the user in moment and neighbourhood noise, or, sampled signal y ionly comprise t ithe neighbourhood noise in moment.At t imoment obtains t i+1the first threshold A in moment 0, i.e. t i+1during the moment, terminal performs the first threshold that in voice awakening method, S103 and S104 uses.
If t ithe voice in moment wake up as first time voice wake up, then t ithe first threshold A in moment 0can be default.Can think, the first threshold A preset 0be an Optimal Parameters, corresponding a kind of possible application scenarios, such as, by first threshold A 0be preset as 50 decibels, the ground unrest thresholding under quiet environment can be thought.Wherein, Fig. 2 example illustrates the first threshold under quiet environment and noisy environment.As shown in Figure 2, under quiet environment, first threshold comparatively neighbourhood noise exceeds the first preset value; Under noisy environment, first threshold comparatively neighbourhood noise exceeds the second preset value.In addition, the first threshold of noisy environment is the first threshold higher than quiet environment.
In addition, S103 can also be: 1) at audio power T iwith t i-1the audio power T in moment i-1difference be more than or equal to t ithe differential threshold A in moment 00when, carry out VAD; Or, 2) and at audio power T ibe more than or equal to t ithe first threshold A in moment 0, and, audio power T iwith t i-1the audio power T in moment i-1difference be more than or equal to t ithe differential threshold A in moment 00when, carry out VAD; Or, 3) and at audio power T ibe more than or equal to t ithe first threshold A in moment 0, or, audio power T iwith t i-1the audio power T in moment i-1difference be more than or equal to t ithe differential threshold A in moment 00, when the two meets one, carry out VAD.Wherein, t i-1the audio power T in moment i-1be buffer memory in the terminal, at t i-1moment calculating sampling signal y i-1audio power obtain.
If 1), then similar adjustment t ithe first threshold A in moment 0method, adjustment t ithe differential threshold A in moment 00; If 2), then similar adjustment t ithe first threshold A in moment 0method, adjust t simultaneously ithe first threshold A in moment 0and t ithe differential threshold A in moment 00; If 3), then similar adjustment t ithe first threshold A in moment 0method, adjustment t ithe first threshold A in moment 0or t ithe differential threshold A in moment 00.
The embodiment of the present invention is by obtaining t iinstance sample obtains sampled signal y iaudio power T i, and at this audio power T ibe more than or equal to t ithe first threshold A in moment 0when, carry out VAD; When VAD detects unsuccessfully, and at t idetected unsuccessfully for n time continuously before moment, and the first noise energy S 0with t ithe first threshold A in moment 0difference be greater than the first default threshold value M 0time, adjustment first threshold A 0size, obtain t i+1the first threshold A in moment 0: according to the first noise energy S 0generate Second Threshold A 1, and by Second Threshold A 1as t i+1the first threshold A in moment 0.Wherein, the first noise energy S 0be by with the first extraction yield 1/x to sampled signal y iextract, and tracking filter is at a slow speed carried out to the sampled point ys extracted obtain, that is, t i+1the first threshold A in moment 0according to t ithe first noise energy S in moment 0obtain, like this, terminal can adjust the first threshold A of subsequent time according to ambient noise present 0size, make the first threshold A in each moment 0with environments match, to reduce the number of times carrying out VAD, realize the reduction of terminal power consumption under noisy environment.
In the above-described embodiments, according to the first noise energy S 0generate Second Threshold A 1, can comprise: by the first noise energy S 0as Second Threshold A 1; Or, by the first noise energy S 0with the first correction N preset 0sum is as Second Threshold A 1, i.e. A 1=S 0+ N 0; Or, by the first noise energy S 0with the first coefficient a preset 0long-pending as Second Threshold A 1, i.e. A 1=a 0× S 0.
Wherein, if the first correction N 0numerical value comparatively large, Second Threshold A is described 1at the first noise energy S 0basis on raise fast; If the first correction N 0numerical value less, Second Threshold A is described 1at the first noise energy S 0basis on raise slow, the speed degree of rising can set according to the actual requirements.Wherein, the first correction N 0size can set according to actual scene, the embodiment of the present invention will not limit.Equally, if the first coefficient a 0numerical value comparatively large, Second Threshold A is described 1at the first noise energy S 0basis on raise fast; If the first coefficient a 0numerical value less, Second Threshold A is described 1at the first noise energy S 0basis on raise slow, the speed degree of rising can set according to the actual requirements.Wherein, the first coefficient a 0size can set according to actual scene, the embodiment of the present invention will not limit.
Alternatively, can also by the first noise energy S 0with the first coefficient a preset 0product, adding the first default correction N 0as Second Threshold A 1, A 1=a 0× S 0+ N 0.
Fig. 3 is the process flow diagram of voice awakening method embodiment two of the present invention.As shown in Figure 3, the method can comprise:
S301, periodic samples is carried out to sound signal, wherein, at t iinstance sample obtains sampled signal y i, i is positive integer.
S302, calculating sampling signal y iaudio power T i.
S303, at audio power T ibe less than t ithe first threshold A in moment 0, and from t i-mmoment is until t imoment first threshold A separately 0with the second noise energy F 0difference be all greater than the second default threshold value M 1when, carry out VAD, m and be positive integer and m is less than i.
Exemplary, if m=2, then as audio power T ibe less than t ithe first threshold A in moment 0, and t i-2the first threshold A in moment 0with the second noise energy F 0difference be greater than the second threshold value M 1, t i-1the first threshold A in moment 0with the second noise energy F 0difference be greater than the second threshold value M 1, and t ithe first threshold A in moment 0with the second noise energy F 0difference be greater than the second threshold value M 1time, carry out VAD.
S304, when VAD detects successfully, according to the second noise energy F 0generate the 3rd threshold value A 2, and by the 3rd threshold value A 2as t i+1the first threshold A in moment 0, wherein, this second noise energy F 0be by with the second extraction yield 1/z to sampled signal y iextract, and carry out quick tracking filter to the sampled point yf extracted and obtain, wherein, z is the natural number being greater than x.
Wherein, illustrating of S301 and S302 with reference to embodiment as shown in Figure 1, can repeat no more herein.
For S303, at audio power T ibe less than t ithe first threshold A in moment 0when, the voice for prior art wake scheme up, no longer carry out VAD, like this, just may occur the situation that the voice of user are undetected.Such as, t ithe first threshold A in moment 0be applicable to noisy environment, but now terminal is in relative quiet environment (such as, the environment of low ground unrest), thus causes sampled signal y ithe voice of middle user undetected.The embodiment of the present invention changes t by S303 and S304 i+1the first threshold A in moment 0, make it mate with current environment.
When from t i-mmoment is until t imoment first threshold A separately 0with the second noise energy F 0difference be all greater than the second default threshold value M 1time, namely add up to occur first threshold A m+1 time 0with the second noise energy F 0difference be greater than the second default threshold value M 1situation, illustrate that terminal is now in quiet environment (environment of low ground unrest), current first threshold A 0comparatively large, need to lower, to mate with quiet environment.Wherein, the second threshold value M 1be default parameter, can obtain through debugging.
Detect successfully for S304, VAD, sampled signal y is described iin comprise the voice of user, for avoiding the undetected of the voice of this user, according to the second noise energy F 0generate the 3rd threshold value A 2, and by the 3rd threshold value A 2as t i+1the first threshold A in moment 0.Wherein, this second noise energy F 0be by with the second extraction yield 1/z to sampled signal y iextract, and quick tracking filter is carried out to the sampled point yf extracted obtain, therefore, the second noise energy F 0the energy level of the transient noise of environment residing for terminal can be reflected to a certain extent.
The embodiment of the present invention is by obtaining t iinstance sample obtains sampled signal y iaudio power T i, and at this audio power T ibe less than t ithe first threshold A in moment 0, and from t i-mmoment is until t imoment first threshold A separately 0with the second noise energy F 0difference be all greater than the second default threshold value M 1when, carry out VAD; When VAD detects successfully, according to the second noise energy F 0generate the 3rd threshold value A 2, and by the 3rd threshold value A 2as t i+1the first threshold A in moment 0.Wherein, this second noise energy F 0be by with the second extraction yield 1/z to sampled signal y iextract, and quick tracking filter is carried out to the sampled point yf extracted obtain, that is, t i+1the first threshold A in moment 0according to t ithe second noise energy F in moment 0obtain, like this, terminal can adjust the first threshold A of subsequent time according to ambient noise present 0size, make the first threshold A in each moment 0with environments match, with reducing the number of times carrying out VAD, when realizing the reduction of terminal power consumption under noisy environment, avoid sampled signal y further ithe voice of middle user undetected.
In the above-described embodiments, according to described second noise energy F 0generate the 3rd threshold value A 2, specifically can comprise: by the second noise energy F 0as the 3rd threshold value A 2; Or, by the second noise energy F 0with the second correction N preset 1sum is as the 3rd threshold value A 2, i.e. A 2=F 0+ N 1; Or, by the second noise energy F 0with the second coefficient a preset 1long-pending as the 3rd threshold value A 2, i.e. A 2=a 1× F 0.
Wherein, if the second correction N 1numerical value comparatively large, the 3rd threshold value A is described 2at the second noise energy F 0basis on raise fast; If the second correction N 1numerical value less, the 3rd threshold value A is described 2at the second noise energy F 0basis on raise slow, the speed degree of rising can set according to the actual requirements.Wherein, the second correction N 1size can set according to actual scene, the embodiment of the present invention will not limit.Equally, if the second coefficient a 1numerical value comparatively large, the 3rd threshold value A is described 2at the second noise energy F 0basis on raise fast; If the second coefficient a 1numerical value less, the 3rd threshold value A is described 2at the second noise energy F 0basis on raise slow, the speed degree of rising can set according to the actual requirements.Wherein, the second coefficient a 1size can set according to actual scene, the embodiment of the present invention will not limit.
Alternatively, can also by the second noise energy F 0with the second coefficient a preset 1product, adding the second default correction N 1as the 3rd threshold value A 2, A 2=a 1× F 0+ N 1.
Fig. 4 is the process flow diagram of voice awakening method embodiment three of the present invention.As shown in Figure 4, the method can comprise:
S401, periodic samples is carried out to sound signal, wherein, at t iinstance sample obtains sampled signal y i, i is positive integer.
S402, calculating sampling signal y iaudio power T i.
S403, at audio power T ibe less than t ithe first threshold A in moment 0, and t ithe first threshold A in moment 0with the first noise energy S 0difference be greater than the 3rd default threshold value M 2when, according to the first noise energy S 0generate the 4th threshold value A 3, and by the 4th threshold value A 3as t i+1the first threshold A in moment 0.
Wherein, illustrating of S401 and S402 with reference to embodiment as shown in Figure 1, can repeat no more herein.
As for S403, at audio power T ibe less than t ithe first threshold A in moment 0when, the voice for prior art wake scheme up, no longer carry out VAD, like this, just may occur the situation that the voice of user are undetected.Such as, t ithe first threshold A in moment 0be applicable to noisy environment, but now terminal is in relatively quiet environment, thus causes sampled signal y ithe voice of middle user undetected.The embodiment of the present invention changes t by S403 i+1the first threshold A in moment 0, make it mate with current environment facies.
Work as t ithe first threshold A in moment 0with the first noise energy S 0difference be greater than the 3rd default threshold value M 2time, also, t ithe first threshold A in moment 0compare the first noise energy S 0comparatively large, illustrate that terminal is now in relatively quiet environment, t ithe first threshold A in moment 0comparatively large, need to lower, with environments match.Wherein, the 3rd threshold value M 2be default parameter, can obtain through debugging.
Because of the first noise energy S 0be by with the first extraction yield 1/x to sampled signal y iextract, and tracking filter is at a slow speed carried out to the sampled point ys extracted obtain, therefore the first noise energy S 0the stable energy of reaction environment.Therefore, S403 without the need to as S303, the first threshold A in more multiple moment 0with the first noise energy S 0difference be greater than the 3rd default threshold value M 2.Work as t ithe first threshold A in moment 0with the first noise energy S 0difference be greater than the 3rd default threshold value M 2time, sampled signal y can be described iin comprise the voice of user, for avoiding the undetected of the voice of this user, according to the first noise energy S 0generate the 4th threshold value A 3, and by the 4th threshold value A 3as t i+1the first threshold A in moment 0.
The embodiment of the present invention is by obtaining t iinstance sample obtains sampled signal y iaudio power T i, and at this audio power T ibe less than t ithe first threshold A in moment 0, and t ithe first threshold A in moment 0with the first noise energy S 0difference be greater than the 3rd default threshold value M 2when, according to the first noise energy S 0generate the 4th threshold value A 3, and by the 4th threshold value A 3as t i+1the first threshold A in moment 0.Wherein, this first noise energy S 0be by with the first extraction yield 1/x to sampled signal y iextract, and tracking filter is at a slow speed carried out to the sampled point ys extracted obtain, that is, t i+1the first threshold A in moment 0according to t ithe first noise energy S in moment 0obtain, like this, terminal can adjust the first threshold A of subsequent time according to ambient noise present 0size, make the first threshold A in each moment 0with environments match, with reducing the number of times carrying out VAD, when realizing the reduction of terminal power consumption under noisy environment, avoid sampled signal y further ithe voice of middle user undetected.
Based on above-described embodiment, wherein, according to the first noise energy S 0generate the 4th threshold value A 3can comprise: by the first noise energy S 0as the 4th threshold value A 3; Or, by the first noise energy S 0with the 3rd correction N preset 2sum is as the 4th threshold value A 3, i.e. A 3=S 0+ N 2; Or, by the first noise energy S 0with the 3rd coefficient a preset 2long-pending as the 4th threshold value A 3, i.e. A 3=a 2× S 0.
Wherein, if the 3rd correction N 2numerical value comparatively large, the 4th threshold value A is described 3at the first noise energy S 0basis on raise fast; If the 3rd correction N 2numerical value less, the 4th threshold value A is described 3at the first noise energy S 0basis on raise slow, the speed degree of rising can set according to the actual requirements.Wherein, the 3rd correction N 2size can set according to actual scene, the embodiment of the present invention will not limit.Equally, if the 3rd coefficient a 2numerical value comparatively large, the 4th threshold value A is described 3at the first noise energy S 0basis on raise fast; If the 3rd coefficient a 2numerical value less, the 4th threshold value A is described 3at the first noise energy S 0basis on raise slow, the speed degree of rising can set according to the actual requirements.Wherein, the 3rd coefficient a 2size can set according to actual scene, the embodiment of the present invention will not limit.
Alternatively, can also by the first noise energy S 0with the 3rd coefficient a preset 2product, adding the 3rd default correction N 2as the 4th threshold value A 3, i.e. A 3=a 2× S 0+ N 2.
Supplementary notes, the second correction N 1with the 3rd correction N 2reflect under different conditions respectively, first threshold A 0the numerical value of relative noise energy lift.Wherein, first threshold A 0relative second noise energy F 0large second correction N 1, first threshold A 0relative first noise energy S 0large 3rd correction N 2.In addition, due to the first noise energy S 0for tracking filter at a slow speed, the second noise energy F 0for quick tracking filter, therefore, alternatively, the 3rd correction N 2be greater than the second correction N 1, to realize the Rapid matching to environment.
Further, the embodiment of the present invention can also record the scene of first threshold change.For the scene raising first threshold, can be recorded as and raise the threshold value moment; For the scene reducing first threshold, can be recorded as and reduce the threshold value moment.
Particularly, by the 3rd threshold value A 2as t i+1the first threshold A in moment 0before, the method can also comprise: record t imoment is for reducing the threshold value moment; Work as t imoment and upper one reduce the threshold value moment interval greater than preset value T timetime, perform above-mentioned by the 3rd threshold value A 2as t i+1the first threshold A in moment 0step, otherwise, do not perform above-mentioned by the 3rd threshold value A 2as t i+1the first threshold A in moment 0step.
By the 4th threshold value A 3as t i+1the first threshold A in moment 0before, the method can also comprise: record t imoment is for reducing the threshold value moment; Work as t imoment and upper one reduce the threshold value moment interval greater than preset value T timetime, perform above-mentioned by the 4th threshold value A 3as t i+1the first threshold A in moment 0step, otherwise, do not perform above-mentioned by the 4th threshold value A 3as t i+1the first threshold A in moment 0step.
Above-mentioned two kinds of concrete implementations can prevent first threshold A 0pingpang handoff, do not affect the reliability of speech detection simultaneously, reduce voice false dismissal probability.
The embodiment of the present invention continues to monitor and tracking environmental ground unrest, environmentally the size adaptation adjustment first threshold A of ground unrest 0, and to this first threshold A 0the slow mode rising or fall slowly is taked in adjustment, thus reduces voice false dismissal probability.In addition, first threshold A 0dynamic adjustments, make the power consumption under quiet environment and noisy environment close, thus can Consumer's Experience be promoted, improve product competitiveness.
Fig. 5 is the structural representation of voice Rouser embodiment one of the present invention.This voice Rouser can be realized by the mode of hardware.This voice Rouser can be integrated in the terminals such as such as panel computer, smart mobile phone, PDA.As shown in Figure 5, STF) 15, comparer 16, configurator 17 and interrupt processing circuit 18 voice Rouser 10 comprises: (SlowTrackingFilter is called for short: for SRC11, computing circuit 12, threshold value decision circuit 13, first withdrawal device 14, at a slow speed tracking filter.
Wherein, SRC11 is used for carrying out periodic samples to sound signal, wherein, at t iinstance sample obtains sampled signal y i, i is positive integer.Computing circuit 12 is for calculating sampling signal y iaudio power T i.Threshold value decision circuit 13 is for judging audio power T iwhether be more than or equal to t ithe first threshold A in moment 0; At audio power T ibe more than or equal to t ithe first threshold A in moment 0when, triggered interrupts treatment circuit 18 exports interruption pulse signal to interrupt control circuit 20, carries out VAD by the enable DSP of interrupt control circuit 20 or processor 30.The input end of the first withdrawal device 14 is coupled to the output terminal of SRC11, the first withdrawal device 14 for the first extraction yield 1/x to sampled signal y icarry out extraction obtain sampled point ys and export, x be greater than 1 natural number.The input end of STF15 is coupled to the output terminal of the first withdrawal device 14, and STF15 is used for obtaining sampled point ys to extraction to carry out tracking filter at a slow speed and obtain the first noise energy S 0.The input end of comparer 16 is coupled to output terminal and the threshold value decision circuit 13 of STF15, and comparer 16 is for comparing the first noise energy S 0with t ithe first threshold A in moment 0difference whether be greater than the first default threshold value M 0.Configurator 17 detects unsuccessfully for working as VAD, and at t idetected unsuccessfully for n time continuously before moment, and the first noise energy S 0with t ithe first threshold A in moment 0difference be greater than the first default threshold value M 0time, according to the first noise energy S 0generate Second Threshold A 1, and by Second Threshold A 1as t i+1the first threshold A in moment 0, be issued to threshold value decision circuit 13, n and be positive integer and n is less than i.
With reference to figure 5, configurator 17 is voice Rouser 10 configuration parameter, such as above-mentioned first threshold A 0deng.It will be appreciated by those skilled in the art that, configurator 17 receives the configuration parameter of self terminal, and corresponding control signal configuration parameter converted to each logic module in voice Rouser 10, wherein, logic module comprises computing circuit 12, threshold value decision circuit 13 and interrupt processing circuit 18 etc.SRC11 specifically can adopt down-sampled mode to sample to sound signal, such as by 32 kilo hertzs of (kilohertz, abbreviations: data KHz) are converted to 16KHz etc.
Sampled signal y ithe flow direction is in Figure 5:
SRC11-> computing circuit 12-> threshold value decision circuit 13-> interrupt processing circuit 18 (optional)-> interrupt control circuit 20 (optional)->DSP or processor 30 (optional).
At audio power T ibe more than or equal to t ithe first threshold A in moment 0when, sampled signal y ithe flow direction comprise above-mentioned optional part; At audio power T ibe less than t ithe first threshold A in moment 0when, sampled signal y ithe flow direction do not comprise above-mentioned optional part.
First withdrawal device 14, STF15 and comparer 16 do not affect normal voice and wake up, only for configurator 17 acting in conjunction change voice wake up in first threshold A 0.
The device of the present embodiment, may be used for the technical scheme performing embodiment of the method shown in Fig. 1, it realizes principle and technique effect is similar, repeats no more herein.
In the above-described embodiments, configurator 17 can be specifically for: by the first noise energy S 0as Second Threshold A 1; Or, by the first noise energy S 0with the first correction N preset 0sum is as Second Threshold A 1, i.e. A 1=S 0+ N 0; Or, by the first noise energy S 0with the first coefficient a preset 0long-pending as Second Threshold A 1, i.e. A 1=a 0× S 0, etc., the embodiment of the present invention is not as restriction.
Fig. 6 is the structural representation of voice Rouser embodiment two of the present invention.This voice Rouser can be realized by the mode of hardware.This voice Rouser can be integrated in the terminals such as such as panel computer, smart mobile phone, PDA.As shown in Figure 6, FTF) 150, comparer 160, configurator 170 and interrupt processing circuit 180 voice Rouser 100 comprises: (FastTrackingFilter is called for short: for SRC110, computing circuit 120, threshold value decision circuit 130, second withdrawal device 140, fast tracking filter.
Wherein, SRC110 is used for carrying out periodic samples to sound signal, wherein, at t iinstance sample obtains sampled signal y i, i is positive integer.Computing circuit 120 is for calculating sampling signal y iaudio power T i.Threshold value decision circuit 130 is for judging audio power T iwhether be more than or equal to t ithe first threshold A in moment 0.The input end of the second withdrawal device 140 is coupled to the output terminal of SRC110, the second withdrawal device 140 for the second extraction yield 1/z to sampled signal y icarry out extraction and obtain sampled point yf, wherein, z is the natural number being greater than x.The input end of FTF150 is coupled to the output terminal of the second withdrawal device 140, and FTF150 is used for obtaining sampled point yf to extraction to carry out quick tracking filter and obtain the second noise energy F 0.The input end of comparer 160 is coupled to the output terminal of FTF150, and comparer 160 is at audio power T ibe less than t ithe first threshold A in moment 0when, the first threshold in more each moment and the second noise energy F 0difference whether be greater than the second default threshold value M 1; And work as from t i-mmoment is until t imoment first threshold A separately 0with the second noise energy F 0difference be all greater than the second default threshold value M 1when, triggered interrupts treatment circuit 180 exports interruption pulse signal to interrupt control circuit 200, carries out VAD, m be positive integer and m is less than i by the enable DSP of interrupt control circuit 200 or processor 300.Configurator 170 for when VAD detects successfully, according to the second noise energy F 0generate the 3rd threshold value A 2, and by the 3rd threshold value A 2as t i+1the first threshold A in moment 0, be issued to threshold value decision circuit 130.
The device of the present embodiment, may be used for the technical scheme performing embodiment of the method shown in Fig. 3, it realizes principle and technique effect is similar, repeats no more herein.
On the basis of above-described embodiment, configurator can be specifically for: by the second noise energy F 0as the 3rd threshold value A 2; Or, by the second noise energy F 0with the second correction N preset 1sum is as the 3rd threshold value A 2, i.e. A 2=F 0+ N 1; Or, by the second noise energy F 0with the second coefficient a preset 1long-pending as the 3rd threshold value A 2, i.e. A 2=a 1× F 0, etc., the embodiment of the present invention is not as restriction.
Alternatively, configurator 170 can also be used for: record t imoment is for reducing the threshold value moment; Work as t imoment and upper one reduce the threshold value moment interval greater than preset value T timetime, perform above-mentioned by the 3rd threshold value A 2as t i+1the first threshold A in moment 0step, otherwise, do not perform above-mentioned by the 3rd threshold value A 2as t i+1the first threshold A in moment 0step, thus can first threshold A be prevented 0pingpang handoff, do not affect the reliability of speech detection simultaneously, reduce voice false dismissal probability.
With reference to figure 5, configurator 17 can also be used for: at audio power T ibe less than t ithe first threshold A in moment 0, and t ithe first threshold A in moment 0with the first noise energy S 0difference be greater than the 3rd default threshold value M 2when, according to the first noise energy S 0generate the 4th threshold value A 3, and by the 4th threshold value A 3as t i+1the first threshold A in moment 0.
Now, the device of the present embodiment, may be used for the technical scheme performing embodiment of the method shown in Fig. 4, it realizes principle and technique effect is similar, repeats no more herein.
Further, configurator 17 can be specifically for: by the first noise energy S 0as the 4th threshold value A 3; Or, by the first noise energy S 0with the 3rd correction N preset 2sum is as the 4th threshold value A 3, i.e. A 3=S 0+ N 2; Or, by the first noise energy S 0with the 3rd coefficient a preset 2long-pending as the 4th threshold value A 3, i.e. A 3=a 2× S 0, etc., the embodiment of the present invention is not as restriction.
Further, configurator 17 can also be used for: record t imoment is for reducing the threshold value moment; Work as t imoment and upper one reduce the threshold value moment interval greater than preset value T timetime, perform above-mentioned by the 4th threshold value A 3as t i+1the first threshold A in moment 0step, otherwise, do not perform above-mentioned by the 4th threshold value A 3as t i+1the first threshold A in moment 0step, thus can first threshold A be prevented 0pingpang handoff, do not affect the reliability of speech detection simultaneously, reduce voice false dismissal probability
With reference to figure 5 and Fig. 6, the first withdrawal device 14 and the second withdrawal device 140 realize long period or short-period data pick-up respectively.STF15 is the wave filter of a slow convergence, changes for tenacious tracking neighbourhood noise.FTF150 is the wave filter of a Fast Convergent, for quick tracking environmental noise change.Alternatively, STF15 is the wave filter of a slow convergence, changes for tenacious tracking neighbourhood noise.STF15 and FTF150, for following the tracks of the energy of current calculating window, adopts and computing circuit 12 or the similar structure of computing circuit 120.The difference of STF15 and FTF150 is the exponent number of wave filter and the difference of parameter, and the exponent number of wave filter and parameter set according to the debugging situation of reality.FTF150 is used for carrying out short period filtering, and the data variation namely occurred recently can the output of rapid contribution wave filter.STF15 is long period filtering, and the impact of data variation on the output of wave filter namely occurred recently is smaller and slow.
Alternatively, on the basis of Fig. 5, composition graphs 6, obtains structure as shown in Figure 7.Fig. 7 is the structural representation of voice Rouser embodiment three of the present invention.As shown in Figure 7, voice Rouser 1000 comprises: SRC11, computing circuit 12, threshold value decision circuit 13, first withdrawal device 14, second withdrawal device 140, STF15, FTF150, comparer 16, configurator 17 and interrupt processing circuit 18.
Wherein, threshold value decision circuit 13 also possesses effect and the function of threshold value decision circuit 130; Comparer 16 also possesses effect and the function of comparer 160; Configurator 17 also possesses effect and the function of configurator 170; Interrupt processing circuit 18 also possesses effect and the function of interrupt processing circuit 180.Concrete principle, as above-described embodiment, repeats no more herein.
The embodiment of the present invention continues to monitor and tracking environmental ground unrest, environmentally the size adaptation adjustment first threshold A of ground unrest 0, and to this first threshold A 0the slow mode rising or fall slowly is taked in adjustment, thus reduces voice false dismissal probability.In addition, first threshold A 0dynamic adjustments, make the power consumption under quiet environment and noisy environment close, thus can Consumer's Experience be promoted, improve product competitiveness.
In several embodiments that the application provides, should be understood that the equipment disclosed and method can realize by another way.Such as, apparatus embodiments described above is only schematic, such as, the division of described unit or module, be only a kind of logic function to divide, actual can have other dividing mode when realizing, such as multiple unit or module can in conjunction with or another system can be integrated into, or some features can be ignored, or do not perform.Another point, shown or discussed coupling each other or direct-coupling or communication connection can be by some interfaces, and the indirect coupling of equipment or module or communication connection can be electrical, machinery or other form.
The described module illustrated as separating component can or may not be physically separates, and the parts as module display can be or may not be physical module, namely can be positioned at a place, or also can be distributed in multiple network element.Some or all of module wherein can be selected according to the actual needs to realize the object of the present embodiment scheme.
Last it is noted that above each embodiment is only in order to illustrate technical scheme of the present invention, be not intended to limit; Although with reference to foregoing embodiments to invention has been detailed description, those of ordinary skill in the art is to be understood that: it still can be modified to the technical scheme described in foregoing embodiments, or carries out equivalent replacement to wherein some or all of technical characteristic; And these amendments or replacement, do not make the essence of appropriate technical solution depart from the scope of various embodiments of the present invention technical scheme.

Claims (16)

1. a voice awakening method, is characterized in that, comprising:
Periodic samples is carried out to sound signal, wherein, at t iinstance sample obtains sampled signal y i, i is positive integer;
Calculate described sampled signal y iaudio power T i;
At described audio power T ibe more than or equal to described t ithe first threshold A in moment 0when, carry out voice activation and detect VAD;
When VAD detects unsuccessfully, and at described t idetected unsuccessfully for n time continuously before moment, and the first noise energy S 0with described t ithe first threshold A in moment 0difference be greater than the first default threshold value M 0time, according to described first noise energy S 0generate Second Threshold A 1, and by described Second Threshold A 1as t i+1the first threshold A in moment 0, wherein, described first noise energy S 0be by with the first extraction yield 1/x to described sampled signal y iextract, and tracking filter is at a slow speed carried out to the sampled point ys extracted obtain, x be greater than 1 natural number, n is positive integer and n is less than i.
2. method according to claim 1, is characterized in that, described according to described first noise energy S 0generate Second Threshold A 1, comprising:
By described first noise energy S 0as described Second Threshold A 1;
Or, by described first noise energy S 0with the first correction N preset 0sum is as described Second Threshold A 1;
Or, by described first noise energy S 0with the first coefficient a preset 0long-pending as described Second Threshold A 1.
3. method according to claim 1, is characterized in that, at the described sampled signal y of described calculating iaudio power T iafterwards, also comprise:
At described audio power T ibe less than described t ithe first threshold A in moment 0, and from t i-mmoment is until t imoment first threshold A separately 0with the second noise energy F 0difference be all greater than the second default threshold value M 1when, carry out VAD, m and be positive integer and m is less than i;
When VAD detects successfully, according to described second noise energy F 0generate the 3rd threshold value A 2, and by described 3rd threshold value A 2as t i+1the first threshold A in moment 0, wherein, described second noise energy F 0be by with the second extraction yield 1/z to described sampled signal y iextract, and carry out quick tracking filter to the sampled point yf extracted and obtain, wherein, z is the natural number being greater than x.
4. method according to claim 3, is characterized in that, described according to described second noise energy F 0generate the 3rd threshold value A 2, comprising:
By described second noise energy F 0as described 3rd threshold value A 2;
Or, by described second noise energy F 0with the second correction N preset 1sum is as described 3rd threshold value A 2;
Or, by described second noise energy F 0with the second coefficient a preset 1long-pending as described 3rd threshold value A 2.
5. the method according to claim 3 or 4, is characterized in that, by described 3rd threshold value A 2as t i+1the first threshold A in moment 0before, also comprise:
Record described t imoment is for reducing the threshold value moment;
As described t imoment and upper one reduce the threshold value moment interval greater than preset value T timetime, perform described by described 3rd threshold value A 2as t i+1the first threshold A in moment 0step, otherwise, do not perform described by described 3rd threshold value A 2as t i+1the first threshold A in moment 0step.
6. method according to claim 1, is characterized in that, at the described sampled signal y of described calculating iaudio power T iafterwards, also comprise:
At described audio power T ibe less than described t ithe first threshold A in moment 0, and described t ithe first threshold A in moment 0with described first noise energy S 0difference be greater than the 3rd default threshold value M 2when, according to described first noise energy S 0generate the 4th threshold value A 3, and by described 4th threshold value A 3as t i+1the first threshold A in moment 0.
7. method according to claim 6, is characterized in that, described according to described first noise energy S 0generate the 4th threshold value A 3, comprising:
By described first noise energy S 0as described 4th threshold value A 3;
Or, by described first noise energy S 0with the 3rd correction N preset 2sum is as described 4th threshold value A 3;
Or, by described first noise energy S 0with the 3rd coefficient a preset 2long-pending as described 4th threshold value A 3.
8. the method according to claim 6 or 7, is characterized in that, by described 4th threshold value A 3as t i+1the first threshold A in moment 0before, also comprise:
Record described t imoment is for reducing the threshold value moment;
As described t imoment and upper one reduce the threshold value moment interval greater than preset value T timetime, perform described by described 4th threshold value A 3as t i+1the first threshold A in moment 0step, otherwise, do not perform described by described 4th threshold value A 3as t i+1the first threshold A in moment 0step.
9. a voice Rouser, is characterized in that, comprising:
Sampling frequency converter SRC, for carrying out periodic samples to sound signal, wherein, at t iinstance sample obtains sampled signal y i, i is positive integer;
Computing circuit, for calculating described sampled signal y iaudio power T i;
Threshold value decision circuit, for judging described audio power T iwhether be more than or equal to described t ithe first threshold A in moment 0; At described audio power T ibe more than or equal to described t ithe first threshold A in moment 0when, triggered interrupts treatment circuit exports interruption pulse signal to interrupt control circuit, carries out voice activation detect VAD by described interrupt control circuit enable digital signals processor DSP or processor;
First withdrawal device, the input end of described withdrawal device is coupled to the output terminal of described SRC, for the first extraction yield 1/x to described sampled signal y icarry out extraction and obtain sampled point ys, x be greater than 1 natural number;
The input end of tracking filter STF at a slow speed, described STF is coupled to the output terminal of described first sampling thief, carries out tracking filter at a slow speed obtain the first noise energy S for obtaining sampled point ys to described extraction 0;
Comparer, the input end of described comparer is coupled to the output terminal of described first withdrawal device and described threshold value decision circuit, for more described first noise energy S 0with described t ithe first threshold A in moment 0difference whether be greater than the first default threshold value M 0;
Configurator, for detecting unsuccessfully as VAD, and at described t idetected unsuccessfully for n time continuously before moment, and described first noise energy S 0with described t ithe first threshold A in moment 0difference be greater than the first default threshold value M 0time, according to described first noise energy S 0generate Second Threshold A 1, and by described Second Threshold A 1as t i+1the first threshold A in moment 0, be issued to described threshold value decision circuit, n is positive integer and n is less than i.
10. device according to claim 9, is characterized in that, described configurator specifically for:
By described first noise energy S 0as described Second Threshold A 1;
Or, by described first noise energy S 0with the first correction N preset 0sum is as described Second Threshold A 1;
Or, by described first noise energy S 0with the first coefficient a preset 0long-pending as described Second Threshold A 1.
11. devices according to claim 9, is characterized in that, also comprise:
Second withdrawal device, the input end of described second withdrawal device is coupled to the output terminal of described SRC, for the second extraction yield 1/z to described sampled signal y icarry out extraction and obtain sampled point yf, wherein, z is the natural number being greater than x;
The input end of fast tracking filter FTF, described FTF is coupled to the output terminal of described second withdrawal device, carries out quick tracking filter obtain the second noise energy F for obtaining sampled point yf to described extraction 0;
Described comparer, the input end of described comparer is coupled to the output terminal of described FTF, also at described audio power T ibe less than described t ithe first threshold A in moment 0when, the first threshold in more each moment and described second noise energy F 0difference whether be greater than the second default threshold value M 1; And work as from t i-mmoment is until t imoment first threshold A separately 0with described second noise energy F 0difference be all greater than the second default threshold value M 1when, trigger described interrupt processing circuit and export interruption pulse signal to described interrupt control circuit, carry out VAD, m by the enable described DSP of described interrupt control circuit or described processor and be positive integer and m is less than i;
Described configurator, also for when VAD detects successfully, according to described second noise energy F 0generate the 3rd threshold value A 2, and by described 3rd threshold value A 2as t i+1the first threshold A in moment 0, be issued to described threshold value decision circuit.
12. devices according to claim 11, is characterized in that, described configurator specifically for:
By described second noise energy F 0as described 3rd threshold value A 2;
Or, by described second noise energy F 0with the second correction N preset 1sum is as described 3rd threshold value A 2;
Or, by described second noise energy F 0with the second coefficient a preset 1long-pending as described 3rd threshold value A 2.
13. devices according to claim 11 or 12, is characterized in that, described configurator also for:
Record described t imoment is for reducing the threshold value moment;
As described t imoment and upper one reduce the threshold value moment interval greater than preset value T timetime, perform described by described 3rd threshold value A 2as t i+1the first threshold A in moment 0step, otherwise, do not perform described by described 3rd threshold value A 2as t i+1the first threshold A in moment 0step.
14. devices according to claim 9, is characterized in that, described configurator also for:
At described audio power T ibe less than described t ithe first threshold A in moment 0, and described t ithe first threshold A in moment 0with described first noise energy S 0difference be greater than the 3rd default threshold value M 2when, according to described first noise energy S 0generate the 4th threshold value A 3, and by described 4th threshold value A 3as t i+1the first threshold A in moment 0.
15. devices according to claim 14, is characterized in that, described configurator specifically for:
By described first noise energy S 0as described 4th threshold value A 3;
Or, by described first noise energy S 0with the 3rd correction N preset 2sum is as described 4th threshold value A 3;
Or, by described first noise energy S 0with the 3rd coefficient a preset 2long-pending as described 4th threshold value A 3.
16. devices according to claims 14 or 15, is characterized in that, described configurator also for:
Record described t imoment is for reducing the threshold value moment;
As described t imoment and upper one reduce the threshold value moment interval greater than preset value T timetime, perform described by described 4th threshold value A 3as t i+1the first threshold A in moment 0step, otherwise, do not perform described by described 4th threshold value A 3as t i+1the first threshold A in moment 0step.
CN201510549435.6A 2015-08-31 2015-08-31 A kind of voice awakening method and device Expired - Fee Related CN105261368B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510549435.6A CN105261368B (en) 2015-08-31 2015-08-31 A kind of voice awakening method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510549435.6A CN105261368B (en) 2015-08-31 2015-08-31 A kind of voice awakening method and device

Publications (2)

Publication Number Publication Date
CN105261368A true CN105261368A (en) 2016-01-20
CN105261368B CN105261368B (en) 2019-05-21

Family

ID=55101027

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510549435.6A Expired - Fee Related CN105261368B (en) 2015-08-31 2015-08-31 A kind of voice awakening method and device

Country Status (1)

Country Link
CN (1) CN105261368B (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106131292A (en) * 2016-06-03 2016-11-16 上海与德通讯技术有限公司 The system of the method for terminal wake-up, awakening method and correspondence is set
CN106297777A (en) * 2016-08-11 2017-01-04 广州视源电子科技股份有限公司 A kind of method and apparatus waking up voice service up
CN108198558A (en) * 2017-12-28 2018-06-22 电子科技大学 A kind of audio recognition method based on CSI data
WO2018149285A1 (en) * 2017-02-16 2018-08-23 腾讯科技(深圳)有限公司 Voice wake-up method and apparatus, electronic device, and storage medium
CN108536412A (en) * 2017-03-06 2018-09-14 北京君正集成电路股份有限公司 A kind of audio data collecting method and apparatus
CN108536413A (en) * 2017-03-06 2018-09-14 北京君正集成电路股份有限公司 A kind of audio data collecting method and apparatus
CN108700926A (en) * 2016-04-11 2018-10-23 惠普发展公司,有限责任合伙企业 Computing device is waken up based on ambient noise
CN109119082A (en) * 2018-10-22 2019-01-01 深圳锐越微技术有限公司 Voice wake-up circuit and electronic equipment
CN109243431A (en) * 2017-07-04 2019-01-18 阿里巴巴集团控股有限公司 A kind of processing method, control method, recognition methods and its device and electronic equipment
CN109473092A (en) * 2018-12-03 2019-03-15 珠海格力电器股份有限公司 A kind of sound end detecting method and device
CN109671426A (en) * 2018-12-06 2019-04-23 珠海格力电器股份有限公司 A kind of sound control method, device, storage medium and air-conditioning
CN109949831A (en) * 2017-12-20 2019-06-28 青岛海尔智能技术研发有限公司 The method, apparatus and computer readable storage medium of speech recognition in smart machine
CN110048979A (en) * 2019-04-17 2019-07-23 电子科技大学 A kind of multiple domain joint trigger device
CN110390934A (en) * 2019-06-25 2019-10-29 华为技术有限公司 A kind of method and interactive voice terminal of information alert
CN110570861A (en) * 2019-09-24 2019-12-13 Oppo广东移动通信有限公司 method and device for voice wake-up, terminal equipment and readable storage medium
CN111261143A (en) * 2018-12-03 2020-06-09 杭州嘉楠耘智信息科技有限公司 Voice wake-up method and device and computer readable storage medium
CN111755002A (en) * 2020-06-19 2020-10-09 北京百度网讯科技有限公司 Speech recognition device, electronic apparatus, and speech recognition method
CN111816178A (en) * 2020-07-07 2020-10-23 云知声智能科技股份有限公司 Voice equipment control method, device and equipment
US11315591B2 (en) 2018-12-19 2022-04-26 Amlogic (Shanghai) Co., Ltd. Voice activity detection method

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1204766A (en) * 1997-03-25 1999-01-13 皇家菲利浦电子有限公司 Method and device for detecting voice activity
CN1540623A (en) * 2003-11-04 2004-10-27 清华大学 Threshold self-adaptive speech sound detection system
US7231348B1 (en) * 2005-03-24 2007-06-12 Mindspeed Technologies, Inc. Tone detection algorithm for a voice activity detector
US20080021707A1 (en) * 2001-03-02 2008-01-24 Conexant Systems, Inc. System and method for an endpoint detection of speech for improved speech recognition in noisy environment
CN101320559A (en) * 2007-06-07 2008-12-10 华为技术有限公司 Sound activation detection apparatus and method
CN102194452A (en) * 2011-04-14 2011-09-21 西安烽火电子科技有限责任公司 Voice activity detection method in complex background noise
CN102314884A (en) * 2011-08-16 2012-01-11 捷思锐科技(北京)有限公司 Voice-activation detecting method and device
CN103888572A (en) * 2014-03-26 2014-06-25 深圳市中兴移动通信有限公司 Mobile terminal and harmful noise detecting method thereof
US20140188467A1 (en) * 2009-05-01 2014-07-03 Aliphcom Vibration sensor and acoustic voice activity detection systems (vads) for use with electronic systems
US20140334645A1 (en) * 2013-05-07 2014-11-13 Qualcomm Incorporated Method and apparatus for controlling voice activation
US20140337030A1 (en) * 2013-05-07 2014-11-13 Qualcomm Incorporated Adaptive audio frame processing for keyword detection
CN104216677A (en) * 2013-05-31 2014-12-17 塞瑞斯逻辑公司 Low-power voice gate for device wake-up
US20150081296A1 (en) * 2013-09-17 2015-03-19 Qualcomm Incorporated Method and apparatus for adjusting detection threshold for activating voice assistant function
CN104795076A (en) * 2014-01-21 2015-07-22 红板凳科技股份有限公司 Audio detection method

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1204766A (en) * 1997-03-25 1999-01-13 皇家菲利浦电子有限公司 Method and device for detecting voice activity
US20080021707A1 (en) * 2001-03-02 2008-01-24 Conexant Systems, Inc. System and method for an endpoint detection of speech for improved speech recognition in noisy environment
CN1540623A (en) * 2003-11-04 2004-10-27 清华大学 Threshold self-adaptive speech sound detection system
US7231348B1 (en) * 2005-03-24 2007-06-12 Mindspeed Technologies, Inc. Tone detection algorithm for a voice activity detector
CN101320559A (en) * 2007-06-07 2008-12-10 华为技术有限公司 Sound activation detection apparatus and method
US20140188467A1 (en) * 2009-05-01 2014-07-03 Aliphcom Vibration sensor and acoustic voice activity detection systems (vads) for use with electronic systems
CN102194452A (en) * 2011-04-14 2011-09-21 西安烽火电子科技有限责任公司 Voice activity detection method in complex background noise
CN102314884A (en) * 2011-08-16 2012-01-11 捷思锐科技(北京)有限公司 Voice-activation detecting method and device
US20140334645A1 (en) * 2013-05-07 2014-11-13 Qualcomm Incorporated Method and apparatus for controlling voice activation
US20140337030A1 (en) * 2013-05-07 2014-11-13 Qualcomm Incorporated Adaptive audio frame processing for keyword detection
CN104216677A (en) * 2013-05-31 2014-12-17 塞瑞斯逻辑公司 Low-power voice gate for device wake-up
US20150081296A1 (en) * 2013-09-17 2015-03-19 Qualcomm Incorporated Method and apparatus for adjusting detection threshold for activating voice assistant function
CN104795076A (en) * 2014-01-21 2015-07-22 红板凳科技股份有限公司 Audio detection method
CN103888572A (en) * 2014-03-26 2014-06-25 深圳市中兴移动通信有限公司 Mobile terminal and harmful noise detecting method thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
胡广书: "《数字信号处理理论、算法与实现第三版》", 31 October 2012, 清华大学出版社 *
马丽丽 等: ""强跟踪求积分卡尔曼滤波算法"", 《计算机工程与设计》 *

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108700926A (en) * 2016-04-11 2018-10-23 惠普发展公司,有限责任合伙企业 Computing device is waken up based on ambient noise
US10725523B2 (en) 2016-04-11 2020-07-28 Hewlett-Packard Development Company, L.P. Waking computing devices based on ambient noise
CN108700926B (en) * 2016-04-11 2021-08-31 惠普发展公司,有限责任合伙企业 Waking computing device based on ambient noise
JP2019510298A (en) * 2016-04-11 2019-04-11 ヒューレット−パッカード デベロップメント カンパニー エル.ピー.Hewlett‐Packard Development Company, L.P. Wakeup of computing device based on ambient noise
CN106131292B (en) * 2016-06-03 2020-06-30 浙江云澎科技有限公司 Terminal wake-up setting method, wake-up method and corresponding system
CN106131292A (en) * 2016-06-03 2016-11-16 上海与德通讯技术有限公司 The system of the method for terminal wake-up, awakening method and correspondence is set
CN106297777A (en) * 2016-08-11 2017-01-04 广州视源电子科技股份有限公司 A kind of method and apparatus waking up voice service up
CN106297777B (en) * 2016-08-11 2019-11-22 广州视源电子科技股份有限公司 A kind of method and apparatus waking up voice service
WO2018149285A1 (en) * 2017-02-16 2018-08-23 腾讯科技(深圳)有限公司 Voice wake-up method and apparatus, electronic device, and storage medium
US11069343B2 (en) 2017-02-16 2021-07-20 Tencent Technology (Shenzhen) Company Limited Voice activation method, apparatus, electronic device, and storage medium
CN108536413A (en) * 2017-03-06 2018-09-14 北京君正集成电路股份有限公司 A kind of audio data collecting method and apparatus
CN108536412B (en) * 2017-03-06 2021-01-08 北京君正集成电路股份有限公司 Audio data acquisition method and equipment
CN108536412A (en) * 2017-03-06 2018-09-14 北京君正集成电路股份有限公司 A kind of audio data collecting method and apparatus
CN109243431A (en) * 2017-07-04 2019-01-18 阿里巴巴集团控股有限公司 A kind of processing method, control method, recognition methods and its device and electronic equipment
CN109949831A (en) * 2017-12-20 2019-06-28 青岛海尔智能技术研发有限公司 The method, apparatus and computer readable storage medium of speech recognition in smart machine
CN109949831B (en) * 2017-12-20 2021-09-24 青岛海尔智能技术研发有限公司 Method and device for voice recognition in intelligent equipment and computer readable storage medium
CN108198558A (en) * 2017-12-28 2018-06-22 电子科技大学 A kind of audio recognition method based on CSI data
CN109119082A (en) * 2018-10-22 2019-01-01 深圳锐越微技术有限公司 Voice wake-up circuit and electronic equipment
CN111261143A (en) * 2018-12-03 2020-06-09 杭州嘉楠耘智信息科技有限公司 Voice wake-up method and device and computer readable storage medium
CN109473092A (en) * 2018-12-03 2019-03-15 珠海格力电器股份有限公司 A kind of sound end detecting method and device
CN111261143B (en) * 2018-12-03 2024-03-22 嘉楠明芯(北京)科技有限公司 Voice wakeup method and device and computer readable storage medium
CN109671426A (en) * 2018-12-06 2019-04-23 珠海格力电器股份有限公司 A kind of sound control method, device, storage medium and air-conditioning
US11315591B2 (en) 2018-12-19 2022-04-26 Amlogic (Shanghai) Co., Ltd. Voice activity detection method
CN110048979A (en) * 2019-04-17 2019-07-23 电子科技大学 A kind of multiple domain joint trigger device
CN110048979B (en) * 2019-04-17 2021-05-14 电子科技大学 Multi-domain combined trigger device
CN110390934A (en) * 2019-06-25 2019-10-29 华为技术有限公司 A kind of method and interactive voice terminal of information alert
CN110390934B (en) * 2019-06-25 2022-07-26 华为技术有限公司 Information prompting method and voice interaction terminal
CN110570861B (en) * 2019-09-24 2022-02-25 Oppo广东移动通信有限公司 Method and device for voice wake-up, terminal equipment and readable storage medium
CN110570861A (en) * 2019-09-24 2019-12-13 Oppo广东移动通信有限公司 method and device for voice wake-up, terminal equipment and readable storage medium
CN111755002A (en) * 2020-06-19 2020-10-09 北京百度网讯科技有限公司 Speech recognition device, electronic apparatus, and speech recognition method
CN111816178A (en) * 2020-07-07 2020-10-23 云知声智能科技股份有限公司 Voice equipment control method, device and equipment

Also Published As

Publication number Publication date
CN105261368B (en) 2019-05-21

Similar Documents

Publication Publication Date Title
CN105261368A (en) Voice wake-up method and apparatus
US10313796B2 (en) VAD detection microphone and method of operating the same
US9418651B2 (en) Method and apparatus for mitigating false accepts of trigger phrases
CN106782613B (en) Signal detection method and device
CN103578468A (en) Method for adjusting confidence coefficient threshold of voice recognition and electronic device
EP2849041A1 (en) Wakeup method and system for touch terminal and touch terminal
CN112004177B (en) Howling detection method, microphone volume adjustment method and storage medium
CN108810280B (en) Voice acquisition frequency processing method and device, storage medium and electronic equipment
US11172312B2 (en) Acoustic activity detecting microphone
WO2004075167A2 (en) Log-likelihood ratio method for detecting voice activity and apparatus
EP3079043A1 (en) Interference suppression method and apparatus for touch screen and terminal device
CN103901782A (en) Sound control method, electronic device and sound control apparatus
CN104464752A (en) Sound feedback detection method and device
CN105405441A (en) Method and device for voice information feedback
CN103871416A (en) Voice processing device and voice processing method
CN103916511A (en) Information processing method and electronic equipment
CN105430564A (en) Mobile device
CN110895930B (en) Voice recognition method and device
US10236000B2 (en) Circuit and method for speech recognition
CN105430543A (en) Digital microphone and electronic device
TW202026855A (en) Voice wake-up apparatus and method thereof
CN111276164B (en) Self-adaptive voice activation detection device and method for high-noise environment on airplane
WO2008076515A1 (en) Method and apparatus for robust speech activity detection
CN115699173A (en) Voice activity detection method and device
CN113470657B (en) Voice wakeup threshold adjustment method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20191211

Address after: 211200 No.1, Xiushan Middle Road, Lishui Economic Development Zone, Nanjing City, Jiangsu Province

Patentee after: NANJING ADVANCED BIOMATERIALS AND PROCESS EQUIPMENT RESEARCH INSTITUTE Co.,Ltd.

Address before: 510000 unit 2414-2416, building, No. five, No. 371, Tianhe District, Guangdong, China

Patentee before: GUANGDONG GAOHANG INTELLECTUAL PROPERTY OPERATION Co.,Ltd.

Effective date of registration: 20191211

Address after: 510000 unit 2414-2416, building, No. five, No. 371, Tianhe District, Guangdong, China

Patentee after: GUANGDONG GAOHANG INTELLECTUAL PROPERTY OPERATION Co.,Ltd.

Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Patentee before: HUAWEI TECHNOLOGIES Co.,Ltd.

TR01 Transfer of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190521

Termination date: 20200831

CF01 Termination of patent right due to non-payment of annual fee