CN1175398C - Sound activation detection method for identifying speech and music from noise environment - Google Patents

Sound activation detection method for identifying speech and music from noise environment Download PDF

Info

Publication number
CN1175398C
CN1175398C CNB001274945A CN00127494A CN1175398C CN 1175398 C CN1175398 C CN 1175398C CN B001274945 A CNB001274945 A CN B001274945A CN 00127494 A CN00127494 A CN 00127494A CN 1175398 C CN1175398 C CN 1175398C
Authority
CN
China
Prior art keywords
ratio
noise
signal
foreground
noise ratio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB001274945A
Other languages
Chinese (zh)
Other versions
CN1354455A (en
Inventor
黎家力
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CNB001274945A priority Critical patent/CN1175398C/en
Publication of CN1354455A publication Critical patent/CN1354455A/en
Application granted granted Critical
Publication of CN1175398C publication Critical patent/CN1175398C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • User Interface Of Digital Computer (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The present invention discloses a sound activity detection method for recognizing voice and music from a noise environment. The present invention takes a signal to noise ratio as a judgement standard for sound activity detection. The present invention comprises the steps that sampled data is converted to a frequency domain through FFT, and the sampled data is divided into different subbands in a non-linear mode in the frequency domain; the energy and the signal to noise ratio measure of each subband are calcualted; the update of the noise energy of each subband and the calculation of the signal to noise ratio measure of each subsband are respectively executed in the foreground and the background, and the foreground and the background alternatively executes control; the signal to noise ratio measure is used as the standard for judging noise, the voice and the music. The present invention can be used to accurately detect the voice and the music in the noise environment so as to make a system have strong environmental noise resistance and have strong adaptability to various effective sound signals.

Description

A kind of voice activity detection method that from noise circumstance, identifies voice and music
Technical field
The present invention relates to the voice activity detection technique in the digital communication system, more specifically, relate to a kind of voice activity that can from the input signal that is mixed with neighbourhood noise, identify voice and music signal exactly and detect (Voice Activity Detection) method.
Background technology
The voice activity detection technique is widely used in communication system, for example, uses the voice activity detection technique in mobile communication system, can improve the flow processing power of system.And for example, in the audio mixing module of the multipoint control unit of video conference, use the voice activity detection technique, only allow detect the audio code stream that the someone speaks and participate in audio mixing, can reduce the number of terminals of participating in audio mixing, improve the quality of audio mixing.
General voice activity detection method is to utilize the parameter in short-term of energy, zero-crossing rate, pitch period or other voice signals to be used as the foundation that judges whether that the someone talks, when ground unrest is big, adopt these methods can cause erroneous judgement, and these parameters all are to be based upon on people's the sonification model, so be not suitable for music.And in multimedia communication system, music often is employed as a kind of important medium, and general voice activity detection method only is applicable to the speech detection that the people speaks, and does not adapt to the such non-stationary process of music.
Summary of the invention
The purpose of this invention is to provide a kind of voice activity detection method that is applicable in the noisy environment and can accurately detects voice and music, make system have very strong anti-neighbourhood noise ability, simultaneously again various effective voice signals had very strong adaptability, be specially adapted in the multimedia communication system, as video conferencing system.
In order to finish goal of the invention, described a kind of voice activity detection method that identifies voice and music from noise circumstance may further comprise the steps:
1, at first resulting sampled data is converted on the frequency domain by fast fourier transform FFT;
2, on frequency domain, non-linearly be divided into different subbands, calculate the energy and the foreground signal to noise ratio (S/N ratio) of each subband then respectively, and calculate the foreground signal to noise ratio (S/N ratio) by the foreground signal to noise ratio (S/N ratio) and estimate;
If 3 present frames are first frames, then current state is changed to the foreground state;
4, the various statistics of estimating according to current signal to noise ratio (S/N ratio) are controlled the work on foreground and backstage;
When if 5 current states are in the foreground state, current foreground signal to noise ratio (S/N ratio) is estimated and selected threshold, judge and handle;
When if 6 current states are in background state, start backstage subband noise energy and upgrade, calculate backstage signal to noise ratio (S/N ratio) and backstage signal to noise ratio (S/N ratio) and estimate, and judge and handle according to the statistic that signal to noise ratio (S/N ratio) is estimated;
If 7 current states are in an interim state, then enter transition state and handle, the statistic of estimating according to signal to noise ratio (S/N ratio) is further judged again, determines finally to enter foreground state or background state;
8, estimate according to the requirement of external module output foreground signal to noise ratio (S/N ratio) or export to estimate and judge the quiet sign that draws controlled flag as voice activity detection (VAD) by the foreground signal to noise ratio (S/N ratio);
9, calculate and export the gross energy (this step is optional) of this each subband of frame according to the requirement of external module;
10, get back to step 1, continue to handle next frame.
In the above-mentioned described step 8 of voice activity detection method,, then put sound flag, otherwise put quiet sign if the foreground signal to noise ratio (S/N ratio) is estimated greater than threshold value one.
By such scheme as seen, this has the physical quantity of ubiquity because of the voice activity detection method of the present invention's realization has been used signal to noise ratio (S/N ratio).So compare obvious advantage with wide adaptability with additive method, both can detect voice, can detect music again, simultaneously very strong noise resisting ability is arranged again, be applicable to various noise circumstances, and can adapt to the hardware of various input gains and different signal to noise ratio (S/N ratio)s.Be specially adapted to multimedia communication system.
Description of drawings
The invention will be further described below in conjunction with drawings and Examples.
Fig. 1 is the process flow diagram of the method for the invention.
Fig. 2 uses this method in the process flow diagram of a system.
Embodiment
Below in conjunction with Fig. 1 this method is specified:
The present invention is based upon the criterion that voice activity detects on this physical quantity of signal to noise ratio (S/N ratio).Because the appreciable frequency spectrum of people's ear mainly concentrates on below the 4KHz, simultaneously in order to reduce operand, the present invention is sampled as example with 8K, but for other sampling rates, as long as change some parameter, the method applied in the present invention is suitable equally.The first step at first is converted to resulting sampled data on the frequency domain by fast Fourier transform (FFT):
The input voice are represented with s (n).The frame length of algorithm is 10ms, and promptly 80 point data are a frame (L=80), and adopts the overlapping method of interframe, and overlapping number of data points D is 24.Like this, and input data frame buffer zone d (m, number of data points n) is the L+D=104 point, wherein preceding D point data is the last D point data of former frame, promptly
d(m,n)=d(m-1,L+n),0≤n<D
Here m represents the numbering of present frame.
Input voice s (n) are carried out pre-emphasis handle, then have
d(m,D+n)=s(n)+ξ ps(n-1),0≤n<L
ξ wherein p=-0.8 is pre emphasis factor.
(m n) carries out windowing process with level and smooth trapezoid window, and zero padding then forms the discrete Fourier transform (DFT) input data g (n) that M=128 is ordered, that is: to the input data d after the pre-emphasis
G (n) is carried out discrete Fourier transform (DFT), obtains the frequency spectrum G (k) of input signal:
G ( k ) = 2 M &Sigma; N = 0 M - 1 g ( n ) e - j 2 &pi;nk / M ; 0 &le; k < M
In actual computation, consider that g (n) is a real number, so the plural fast fourier transform that available M/2 is ordered is calculated the real number fast fourier transform that M is ordered fast.
To the 16K sampling, 160 point data are a frame (L=160), and adopt the overlapping method of interframe, and overlapping number of data points D is 48.Like this, (m, number of data points n) is the L+D=208 point to input data frame buffer zone d, carries out 256 point fast Fourier conversion.
Second step non-linearly was divided into different subbands, calculated the energy and the foreground signal to noise ratio (S/N ratio) of each subband then respectively,
And calculate to such an extent that the foreground signal to noise ratio (S/N ratio) is estimated by the foreground signal to noise ratio (S/N ratio):
(1), the ENERGY E of each subband of present frame Ch(m) calculate by following formula:
E ch ( m , i ) = max { E min , &alpha; ch ( m ) E ch ( m - 1 , i ) + ( 1 - &alpha; ch ( m ) ) 1 f H ( i ) - f L ( i ) + 1 &Sigma; k = f L ( i ) f H ( i ) | G ( k ) | 2 } 0 &le; i < N C
N wherein C=16 is sub band number, E Min=0.0625 is the subband least energy, α Ch(m) be the sub belt energy smoothing factor.Smoothing factor α Ch(m) be defined as
f L(i) and f H(i) be the position of i son band starting and ending, wherein f LAnd f HBe defined as follows:
f L={2,4,6,8,10,12,14,17,20,23,27,31,36,42,49,56},
f H={3,5,7,9,11,13,16,19,22,26,30,35,41,48,55,63}
Sample for 16K:
f L={2,4,6,8,10,12,14,17,20,23,27,31,36,42,49,57,66,77,90,106},
f H={3,5,7,9,11,13,16,19,22,26,30,35,41,48,56,65,76,89,105,127}
(2), subband SNR estimation
Be calculated as follows the signal to noise ratio (S/N ratio) σ of subband q(i)
&sigma; q ( i ) = max { 0 , min { 89 , round { 10 log 10 ( E ch ( m , i ) E n ( m , i ) ) / 0.375 } } } ; 0 &le; i < N c
E wherein n(m i) is the estimated value of i subband noise energy of present frame, the 0.375th, and the quantization step of signal to noise ratio (S/N ratio).σ q(i) be quantified as integer, and be limited between 0 and 89.
(3), calculate signal to noise ratio (S/N ratio) and estimate (SNR Metric)
It is the similarity degree of recently describing present frame and voice according to the subband noise that signal to noise ratio (S/N ratio) is estimated v (m), and it is that the sign present frame is the voice or the criterion of noise
v ( m ) = &Sigma; i = 0 N c - 1 V ( &sigma; q ( i ) )
Wherein V (k) estimates table { k value among the V} for signal to noise ratio (S/N ratio).{ V} has 90 elements, is defined as
V={2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,4,4,4,5,5,5,6,6,7,7,7,8,8,9,9,10,10,11,12,12,13,13,14,15,15,16,17,17,18,19,20,20,21,22,23,24,24,25,26,27,28,28,29,30,31,32,33,34,35,36,37,37,38,39,40,41,42,43,44,45,46,47,48,49,50,50,50,50,50,50,50,50,50,50}。
In the 3rd step,, then current state is changed to the foreground state if this frame is first frame.
The 4th step, the various statistics control foreground of estimating according to signal to noise ratio (S/N ratio) and the work on backstage.
In the 5th step,, carry out to judge and to handle if when current state is the foreground state:
1) estimates when the foreground signal to noise ratio (S/N ratio) and be lower than threshold value and think noise for the moment, start the foreground noise energy and upgrade;
2) if current be the foreground state, and if continuous 2 seconds foreground signal to noise ratio (S/N ratio)s estimate greater than threshold value and think for the moment and will enter transition state, then current each sub belt energy as backstage subband noise energy, and to put current state be transition state;
3) when the foreground signal to noise ratio (S/N ratio) was estimated greater than threshold value two in continuous 2 seconds, think music, forbid that simultaneously foreground subband noise energy upgrades, and to put current state be background state;
4) changeed for the 8th step.
The 6th step when if current state is in background state, started backstage subband noise energy and upgrades, and calculated backstage signal to noise ratio (S/N ratio) and backstage signal to noise ratio (S/N ratio) and estimated, and carried out simultaneously to judge and to handle:
1) calculating backstage signal to noise ratio (S/N ratio) and backstage signal to noise ratio (S/N ratio) estimates;
2) if continuous 6 seconds backstage signal to noise ratio (S/N ratio)s estimate greater than threshold value one, current each sub belt energy as backstage subband noise energy;
3) estimating statistic in a period of time when the backstage signal to noise ratio (S/N ratio) satisfies specified conditions or judges the foreground signal to noise ratio (S/N ratio) and estimate continuous 1 second during less than threshold value two, then backstage subband noise energy as foreground subband noise energy, putting current state simultaneously is the foreground state, stop background processes, restart foreground noise energy renewal process, and changeed for the 8th step;
4) estimate when the backstage signal to noise ratio (S/N ratio) and be lower than threshold value for the moment, start the backstage noise energy and upgrade;
5) changeed for the 8th step.
In the 7th step,, then carry out following judgement if current state is a transition state:
1) when the foreground signal to noise ratio (S/N ratio) was estimated greater than threshold value two in continuous 2 seconds, think music, and to put current state is background state;
2) calculating backstage signal to noise ratio (S/N ratio) and backstage signal to noise ratio (S/N ratio) estimates;
3) if continuous 6 seconds backstage signal to noise ratio (S/N ratio)s estimate greater than threshold value one, current each sub belt energy as backstage subband noise energy;
4) estimating statistic in a period of time when the backstage signal to noise ratio (S/N ratio) satisfies specified conditions or judges the foreground signal to noise ratio (S/N ratio) and estimate continuous 1 second during less than threshold value two, then backstage subband noise energy as foreground subband noise energy, putting current state simultaneously is the foreground state, changes step 6);
5) when the backstage signal to noise ratio (S/N ratio) estimate be lower than threshold value and start for the moment after the stage noise upgrade;
6) estimate when the foreground signal to noise ratio (S/N ratio) and be lower than threshold value and think noise for the moment, the stage noise upgrades before starting;
7) changeed for the 8th step.
In the 8th step, according to the requirement of external module, output foreground signal to noise ratio (S/N ratio) is estimated or export by the foreground signal to noise ratio (S/N ratio) and estimate quiet sign that judgement the draws controlled flag as VAD.
The 9th goes on foot, and calculates and export the gross energy (this step is optional) of this each subband of frame according to the requirement of external module.
The tenth step repeated the first step, and the next frame data are handled.
In described voice activity detection method, the span of threshold value one is 35~40, and the span of threshold value two is that threshold value one adds that five add ten to threshold value one.
In described the 8th step,, then put sound flag, otherwise put quiet sign if the foreground signal to noise ratio (S/N ratio) is estimated greater than threshold value one.
In described the 6th step and the 7th step, the statistics that the backstage signal to noise ratio (S/N ratio) is estimated comprises: with 20 subframes is a multi-frame (200ms), and to each subframe, if the backstage signal to noise ratio (S/N ratio) of this subframe is estimated greater than threshold value one, then statistic subtracts 1; Otherwise statistic adds 1.
One of following situation satisfies described specified conditions:
1. continuous 30 multi-frames, statistic is greater than zero;
Statistic greater than zero with the ratio of minus multi-frame number greater than 35 to 7;
More than two conditions noisiness that current sound all is described obvious.
In described voice activity detection method, the ground unrest energy of next frame is pressed following formula and is upgraded:
E n(m+1, i)=max{E Min, α nE n(m, i)+(1-α n) E Ch(m, i) }, 0 i<N CE wherein MinThe=0.00625th, the subband least energy that allows.α nThe=0.9th, subband noise energy smoothing factor, it directly influences the renewal speed of subband noise energy estimated value.Usually, with the initial value of each the frame sub belt energy in preceding four frames as the subband noise energy
E n(m,i)=max{E init,E ch(m,i)},1 m 4,0 i<N C
E wherein Init=16.
Below in conjunction with Fig. 2 the flow process that the present invention is applied in the total system is described:
After the compressed bit stream input of every road voice, through decoding, the back signal of will decoding is analyzed and is handled with this method, the signal to noise ratio (S/N ratio) of exporting each road is then estimated the gross energy (tce) of (SNR) and each subband and is given the audio mixing module, last by the size of audio mixing module according to SNR and tce, audio mixing is participated on the n road before selecting.Because the operand of this method is very little, so can be made in on a slice digital signal processing (DSP) chip with decoding, also can be made in on a slice dsp chip with the audio mixing algorithm.

Claims (6)

1, a kind of voice activity detection method that identifies voice and music from noise circumstance is characterized in that, may further comprise the steps:
1) at first resulting sampled data is converted on the frequency domain by fast fourier transform;
2) on frequency domain, non-linearly be divided into different subbands, calculate the energy and the foreground signal to noise ratio (S/N ratio) of each subband then respectively, and calculate the foreground signal to noise ratio (S/N ratio) by the foreground signal to noise ratio (S/N ratio) and estimate;
3) if present frame is first frame, then putting current state is the foreground state;
4) the various statistics of estimating according to current signal to noise ratio (S/N ratio) are controlled the work on foreground and backstage;
5) if when current state is in the foreground state, estimate when the foreground signal to noise ratio (S/N ratio) and to be lower than threshold value and to think noise for the moment, start the foreground noise energy and upgrade; If the foreground signal to noise ratio (S/N ratio) was estimated greater than threshold value for the moment in continuous 2 seconds, putting current state is transition state; When the foreground signal to noise ratio (S/N ratio) was estimated greater than threshold value two in continuous 2 seconds, think music, forbid that simultaneously foreground subband noise energy upgrades, and to put current state be background state;
6) if when current state is in background state, start backstage subband noise energy and upgrade, calculate backstage signal to noise ratio (S/N ratio) and backstage signal to noise ratio (S/N ratio) and estimate; If the backstage signal to noise ratio (S/N ratio) was estimated greater than threshold value one in continuous 6 seconds, current each sub belt energy as backstage subband noise energy; Estimating statistic in a period of time when the backstage signal to noise ratio (S/N ratio) satisfies specified conditions or judges the foreground signal to noise ratio (S/N ratio) and estimate continuous 1 second during less than threshold value two, then backstage subband noise energy as foreground subband noise energy, putting current state simultaneously is the foreground state, stop background processes, restart foreground noise energy renewal process, and change step 8); Estimate when the backstage signal to noise ratio (S/N ratio) and to be lower than threshold value for the moment, start the backstage noise energy and upgrade;
7) if current state is in an interim state, then enter transition state and handle, the statistic of estimating according to signal to noise ratio (S/N ratio) is further judged again, determines finally to enter foreground state or background state:
(1) when the foreground signal to noise ratio (S/N ratio) was estimated greater than threshold value two in continuous 2 seconds, think music, and to put current state is background state;
(2) calculating backstage signal to noise ratio (S/N ratio) and backstage signal to noise ratio (S/N ratio) estimates;
(3) if continuous 6 seconds backstage signal to noise ratio (S/N ratio)s estimate greater than threshold value one, current each sub belt energy as backstage subband noise energy;
(4) estimating statistic in a period of time when the backstage signal to noise ratio (S/N ratio) satisfies specified conditions or judges the foreground signal to noise ratio (S/N ratio) and estimate continuous 1 second during less than threshold value two, then backstage subband noise energy as foreground subband noise energy, putting current state simultaneously is the foreground state, changes step (6);
(5) when the backstage signal to noise ratio (S/N ratio) estimate be lower than threshold value and start for the moment after the stage noise upgrade;
(6) estimate when the foreground signal to noise ratio (S/N ratio) and be lower than threshold value and think noise for the moment, the stage noise upgrades before starting;
(7) change step 8);
8) estimate according to the requirement of external module output foreground signal to noise ratio (S/N ratio) or export to estimate and judge the quiet sign that draws controlled flag as the voice activity detection by the foreground signal to noise ratio (S/N ratio);
9) get back to step 1), continue to handle next frame.
2, voice activity detection method as claimed in claim 1 is characterized in that described step 8) and 9) between can also increase: the gross energy that calculates and export this each subband of frame according to the requirement of external module.
3, voice activity detection method as claimed in claim 1, it is characterized in that, in described step 6) and the step 7), the statistic that the backstage signal to noise ratio (S/N ratio) is estimated is to calculate like this: with 20 subframes is a multi-frame, to each subframe, if the backstage signal to noise ratio (S/N ratio) of this subframe is estimated greater than threshold value one, then statistic subtracts 1; Otherwise statistic adds 1.
4, voice activity detection method as claimed in claim 1 is characterized in that in the described step 8), if the foreground signal to noise ratio (S/N ratio) is estimated greater than threshold value one, then puts sound flag, otherwise puts quiet sign.
5, as the described voice activity detection method of one of claim 1 to 4, it is characterized in that: the span of described threshold value one is 35~40.
6, as the described voice activity detection method of one of claim 1 to 4, it is characterized in that: the span of described threshold values two be the value of described threshold value one add five and the value of described threshold values one add between ten.
CNB001274945A 2000-11-18 2000-11-18 Sound activation detection method for identifying speech and music from noise environment Expired - Fee Related CN1175398C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB001274945A CN1175398C (en) 2000-11-18 2000-11-18 Sound activation detection method for identifying speech and music from noise environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB001274945A CN1175398C (en) 2000-11-18 2000-11-18 Sound activation detection method for identifying speech and music from noise environment

Publications (2)

Publication Number Publication Date
CN1354455A CN1354455A (en) 2002-06-19
CN1175398C true CN1175398C (en) 2004-11-10

Family

ID=4592509

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB001274945A Expired - Fee Related CN1175398C (en) 2000-11-18 2000-11-18 Sound activation detection method for identifying speech and music from noise environment

Country Status (1)

Country Link
CN (1) CN1175398C (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101256772B (en) * 2007-03-02 2012-02-15 华为技术有限公司 Method and device for determining attribution class of non-noise audio signal

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE10325746A1 (en) * 2003-03-17 2004-10-21 Infineon Technologies Ag Operating method for detecting an operating status in direct current voltage motor e.g. in passenger safety system in motor vehicle, involves making an analog signal available for an operating status
JP4429081B2 (en) * 2004-06-01 2010-03-10 キヤノン株式会社 Information processing apparatus and information processing method
CN100456356C (en) * 2004-11-12 2009-01-28 中国科学院声学研究所 Sound end detecting method for sound identifying system
JP4321518B2 (en) * 2005-12-27 2009-08-26 三菱電機株式会社 Music section detection method and apparatus, and data recording method and apparatus
ES2525427T3 (en) * 2006-02-10 2014-12-22 Telefonaktiebolaget L M Ericsson (Publ) A voice detector and a method to suppress subbands in a voice detector
CN100483509C (en) * 2006-12-05 2009-04-29 华为技术有限公司 Aural signal classification method and device
CN101197130B (en) * 2006-12-07 2011-05-18 华为技术有限公司 Sound activity detecting method and detector thereof
CN101681619B (en) * 2007-05-22 2012-07-04 Lm爱立信电话有限公司 Improved voice activity detector
CN101399039B (en) * 2007-09-30 2011-05-11 华为技术有限公司 Method and device for determining non-noise audio signal classification
CN101515454B (en) * 2008-02-22 2011-05-25 杨夙 Signal characteristic extracting methods for automatic classification of voice, music and noise
CN101645265B (en) * 2008-08-05 2011-07-13 中兴通讯股份有限公司 Method and device for identifying audio category in real time
CN102655010B (en) * 2008-12-31 2014-09-03 无锡中星微电子有限公司 Voice record controlling method and voice recording device
CN101458943B (en) * 2008-12-31 2013-01-30 无锡中星微电子有限公司 Sound recording control method and sound recording device
CN102044244B (en) 2009-10-15 2011-11-16 华为技术有限公司 Signal classifying method and device
CN102110436B (en) * 2009-12-28 2012-05-09 中兴通讯股份有限公司 Method and device for identifying mark voice based on voice enveloping characteristic
CN102096798B (en) * 2011-01-25 2013-08-21 上海交通大学 Signal processing method for electromagnetic induction state recognition
CN103325386B (en) * 2012-03-23 2016-12-21 杜比实验室特许公司 The method and system controlled for signal transmission
CN102708859A (en) * 2012-06-20 2012-10-03 太仓博天网络科技有限公司 Real-time music voice identification system
CN103632681B (en) * 2013-11-12 2016-09-07 广州海格通信集团股份有限公司 A kind of spectral envelope silence detection method
CN107293287B (en) * 2014-03-12 2021-10-26 华为技术有限公司 Method and apparatus for detecting audio signal
CN105810214B (en) * 2014-12-31 2019-11-05 展讯通信(上海)有限公司 Voice-activation detecting method and device
CN105810201B (en) * 2014-12-31 2019-07-02 展讯通信(上海)有限公司 Voice activity detection method and its system
CN107767863B (en) * 2016-08-22 2021-05-04 科大讯飞股份有限公司 Voice awakening method and system and intelligent terminal
CN106782608B (en) * 2016-12-10 2019-11-05 广州酷狗计算机科技有限公司 Noise detecting method and device
CN108461081B (en) * 2018-03-21 2020-07-31 北京金山安全软件有限公司 Voice control method, device, equipment and storage medium
CN109545192B (en) * 2018-12-18 2022-03-08 百度在线网络技术(北京)有限公司 Method and apparatus for generating a model
CN115811574B (en) * 2023-02-03 2023-06-16 合肥炬芯智能科技有限公司 Sound signal processing method and device, main equipment and split conference system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101256772B (en) * 2007-03-02 2012-02-15 华为技术有限公司 Method and device for determining attribution class of non-noise audio signal

Also Published As

Publication number Publication date
CN1354455A (en) 2002-06-19

Similar Documents

Publication Publication Date Title
CN1175398C (en) Sound activation detection method for identifying speech and music from noise environment
CN1257486C (en) Complex signal activity detection for improved speech-noise classification of an audio signal
CN1320521C (en) Method and device for selecting coding speed in variable speed vocoder
CN1160698C (en) Endpointing of speech in noisy signal
CN1188835C (en) System and method for reducing noise
CN1075692C (en) Method and apparatus for suppressing noise in communication system
CN1248339A (en) Apparatus and method for rate determination in commuincation system
CN1750124A (en) Bandwidth extension of band limited audio signals
CN1271593C (en) Voice signal detection method
CN1727860A (en) Gain-constrained noise suppression
CN1758331A (en) Quick audio-frequency separating method based on tonic frequency
KR20100125272A (en) Systems, methods, and apparatus for context processing using multi resolution analysis
CN1437747A (en) Closed-loop multimode mixed-domain linear prediction (MDLP) speech coder
CN101067927A (en) Sound volume adjusting method and device
CN1719517A (en) Dynamic noise eliminating method and digital filter
CN1290077C (en) Method and apparatus for phase spectrum subsamples drawn
CN1685336A (en) Method for fast dynamic estimation of background noise
CN110619881B (en) Voice coding method, device and equipment
CN1866357A (en) Noise level estimation method and device thereof
CN1149534C (en) Sound decoding device and sound decoding method
CN1787071A (en) Method for testing silent frame
CN1248477C (en) Echo cancellating and phonetic testing method and apparatus for dialogue interactive front end
CN101582263B (en) Method and device for noise enhancement post-processing in speech decoding
CN1625681A (en) Generation LSF vector
CN1787079A (en) Apparatus and method for detecting moise

Legal Events

Date Code Title Description
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C06 Publication
PB01 Publication
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20041110

Termination date: 20141118

EXPY Termination of patent right or utility model