CN101853665A - Method for eliminating noise in voice - Google Patents

Method for eliminating noise in voice Download PDF

Info

Publication number
CN101853665A
CN101853665A CN200910087086A CN200910087086A CN101853665A CN 101853665 A CN101853665 A CN 101853665A CN 200910087086 A CN200910087086 A CN 200910087086A CN 200910087086 A CN200910087086 A CN 200910087086A CN 101853665 A CN101853665 A CN 101853665A
Authority
CN
China
Prior art keywords
noise
signal
voice
voice signal
ratio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN200910087086A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BOSHIJIN (BEIJING) INFORMATION TECHNOLOGY Co Ltd
Original Assignee
BOSHIJIN (BEIJING) INFORMATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BOSHIJIN (BEIJING) INFORMATION TECHNOLOGY Co Ltd filed Critical BOSHIJIN (BEIJING) INFORMATION TECHNOLOGY Co Ltd
Priority to CN200910087086A priority Critical patent/CN101853665A/en
Publication of CN101853665A publication Critical patent/CN101853665A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Circuit For Audible Band Transducer (AREA)

Abstract

The invention relates to a method for eliminating noise in voice, which is used for extracting a voice signal polluted by noise and can be widely applied to front-end processing of voice communication, auxiliary hearing, speech recognition and voice print authentication. The method comprises the following steps of: 1) framing and windowing a polluted noise-containing voice signal; 2) converting to a frequency domain to obtain the voice signal in the frequency domain; 3) estimating a statistics characteristic parameter of the noise in real time; 4) designing an amplifier according to a statistics model method and the estimated noise statistics characteristic; 5) amplifying the voice signal in the frequency domain and inversely converting the amplified voice signal into a time domain signal; and 6) performing overlap-add to obtain a final noise reduction signal. Since the statistics characteristic parameter of the noise is estimated in real time and the voice statistics model method is used, a large amount of noise is eliminated, the voice signal is prevented from being damaged and better noise reduction effect is achieved.

Description

The removing method of noise in the voice
Technical field: the invention belongs to especially field of voice signal of signal Processing, in particular, the present invention relates to the noise cancellation method in a kind of voice signal, also claim sound enhancement method.
Background technology: the removing method of noise also claims sound enhancement method in the voice, belong to a kind of optionally treatment technology, its purpose is from the signal that is subjected to noise pollution, extract pure primary speech signal, improve the sharpness and the intelligibility of voice, the place that can be widely used in that voice are intercepted, voice communication, voice hearing aid, video conference etc. need improve voice quality.In addition, on other important application of voice, as speech recognition, voiceprint etc., usually owing to the influence of the ground unrest practical application area that delays extensively to enter, the serious use that hinders these technology, by rational use noise cancellation technique, voice signal as well as possible extracting from contaminated noisy speech signal can significantly can be improved the performance of speech recognition and vocal print Verification System.Because sound enhancement method importance in practice is an international hot research direction always.
Speech enhancement technique is by the port number classification of picking up voice signal, can be divided into single channel speech enhancement technique and multicenter voice enhancement techniques, the multicenter voice enhanced system needs the microphone more than 2 usually, form microphone array, by wave beam form, processing means such as back filtering eliminate the ground unrest in the voice signal, because it not only can obtain the time-domain information and the frequency domain information of voice signal, and can obtain the spatial information of voice signal, can obtain better effect usually.The single channel speech-enhancement system only uses a microphone, and hardware resource requires low, and algorithm complex is also corresponding lower, be easy to realize, and range of application is more extensive, the invention belongs to the single channel sound enhancement method.
Typical single channel speech-enhancement system comprises the steps:
1) noisy voice signal y is done branch frame, windowing process;
2) transform to frequency field Y (Ω);
3) the statistical nature parameter of estimating noise;
4) multiplier (-icator) design and signals and associated noises are handled;
5) will handle back signal contravariant and gain time domain
6) signal after the use overlap-add method is enhanced;
The statistical nature parametric technique of estimating noise in the step 3).Can directly choose begin to locate quiet section and extract corresponding noise statistics feature as noise frame, also can be by real-time voice activity detection technique, following the tracks of the current speech frame is noise frame or speech frame, if noise frame then upgrades the noise statistics characteristic parameter, otherwise continues to follow the tracks of next frame.The noise spectrum P that obtains Nn(Ω) expression.
The multiplier (-icator) design of step 4), in numerous single channel sound enhancement methods, maximum usually difference promptly is this, relatively typically the Wiener filtering multiplier (-icator) based on the MMSE criterion is:
G ( Ω ) = P ss ( Ω ) P ss ( Ω ) + P nn ( Ω ) - - - ( 1 )
Figure B2009100870865D00013
This method is used extensively in practice, and can obtain result preferably when noise meets smooth conditions, yet this method also has weak point.Mainly comprise the following aspects: 1) under the nonstationary noise environment, choose the unvoiced segments that begins to locate as noise frame, usually can not accurately obtain the statistical nature of follow-up noise, if use the voice activity detection method, realize that not only difficulty is bigger, and under non-stationary environment, be difficult to realize detecting accurately; What 2) multiplier (-icator) used is the linear gain device, no matter the sound section gain of still all using same ratio in unvoiced segments, obviously be irrational, because under the higher situation of signal to noise ratio (S/N ratio), undue removal noise can cause the loss of voice signal self-information on the contrary, and can residual more noise under the low situation of signal to noise ratio (S/N ratio), this also just the linear gain device can cause the reason of stronger " music noise ".Therefore, need a kind of more reasonably noise characteristic track algorithm and rational more multiplier (-icator).
Summary of the invention
The objective of the invention is to design a kind of more reasonably, method that can the real-time follow-up noise, make speech-enhancement system under the noise circumstance of non-stationary, to work, and design a kind of more effectively multiplier (-icator), avoid damaging voice as much as possible, and eliminate residual noise as much as possible, to remedy the shortcoming of linear gain device damage voice and residual noise.
Design content of the present invention comprises the steps:
1) noisy voice signal y is done branch frame, windowing process;
2) transform to frequency field Y (Ω);
3) the statistical nature parameter of estimating noise;
4) multiplier (-icator) design and signals and associated noises are handled;
5) will handle back signal contravariant and gain time domain
Figure B2009100870865D00021
6) signal after the use overlap-add method is enhanced;
Wherein, the statistical nature parameter of step 3) estimating noise comprises the steps:
A) calculate the current frame signal power spectrum, use the estimating noise spectrum of former frame, by smoothing method, estimate current noise spectrum, its tracking velocity can be regulated by smoothing factor;
B) power spectrum signal and the noise spectrum that obtain in utilizing a) calculate the priori signal to noise ratio (S/N ratio);
C) estimated snr of use former frame by smoothing method, is estimated the posteriority signal to noise ratio (S/N ratio) of present frame, and its tracking velocity can be regulated by smoothing factor;
Wherein, step 4) gain design and signals and associated noises treatment step comprise the steps:
A) statistical model of selection voice, the operation parameter method is the speech data modeling;
B), utilize the estimating noise spectrum, priori signal to noise ratio (S/N ratio), the posteriority signal to noise ratio (S/N ratio) that obtain in the step 3) to obtain multiplier (-icator) G (Ω) again based on criterion MMSE;
C) utilize formula
Figure B2009100870865D00022
The speech manual that obtains estimating;
Because the variation of real-time follow-up noise spectrum of the present invention, priori signal to noise ratio (S/N ratio), posteriority signal to noise ratio (S/N ratio), so can obtain the noise statistics feature of current demand signal frame more accurately, be applicable to the environment of complicated more nonstationary noise, and used based on the method for statistics speech data has been carried out modeling, obtain nonlinear multiplier (-icator) parameter, can be when removing more noises and avoid too much damage voice.
Description of drawings
Fig. 1 is the block scheme of the speech-enhancement system realized of the present invention;
Embodiment
Below in conjunction with the drawings and specific embodiments, the present invention done describing in further detail.
Voice signal x (t)=s (t)+n (t) to present frame does windowing process, and does the FFT conversion, obtains the signal frequency domain form:
Y(k)=FFT(x(t)□h(t)) (3)
Wherein, window function h (t) can be hamming window or hanning window; Frame length can be selected 10 30ms.
The information of using former frame to estimate, noise spectrum N pAnd signal spectrum (k), Estimate the present frame noise spectrum:
Figure B2009100870865D00032
Calculate the priori signal to noise ratio (S/N ratio) of present frame:
γ ( k ) = Y ( k ) N ( k ) - 1 - - - ( 5 )
Utilize the posteriority signal to noise ratio (S/N ratio) of former frame, calculate the posteriority signal to noise ratio (S/N ratio) of present frame:
η(k)=max{0,β(k)η(k)+(1-β(k))γ(k)} (6)
After current noise statistics information is estimated to finish, can offer multiplier (-icator) and use.
Multiplier (-icator) design of the present invention belongs to parameterized optimal estimation problem.Need the estimated signals frequency domain form to be:
Figure B2009100870865D00034
Be easy note, implicit below frequency representation k, the following formula right side is deployable to be:
Figure B2009100870865D00035
According to bayesian theory, can obtain:
Wherein p () represents probability density function.If speech data is carried out modeling with this model of superelevation, can select Laplce's probability density function or gamma probability density function for use, as follows respectively:
p ( S ) = 1 σ s exp ( - 2 | S | σ s ) - - - ( 10 )
p ( S ) = 3 4 2 π σ s | S | - 1 2 exp ( - 3 | S | 2 σ s ) - - - ( 11 )
The σ here sThe variance of expression voice signal is an example with Laplce's probability function, establishes:
L + = σ n σ s + Y σ n = 1 η + Y σ n
L - = σ n σ s - Y σ n = 1 η - Y σ n - - - ( 12 )
Wherein η is the posteriority signal to noise ratio (S/N ratio) of signal, according to optimum MMSE criterion, makes
Figure B2009100870865D00043
Minimum can get:
Figure B2009100870865D00044
Figure B2009100870865D00046
Figure B2009100870865D00047
Wherein erfc () represents an error function, and p (Y) is:
Figure B2009100870865D00048
Figure B2009100870865D00049
Figure B2009100870865D000411
Finally obtain the following G of multiplier (-icator), the frequency-region signal after can obtaining handling by the multiplier (-icator) that obtains again
Figure B2009100870865D000412
Get by the IFFT conversion again:
Figure B2009100870865D000413
At last, the whole signals after can obtaining to strengthen by overlap-add method.

Claims (3)

1. the removing method of noise in the voice is used to extract the voice signal by noise pollution, comprises the steps:
1) contaminated noisy speech signal is done branch frame, windowing process;
2) transform to frequency field, obtain the voice signal in the frequency domain;
3) the statistical nature parameter of real-time estimating noise;
4) according to the noise statistics designing gain device of statistical model method and estimation;
5) voice signal of frequency domain is done gain process, contravariant is changed to time-domain signal;
6) obtain complete voice signal by overlap-add method.
2. according to the statistical nature parameter of real-time estimating noise in the claim 1, it is characterized in that step 3), comprising:
1) calculates the frequency spectrum of this frame voice signal;
2) noise spectrum of estimating according to former frame calculates the priori signal to noise ratio (S/N ratio);
3) according to the noise spectrum of priori signal to noise ratio (S/N ratio) of calculating and former frame estimation, do smoothing processing, estimate the noise spectrum and the posteriority signal to noise ratio (S/N ratio) of present frame;
4) parameter of estimating is passed to multiplier (-icator).
3. according to the noise statistics designing gain device described in the claim 1, it is characterized in that step 4, comprising according to statistical model method and estimation:
1) chooses the design of statistical model method and optiaml ciriterion, design multiplier (-icator);
2) by the estimated statistical nature parameter of step 3) in the claim 1, the voice signal in the frequency domain is done gain process, obtain voice signal in the frequency domain after the de-noising;
CN200910087086A 2009-06-18 2009-06-18 Method for eliminating noise in voice Pending CN101853665A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200910087086A CN101853665A (en) 2009-06-18 2009-06-18 Method for eliminating noise in voice

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200910087086A CN101853665A (en) 2009-06-18 2009-06-18 Method for eliminating noise in voice

Publications (1)

Publication Number Publication Date
CN101853665A true CN101853665A (en) 2010-10-06

Family

ID=42805120

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200910087086A Pending CN101853665A (en) 2009-06-18 2009-06-18 Method for eliminating noise in voice

Country Status (1)

Country Link
CN (1) CN101853665A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102426837A (en) * 2011-12-30 2012-04-25 中国农业科学院农业信息研究所 Robustness method used for voice recognition on mobile equipment during agricultural field data acquisition
CN104103278A (en) * 2013-04-02 2014-10-15 北京千橡网景科技发展有限公司 Real time voice denoising method and device
CN105931649A (en) * 2016-03-31 2016-09-07 欧仕达听力科技(厦门)有限公司 Ultra-low time delay audio processing method and system based on spectrum analysis
CN106328155A (en) * 2016-09-13 2017-01-11 广东顺德中山大学卡内基梅隆大学国际联合研究院 Speech enhancement method of correcting priori signal-to-noise ratio overestimation
CN106796802A (en) * 2014-09-03 2017-05-31 马维尔国际贸易有限公司 Method and apparatus for eliminating music noise via nonlinear attenuation/gain function
CN108074582A (en) * 2016-11-10 2018-05-25 电信科学技术研究院 A kind of noise suppressed signal-noise ratio estimation method and user terminal
CN109063165A (en) * 2018-08-15 2018-12-21 深圳市诺信连接科技有限责任公司 A kind of ERP file polling management system
US20190019526A1 (en) * 2017-07-13 2019-01-17 Gn Hearing A/S Hearing device and method with non-intrusive speech intelligibility
CN110648681A (en) * 2019-09-26 2020-01-03 腾讯科技(深圳)有限公司 Voice enhancement method and device, electronic equipment and computer readable storage medium
CN113286047A (en) * 2021-04-22 2021-08-20 维沃移动通信(杭州)有限公司 Voice signal processing method and device and electronic equipment

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102426837A (en) * 2011-12-30 2012-04-25 中国农业科学院农业信息研究所 Robustness method used for voice recognition on mobile equipment during agricultural field data acquisition
CN102426837B (en) * 2011-12-30 2013-10-16 中国农业科学院农业信息研究所 Robustness method used for voice recognition on mobile equipment during agricultural field data acquisition
CN104103278A (en) * 2013-04-02 2014-10-15 北京千橡网景科技发展有限公司 Real time voice denoising method and device
CN106796802A (en) * 2014-09-03 2017-05-31 马维尔国际贸易有限公司 Method and apparatus for eliminating music noise via nonlinear attenuation/gain function
CN106796802B (en) * 2014-09-03 2021-06-18 马维尔亚洲私人有限公司 Method and apparatus for eliminating musical noise via a non-linear attenuation/gain function
CN105931649A (en) * 2016-03-31 2016-09-07 欧仕达听力科技(厦门)有限公司 Ultra-low time delay audio processing method and system based on spectrum analysis
CN106328155A (en) * 2016-09-13 2017-01-11 广东顺德中山大学卡内基梅隆大学国际联合研究院 Speech enhancement method of correcting priori signal-to-noise ratio overestimation
CN108074582A (en) * 2016-11-10 2018-05-25 电信科学技术研究院 A kind of noise suppressed signal-noise ratio estimation method and user terminal
US20190019526A1 (en) * 2017-07-13 2019-01-17 Gn Hearing A/S Hearing device and method with non-intrusive speech intelligibility
US11164593B2 (en) * 2017-07-13 2021-11-02 Gn Hearing A/S Hearing device and method with non-intrusive speech intelligibility
US11676621B2 (en) 2017-07-13 2023-06-13 Gn Hearing A/S Hearing device and method with non-intrusive speech intelligibility
CN109063165A (en) * 2018-08-15 2018-12-21 深圳市诺信连接科技有限责任公司 A kind of ERP file polling management system
CN109063165B (en) * 2018-08-15 2022-04-19 深圳市诺信连接科技有限责任公司 ERP file query management system
CN110648681A (en) * 2019-09-26 2020-01-03 腾讯科技(深圳)有限公司 Voice enhancement method and device, electronic equipment and computer readable storage medium
CN110648681B (en) * 2019-09-26 2024-02-09 腾讯科技(深圳)有限公司 Speech enhancement method, device, electronic equipment and computer readable storage medium
CN113286047A (en) * 2021-04-22 2021-08-20 维沃移动通信(杭州)有限公司 Voice signal processing method and device and electronic equipment

Similar Documents

Publication Publication Date Title
CN101853665A (en) Method for eliminating noise in voice
US10580430B2 (en) Noise reduction using machine learning
CN106340292B (en) A kind of sound enhancement method based on continuing noise estimation
CN101976566B (en) Voice enhancement method and device using same
US6173258B1 (en) Method for reducing noise distortions in a speech recognition system
US8010355B2 (en) Low complexity noise reduction method
US8712074B2 (en) Noise spectrum tracking in noisy acoustical signals
EP2031583B1 (en) Fast estimation of spectral noise power density for speech signal enhancement
CN101154383B (en) Method and device for noise suppression, phonetic feature extraction, speech recognition and training voice model
CN103021420A (en) Speech enhancement method of multi-sub-band spectral subtraction based on phase adjustment and amplitude compensation
CN104464728A (en) Speech enhancement method based on Gaussian mixture model (GMM) noise estimation
CN112735456A (en) Speech enhancement method based on DNN-CLSTM network
CN105489226A (en) Wiener filtering speech enhancement method for multi-taper spectrum estimation of pickup
CN102969000A (en) Multi-channel speech enhancement method
CN106373559A (en) Robustness feature extraction method based on logarithmic spectrum noise-to-signal weighting
Islam et al. Speech enhancement based on a modified spectral subtraction method
CN102314883B (en) Music noise judgment method and voice noise elimination method
CN107895582A (en) Towards the speaker adaptation speech-emotion recognition method in multi-source information field
Jakati et al. A noise reduction method based on modified LMS algorithm of real time speech signals
Alam et al. Robust feature extraction for speech recognition by enhancing auditory spectrum
CN107045874A (en) A kind of Non-linear Speech Enhancement Method based on correlation
CN112233657A (en) Speech enhancement method based on low-frequency syllable recognition
CN106997766B (en) Homomorphic filtering speech enhancement method based on broadband noise
Saruwatari et al. Semi-blind speech extraction for robot using visual information and noise statistics
Swamy et al. Dual threshold log spectral distance voice activity detector based effective statistical speech enhancement

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20101006