CN108696791A - A kind of combination perception gain function sound enhancement method of single microphone - Google Patents
A kind of combination perception gain function sound enhancement method of single microphone Download PDFInfo
- Publication number
- CN108696791A CN108696791A CN201710227956.9A CN201710227956A CN108696791A CN 108696791 A CN108696791 A CN 108696791A CN 201710227956 A CN201710227956 A CN 201710227956A CN 108696791 A CN108696791 A CN 108696791A
- Authority
- CN
- China
- Prior art keywords
- voice
- gain function
- noise
- combination
- single microphone
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2430/00—Signal processing covered by H04R, not provided for in its groups
Abstract
The present invention provides a kind of combination perception gain function sound enhancement method of single microphone, which estimates prior weight in the domains DFT with decision-directed method;Secondly, enhance voice with the portfolio premium function that European distortion measure obtains is weighted using based on broad sense Gamma priori, in this case, the gain function of gained does not have closed solution, then uses the combination representation of its numerical solution;Finally, the inverse transformation that DFT is carried out to the spectrum component of voice then obtains the forms of time and space of enhancing voice.By this method, restore clean speech signal from noisy speech.
Description
Technical field
The present invention relates to field of speech enhancement, the combination more particularly, to a kind of single microphone perceives gain function language
Sound Enhancement Method.
Background technology
In speech processing system, voice signal is becoming noisy speech after the interference of noise of all kinds, noisy
Voice passes through voice of the speech enhan-cement module to be enhanced, and finally can carry out other processing to voice signal.It is practical raw
In work, various actual treatments, such as voice coding, speech recognition, onboard system, background can be carried out to voice signal
The interference of noise can seriously affect the performance of operating system.For many years, the speech enhan-cement based on statistical model is always to study
Hot spot.Since its complexity is low, hardware requirement is simply widely used single microphone speech enhan-cement.Speech enhan-cement as voice at
The preprocessing module of reason system is the effective means for fighting noise pollution, to reach the mesh for inhibiting noise, improving voice quality
's.The quality of gain function directly affects the performance of speech enhan-cement.Compared to Gaussian prior, Gamma priori more meets voice
The distribution of DFT range coefficients.Auditory masking effect refers to that auditory system is not easy to recognize for the quantizing noise near formant
Come, this characteristic can be utilized to be used for rounding error frequency spectrum.Therefore, using broad sense Gamma models to voice DFT range coefficients into
Row modeling, and consider that the method for auditory masking effect is of great value.
Invention content
The present invention provides a kind of combination perception gain function sound enhancement method of single microphone, and this method can be realized from band
It makes an uproar and restores clean speech signal in voice.
In order to reach above-mentioned technique effect, technical scheme is as follows:
A kind of combination perception gain function sound enhancement method of single microphone, includes the following steps:
S1:It is obtained using the unbiased noise power Power estimation based on MMSE
S2:Utilize decision-directed methodEstimate prior weight;
S3:Gain function is calculated according to the perception MMSE criterion of broad sense Gamma priori,
S4:Enhance voice using gain function
Further, in the step S1 in Additive noise model, S (k, i) and N (k, i) indicate kth frame respectively, i-th
The voice signal and noise signal of a spectrum component.Noisy Speech Signal by being in frequency domain representation after Discrete Fourier Transform:X
(k, i)=S (k, i)+N (k, i), if the power spectral density of voice isAnd the power spectral density of noise isThen prior weight definition is respectively defined as with posteriori SNRWithWherein, E[·]It is expectation operator, noise power spectrumEstimated using MMSE.
Further, estimate prior weight using DD methods in the step S2:
Wherein, P[·]Indicate halfwave rectifier,Previous frame
Voice Power estimation, β=0.98.
Further, in the step S3:
In amplitude-frequency domain, X (k, i)=S (k, i)+N (k, i) is indicated using polar coordinates, then
Rexp (j θ)=Aexp (j φ)+Dexp (j ψ).The range coefficient of X, S, N are respectively for R, A, D.Width frequency domain speech
The purpose of enhancing is exactly to acquire the estimation of A
The distribution of voice DFT range coefficients is modeled using unilateral broad sense Gamma models:
Wherein, Γ () indicates Gamma functions,
τ and v is the form parameter of Gamma distributions, and β is scaling parameter, and as τ=1, β expression formulas are as follows:
Noise DFT coefficient is modeled using Gauss model:
Wherein, I0() is zeroth order shellfish plug
That function;
The European distortion measure of weighting of perception isThen risk function
The minimum value for taking risk function, obtains
Then have:As γ=1,There is no closed solutions, then Bessel equation is taken
Approximation solves, and enablesΥx() is the parabolic cylinder function of x ranks:
1), when low signal-to-noise ratio, I is utilized0In the Taylor series expansion of w=0Have
2), when high s/n ratio, I is utilized0Approximate function when being worth very bigHave
Compared with prior art, the advantageous effect of technical solution of the present invention is:
The present invention estimates prior weight in the domains DFT with decision-directed method;Secondly, using based on broad sense Gamma priori
Enhance voice with the portfolio premium function that European distortion measure obtains is weighted, in this case, the gain function of gained does not have
Closed solution then uses the combination representation of its numerical solution;Finally, the inverse transformation that DFT is carried out to the spectrum component of voice, then obtain
The forms of time and space of voice must be enhanced, restore clean speech signal from noisy speech in this way, can effectively realize.
Description of the drawings
Fig. 1 is the single microphone speech-enhancement system in the domains DFT;
Fig. 2 is the single microphone speech enhan-cement processing procedure in the domains DFT;
Fig. 3 is flow chart of the present invention;
Fig. 4 is the gain function of the perception MMSE changed with instantaneous signal-to-noise ratio.
Specific implementation mode
The attached figures are only used for illustrative purposes and cannot be understood as limitating the patent;
In order to more preferably illustrate that the present embodiment, the certain components of attached drawing have omission, zoom in or out, actual product is not represented
Size;
To those skilled in the art, it is to be appreciated that certain known features and its explanation, which may be omitted, in attached drawing
's.
The following further describes the technical solution of the present invention with reference to the accompanying drawings and examples.
Embodiment 1
In Additive noise model, S (k, i) and N (k, i) indicates kth frame, the voice signal of i-th of spectrum component respectively
With noise signal.Noisy Speech Signal by being in frequency domain representation after Discrete Fourier Transform:
X (k, i)=S (k, i)+N (k, i).If the power spectral density of voice isAnd the power of noise
Spectrum density isThen prior weight definition is respectively defined as with posteriori SNRWithWherein, E[·]It is expectation operator, noise power spectrumEstimated using MMSE.
Estimate prior weight using DD methodsWherein, P[·]Table
Show halfwave rectifier,Previous frame voice Power estimation.β=0.98 under normal circumstances.
For expression formula simplicity, frame index k and frequency index i is omitted.In amplitude-frequency domain, using polar coordinates come indicate X (k, i)=
S (k, i)+N (k, i), then Rexp (j θ)=Aexp (j φ)+Dexp (j ψ).The range coefficient of X, S, N are respectively for R, A, D.Width
The purpose of frequency domain speech enhancing is exactly to acquire the estimation of A
The distribution of voice DFT range coefficients is modeled using unilateral broad sense Gamma models:
Wherein, Γ () indicate Gamma functions, τ with
V is the form parameter of Gamma distributions, and β is scaling parameter.As τ=1, β expression formulas are as follows:
Noise DFT coefficient is modeled using Gauss model:
Wherein, I0() is zero-order Bessel
Function.
The European distortion measure of weighting of perception isThen risk functionThe minimum value for taking risk function, obtains
Then have:As γ=1,There is no closed solutions, then take approximation to solve Bessel equation, enablesΥx() is
The parabolic cylinder function of x ranks:
1), when low signal-to-noise ratio, I is utilized0In the Taylor series expansion of w=0Have
2), when high s/n ratio, I is utilized0Approximate function when being worth very bigHave
As shown in Figure 1, this is the block diagram of the single microphone speech-enhancement system in the domains DFT.As shown in Fig. 2, this is in Fig. 1
The details to every frame per frequency spectrum processing of reason process, i.e. the single microphone speech enhan-cement processing procedure in the domains DFT.Such as Fig. 3 institutes
Show, this is the specific implementation flow chart of the present invention.
First, Noisy Speech Signal is through over-sampling (sample frequency 8000HZ), framing (140*129), adding window (50% weight
It is folded), DFT transform to frequency domain.Estimate unbiased noise power spectrum with MMSE methods
Secondly, posteriori SNR γ and prior weight ξ is calculated according to such as rear two formula respectively, Take β=0.98.
Again, by amplitude and PHASE SEPARATION, the gain function in amplitude-frequency domain is calculated.According to posteriori SNR and priori noise
Than passing through formula
Wherein, by prior weight
ξ and posteriori SNR γ takes a value range (- 40dB~50dB, using 1dB as spacing) first to calculate gain function and table is made
(91*91), in particular situations the corresponding gain function value of different priori posteriori SNR obtained by tabling look-up,p
=-0.1, v recommends 0.7.
As shown in figure 4, being the gain function changed with instantaneous signal-to-noise ratio.
Then, spectrum gain is acted on into Noisy Speech SignalAnd by amplitude and phase
Bit combination then obtains the frequency-domain expression of voice.
Finally, inverse Fourier transform carried out to Noisy Speech Signal, remove window, close frame (17967*1), then exportable voice
Time domain is expressed, and can carry out subjective and objective hearing test to voice.
The same or similar label correspond to the same or similar components;
Position relationship described in attached drawing is used to only for illustration, should not be understood as the limitation to this patent;
Obviously, the above embodiment of the present invention be only to clearly illustrate example of the present invention, and not be pair
The restriction of embodiments of the present invention.For those of ordinary skill in the art, may be used also on the basis of the above description
To make other variations or changes in different ways.There is no necessity and possibility to exhaust all the enbodiments.It is all this
All any modification, equivalent and improvement etc., should be included in the claims in the present invention made by within the spirit and principle of invention
Protection domain within.
Claims (4)
1. a kind of combination of single microphone perceives gain function sound enhancement method, which is characterized in that include the following steps:
S1:It is obtained using the unbiased noise power Power estimation based on MMSE
S2:Utilize decision-directed methodEstimate prior weight;
S3:Gain function is calculated according to the perception MMSE criterion of broad sense Gamma priori,
S4:Enhance voice using gain function
2. the combination of single microphone according to claim 1 perceives gain function sound enhancement method, which is characterized in that institute
It states in step S1 in Additive noise model, S (k, i) and N (k, i) indicates kth frame, the voice letter of i-th of spectrum component respectively
Number and noise signal.Noisy Speech Signal by being in frequency domain representation after Discrete Fourier Transform:X (k, i)=S (k, i)+N (k,
I), if the power spectral density of voice isAnd the power spectral density of noise isThen
Prior weight definition is respectively defined as with posteriori SNRWithWherein, E[·]It is it is expected
Operator, noise power spectrumEstimated using MMSE.
3. the combination of single microphone according to claim 2 perceives gain function sound enhancement method, which is characterized in that institute
It states in step S2 and estimates prior weight using DD methods:
Wherein, P[·]Indicate halfwave rectifier,Previous frame language
Music estimation, β=0.98.
4. the combination of single microphone according to claim 3 perceives gain function sound enhancement method, which is characterized in that institute
It states in step S3:
In amplitude-frequency domain, X (k, i)=S (k, i)+N (k, i) is indicated using polar coordinates, then
Rexp (j θ)=Aexp (j φ)+Dexp (j ψ).The range coefficient of X, S, N are respectively for R, A, D.Amplitude-frequency domain speech enhan-cement
Purpose be exactly to acquire the estimation of A
The distribution of voice DFT range coefficients is modeled using unilateral broad sense Gamma models:
Wherein, Γ () indicates Gamma functions, τ and v
It is the form parameter of Gamma distributions, and β is scaling parameter, as τ=1, β expression formulas are as follows:
Noise DFT coefficient is modeled using Gauss model:
Wherein, I0() is zero-order Bessel letter
Number;
The European distortion measure of weighting of perception isThen risk functionThe minimum value for taking risk function, obtainsThen
Have:As γ=1,There is no closed solutions, then take approximation to solve Bessel equation, enablesΥx() is x
The parabolic cylinder function of rank:
1), when low signal-to-noise ratio, I is utilized0In the Taylor series expansion of w=0Have
2), when high s/n ratio, I is utilized0Approximate function when being worth very bigHave
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710227956.9A CN108696791A (en) | 2017-04-10 | 2017-04-10 | A kind of combination perception gain function sound enhancement method of single microphone |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710227956.9A CN108696791A (en) | 2017-04-10 | 2017-04-10 | A kind of combination perception gain function sound enhancement method of single microphone |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108696791A true CN108696791A (en) | 2018-10-23 |
Family
ID=63843280
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710227956.9A Pending CN108696791A (en) | 2017-04-10 | 2017-04-10 | A kind of combination perception gain function sound enhancement method of single microphone |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108696791A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112767962A (en) * | 2021-03-01 | 2021-05-07 | 北京电信易通信息技术股份有限公司 | Voice enhancement method and system |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102568491A (en) * | 2010-12-14 | 2012-07-11 | 联芯科技有限公司 | Noise suppression method and equipment |
CN105338450A (en) * | 2015-09-23 | 2016-02-17 | 苏州科达科技股份有限公司 | Residual echo inhibition method and device |
-
2017
- 2017-04-10 CN CN201710227956.9A patent/CN108696791A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102568491A (en) * | 2010-12-14 | 2012-07-11 | 联芯科技有限公司 | Noise suppression method and equipment |
CN105338450A (en) * | 2015-09-23 | 2016-02-17 | 苏州科达科技股份有限公司 | Residual echo inhibition method and device |
Non-Patent Citations (1)
Title |
---|
赵改华等: "修正的基于广义Gamma语音模型语音增强算法", 《计算机工程与应用》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112767962A (en) * | 2021-03-01 | 2021-05-07 | 北京电信易通信息技术股份有限公司 | Voice enhancement method and system |
CN112767962B (en) * | 2021-03-01 | 2021-08-03 | 北京电信易通信息技术股份有限公司 | Voice enhancement method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109643554B (en) | Adaptive voice enhancement method and electronic equipment | |
Mittal et al. | Signal/noise KLT based approach for enhancing speech degraded by colored noise | |
US7313518B2 (en) | Noise reduction method and device using two pass filtering | |
CN108831499A (en) | Utilize the sound enhancement method of voice existing probability | |
CN105489226A (en) | Wiener filtering speech enhancement method for multi-taper spectrum estimation of pickup | |
CN106875938A (en) | A kind of improved nonlinear adaptive sound end detecting method | |
Yang et al. | A noise reduction method based on LMS adaptive filter of audio signals | |
CN110808057A (en) | Voice enhancement method for generating confrontation network based on constraint naive | |
CN108711432A (en) | A kind of sound enhancement method of the perception gain function of single microphone | |
CN108696791A (en) | A kind of combination perception gain function sound enhancement method of single microphone | |
Hamid et al. | Speech enhancement using EMD based adaptive soft-thresholding (EMD-ADT) | |
Rao et al. | Speech enhancement using sub-band cross-correlation compensated Wiener filter combined with harmonic regeneration | |
Li et al. | Noisy speech enhancement based on discrete sine transform | |
CN113066483B (en) | Sparse continuous constraint-based method for generating countermeasure network voice enhancement | |
Wei et al. | A novel prewhitening subspace method for enhancing speech corrupted by colored noise | |
Surendran et al. | Perceptual subspace speech enhancement with variance normalization | |
CN113593590A (en) | Method for suppressing transient noise in voice | |
Zheng et al. | SURE-MSE speech enhancement for robust speech recognition | |
Yan et al. | A signal subspace speech enhancement method for various noises | |
Khalil et al. | Enhancement of speech signals using multiple statistical models | |
Badiezadegan et al. | A wavelet-based data imputation approach to spectrogram reconstruction for robust speech recognition | |
Zehtabian et al. | Optimized singular vector denoising approach for speech enhancement | |
Indumathi et al. | Noise estimation using standard deviation of the frequency magnitude spectrum for mixed non-stationary noise | |
Guo et al. | Speaker recognition based on wavelet packet decomposition and Volterra adaptive model | |
Chen | Research on Single Channel Speech Noise Reduction Algorithm Based on Signal Processing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181023 |
|
RJ01 | Rejection of invention patent application after publication |