CN108696791A - A kind of combination perception gain function sound enhancement method of single microphone - Google Patents

A kind of combination perception gain function sound enhancement method of single microphone Download PDF

Info

Publication number
CN108696791A
CN108696791A CN201710227956.9A CN201710227956A CN108696791A CN 108696791 A CN108696791 A CN 108696791A CN 201710227956 A CN201710227956 A CN 201710227956A CN 108696791 A CN108696791 A CN 108696791A
Authority
CN
China
Prior art keywords
voice
gain function
noise
combination
single microphone
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710227956.9A
Other languages
Chinese (zh)
Inventor
谭洪舟
李竺珊
李宇
农革
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
SYSU CMU Shunde International Joint Research Institute
National Sun Yat Sen University
Original Assignee
SYSU CMU Shunde International Joint Research Institute
National Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SYSU CMU Shunde International Joint Research Institute, National Sun Yat Sen University filed Critical SYSU CMU Shunde International Joint Research Institute
Priority to CN201710227956.9A priority Critical patent/CN108696791A/en
Publication of CN108696791A publication Critical patent/CN108696791A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups

Abstract

The present invention provides a kind of combination perception gain function sound enhancement method of single microphone, which estimates prior weight in the domains DFT with decision-directed method;Secondly, enhance voice with the portfolio premium function that European distortion measure obtains is weighted using based on broad sense Gamma priori, in this case, the gain function of gained does not have closed solution, then uses the combination representation of its numerical solution;Finally, the inverse transformation that DFT is carried out to the spectrum component of voice then obtains the forms of time and space of enhancing voice.By this method, restore clean speech signal from noisy speech.

Description

A kind of combination perception gain function sound enhancement method of single microphone
Technical field
The present invention relates to field of speech enhancement, the combination more particularly, to a kind of single microphone perceives gain function language Sound Enhancement Method.
Background technology
In speech processing system, voice signal is becoming noisy speech after the interference of noise of all kinds, noisy Voice passes through voice of the speech enhan-cement module to be enhanced, and finally can carry out other processing to voice signal.It is practical raw In work, various actual treatments, such as voice coding, speech recognition, onboard system, background can be carried out to voice signal The interference of noise can seriously affect the performance of operating system.For many years, the speech enhan-cement based on statistical model is always to study Hot spot.Since its complexity is low, hardware requirement is simply widely used single microphone speech enhan-cement.Speech enhan-cement as voice at The preprocessing module of reason system is the effective means for fighting noise pollution, to reach the mesh for inhibiting noise, improving voice quality 's.The quality of gain function directly affects the performance of speech enhan-cement.Compared to Gaussian prior, Gamma priori more meets voice The distribution of DFT range coefficients.Auditory masking effect refers to that auditory system is not easy to recognize for the quantizing noise near formant Come, this characteristic can be utilized to be used for rounding error frequency spectrum.Therefore, using broad sense Gamma models to voice DFT range coefficients into Row modeling, and consider that the method for auditory masking effect is of great value.
Invention content
The present invention provides a kind of combination perception gain function sound enhancement method of single microphone, and this method can be realized from band It makes an uproar and restores clean speech signal in voice.
In order to reach above-mentioned technique effect, technical scheme is as follows:
A kind of combination perception gain function sound enhancement method of single microphone, includes the following steps:
S1:It is obtained using the unbiased noise power Power estimation based on MMSE
S2:Utilize decision-directed methodEstimate prior weight;
S3:Gain function is calculated according to the perception MMSE criterion of broad sense Gamma priori,
S4:Enhance voice using gain function
Further, in the step S1 in Additive noise model, S (k, i) and N (k, i) indicate kth frame respectively, i-th The voice signal and noise signal of a spectrum component.Noisy Speech Signal by being in frequency domain representation after Discrete Fourier Transform:X (k, i)=S (k, i)+N (k, i), if the power spectral density of voice isAnd the power spectral density of noise isThen prior weight definition is respectively defined as with posteriori SNRWithWherein, E[·]It is expectation operator, noise power spectrumEstimated using MMSE.
Further, estimate prior weight using DD methods in the step S2:
Wherein, P[·]Indicate halfwave rectifier,Previous frame Voice Power estimation, β=0.98.
Further, in the step S3:
In amplitude-frequency domain, X (k, i)=S (k, i)+N (k, i) is indicated using polar coordinates, then
Rexp (j θ)=Aexp (j φ)+Dexp (j ψ).The range coefficient of X, S, N are respectively for R, A, D.Width frequency domain speech The purpose of enhancing is exactly to acquire the estimation of A
The distribution of voice DFT range coefficients is modeled using unilateral broad sense Gamma models:
Wherein, Γ () indicates Gamma functions, τ and v is the form parameter of Gamma distributions, and β is scaling parameter, and as τ=1, β expression formulas are as follows:
Noise DFT coefficient is modeled using Gauss model:
Wherein, I0() is zeroth order shellfish plug That function;
The European distortion measure of weighting of perception isThen risk function
The minimum value for taking risk function, obtains
Then have:As γ=1,There is no closed solutions, then Bessel equation is taken Approximation solves, and enablesΥx() is the parabolic cylinder function of x ranks:
1), when low signal-to-noise ratio, I is utilized0In the Taylor series expansion of w=0Have
2), when high s/n ratio, I is utilized0Approximate function when being worth very bigHave
Compared with prior art, the advantageous effect of technical solution of the present invention is:
The present invention estimates prior weight in the domains DFT with decision-directed method;Secondly, using based on broad sense Gamma priori Enhance voice with the portfolio premium function that European distortion measure obtains is weighted, in this case, the gain function of gained does not have Closed solution then uses the combination representation of its numerical solution;Finally, the inverse transformation that DFT is carried out to the spectrum component of voice, then obtain The forms of time and space of voice must be enhanced, restore clean speech signal from noisy speech in this way, can effectively realize.
Description of the drawings
Fig. 1 is the single microphone speech-enhancement system in the domains DFT;
Fig. 2 is the single microphone speech enhan-cement processing procedure in the domains DFT;
Fig. 3 is flow chart of the present invention;
Fig. 4 is the gain function of the perception MMSE changed with instantaneous signal-to-noise ratio.
Specific implementation mode
The attached figures are only used for illustrative purposes and cannot be understood as limitating the patent;
In order to more preferably illustrate that the present embodiment, the certain components of attached drawing have omission, zoom in or out, actual product is not represented Size;
To those skilled in the art, it is to be appreciated that certain known features and its explanation, which may be omitted, in attached drawing 's.
The following further describes the technical solution of the present invention with reference to the accompanying drawings and examples.
Embodiment 1
In Additive noise model, S (k, i) and N (k, i) indicates kth frame, the voice signal of i-th of spectrum component respectively With noise signal.Noisy Speech Signal by being in frequency domain representation after Discrete Fourier Transform:
X (k, i)=S (k, i)+N (k, i).If the power spectral density of voice isAnd the power of noise Spectrum density isThen prior weight definition is respectively defined as with posteriori SNRWithWherein, E[·]It is expectation operator, noise power spectrumEstimated using MMSE.
Estimate prior weight using DD methodsWherein, P[·]Table Show halfwave rectifier,Previous frame voice Power estimation.β=0.98 under normal circumstances.
For expression formula simplicity, frame index k and frequency index i is omitted.In amplitude-frequency domain, using polar coordinates come indicate X (k, i)= S (k, i)+N (k, i), then Rexp (j θ)=Aexp (j φ)+Dexp (j ψ).The range coefficient of X, S, N are respectively for R, A, D.Width The purpose of frequency domain speech enhancing is exactly to acquire the estimation of A
The distribution of voice DFT range coefficients is modeled using unilateral broad sense Gamma models:
Wherein, Γ () indicate Gamma functions, τ with V is the form parameter of Gamma distributions, and β is scaling parameter.As τ=1, β expression formulas are as follows:
Noise DFT coefficient is modeled using Gauss model:
Wherein, I0() is zero-order Bessel Function.
The European distortion measure of weighting of perception isThen risk functionThe minimum value for taking risk function, obtains Then have:As γ=1,There is no closed solutions, then take approximation to solve Bessel equation, enablesΥx() is The parabolic cylinder function of x ranks:
1), when low signal-to-noise ratio, I is utilized0In the Taylor series expansion of w=0Have
2), when high s/n ratio, I is utilized0Approximate function when being worth very bigHave
As shown in Figure 1, this is the block diagram of the single microphone speech-enhancement system in the domains DFT.As shown in Fig. 2, this is in Fig. 1 The details to every frame per frequency spectrum processing of reason process, i.e. the single microphone speech enhan-cement processing procedure in the domains DFT.Such as Fig. 3 institutes Show, this is the specific implementation flow chart of the present invention.
First, Noisy Speech Signal is through over-sampling (sample frequency 8000HZ), framing (140*129), adding window (50% weight It is folded), DFT transform to frequency domain.Estimate unbiased noise power spectrum with MMSE methods
Secondly, posteriori SNR γ and prior weight ξ is calculated according to such as rear two formula respectively, Take β=0.98.
Again, by amplitude and PHASE SEPARATION, the gain function in amplitude-frequency domain is calculated.According to posteriori SNR and priori noise Than passing through formula
Wherein, by prior weight ξ and posteriori SNR γ takes a value range (- 40dB~50dB, using 1dB as spacing) first to calculate gain function and table is made (91*91), in particular situations the corresponding gain function value of different priori posteriori SNR obtained by tabling look-up,p =-0.1, v recommends 0.7.
As shown in figure 4, being the gain function changed with instantaneous signal-to-noise ratio.
Then, spectrum gain is acted on into Noisy Speech SignalAnd by amplitude and phase Bit combination then obtains the frequency-domain expression of voice.
Finally, inverse Fourier transform carried out to Noisy Speech Signal, remove window, close frame (17967*1), then exportable voice Time domain is expressed, and can carry out subjective and objective hearing test to voice.
The same or similar label correspond to the same or similar components;
Position relationship described in attached drawing is used to only for illustration, should not be understood as the limitation to this patent;
Obviously, the above embodiment of the present invention be only to clearly illustrate example of the present invention, and not be pair The restriction of embodiments of the present invention.For those of ordinary skill in the art, may be used also on the basis of the above description To make other variations or changes in different ways.There is no necessity and possibility to exhaust all the enbodiments.It is all this All any modification, equivalent and improvement etc., should be included in the claims in the present invention made by within the spirit and principle of invention Protection domain within.

Claims (4)

1. a kind of combination of single microphone perceives gain function sound enhancement method, which is characterized in that include the following steps:
S1:It is obtained using the unbiased noise power Power estimation based on MMSE
S2:Utilize decision-directed methodEstimate prior weight;
S3:Gain function is calculated according to the perception MMSE criterion of broad sense Gamma priori,
S4:Enhance voice using gain function
2. the combination of single microphone according to claim 1 perceives gain function sound enhancement method, which is characterized in that institute It states in step S1 in Additive noise model, S (k, i) and N (k, i) indicates kth frame, the voice letter of i-th of spectrum component respectively Number and noise signal.Noisy Speech Signal by being in frequency domain representation after Discrete Fourier Transform:X (k, i)=S (k, i)+N (k, I), if the power spectral density of voice isAnd the power spectral density of noise isThen Prior weight definition is respectively defined as with posteriori SNRWithWherein, E[·]It is it is expected Operator, noise power spectrumEstimated using MMSE.
3. the combination of single microphone according to claim 2 perceives gain function sound enhancement method, which is characterized in that institute It states in step S2 and estimates prior weight using DD methods:
Wherein, P[·]Indicate halfwave rectifier,Previous frame language Music estimation, β=0.98.
4. the combination of single microphone according to claim 3 perceives gain function sound enhancement method, which is characterized in that institute It states in step S3:
In amplitude-frequency domain, X (k, i)=S (k, i)+N (k, i) is indicated using polar coordinates, then
Rexp (j θ)=Aexp (j φ)+Dexp (j ψ).The range coefficient of X, S, N are respectively for R, A, D.Amplitude-frequency domain speech enhan-cement Purpose be exactly to acquire the estimation of A
The distribution of voice DFT range coefficients is modeled using unilateral broad sense Gamma models:
Wherein, Γ () indicates Gamma functions, τ and v It is the form parameter of Gamma distributions, and β is scaling parameter, as τ=1, β expression formulas are as follows:
Noise DFT coefficient is modeled using Gauss model:
Wherein, I0() is zero-order Bessel letter Number;
The European distortion measure of weighting of perception isThen risk functionThe minimum value for taking risk function, obtainsThen Have:As γ=1,There is no closed solutions, then take approximation to solve Bessel equation, enablesΥx() is x The parabolic cylinder function of rank:
1), when low signal-to-noise ratio, I is utilized0In the Taylor series expansion of w=0Have
2), when high s/n ratio, I is utilized0Approximate function when being worth very bigHave
CN201710227956.9A 2017-04-10 2017-04-10 A kind of combination perception gain function sound enhancement method of single microphone Pending CN108696791A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710227956.9A CN108696791A (en) 2017-04-10 2017-04-10 A kind of combination perception gain function sound enhancement method of single microphone

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710227956.9A CN108696791A (en) 2017-04-10 2017-04-10 A kind of combination perception gain function sound enhancement method of single microphone

Publications (1)

Publication Number Publication Date
CN108696791A true CN108696791A (en) 2018-10-23

Family

ID=63843280

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710227956.9A Pending CN108696791A (en) 2017-04-10 2017-04-10 A kind of combination perception gain function sound enhancement method of single microphone

Country Status (1)

Country Link
CN (1) CN108696791A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112767962A (en) * 2021-03-01 2021-05-07 北京电信易通信息技术股份有限公司 Voice enhancement method and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102568491A (en) * 2010-12-14 2012-07-11 联芯科技有限公司 Noise suppression method and equipment
CN105338450A (en) * 2015-09-23 2016-02-17 苏州科达科技股份有限公司 Residual echo inhibition method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102568491A (en) * 2010-12-14 2012-07-11 联芯科技有限公司 Noise suppression method and equipment
CN105338450A (en) * 2015-09-23 2016-02-17 苏州科达科技股份有限公司 Residual echo inhibition method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
赵改华等: "修正的基于广义Gamma语音模型语音增强算法", 《计算机工程与应用》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112767962A (en) * 2021-03-01 2021-05-07 北京电信易通信息技术股份有限公司 Voice enhancement method and system
CN112767962B (en) * 2021-03-01 2021-08-03 北京电信易通信息技术股份有限公司 Voice enhancement method and system

Similar Documents

Publication Publication Date Title
CN109643554B (en) Adaptive voice enhancement method and electronic equipment
Mittal et al. Signal/noise KLT based approach for enhancing speech degraded by colored noise
US7313518B2 (en) Noise reduction method and device using two pass filtering
CN108831499A (en) Utilize the sound enhancement method of voice existing probability
CN105489226A (en) Wiener filtering speech enhancement method for multi-taper spectrum estimation of pickup
CN106875938A (en) A kind of improved nonlinear adaptive sound end detecting method
Yang et al. A noise reduction method based on LMS adaptive filter of audio signals
CN110808057A (en) Voice enhancement method for generating confrontation network based on constraint naive
CN108711432A (en) A kind of sound enhancement method of the perception gain function of single microphone
CN108696791A (en) A kind of combination perception gain function sound enhancement method of single microphone
Hamid et al. Speech enhancement using EMD based adaptive soft-thresholding (EMD-ADT)
Rao et al. Speech enhancement using sub-band cross-correlation compensated Wiener filter combined with harmonic regeneration
Li et al. Noisy speech enhancement based on discrete sine transform
CN113066483B (en) Sparse continuous constraint-based method for generating countermeasure network voice enhancement
Wei et al. A novel prewhitening subspace method for enhancing speech corrupted by colored noise
Surendran et al. Perceptual subspace speech enhancement with variance normalization
CN113593590A (en) Method for suppressing transient noise in voice
Zheng et al. SURE-MSE speech enhancement for robust speech recognition
Yan et al. A signal subspace speech enhancement method for various noises
Khalil et al. Enhancement of speech signals using multiple statistical models
Badiezadegan et al. A wavelet-based data imputation approach to spectrogram reconstruction for robust speech recognition
Zehtabian et al. Optimized singular vector denoising approach for speech enhancement
Indumathi et al. Noise estimation using standard deviation of the frequency magnitude spectrum for mixed non-stationary noise
Guo et al. Speaker recognition based on wavelet packet decomposition and Volterra adaptive model
Chen Research on Single Channel Speech Noise Reduction Algorithm Based on Signal Processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20181023

RJ01 Rejection of invention patent application after publication