CN105679330A - Digital hearing aid noise reduction method based on improved sub-band signal-to-noise ratio estimation - Google Patents

Digital hearing aid noise reduction method based on improved sub-band signal-to-noise ratio estimation Download PDF

Info

Publication number
CN105679330A
CN105679330A CN201610150663.0A CN201610150663A CN105679330A CN 105679330 A CN105679330 A CN 105679330A CN 201610150663 A CN201610150663 A CN 201610150663A CN 105679330 A CN105679330 A CN 105679330A
Authority
CN
China
Prior art keywords
signal
subband
frame
noise
noise ratio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610150663.0A
Other languages
Chinese (zh)
Other versions
CN105679330B (en
Inventor
姜涛
梁瑞宇
王青云
陈姝
季昌华
汪潇文
蔡毅杰
吴振飞
李冠霖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Institute of Technology
Original Assignee
Nanjing Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Institute of Technology filed Critical Nanjing Institute of Technology
Priority to CN201610150663.0A priority Critical patent/CN105679330B/en
Publication of CN105679330A publication Critical patent/CN105679330A/en
Application granted granted Critical
Publication of CN105679330B publication Critical patent/CN105679330B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R25/00Deaf-aid sets, i.e. electro-acoustic or electro-mechanical hearing aids; Electric tinnitus maskers providing an auditory perception
    • H04R25/50Customised settings for obtaining desired overall acoustical characteristics
    • H04R25/505Customised settings for obtaining desired overall acoustical characteristics using digital signal processing

Abstract

The invention provides a digital hearing aid noise reduction method based on improved sub-band signal-to-noise ratio estimation, the method comprises the steps: a decomposition filter decomposes an original signal into a plurality of sub-bands; a cross-correlation function and each mean square value of two adjacent frame signals in each sub-band is calculated, so that a signal-to-noise ratio of the sub-band is estimated; gain of each sub-band is calculated according to the estimated signal-to-noise ratio, and the gain multiplied by a sub-band signal is a corrected sub-band signal; finally, the corrected sub-band signals are combined to obtain a noise reduction processed voice. According to the invention, the accurater sub-band signal-to-noise ratio estimation method makes the background noise inhibitory effect being better, and auditory fatigue of hearing-aid users is reduced; the method is simple and has high efficiency, inverse Fourier transform is avoided, so that the time delay performance is greatly improved, the method improved for 60.6% from a traditional spectral subtraction, and the method improved for 40.7% from a traditional Wiener filtering method.

Description

Based on the digital deaf-aid noise-reduction method improving subband signal-to-noise ratio (SNR) estimation
Technical field
The present invention relates to a kind of based on the digital deaf-aid noise-reduction method improving subband signal-to-noise ratio (SNR) estimation.
Background technology
Hearing loss is the serious disease affecting human lives. Long-term dysaudia, except affecting daily exchange, also can cause psychological problems, brings white elephant to society and family. Wear hearing aid is that light-severe listens the maximally effective audition intervention of barrier patient and the means of rehabilitation. Noise circumstance is understood the most FAQs that voice difficulty is puzzlement sonifer user. In order to improve speech understanding degree and listen to comfort level, sonifer speech enhancement technique substantially can be attributed to two classes: shotgun microphone and noise-reduction method. The former is based on voice and noise diversity design spatially, and utilization orientation mike or beam-forming technology carry out the voice signal on Enhancement feature direction. But this type of method is subject to quantity or the size limitation of mike, and performance improvement is limited, and is not suitable for deep auditory meatus sonifer. Difference on time that second method is intended to utilize voice and noise and frequency spectrum, separates voice from signals and associated noises. But, voice and noise are likely on time and frequency spectrum there is overlap, and therefore a lot of research worker conduct in-depth research for this problem.
In speech enhan-cement, researcher proposes the significant method of a lot of noise reduction effect. Patent US2012/8204263B2 " Methodofestimatingweightingfunctionofaudiosignalsinahear ingaid " i.e. is a kind of sonifer audio signal Enhancement Method, utilize multi-microphone Collect jointly ambient sound, realize speech enhan-cement effect, but for wearing in auditory meatus or the patient of neighbouring small-sized custom hearing aid, it is but inapplicable that this class carries out speech enhan-cement by multi-microphone; Patent US2014/13198739.8, " Enhanceddynamicsprocessingofstreamingaudiobysourcesepara tionandremixing " i.e. be a kind of enhancing audio stream method using source to separate and mix, employing Non-negative Matrix Factorization is theoretical, by clean speech data and noise data being carried out study voice and noise model, but the method complexity and the requirement of equipment is higher; Patent CN200510086877.8 " sound enhancement method for sonifer " utilizes auditory masking curve that filtering parameter is adjusted, but the method needs further appreciating that human auditory system physiological models understanding, and the method usually contains the computing of some complexity such as substantial amounts of exponent arithmetic, logarithm operation.Although these above-mentioned methods have certain speech enhan-cement effect, but it is not particularly suited in the significantly high sonifer of real-time and power consumption requirements.
Summary of the invention
It is an object of the invention to provide a kind of based on the digital deaf-aid noise-reduction method improving subband signal-to-noise ratio (SNR) estimation, solve that sonifer speech enhancement technique in prior art exists or by mike quantity or size limitation, performance improvement is limited, and be not suitable for deep auditory meatus sonifer, or the problem that voice and noise are likely to there is overlap on time and frequency spectrum.
In order to realize above-mentioned target, the present invention adopts the following technical scheme that:
A kind of based on the digital deaf-aid noise-reduction method improving subband signal-to-noise ratio (SNR) estimation, including:
Adopt analysis filter bank that original signal resolves into several subbands, then calculate the cross-correlation function of adjacent two frame signals in each subband and respective mean-square value, estimate the signal to noise ratio of this subband;
Secondly, according to each subband gain of the signal-to-noise ratio computation estimated, and the subband signal obtaining revising that is multiplied with subband signal;
Finally, revised each subband signal is comprehensively obtained the voice after noise reduction process.
Further, following steps are specifically included:
S1, microphone input signal x (n) to digital deaf-aid are AD converted, and by the digital signal framing x after conversioni(n) (i=0,1,2 ... N-1), wherein N is frame length;
S2, pass through resolution filter: by every frame signal by multichannel 6 rank IIR analysis filter bank hi(n) (i=0,1 ... M-1), M is port number, and passband ripple 0.5dB, with ciCarry out down-sampled for down-sampled coefficient, obtain M subband signal: x (i, m) (i=0,1,2 ... N-1) (m=0,1,2 ... M-1);
S3, subband signal-to-noise ratio (SNR) estimation: the signal-to-noise ratio (SNR) estimation value of m-th subband the i-th frame is
S N R ( i , m ) = 10 l o g ( α P ( i , m ) + ( 1 - α ) P ( i - 1 , m ) σ 2 ( m ) ) - - - ( 12 )
In formula, (i, m), (i-1, m) for the pure voice signal power of the in m-th subband i-th and i-th-1 frame, σ for P for P2M () is the noise variance in m-th subband, α is smoothing factor, 0 < α < 1.
S4, yield value calculate: by the signal to noise ratio snr that estimates, (i m) brings gain function g intodB(i, in m), and by gdb(i, m) transform to amplitude domain g (i, m).
S5, subband noise suppress: primary speech signal is carried out noise attentuation by the yield value of each subband by obtaining, and obtains enhanced voice signal: yim(n)=g (i, m) x (i, m);
S6, subband signal comprehensively export: by the output signal y of M passageim(n) (i=0,1 ... M-1) first with ciCarry out a liter sampling for liter downsampling factor, be finally synthesizing voice signal y (n) obtaining output.
Further, in step S3, subband signal-to-noise ratio (SNR) estimation particularly as follows:
The i-th and i-th-1 frame signal on S31, definition kth Frequency point is adjacent signals vector Xadjoin(i, k), calculate adjacent signals vector autocorrelation matrix R (i, k);
S32, utilize L frequency spectrum in each subband to estimate correlation matrix R (i, the E [X in m)2(i,m)]、E[X2(i-1, m)] and E [X (i-1, m) X (i, m)], E [X2(i,m)]、E[X2(i-1, m)] mean-square value of respectively the i-th frame in m-th subband and the i-th-1 frame, E [X (i, m) X (i-1, m)] represent the i-th frame and the cross-correlation function of the i-th-1 frame;
S33, solve the i-th and i-th-1 frame in adjacent two frame voice signals and m-th subband pure voice signal power P (i, m), P (i-1, m), and the noise variance σ in m-th subband2(m), for estimate m-th subband the i-th frame signal-to-noise ratio (SNR) estimation value SNR (i, m).
Further, in step S31, the autocorrelation matrix calculating adjacent signals vector is as follows:
R ( i , k ) = E &lsqb; X a d j o i n ( i , k ) X a d j o i n ( i , k ) H &rsqb; = E ( X 2 ( i , k ) ) E ( X ( i , k ) X ( i - 1 , k ) ) E ( X ( i , k ) X ( i - 1 , k ) ) E ( X 2 ( i - 1 , k ) ) - - - ( 1 ) ;
Wherein, adjacent signals vector Xadjoin(i, k)=X (i, k), X (i-1, k) }T, E (X2(i, k)), E (X2The i-th frame in (i-1, k)) respectively kth Frequency point kth subband and the mean-square value of the i-th-1 frame, ((i, k) X (i-1, k)) represents the i-th frame and the cross-correlation function of the i-th-1 frame to X to E.
Further, in step S32, after signal wave filter by analysis obtains M subband, according to each subband comprise L=N/M frequency spectrum estimate m-th subband consecutive frame correlation matrix R (i, m):
R ( i , m ) = E &lsqb; X 2 ( i , m ) &rsqb; E &lsqb; X ( i , m ) X ( i - 1 , m ) &rsqb; E &lsqb; X ( i , m ) X ( i - 1 , m ) &rsqb; E &lsqb; X 2 ( i - 1 , m ) &rsqb; - - - ( 2 )
In formula (2), E [X2(i,m)]、E[X2(i-1, m)] mean-square value of respectively the i-th frame in m-th subband and the i-th-1 frame, E [X (i, m) X (i-1, m)] represent the i-th frame and the cross-correlation function of the i-th-1 frame.
Further, in step S4, gain function is
gdB(i, m)=λim(SNR)f(xdB(n))(13)
Wherein, gdb(i m) represents the gain of the i-th frame voice signal in m subband.
Further, in step S4, by the gain g of the i-th frame voice signal in m subbanddb(i, m) transform to amplitude domain g (i, m):
g ( i , m ) = 10 2 &CenterDot; g d B ( i , m ) - - - ( 14 ) .
The invention has the beneficial effects as follows: this kind is based on the digital deaf-aid noise-reduction method improving subband signal-to-noise ratio (SNR) estimation, by the signal-to-noise ratio (SNR) estimation to subband signal, and the yield value of each subband is calculated according to subband state of signal-to-noise, achieve the signals and associated noises to quiet signal or high signal noise ratio and only carry out slight noise abatement, and the noise cancellation signal that has of low signal noise ratio is improved reduction, alleviate the auditory fatigue of user, and solve prior art and be difficult to be applicable to the difficulty of the sonifer that real-time and power consumption requirements is significantly high because computation complexity is high. This kind is based on the digital deaf-aid noise-reduction method improving subband signal-to-noise ratio (SNR) estimation, and subband signal-noise ratio estimation method makes background noise inhibition better more accurately, alleviates the auditory fatigue of hearing aid user; Method is simply efficient, it is to avoid inverse Fourier transform (IFFT) makes time delay performance obtain to improve largely, and more traditional spectrum-subtraction improves 60.6%, more traditional Wiener Filter Method improves 40.7%.
Accompanying drawing explanation
Fig. 1 is the embodiment of the present invention schematic flow sheet based on the digital deaf-aid noise-reduction method improving subband signal-to-noise ratio (SNR) estimation;
Fig. 2 is the explanation schematic diagram of digital deaf-aid noise reduction model in embodiment;
Fig. 3 is the linear-proportional gain's function revised in embodiment;
Fig. 4 is each algorithm delay performance contrast in embodiment.
Detailed description of the invention
The preferred embodiments of the present invention are described in detail below in conjunction with accompanying drawing.
Current digital deaf-aid adopts multiple-channels algorithm mostly, and voice signal is divided into some subbands according to frequency, and each subband individually carries out noise suppressed. The wherein difference according to frequency band division methods, can be divided into wide and non-wide two kinds of methods. Because the perception of sound ability of different frequency range is different by human ear, therefore it is generally adopted non-wide frequency band division methods, the auditory properties of human ear cochlea can be simulated better. Additionally the bank of filters of threshold sampling shows poor in this performance of stopband attenuation, therefore it is better to be generally adopted noise suppressed performance, the over-sampling bank of filters of flexible design. The frequency band division methods that right and wrong that the frequency band division methods of embodiment adopts are wide, additionally bank of filters adopts over-sampling wave filter.
Embodiment
A kind of based on the digital deaf-aid noise-reduction method improving subband signal-to-noise ratio (SNR) estimation, such as Fig. 1, adopt 6 rank IIR analysis filter bank that original signal resolves into 16 subbands, then calculate the cross-correlation function of adjacent two frame signals in each subband and respective mean-square value, estimate the signal to noise ratio of this subband;Secondly, according to each subband gain of the signal-to-noise ratio computation estimated, and the subband signal obtaining revising that is multiplied with subband signal; Finally, revised each subband signal is comprehensively obtained the voice after noise reduction process.
Specifically include following steps:
S1, microphone input signal x (n) to digital deaf-aid are AD converted, and by the digital signal framing x after conversioni(n) (i=0,1,2 ... N-1), wherein N is frame length;
S2, pass through resolution filter: by every frame signal by multichannel 6 rank IIR analysis filter bank hi(n) (i=0,1 ... M-1), M is port number, and passband ripple 0.5dB, with ciCarry out down-sampled for down-sampled coefficient, obtain M subband signal: x (i, m) (i=0,1,2 ... N-1) (m=0,1,2 ... M-1);
S3, subband signal-to-noise ratio (SNR) estimation: the signal-to-noise ratio (SNR) estimation value of m-th subband the i-th frame is
S N R ( i , m ) = 10 l o g ( &alpha; P ( i , m ) + ( 1 - &alpha; ) P ( i - 1 , m ) &sigma; 2 ( m ) ) - - - ( 12 )
In formula, (i, m), (i-1, m) for the pure voice signal power of the in m-th subband i-th and i-th-1 frame, σ for P for P2M () is the noise variance in m-th subband, α is smoothing factor, takes 0.5 in the present embodiment.
S4, yield value calculate: by the signal to noise ratio snr that estimates, (i m) brings gain function g intodB(i, in m), and by gdb(i, m) transform to amplitude domain g (i, m).
S5, subband noise suppress: primary speech signal is carried out noise attentuation by the yield value of each subband by obtaining, and obtains enhanced voice signal: yim(n)=g (i, m) x (i, m);
S6, subband signal comprehensively export: by the output signal y of M passageim(n) (i=0,1 ... M-1) first with ciCarry out a liter sampling for liter downsampling factor, be finally synthesizing voice signal y (n) obtaining output.
Embodiment based on improve subband signal-to-noise ratio (SNR) estimation digital deaf-aid noise-reduction method as in figure 2 it is shown, Fig. 2 digital deaf-aid noise reduction model in M be total number of channels, frequency band is divided into 16 subbands according to auditory perceptual characteristic and cochlea filter characteristic by embodiment:
Wherein, fiRepresent the lower sideband frequencies of the i-th subband,Representing that the upper side band cymometer of the i-th subband calculates the lower sideband frequency of each subband, result of calculation is as shown in table 1.
Table 1 frequency band division result
In Fig. 2, x0(n),x1(n),…xM-1N () is through hiN the effect of () is that original signal is decomposed M the subband signal obtained, and keep original sampling frequency, hiN () also has the effect of frequency overlapped-resistable filter simultaneously, embodiment adopts 3 order modes to intend Chebyshev's I type and designs 6 rank IIR resolution filter hi(n) (i=0,1 ... 15), passband ripple 0.5dB. ciBe the i-th subband/liter downsampling factor drops. As downsampling factor ciDuring=M, it is called threshold sampling; ciDuring < M, it is called over-sampling. The present embodiment uses over-sampling. u0(n),u1(n),…uM-1N () is respectively through with c0(n),c1(n),…cM-1N () carries out down-sampled M the subband signal obtained for downsampling factor, down-sampled coefficient is as shown in table 2. ;
The down-sampled coefficient of table 2
G in Fig. 20(n),g1(n),…gM-1N () is the amount of gain of each subband, respectively with u0(n),u1(n),…uM-1N () is multiplied and obtains strengthening the output signal v after algorithm process through sub-band speech0(n),v1(n),…vM-1(n); V after process0(n),v1(n),…vM-1N () signal is then through with c0(n),c1(n),…cM-1N () carries out a liter sampling for downsampling factor. Each subband signal final is comprehensive, the clean speech signal of output estimation.
Subband signal-to-noise ratio (SNR) estimation
Owing to voice signal and noise are time varying signal, in each frame, the frequency domain distribution of voice and noise energy is unbalanced, it is therefore desirable to estimate the signal to noise ratio of signal in real time. The present invention proposes the signal-noise ratio estimation method of a kind of improvement, utilizes the autocorrelation matrix of adjacent two frame noisy speech signal, estimates the signal to noise ratio of each subband.
Subband signal-to-noise ratio (SNR) estimation is specific as follows:
Assume that the i-th frame signal is:
X (i, n)=s (i, n)+n (i, n) (2)
In formula, (i, n) represents Noisy Speech Signal to x, and (i, n) represents primary speech signal to s, and (i n) represents noise signal to n.
Its Fourier transformation is:
X (i, k)=S (i, k)+N (i, k) (3)
The i-th and i-th-1 frame signal on definition kth Frequency point is adjacent signals vector Xadjoin(i, k), then
Xadjoin(i, k)=X (i, k), X (i-1, k) }T
Calculate the autocorrelation matrix of adjacent signals vector:
R ( i , k ) = E &lsqb; X a d j o i n ( i , k ) X a d j o i n ( i , k ) H &rsqb; = E ( X 2 ( i , k ) ) E ( X ( i , k ) X ( i - 1 , k ) ) E ( X ( i , k ) X ( i - 1 , k ) ) E ( X 2 ( i - 1 , k ) ) - - - ( 4 )
If only with X, (i, k) (i-1, k) estimates the correlation matrix of each frequency, can produce bigger estimation difference with X. After signal wave filter by analysis obtains M subband, it is possible to comprise L according to each subbandmIndividual frequency spectrum estimate m-th subband consecutive frame correlation matrix R (i, m):
R ( i , m ) = E &lsqb; X 2 ( i , m ) &rsqb; E &lsqb; X ( i , m ) X ( i - 1 , m ) &rsqb; E &lsqb; X ( i , m ) X ( i - 1 , m ) &rsqb; E &lsqb; X 2 ( i - 1 , m ) &rsqb; - - - ( 5 )
In formula (6), E [X2(i,m)]、E[X2(i-1, m)] mean-square value of respectively the i-th frame in m-th subband and the i-th-1 frame, E [X (i, m) X (i-1, m)] represent the i-th frame and the cross-correlation function of the i-th-1 frame.
Assuming that clean speech signal s (t) of any frame is unrelated with the noise signal of any frame statistics, the noise of consecutive frame is uncorrelated simultaneously, then every in formula (6) can abbreviation be
E[X2(i, m)]=P (i, m)+σ2(m)(6)
E[X2(i-1, m)]=P (i-1, m)+σ2(m)(7)
Wherein, (i, m), (i-1, m) for the pure voice signal power of the in m-th subband i-th and i-th-1 frame, σ for P for P2M () is the noise variance in m-th subband.
E [X (i, m) X (i-1, m)]=E [S (and i, m) S (i-1, m)] (7+)
Above formula is deformed, can obtain
E &lsqb; X ( i , m ) X ( i - 1 , m ) &rsqb; = E &lsqb; S ( i , m ) S ( i - 1 , m ) S 2 ( i - 1 , m ) &rsqb; = &lambda; ( i - 1 , m ) E &lsqb; S 2 ( i - 1 , m ) &rsqb; = &lambda; ( i - 1 , m ) P ( i - 1 , m ) - - - ( 8 )
From above formula
λ (i-1, m)=S (i, m)/S (i-1, m)=(X (i, m)-N (i, m))/(X (i-1, m)-N (i-1, m))
(8+)
Assuming that noise is smooth change, (i, m) with N (i-1, m) the noise spectrum N for estimating time quiet for Nsilence(m)。
(i m) may finally be expressed as such R
R ( i , m ) = P ( i , m ) + &sigma; 2 ( m ) &lambda; ( i - 1 , m ) P ( i - 1 , m ) &lambda; ( i - 1 , m ) P ( i - 1 , m ) P ( i - 1 , m ) + &sigma; 2 ( m ) - - - ( 9 )
Utilize the L in each subbandmIndividual frequency spectrum estimates the E [X in (7) (8) (9) formula2(i, m)], E [X2(i-1, m)], E [X (and i-1, m) X (i, m)].
E &lsqb; X 2 ( i , m ) &rsqb; = 1 L m &Sigma; k = ( m - 1 ) L + 1 m L X 2 ( i , k ) - - - ( 10 )
E &lsqb; X 2 ( i - 1 , m ) &rsqb; = 1 L m &Sigma; k = ( m - 1 ) L + 1 m L X 2 ( i - 1 , k ) - - - ( 11 )
E &lsqb; X ( i , m ) X ( i - 1 , m ) &rsqb; = 1 L m &Sigma; k = ( m - 1 ) L + 1 m L X ( i , k ) X ( i - 1 , k ) - - - ( 12 )
Bring in formula (7), (8), (9), and P (i, m), σ2M (), (i-1 m) can uniquely determine P.
The signal-to-noise ratio (SNR) estimation value of such m-th subband the i-th frame is
S N R ( i , m ) = 10 l o g ( &alpha; P ( i , m ) + ( 1 - &alpha; ) P ( i - 1 , m ) &sigma; 2 ( m ) ) - - - ( 13 )
The structure of gain function
The basic thought of Wiener filtering noise reduction model is by designing linear filter H (w) so that can reach mean square error expectation by the output after this linear filter minimum, namelyOptimal estimation for clean speech S (w).
S ^ ( w ) = H ( w ) X ( w ) - - - ( 14 )
Wherein,PsW () is clean speech signal power spectral density, PdW () is noise power spectral density. Noise Estimation is generally adopted the noise spectrum estimation algorithm based on prior weight and posteriori SNR, and derives filter transfer function H (w) by introducing smoothing parameter.
Wiener Filter Method is by noisy speech being multiplied by a gain function or by a wave filter, its purpose is to the sound pressure level of noise attentuation to patient comfort. Therefore embodiment does not carry out noise abatement to realize the signals and associated noises to quiet signal or high signal noise ratio or only carries out slight noise abatement, and the noise cancellation signal that has of low signal noise ratio is improved reduction, is defined as by gain function
gdB(i, m)=λim(SNR)f(xdB(n))(15)
g ( i , m ) = 10 2 &CenterDot; g d B ( i , m ) - - - ( 16 )
Wherein gdb(i, m) represents the gain (dB territory) of the i-th frame voice signal in m subband, and (i is m) by g to gdb(i m) transforms to amplitude domain. λim(SNR) represent in m subband that the i-th frame voice signal is with the SNR pad value changed, by λim(SNR) it is limited in [-1,0]. Work as λim(SNR), when=0, being transformed into amount of gain after amplitude domain is 1, represents undamped; λim(SNR), time closer to-1, the pad value in respective amplitude territory is more big.
Although existing Noise reduction algorithm utilizes pad value function lambdaim(SNR) carry out noise reduction with the relation of subband SNR, but under low signal-to-noise ratio, background noise still remains too much, can cause user's auditory fatigue. For this, embodiment proposes a kind of improvement strategy, in low signal-to-noise ratio section modified gain proportion function, as it is shown on figure 3, low signal-to-noise ratio section [0, B0] attenuation curve is precipitous, it is therefore an objective to significantly weaken background noise during low signal-to-noise ratio. But this strategy is disadvantageous for faint quiet signal, and therefore gain function should also depend on environmental noise level. When being in quiet indoor such as patient, if relying solely on signal noise ratio calculating gain can cause that faint quiet signal is significantly attenuated, introduce maximum noise attenuation function f (N for thisdB(n)) in order to control signal attenuation amplitude. So under quiet environment, NdBN () is very low, arrange maximum attenuation amplitude f (NdB(n))=NdBAfter (n), even if very low also the need hardly of signal noise ratio is decayed.
Performance test analysis
Compare the algorithm of present invention proposition and the algorithm performance difference of the patent US2004/6757395B1 modulation depth method proposed by experiment, and compare the performance of processing delay with basic spectrum-subtraction, Wiener Filter Method. Test all clean speech and take from TIMIT speech database, sample rate 16kHz. Pure noise intercepts from Noise92 noise storehouse, and noise type respectively White, Pink, Tank and SpeechBabble, input signal-to-noise ratio takes 0dB, 5dB, 10dB.
Speech perceptual quality is evaluated
Embodiment adopts logarithmic spectrum to estimate LSD (Log-spectraldistortion) and speech perceptual quality evaluation PESQ (Perceptualevaluationofspeechquality) is objective evaluation index.
LSD reflection is voice distortion situation, and PESQ will reflect voice quality on the whole. Both indexs and subjective assessment have higher degree of association. General LSD drop-out value (Δ LSD) is more big, it was shown that the logarithmic spectrum distortion factor is more little, and algorithm is more little to the degree of injury of voice, and PESQ raising amount (Δ PESQ) is more big, it was shown that the voice quality after algorithm process is more good. The computational methods of LSD and PESQ are as follows
L S D = 1 J &Sigma; l = 0 J - 1 { 1 N / 2 + 1 &Sigma; k = 0 N / 2 &lsqb; l 0 log 10 X ( k , l ) - 10 log 10 X ^ ( k , l ) &rsqb; 2 } 1 2 - - - ( 17 )
PESQ=4.5-0.1 dSYM-0.0309·dASYM(18)
In formula (17), X (k, l) andThe respectively Short Time Fourier Transform of clean speech and enhanced voice, N is frame length, and J is frame number. D in formula (18)SYMAnd dASYMFor the correlometer formula in cognitive model. Result of calculation is as shown in table 3.
The speech perceptual quality evaluation of table 3 modulation depth method and the inventive method contrasts
In table 3, relatively input signal-to-noise ratio respectively 0dB, 5dB, during 10dB, the raising amount of the modulation depth method method proposed from embodiment LSD slippage under different noise types and PESQ, it appeared that the method that embodiment proposes shows excellence under White and Pink noise circumstance, the improvement degree of LSD and PESQ is substantially better than modulation depth method, particularly under White noise circumstance, the inventive method relatively modulation depth method on average exceeds 36.7% in logarithmic spectrum distance improved values, and PESQ improved values on average exceeds 19%. But under Tank and SpeechBabble noise circumstance, the present invention and modulation depth method performance are all performed poor, the average improvement amount of PESQ is only 0.1, so the linear scale attenuation model used by the present invention is further improved, be not suitable for Tank and SpeechBabble noise circumstance, therefore the performance of subjective feeling is by the impact of noise scenarios.
Algorithm complex and time delay analysis
The noisy speech duration 2.26s wherein processed, experiment frame length takes 64,128 and 256 respectively, and frame moves and is 50%.Although being both needed to, by analysis filter, signal is divided into some subbands for multiband algorithm, but the amount of calculation of filter equalizer being not belonging in the scope that the present invention considers. Therefore, the process time delay that embodiment is added up be from signal by after resolution filter to the process before synthesis filter. As can be known from Fig. 4, the time delay of basic spectrum-subtraction and basic Wiener Filter Method to be significantly greater than method proposed by the invention, this is because this two class model is both needed to by fast Fourier transform (FastFourierTransform, FFT) frequency-domain analysis is carried out, and it is final by inverse fast Fourier transform (InverseFastFourierTransform, IFFT) reverting to time-domain signal, therefore amount of calculation can dramatically increase. And the present invention by means of FFT and carries out frequency-domain analysis when the estimation of subband signal to noise ratio, the calculating of FFT is only for estimator (13), and the cross-correlation function being directed to and the amount of calculation of mean-square value are all extremely low. Additionally the present invention need to not revert to the process of time-domain signal via IFFT. Therefore the computation complexity invented is lower than basic spectrum-subtraction and basic Wiener Filter Method. Engineering generally takes frame length 28=256 points, now the more basic spectrum-subtraction of processing delay performance of the present invention improves 60.6%, and relatively Wiener Filter Method improves 40.7%.
The above is only the preferred embodiment of the present invention; it should be pointed out that, for those skilled in the art, under the premise without departing from the technology of the present invention principle; can also making some improvement and deformation, these improve and deformation also should be regarded as protection scope of the present invention.

Claims (7)

1. the digital deaf-aid noise-reduction method based on improvement subband signal-to-noise ratio (SNR) estimation, it is characterised in that including:
Adopt analysis filter bank that original signal resolves into several subbands, then calculate the cross-correlation function of adjacent two frame signals in each subband and respective mean-square value, estimate the signal to noise ratio of this subband;
Secondly, according to each subband gain of the signal-to-noise ratio computation estimated, and the subband signal obtaining revising that is multiplied with subband signal;
Finally, revised each subband signal is comprehensively obtained the voice after noise reduction process.
2. as claimed in claim 1 based on the digital deaf-aid noise-reduction method improving subband signal-to-noise ratio (SNR) estimation, it is characterised in that to specifically include following steps:
S1, microphone input signal x (n) to digital deaf-aid are AD converted, and by the digital signal framing x after conversioni(n) (i=0,1,2 ... N-1), wherein N is frame length;
S2, pass through resolution filter: by every frame signal by multichannel 6 rank IIR analysis filter bank hi(n) (i=0,1 ... M-1), M is port number, with ciCarry out down-sampled for down-sampled coefficient, obtain M subband signal: x (i, m) (i=0,1,2 ... N-1) (m=0,1,2 ... M-1);
S3, subband signal-to-noise ratio (SNR) estimation: the signal-to-noise ratio (SNR) estimation value of m-th subband the i-th frame is
S N R ( i , m ) = 10 l o g ( &alpha; P ( i , m ) + ( 1 - &alpha; ) P ( i - 1 , m ) &sigma; 2 ( m ) ) - - - ( 12 )
In formula, (i, m), (i-1, m) for the pure voice signal power of the in m-th subband i-th and i-th-1 frame, σ for P for P2M () is the noise variance in m-th subband, α is smoothing factor, 0 < α < 1;
S4, yield value calculate: by the signal to noise ratio snr that estimates, (i m) brings gain function g intodB(i, in m), and by gdb(i, m) transform to amplitude domain g (i, m).
S5, subband noise suppress: primary speech signal is carried out noise attentuation by the yield value of each subband by obtaining, and obtains enhanced voice signal: yim(n)=g (i, m) x (i, m);
S6, subband signal comprehensively export: by the output signal y of M passageim(n) (i=0,1 ... M-1) first with ciCarry out a liter sampling for liter downsampling factor, be finally synthesizing voice signal y (n) obtaining output.
3. as claimed in claim 2 based on the digital deaf-aid noise-reduction method improving subband signal-to-noise ratio (SNR) estimation, it is characterised in that in step S3, subband signal-to-noise ratio (SNR) estimation particularly as follows:
The i-th and i-th-1 frame signal on S31, definition kth Frequency point is adjacent signals vector Xadjoin(i, k), calculate adjacent signals vector autocorrelation matrix R (i, k);
S32, utilize the L in each subbandmIndividual frequency spectrum estimates correlation matrix R (i, the E [X in m)2(i,m)]、E[X2(i-1, m)] and E [X (i-1, m) X (i, m)], E [X2(i,m)]、E[X2(i-1, m)] mean-square value of respectively the i-th frame in m-th subband and the i-th-1 frame, E [X (i, m) X (i-1, m)] represent the i-th frame and the cross-correlation function of the i-th-1 frame;
S33, solve the i-th and i-th-1 frame in adjacent two frame voice signals and m-th subband pure voice signal power P (i, m), P (i-1, m), and the noise variance σ in m-th subband2(m), for estimate m-th subband the i-th frame signal-to-noise ratio (SNR) estimation value SNR (i, m).
4. as claimed in claim 3 based on the digital deaf-aid noise-reduction method improving subband signal-to-noise ratio (SNR) estimation, it is characterised in that in step S31, the autocorrelation matrix calculating adjacent signals vector is as follows:
R ( i , k ) = E &lsqb; X a d j o i n ( i , k ) X a d j o i n ( i , k ) H &rsqb; = E ( X 2 ( i , k ) ) E ( X ( i , k ) X ( i - 1 , k ) ) E ( X ( i , k ) X ( i - 1 , k ) ) E ( X 2 ( i - 1 , k ) ) - - - ( 1 ) ;
Wherein, adjacent signals vector Xadjoin(i, k)=X (i, k), X (i-1, k) }T, E (X2(i, k)), E (X2The i-th frame in (i-1, k)) respectively kth Frequency point kth subband and the mean-square value of the i-th-1 frame, ((i, k) x (i-1, k)) represents the i-th frame and the cross-correlation function of the i-th-1 frame to X to E.
5. as claimed in claim 3 based on the digital deaf-aid noise-reduction method improving subband signal-to-noise ratio (SNR) estimation, it is characterised in that in step S32, after signal wave filter by analysis obtains M subband, to comprise L according to each subbandmIndividual frequency spectrum estimate m-th subband consecutive frame correlation matrix R (i, m):
R ( i , m ) = E &lsqb; X 2 ( i , m ) &rsqb; E &lsqb; X ( i , m ) X ( i - 1 , m ) &rsqb; E &lsqb; X ( i , m ) X ( i - 1 , m ) &rsqb; E &lsqb; X 2 ( i - 1 , m ) &rsqb; - - - ( 2 )
In formula (2), E [X2(i,m)]、E[X2(i-1, m)] mean-square value of respectively the i-th frame in m-th subband and the i-th-1 frame, E [X (i, m) X (i-1, m)] represent the i-th frame and the cross-correlation function of the i-th-1 frame.
6. the digital deaf-aid noise-reduction method based on improvement subband signal-to-noise ratio (SNR) estimation as described in any one of claim 1-5, it is characterised in that in step S4, gain function is
gdB(i, m)=λim(SNR)f(xdB(n))(13)
Wherein, gdb(i m) represents the gain of the i-th frame voice signal in m subband.
7. the digital deaf-aid noise-reduction method based on improvement subband signal-to-noise ratio (SNR) estimation as described in any one of claim 1-5 1, it is characterised in that in step S4, by the gain g of the i-th frame voice signal in m subbanddb(i, m) transform to amplitude domain g (i, m):
g ( i , m ) = 10 2 &CenterDot; g d B ( i , m ) - - - ( 14 ) .
CN201610150663.0A 2016-03-16 2016-03-16 Based on the digital deaf-aid noise-reduction method for improving subband signal-to-noise ratio (SNR) estimation Active CN105679330B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610150663.0A CN105679330B (en) 2016-03-16 2016-03-16 Based on the digital deaf-aid noise-reduction method for improving subband signal-to-noise ratio (SNR) estimation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610150663.0A CN105679330B (en) 2016-03-16 2016-03-16 Based on the digital deaf-aid noise-reduction method for improving subband signal-to-noise ratio (SNR) estimation

Publications (2)

Publication Number Publication Date
CN105679330A true CN105679330A (en) 2016-06-15
CN105679330B CN105679330B (en) 2019-11-29

Family

ID=56310634

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610150663.0A Active CN105679330B (en) 2016-03-16 2016-03-16 Based on the digital deaf-aid noise-reduction method for improving subband signal-to-noise ratio (SNR) estimation

Country Status (1)

Country Link
CN (1) CN105679330B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106301553A (en) * 2016-08-15 2017-01-04 北京邮电大学 Determine interferometer both arms time delayed difference value method, OSNR Monitoring Method and device
CN107680610A (en) * 2017-09-27 2018-02-09 安徽硕威智能科技有限公司 A kind of speech-enhancement system and method
CN107889006A (en) * 2017-10-11 2018-04-06 恒玄科技(上海)有限公司 A kind of active noise reduction system of flexible modulation de-noising signal delay
CN108257617A (en) * 2018-01-11 2018-07-06 会听声学科技(北京)有限公司 A kind of noise scenarios identifying system and method
CN108281154A (en) * 2018-01-24 2018-07-13 成都创信特电子技术有限公司 A kind of noise-reduction method of voice signal
CN110473567A (en) * 2019-09-06 2019-11-19 上海又为智能科技有限公司 Audio-frequency processing method, device and storage medium based on deep neural network
CN112885375A (en) * 2021-01-08 2021-06-01 天津大学 Global signal-to-noise ratio estimation method based on auditory filter bank and convolutional neural network

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101582264A (en) * 2009-06-12 2009-11-18 瑞声声学科技(深圳)有限公司 Method and voice collecting system for speech enhancement
CN102004264A (en) * 2010-10-18 2011-04-06 中国石油化工股份有限公司 Quantitative analysis and evaluation method for quality of acquired seismic data
US20110222372A1 (en) * 2010-03-12 2011-09-15 University Of Maryland Method and system for dereverberation of signals propagating in reverberative environments
CN103607363A (en) * 2013-12-04 2014-02-26 北京邮电大学 Blind estimation method of signal to noise ratio
CN104917706A (en) * 2014-03-10 2015-09-16 联想(北京)有限公司 Signal-to-Noise Ratio estimation method and electronic equipment
CN105390142A (en) * 2015-12-17 2016-03-09 广州大学 Digital hearing aid voice noise elimination method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101582264A (en) * 2009-06-12 2009-11-18 瑞声声学科技(深圳)有限公司 Method and voice collecting system for speech enhancement
US20110222372A1 (en) * 2010-03-12 2011-09-15 University Of Maryland Method and system for dereverberation of signals propagating in reverberative environments
CN102004264A (en) * 2010-10-18 2011-04-06 中国石油化工股份有限公司 Quantitative analysis and evaluation method for quality of acquired seismic data
CN103607363A (en) * 2013-12-04 2014-02-26 北京邮电大学 Blind estimation method of signal to noise ratio
CN104917706A (en) * 2014-03-10 2015-09-16 联想(北京)有限公司 Signal-to-Noise Ratio estimation method and electronic equipment
CN105390142A (en) * 2015-12-17 2016-03-09 广州大学 Digital hearing aid voice noise elimination method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王红玲: "地震记录信噪比估算方法研究", 《中国优秀硕士学位论文全文数据库基础科学辑》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106301553A (en) * 2016-08-15 2017-01-04 北京邮电大学 Determine interferometer both arms time delayed difference value method, OSNR Monitoring Method and device
CN107680610A (en) * 2017-09-27 2018-02-09 安徽硕威智能科技有限公司 A kind of speech-enhancement system and method
CN107889006A (en) * 2017-10-11 2018-04-06 恒玄科技(上海)有限公司 A kind of active noise reduction system of flexible modulation de-noising signal delay
CN108257617A (en) * 2018-01-11 2018-07-06 会听声学科技(北京)有限公司 A kind of noise scenarios identifying system and method
CN108257617B (en) * 2018-01-11 2021-01-19 会听声学科技(北京)有限公司 Noise scene recognition system and method
CN108281154A (en) * 2018-01-24 2018-07-13 成都创信特电子技术有限公司 A kind of noise-reduction method of voice signal
CN108281154B (en) * 2018-01-24 2021-05-18 成都创信特电子技术有限公司 Noise reduction method for voice signal
CN110473567A (en) * 2019-09-06 2019-11-19 上海又为智能科技有限公司 Audio-frequency processing method, device and storage medium based on deep neural network
CN110473567B (en) * 2019-09-06 2021-09-14 上海又为智能科技有限公司 Audio processing method and device based on deep neural network and storage medium
CN112885375A (en) * 2021-01-08 2021-06-01 天津大学 Global signal-to-noise ratio estimation method based on auditory filter bank and convolutional neural network

Also Published As

Publication number Publication date
CN105679330B (en) 2019-11-29

Similar Documents

Publication Publication Date Title
CN105679330A (en) Digital hearing aid noise reduction method based on improved sub-band signal-to-noise ratio estimation
CN105741849B (en) The sound enhancement method of phase estimation and human hearing characteristic is merged in digital deaf-aid
CN110085249B (en) Single-channel speech enhancement method of recurrent neural network based on attention gating
CN108831499B (en) Speech enhancement method using speech existence probability
CN101976566B (en) Voice enhancement method and device using same
CN110600050B (en) Microphone array voice enhancement method and system based on deep neural network
CN101901602B (en) Method for reducing noise by using hearing threshold of impaired hearing
CN108831495A (en) A kind of sound enhancement method applied to speech recognition under noise circumstance
US8880396B1 (en) Spectrum reconstruction for automatic speech recognition
CN105390142B (en) A kind of digital deaf-aid voice noise removing method
CN102157156B (en) Single-channel voice enhancement method and system
WO2011137258A1 (en) Multi-microphone robust noise suppression
CN106340292A (en) Voice enhancement method based on continuous noise estimation
CN108986832B (en) Binaural voice dereverberation method and device based on voice occurrence probability and consistency
CN102347028A (en) Double-microphone speech enhancer and speech enhancement method thereof
TW201248613A (en) System and method for monaural audio processing based preserving speech information
CN110310656A (en) A kind of sound enhancement method
CN113129918B (en) Voice dereverberation method combining beam forming and deep complex U-Net network
CN109961799A (en) A kind of hearing aid multicenter voice enhancing algorithm based on Iterative Wiener Filtering
CN109147808A (en) A kind of Speech enhancement hearing-aid method
CN113160845A (en) Speech enhancement algorithm based on speech existence probability and auditory masking effect
CN115424627A (en) Voice enhancement hybrid processing method based on convolution cycle network and WPE algorithm
CN105869649A (en) Perceptual filtering method and perceptual filter
CN115312073A (en) Low-complexity residual echo suppression method combining signal processing and deep neural network
Xu et al. Adaptive speech enhancement algorithm based on first-order differential microphone array

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant