CN103824563A

CN103824563A - Hearing aid denoising device and method based on module multiplexing

Info

Publication number: CN103824563A
Application number: CN201410059379.3A
Authority: CN
Inventors: 薛风杰
Original assignee: Shenzhen Micro & Nano Integrated Circuit And System Application Institute
Current assignee: Shenzhen Micro & Nano Integrated Circuit And System Application Institute
Priority date: 2014-02-21
Filing date: 2014-02-21
Publication date: 2014-05-28

Abstract

The invention provides a hearing aid denoising device and method based on module multiplexing. The hearing air denoising device comprises a spectrum estimation and end point detection module, a spectrum mean value module, a Wiener filtering module and a quick Fourier inversion module, wherein the spectrum estimation and end point detection module is used for solving a power spectrum of an input frame and judging whether the current frame is a voice frame or a noise frame so that frequency spectrum estimation and end point detection can share one module; the spectrum mean value module is used for carrying out mean value operations on power spectrum densities of a front frame and a rear frame so as to obtain a time smooth power spectrum; the Wiener filtering module is used for carrying out Wiener filtering operations on the current frame; the quick Fourier inversion module is used for allowing a processed signal to be converted from a frequency domain to a time domain, so that a signal after denoising is obtained. According to the hearing aid denoising device and method, frequency spectrum estimation and voice mobility detection can share one module, and accordingly calculation and hardware power consumption are reduced.

Description

A kind of osophone denoising apparatus and method based on module reuse

Technical field

The present invention relates to a kind of denoising apparatus and method of osophone, relate more specifically to a kind of osophone denoising apparatus and method based on module reuse.

Background technology

Along with the development of modern science and technology, digital deaf-aid is accepted by increasing Deaf and Hard of Hearing Talents gradually with its powerful signal handling capacity.But under noise circumstance, osophone wearer can degradation to the intelligibility of voice.Therefore, squelch circuit module is particularly important for the speech processes of osophone.Some conventional denoising methods comprise the methods such as spectrum-subtraction, Wiener Filter Method, subspace sound enhancement method at present, and wherein the most frequently used is Wiener Filter Method, and it can remove ground unrest, white noise and some music noises etc. in environment.

Conventional a kind of Wiener filtering algorithm is ETSI(ETSI) standard ETSI ES 202 050 V1.1.5(2007-01).Fig. 1 is the FB(flow block) of the two-stage Wiener Filtering of prior art.As shown in Figure 1, two-stage Mel warpage Wiener filtering application Mel territory triangular filter group to the Mel territory relevant to speech perception, is then carried out filtering to signal by Wiener filtering coefficients conversion.

In the first order, calculate the spectrum of incoming frame by spectrum estimation module, spectrum average module obtains the power spectrum of time smoothing by the power spectrum average between the frame of front and back, Voice activity detector (Voice Activity Detection, VADNest) (also will be called below end-point detection) module judges that present frame is speech frame or pure noise frame.After Wiener filtering module completes the calculating of linear frequency filter coefficient, adopt and beautify band module and carry out smooth operation and obtain Mel warpage Wiener filtering coefficient, then carry out Mel IDCT operation and obtain the time domain shock response of Mel warpage S filter.Then, by filtration module, wave filter shock response and input speech signal are carried out to convolution, thereby realize Wiener filtering process.

In composing and estimating, need to carry out Fast Fourier Transform (FFT) (FFT) to 256, then frequency domain is carried out square obtaining frequency spectrum.Fig. 2 is the FB(flow block) of the end-point detection (VADNest) in the two-stage Wiener Filtering of prior art, wherein asks the formula of FRAME_EN to be:

As shown in Figure 1, what the Wiener filtering of the second level was different from the first order is after beautifying and asking for the each sub-band filter coefficient of Mel yardstick in band module, in gain regulation module, coefficient has been carried out to gain process.Utilize gain process to make the more noise of the degree of depth to the lower signal frame of signal to noise ratio (snr) and eliminate, the signal frame higher on SNR gains to lower wave filter impact by reducing filter factor, eliminates the degree of depth thereby reduce noise.By such processing, further reduce the amplitude of noise signal, retain as far as possible voice signal simultaneously, be conducive to improve the accuracy rate of identification.In addition, go DC Module for eliminating DC component.

Can find out from Fig. 1 and Fig. 2, estimate to carry out the FFT conversion of 256 in asking frequency spectrum composing, in VADNest, need to carry out square and ask logarithm each point.Although this too large burden not very for the computing of software, for hardware, FFT conversion and square operation be all need to spend long time and needs consume very large power consumption.Especially for the osophone product of button cell is installed, such cost is very unworthy, also unusual deathblow at last.

Summary of the invention

The embodiment of the present invention provides a kind of osophone denoising device based on module reuse, comprise: spectrum is estimated and endpoint detection module, for obtaining the power spectrum of incoming frame and judge that present frame is speech frame or noise frame, thereby it is shared to realize the module of spectrum estimation and end-point detection; Spectrum average module, is coupled in spectrum and estimates and endpoint detection module, thereby obtains the power spectrum of time smoothing for the power spectrum density of front and back two frames is carried out to equal Value Operations; Wiener filtering module, is coupled in spectrum average module, for present frame is carried out to Wiener filtering operation; And inverse fast Fourier transform module, be coupled in Wiener filtering module, for processed signal is converted back to time domain from frequency domain, thereby obtain the signal after denoising.

The embodiment of the present invention also provides a kind of osophone denoising method based on module reuse, comprising: obtain the power spectrum of incoming frame and judge that present frame is speech frame or noise frame, thereby realize spectrum estimation and end-point detection module share; Thereby the power spectrum density of front and back two frames is carried out equal Value Operations and is obtained the power spectrum of time smoothing; Present frame is carried out to Wiener filtering operation; And processed signal is converted back to time domain from frequency domain, thereby obtain the signal after denoising.

The module that osophone denoising apparatus and method based on module reuse provided by the invention can realize spectrum estimation and Voice activity detector shares, thus the function of implementation algorithm and reduce operand and hardware power consumption so that the function of algorithm is simpler better.

Accompanying drawing explanation

In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, to the accompanying drawing of required use in embodiment or description of the Prior Art be briefly described below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skills, do not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings other accompanying drawing.

Fig. 1 is the FB(flow block) of the two-stage Wiener Filtering of prior art.

Fig. 2 is the FB(flow block) of the end-point detection in the two-stage Wiener Filtering of prior art.

Fig. 3 is the structural representation of the osophone denoising device based on module reuse that provides of one embodiment of the invention.

Fig. 4 is the schematic flow sheet of the osophone denoising method based on module reuse that provides of one embodiment of the invention.

Embodiment

In order to make object of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein, only in order to explain the present invention, is not intended to limit the present invention.

Fig. 3 is the structural representation of the osophone denoising device 300 based on module reuse that provides of one embodiment of the invention.Osophone denoising device 300 comprises that spectrum is estimated and endpoint detection module 302, spectrum average module 304, Wiener filtering module 306 and inverse fast Fourier transform (IFFT) module 308.

Spectrum estimates that the module that can realize spectrum estimation and Voice activity detector with endpoint detection module 302 shares.On the one hand, spectrum is estimated to can be used for overlapping input signal point frame with endpoint detection module 302, by Fast Fourier Transform (FFT) (FFT), frame is transformed into frequency domain and obtains the power spectrum of incoming frame from time domain.On the other hand, spectrum is estimated also to can be used for judging that with endpoint detection module 302 present frame (for example, 20ms Frame) is speech frame or noise frame.In one embodiment, spectrum estimates with endpoint detection module 302 it is to adopt the algorithm the same with spectrum estimation to judge that present frame is speech frame or noise frame., spectrum is estimated to adopt the FFT of 256 to carry out the computing of spectrum estimation and end-point detection with endpoint detection module 302.Specifically, estimate with in endpoint detection module 302 carries out spectrum estimation in spectrum, then the result of directly taking FFT carries out square obtaining frequency spectrum; And when spectrum estimates to carry out end-point detection with endpoint detection module 302, certain several frequency component is combined, then obtain the average of frequency spectrum, and then try to achieve the variance of frequency band.Because the fluctuating quantity of noise is often smaller, can realize by the variance of comparison frequency band and predetermined threshold value the judgement of speech frame and noise frame.If the variance of frequency band is greater than predetermined threshold value, present frame is judged as speech frame (as speech, music, information tone etc.).On the contrary, if the variance of frequency band is not more than predetermined threshold value, present frame is judged as noise frame (quiet frame).

Thereby spectrum average module 304 obtains the power spectrum of time smoothing for the power spectrum density of front and back two frames is carried out to equal Value Operations.In one embodiment, Wiener filtering module 306 can be carried out the operation of two-stage Wiener filtering.In the noise of first order filtering stage is estimated, estimate with the testing result of endpoint detection module 302, non-speech segment to be upgraded according to spectrum.In the noise of second level filtering stage is estimated, utilize the correlativity between voice and noise to upgrade.IFFT module 308 is for processed signal is converted back to time domain from frequency domain, thereby obtains the signal after denoising.Although more concrete operations of spectrum average module 304, Wiener filtering module 306 and IFFT module 308 are described in detail in detail herein, those skilled in the art should understand with upper module can adopt any known appropriate technology and in conjunction with realizing.

Fig. 4 is the schematic flow sheet of the osophone denoising method 400 based on module reuse that provides of one embodiment of the invention.Below with reference to Fig. 3, Fig. 4 is described.

At step S402, divide frame by input signal, obtain the power spectrum of incoming frame and carry out end-point detection (or can be described as Voice activity detector).For example, the spectrum in Fig. 3 is estimated to adopt the FFT of 256 to carry out the computing of spectrum estimation and end-point detection with endpoint detection module 302.Specifically, estimate with in endpoint detection module 302 carries out spectrum estimation in spectrum, then the result of directly taking FFT carries out square obtaining frequency spectrum; And when spectrum estimates to carry out end-point detection with endpoint detection module 302, certain several frequency component is combined, then obtain the average of frequency spectrum, and then try to achieve the variance of frequency band.Because the fluctuating quantity of noise is often smaller, can realize by the variance of comparison frequency band and predetermined threshold value the judgement of speech frame and noise frame.If the variance of frequency band is greater than predetermined threshold value, present frame is judged as speech frame (as speech, music, information tone etc.).On the contrary, if the variance of frequency band is not more than predetermined threshold value, present frame is judged as noise frame (quiet frame).

In step S404, thereby carry out equal Value Operations and obtain the power spectrum of time smoothing.For example, thus the spectrum average module 304 in Fig. 3 is carried out equal Value Operations and obtains the power spectrum of time smoothing to the power spectrum density of front and back two frames.

In step S406, carry out Wiener filtering operation.For example, the Wiener filtering module 306 in Fig. 3 can be carried out the operation of two-stage Wiener filtering.In the noise of first order filtering stage is estimated, estimate with the testing result of endpoint detection module 302, non-speech segment to be upgraded according to spectrum.In the noise of second level filtering stage is estimated, utilize the correlativity between voice and noise to upgrade.

In step S408, the signal of processing is converted back to time domain from frequency domain, thereby obtain the signal after denoising.For example, the IFFT module 308 in Fig. 3 converts back time domain by processed signal from frequency domain, thereby obtains the signal after denoising.

Advantageously, the module that osophone denoising apparatus and method based on module reuse provided by the invention can realize spectrum estimation and Voice activity detector shares, thus the function of implementation algorithm and reduce operand and hardware power consumption so that the function of algorithm is simpler better.

The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, all any modifications of doing within the spirit and principles in the present invention, be equal to and replace and improvement etc., within all should being included in protection scope of the present invention.

Claims

1. the osophone denoising device based on module reuse, comprising:

Spectrum estimates and endpoint detection module, and for obtaining the power spectrum of incoming frame and judge that present frame is speech frame or noise frame, thereby it is shared to realize the module of spectrum estimation and end-point detection;

Spectrum average module, is coupled in described spectrum and estimates and endpoint detection module, thereby obtains the power spectrum of time smoothing for the power spectrum density of front and back two frames is carried out to equal Value Operations;

Wiener filtering module, is coupled in described spectrum average module, for described present frame is carried out to Wiener filtering operation; And

Inverse fast Fourier transform module, is coupled in described Wiener filtering module, for processed signal is converted back to time domain from frequency domain, thereby obtains the signal after denoising.

2. osophone denoising device as claimed in claim 1, is characterized in that, it is to adopt the Fast Fourier Transform (FFT) of 256 to carry out the computing of spectrum estimation and end-point detection that described spectrum is estimated with endpoint detection module.

3. osophone denoising device as claimed in claim 2, is characterized in that, described spectrum estimates that with endpoint detection module be that then the result of directly taking Fast Fourier Transform (FFT) carries out square obtaining frequency spectrum.

4. osophone denoising device as claimed in claim 2, is characterized in that, described spectrum estimation and endpoint detection module are the judgements that realizes speech frame and noise frame by the variance of comparison frequency band and predetermined threshold value.

5. osophone denoising device as claimed in claim 4, it is characterized in that, if the variance of described frequency band is greater than described predetermined threshold value, described present frame is judged as speech frame, and if the variance of described frequency band is not more than described predetermined threshold value, described present frame is judged as noise frame.

6. the osophone denoising method based on module reuse, comprising:

Obtain the power spectrum of incoming frame and judge that present frame is speech frame or noise frame, thus realize spectrum estimation and end-point detection module share;

Thereby the power spectrum density of front and back two frames is carried out equal Value Operations and is obtained the power spectrum of time smoothing;

Described present frame is carried out to Wiener filtering operation; And

Processed signal is converted back to time domain from frequency domain, thereby obtain the signal after denoising.

7. osophone denoising method as claimed in claim 6, it is characterized in that, obtain the power spectrum of incoming frame and judge that present frame is that the described step of speech frame or noise frame comprises and adopts the Fast Fourier Transform (FFT) of 256 to carry out the computing of spectrum estimation and end-point detection.

8. osophone denoising method as claimed in claim 7, is characterized in that, the described step of obtaining the power spectrum of incoming frame comprises that then the result of directly taking Fast Fourier Transform (FFT) carry out square obtaining frequency spectrum.

9. osophone denoising method as claimed in claim 7, is characterized in that, judges that present frame is variance and the predetermined threshold value that the described step of speech frame or noise frame comprises comparison frequency band.

10. osophone denoising method as claimed in claim 9, it is characterized in that, if the variance of described frequency band is greater than described predetermined threshold value, described present frame is judged as speech frame, and if the variance of described frequency band is not more than described predetermined threshold value, described present frame is judged as noise frame.