CN1212608C - A multichannel speech enhancement method using postfilter - Google Patents
A multichannel speech enhancement method using postfilter Download PDFInfo
- Publication number
- CN1212608C CN1212608C CNB031570747A CN03157074A CN1212608C CN 1212608 C CN1212608 C CN 1212608C CN B031570747 A CNB031570747 A CN B031570747A CN 03157074 A CN03157074 A CN 03157074A CN 1212608 C CN1212608 C CN 1212608C
- Authority
- CN
- China
- Prior art keywords
- power spectrum
- signal
- noise
- auto
- signals
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Abstract
The present invention discloses a speech enhancing method which adopts a postfilter and is used for enhancing multichannel speech signals. The method comprises the procedures: 1. the time delay of a speech signal in every channel is calculated; 2. every channel signal is aligned at a time domain through delay compensation; 3. beam formation is carried out to signals of every channel by a beam former; 4. the autopower spectrum of a pure speech signal and the autopower spectrum of a signal with noise are estimated, the frequency response function of a Weiner filter is obtained, wherein the estimation of the cross power spectrum of noise is eliminated for obtaining the estimation of the autopower spectrum of the pure speech signal in the estimation of the cross power spectrum of the signal with noise; 5. the postpositive Wiener filter filters output beams of the beam former, and speech is enhanced. Because the present invention considers correlation among channel noises, the present invention more accords with actual conditions, noise can be effectively eliminated at low-frequency bands especially, and the effect of speech enhancement is improved.
Description
Technical field
The present invention relates to computer speech signal Processing field, more particularly, the present invention relates to a kind of multicenter voice Enhancement Method that adopts postfilter
Background technology
It is a kind of selectivity treatment technology of signal that voice strengthen, and mainly solves from the voice signal that is subjected to the different modes pollution, extracts the problem of pure as far as possible target voice signal.The purpose that voice strengthen is to improve the sense of hearing of voice signal, improves intelligibility, is used for communication, hearing aid, intercepts, field such as audiovisual conference.Along with the development of speech recognition technology, under quiet environment, can reach very high discrimination in addition, but the degeneration of discrimination is comparatively serious under noise circumstance.Therefore voice strengthen a kind of means of handling as speech recognition front-ends, are current very active important research directions in the world.
According to the microphone number that picks up voice signal, voice strengthen and to be divided into two types of single channel and hyperchannels.The single channel speech-enhancement system only needs a microphone, and hardware resource requires low, and algorithm complexity is less, but the de-noising performance is limited.The multicenter voice enhanced system is used microphone array, and multi channel signals has comprised abundant spatial information and temporal information, has bigger performance boost space.Therefore from since last century is over 90 years, it is people's a research focus that the microphone array voice strengthen always.
Adopt the typical workflow of the multicenter voice Enhancement Method of microphone array may be summarized as follows:
1) at first utilize time delay algorithm for estimating (as the broad sense cross correlation function, Adaptive Time Delay algorithm for estimating etc.) to obtain voice signal in each interchannel time delay, estimating signal time delay exactly is the basis that multicenter voice strengthens.
2) then by delay compensation, each channel signal in time domain alignment.
3) with Beam-former the signal of each passage being carried out wave beam forms.
4) with a postfilter (being S filter) beamformer output of Beam-former is carried out filtering, realize the enhancing of voice.
Wherein, in abovementioned steps (4),, need obtain the frequency response function of S filter for the beamformer output to Beam-former carries out filtering.
At first will remove time delay microphone signal x before
i(t) and x
j(t) be modeled as the combination of sound source s (t) and additive noise n (t):
x
i(t)=s(t-τ
i)+n
i(t) (1)
x
j(t)=s(t-τ
j)+n
j(t) (2)
Wherein, i and j are the numbering of microphone/passage, τ
i, τ
jIt is the travel-time (be time postpone) of sound source to microphone.
The form of S filter frequency response function is:
φ wherein
Ss(f) be the auto-power spectrum of desirable clean speech signal s (t), φ
Xx(f) be the auto-power spectrum of signals with noise (s (t)+n (t)).The auto-power spectrum of signals with noise can directly calculate by measuring microphone signal, but the auto-power spectrum of clean speech signal can't a priori be obtained, and particularly voice signal is again a non-stationary signal, and its power spectrum is constantly to change.Therefore the key of S filter is the power spectrum that obtains the clean speech signal in the Noisy Speech Signal in each passage as far as possible exactly, and obtains the S filter frequency response function according to this power spectrum.Zelinski utilizes multi-channel information to preferably resolve this problem, and he at first supposes:
1, signal and ground unrest are incoherent.
2, also be incoherent between the noise that each passage is recorded.
3, the noise power spectrum recorded of each passage is identical.
Like this, after the simple crosscorrelation of ignoring between the relevant and noise of signal and ground unrest, obtain
φ wherein
Xixj(f) be signals with noise x
iAnd x
jCross-power spectrum.Formula (4) substitution formula (3) is just obtained the S filter frequency response function.Average by the spectral density of all possible microphone combination is calculated, can obtain estimated result more accurately:
Wherein N represents passage/microphone number, and operational character R{.} gets real, because the signal auto-power spectrum must be a real number.
But this method is because to be based between the noise that each passage records also be uncorrelated this hypothesis, but the simple crosscorrelation of this each channel noise only could be ignored under the situation of high frequency substantially, and under the low frequency situation, the simple crosscorrelation of each channel noise is comparatively obvious, can not be left in the basket, so this method can not be practical.Therefore just need a kind of algorithm process that is applicable under the low frequency situation.
Summary of the invention
The objective of the invention is to overcome existing multicenter voice Enhancement Method and only be suitable for this shortcoming of high frequency,, provide a kind of multicenter voice Enhancement Method that adopts postfilter by considering the simple crosscorrelation of interchannel noise signal.
In order to realize purpose of the present invention, the invention provides a kind of sound enhancement method that adopts postfilter, be used for the enhancing of multicenter voice signal, comprise the steps:
1) the computing voice signal is in the time delay of each passage.
2) by delay compensation, with each channel signal in time domain alignment.
3) with Beam-former the signal of each passage being carried out wave beam forms.
4) estimate the auto-power spectrum and the signals with noise auto-power spectrum of clean speech signal, obtain the frequency response function of S filter.
Wherein, the auto-power spectrum of clean speech signal obtains as follows:
A) in all voice channels, choose two passages wantonly as a combination;
B) estimate in the described combination of channels two interchannel signals with noise cross-power spectrums and noise cross-power spectrum;
C) in described interchannel signals with noise cross-power spectrum estimation, remove the noise cross-power spectrum estimation and obtain interchannel clean speech signal auto-power spectrum estimation;
D) all possible combination of channels in a) is all carried out b) and operation c), the interchannel clean speech signal auto-power spectrum that then all is obtained is estimated to do average, and this average result is estimated as the auto-power spectrum of the clean speech signal in the step 4).
Wherein, the signals with noise auto-power spectrum is the average result of the signals with noise auto-power spectrum of all passages.
5) with rearmounted described S filter the beamformer output of Beam-former is carried out filtering, realize the enhancing of voice.
Described multicenter voice signal comprises two passage voice signals at least.
In order to reduce operand, this sound enhancement method can only be used to strengthen the low frequency part of voice signal; And the HFS of voice signal still uses existing sound enhancement method, for example the Zelinski algorithm.
Because the present invention has considered the correlativity between each channel noise when obtaining the auto-power spectrum of clean speech signal, this more tallies with the actual situation, and especially can remove noise effectively in low-frequency range, has improved the effect that voice strengthen.
Description of drawings
Fig. 1 adopts the enhancing example of sound enhancement method to one section noisy speech; Wherein (a) is original noisy speech, (b) is the voice enhancement process result of filtering behind the employing Zelinski, and figure (c) is the voice enhancement process result who adopts method of the present invention to obtain.
Embodiment
Below in conjunction with the drawings and specific embodiments the present invention is described in further detail.
To formula (1) and (2) given signal model x
i(t) and x
j(t) remove time delay τ
i, τ
jRemake Fourier transform afterwards, obtain
In the formula
With
Be that time delay is removed back x
i(t+ τ
i) and x
j(t+ τ
j) Fourier transform, (^) represent that erasure signal postpones; S (f) is the purified signal Fourier transform; N
i(f) and N
j(f) be the Fourier transform of noise; W is a frame length.By formula (6), (7) obtain the cross-power spectrum of signals with noise
Wherein
In the formula
Be signals with noise x
i(t+ τ
i) and x
j(t+ τ
j) cross-power spectrum, φ
Ss(f) be the auto-power spectrum of purified signal, φ
Ninj(f) and
Be respectively to postpone to remove forward and backward noise cross-power spectrum.τ
Ij=τ
i-τ
jIt is the time delay between two passage i and the j signal.
Be not difficult to find out from formula (8), in order to obtain the auto-power spectrum φ of purified signal
Ss(f), at first will estimate noise cross-power spectrum part in the formula, and in the prior art, the noise cross-power spectrum is left in the basket partly.Formula (9) shows the noise cross-power spectrum
Along with time delay τ
IjChange change, this also is simply to postpone the reason that addition and Wiener filtering algorithm can not be handled mobile sound source.According to above analysis, the noise cross-power spectrum can obtain by following formula:
In the formula
Be to postpone to eliminate back noise cross-power spectrum estimation, φ '
Ninj(f) be original noise cross-power spectrum estimation, it can obtain in speech gaps.() ' expression signal estimated value.According to formula (8), (10) obtain the purified signal power Spectral Estimation
Also can estimate φ ' simultaneously by the calculating of signals with noise auto-power spectrum
Ss(f).Release by formula (1)
Therefore obtain
φ ' in the formula
Ninj(f) be that noise power spectrum is estimated.All microphones are made up the φ ' that tries to achieve according to formula (11), (13)
Ss(f) work is average to improve the estimation of purified signal auto-power spectrum, obtains the estimation of S filter
R{.} represents to get real.Because power spectrum signal φ
Ss(f) may be arithmetic number only, so also will make the half-wave integer, the negative that removal may occur to it.
In the specific implementation, power spectrum all upgrades by the following formula of repeatedly being with
X represents signal or noise in the formula; φ
Xixj(k+1, f) expression k+1 frame power Spectral Estimation, φ
Xixj(k f) is k frame power Spectral Estimation.X (f) is the Fourier Tranform of signal x (k), and α is the number between 0 to 1, has reflected that power spectrum upgrades speed.
The simple crosscorrelation of each channel noise is only comparatively obvious in low frequency part, can ignore substantially at HFS.Therefore in order rationally to reduce operand, can be the low frequency part below the signal 1kHz with formula (14) filtering, and HFS is still used the algorithm process of Zelinski, as shown in Equation (5).
Figure (1) is one section noisy speech result, and wherein (a) is original noisy speech, (b) is the voice enhancement process result of filtering behind the employing Zelinski, and figure (c) is the voice enhancement process result who adopts method of the present invention to obtain.As can be seen from the figure, filtering algorithm can not effectively be removed the low-frequency noise that wherein comprises behind the Zelinski, and this part noise is in 1kHz, so also can't remove with high-pass filtering; Method of the present invention has then been removed low-frequency noise substantially.
Claims (3)
1, a kind of sound enhancement method that adopts postfilter is used for the enhancing of multicenter voice signal, and described multicenter voice signal comprises two passage voice signals at least, comprises the steps:
1) the computing voice signal is in the time delay of each passage;
2) by delay compensation, with each channel signal in time domain alignment;
3) with Beam-former the signal of each passage being carried out wave beam forms;
4) estimate the auto-power spectrum and the signals with noise auto-power spectrum of clean speech signal, obtain the frequency response function of S filter;
5) with rearmounted described S filter the beamformer output of Beam-former is carried out filtering, realize the enhancing of voice;
It is characterized in that in the step 4), the auto-power spectrum of clean speech signal obtains as follows:
A) in all voice channels, choose two passages wantonly as a combination;
B) estimate in the described combination of channels two interchannel signals with noise cross-power spectrums and noise cross-power spectrum;
C) in described interchannel signals with noise cross-power spectrum estimation, remove the noise cross-power spectrum estimation and obtain interchannel clean speech signal auto-power spectrum estimation;
D) all possible combination of channels in a) is all carried out b) and operation c), the interchannel clean speech signal auto-power spectrum that then all is obtained is estimated to do average, and this average result is estimated as the auto-power spectrum of the clean speech signal in the step 4).
2, the sound enhancement method of employing postfilter according to claim 1 is characterized in that, the signals with noise auto-power spectrum described in the step 4) is the average result of the signals with noise auto-power spectrum of all passages.
3, the sound enhancement method of employing postfilter according to claim 1 and 2 is characterized in that, this sound enhancement method only is used to strengthen the low frequency part of voice signal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB031570747A CN1212608C (en) | 2003-09-12 | 2003-09-12 | A multichannel speech enhancement method using postfilter |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB031570747A CN1212608C (en) | 2003-09-12 | 2003-09-12 | A multichannel speech enhancement method using postfilter |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1523573A CN1523573A (en) | 2004-08-25 |
CN1212608C true CN1212608C (en) | 2005-07-27 |
Family
ID=34287137
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNB031570747A Expired - Fee Related CN1212608C (en) | 2003-09-12 | 2003-09-12 | A multichannel speech enhancement method using postfilter |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN1212608C (en) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1851866B1 (en) * | 2005-02-23 | 2011-08-17 | Telefonaktiebolaget LM Ericsson (publ) | Adaptive bit allocation for multi-channel audio encoding |
CN100535993C (en) * | 2005-11-14 | 2009-09-02 | 北京大学科技开发部 | Speech enhancement method applied to deaf-aid |
DE602007003220D1 (en) * | 2007-08-13 | 2009-12-24 | Harman Becker Automotive Sys | Noise reduction by combining beamforming and postfiltering |
KR102079000B1 (en) * | 2010-07-02 | 2020-02-19 | 돌비 인터네셔널 에이비 | Selective bass post filter |
CN103813251B (en) * | 2014-03-03 | 2017-01-11 | 深圳市微纳集成电路与系统应用研究院 | Hearing-aid denoising device and method allowable for adjusting denoising degree |
CN105100338B (en) * | 2014-05-23 | 2018-08-10 | 联想(北京)有限公司 | The method and apparatus for reducing noise |
CN105244036A (en) * | 2014-06-27 | 2016-01-13 | 中兴通讯股份有限公司 | Microphone speech enhancement method and microphone speech enhancement device |
CN104202098B (en) * | 2014-08-08 | 2016-06-15 | 中国科学院上海微系统与信息技术研究所 | A kind of broadband power Power estimation method based on multichannel compression sampling |
CN106653048B (en) * | 2016-12-28 | 2019-10-15 | 云知声(上海)智能科技有限公司 | Single channel sound separation method based on voice model |
CN108957392A (en) * | 2018-04-16 | 2018-12-07 | 深圳市沃特沃德股份有限公司 | Sounnd source direction estimation method and device |
CN108615535B (en) * | 2018-05-07 | 2020-08-11 | 腾讯科技(深圳)有限公司 | Voice enhancement method and device, intelligent voice equipment and computer equipment |
CN109243476B (en) * | 2018-10-18 | 2021-09-03 | 电信科学技术研究院有限公司 | Self-adaptive estimation method and device for post-reverberation power spectrum in reverberation voice signal |
CN109538946B (en) * | 2018-11-13 | 2020-11-10 | 阎兆立 | Urban tap water pipeline leakage detection positioning method |
-
2003
- 2003-09-12 CN CNB031570747A patent/CN1212608C/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
CN1523573A (en) | 2004-08-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN1212608C (en) | A multichannel speech enhancement method using postfilter | |
Li et al. | Embedding and beamforming: All-neural causal beamformer for multichannel speech enhancement | |
CN1763846A (en) | Voice gain factor estimating device and method | |
CN1967658A (en) | Small scale microphone array speech enhancement system and method | |
CN103827967B (en) | Voice signal restoring means and voice signal restored method | |
CN1356014A (en) | System and method for dual microphone signal noise reduction using spectral substraction | |
JP2013527493A (en) | Robust noise suppression with multiple microphones | |
CN1625920A (en) | Audio apparatus and its reproduction program | |
CN1805278A (en) | Method and apparatus to record a signal using a beam forming algorithm | |
CN1746974A (en) | Method of enhancing quality of speech and apparatus thereof | |
WO2013009949A1 (en) | Microphone array processing system | |
CN104021798A (en) | Method for soundproofing an audio signal by an algorithm with a variable spectral gain and a dynamically modulatable hardness | |
CN102982808A (en) | Speech denoising device and method based on wavelet transform | |
EP3731227A1 (en) | Voice signal enhancing method and device | |
CN101587712A (en) | A kind of directional speech enhancement method based on minitype microphone array | |
CN108962276B (en) | Voice separation method and device | |
CN1819720A (en) | Crosstalk eliminator and elimination thereof | |
CN1805011A (en) | Adaptive filter method and apparatus for improving speech quality of mobile communication apparatus | |
CN1180602C (en) | Method and apparatus for space-time echo cancellation | |
CN111225317B (en) | Echo cancellation method | |
CN101039486A (en) | Method for suppressing voice noise of mobile phone | |
Ngo et al. | Variable speech distortion weighted multichannel wiener filter based on soft output voice activity detection for noise reduction in hearing aids | |
CN1212603C (en) | Non linear spectrum reduction and missing component estimation method | |
CN108074580B (en) | Noise elimination method and device | |
CN1201287C (en) | Hiaden Markov model edge decipher data reconstitution method f speech sound identification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C19 | Lapse of patent right due to non-payment of the annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |