CN101447190A - Voice enhancement method employing combination of nesting-subarray-based post filtering and spectrum-subtraction - Google Patents

Voice enhancement method employing combination of nesting-subarray-based post filtering and spectrum-subtraction Download PDF

Info

Publication number
CN101447190A
CN101447190A CNA200810068000XA CN200810068000A CN101447190A CN 101447190 A CN101447190 A CN 101447190A CN A200810068000X A CNA200810068000X A CN A200810068000XA CN 200810068000 A CN200810068000 A CN 200810068000A CN 101447190 A CN101447190 A CN 101447190A
Authority
CN
China
Prior art keywords
signal
voice
subarray
voice signal
spectrum
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA200810068000XA
Other languages
Chinese (zh)
Inventor
邹月娴
赵璟
万波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University Shenzhen Graduate School
Original Assignee
Peking University Shenzhen Graduate School
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Shenzhen Graduate School filed Critical Peking University Shenzhen Graduate School
Priority to CNA200810068000XA priority Critical patent/CN101447190A/en
Publication of CN101447190A publication Critical patent/CN101447190A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a voice enhancement method employing a combination of nesting-subarray-based post filtering and spectrum-subtraction and is suitable for indoor environment, comprising the enhancement of multi-channel voice signal in vehicle environment; as the problems of unstable the broadband of the voice signals, the inconsistent frequency response of the microphone-array-based multi-channel voice-enhancement method to the voice signal and the correlation among all-channel noise in actual noise-field environment are considered, by utilizing the microphone array nested by the subarrays with different spacing, the voice signals are collected; and the voice signals formed by subarray beams are divided into a high-frequency section and a low-frequency section, different voice-enhancement algorism are adopted for carrying out the treatment; all the advantages are complementary with each other, thus improving the effect of voice enhancement.

Description

Post-filtering and spectrum-subtraction associating sound enhancement method based on nested subarray
Technical field
The present invention relates to computer speech signal Processing field, more particularly, the present invention relates to a kind of post-filtering and spectrum-subtraction associating sound enhancement method, be particularly useful for the enhancing of voice signal in the indoor noisy environment based on nested subarray.
Background technology
It is exactly that the voice that band is made an uproar are carried out relevant treatment that voice strengthen (Speech Enhancement) technology, therefrom extract pure as far as possible raw tone, to improve the receiving end voice quality, improve sharpness, intelligibility and the comfort level of voice, make the people be easy to accept or improve the performance of speech processing system.Be commonly used in fields such as automatic speech recognition system, vehicle-mounted hands-free telephone, multimedia conferencing, radio communication, scene recording, military eavesdropping, hearing-aid device and intelligent robot.The research and development of speech enhancement technique has the history of four more than ten years, and traditional method all is based on the system of single microphone, and problems such as its pickup scope, directive property variation, noise inhibiting ability all are subjected to certain restriction.Adaptive voice enhancement techniques based on microphone array has merged multinomial gordian techniquies such as array signal process technique, voice process technology and multi-channel signal acquiring technology.Its technical advantage is that it not only can utilize the time domain and the frequency domain characteristic of voice signal, and can utilize its spatial information to realize elimination to noise, reaches to strengthen and the purpose of purifying voice.Based on the typical workflow of the sound enhancement method of microphone array as shown in Figure 1, specifically describe as follows:
1) according to application requirements design microphone array array structure;
2) utilize time, frequency, the spatial information of the multicenter voice signal that microphone array receives, initial, the end caps of voice signal detected, the time delay between the estimating channel simultaneously, estimated signal attitude information;
3) adopt voice enhancement algorithm that multi channel signals is handled, realize the enhancing of voice signal.
Abovementioned steps 1) the microphone array structural design in is a critical step.Traditional array structure has even linear array, non-homogeneous linear array, uniform circular array and sphere array etc.The selection of the design of array structure and multi channel signals model has substantial connection.
The array signal model is divided near field model and far field model, and its maximum difference is: in the model of far field, the signal amplitude that each array element receives is considered to identical, and phase differential is arranged; The near field model then needs to consider the signal amplitude decay that the difference of travel path is brought, and promptly the near field model also must be considered the distance of information source to each microphone except the arrival direction that will consider information source.Under the situation of near field, adopt the spherical wave front model usually, replace the plane wave front model in far field.
Similar with time domain sampling theorem, in order to prevent the space aliasing phenomenon, spatial sampling also needs to meet some requirements based on the microphone array sensor, is called the spatial sampling theorem, is described as formula (1):
d≤λ/2 (1)
Wherein, d is the air line distance between adjacent microphone array element, and λ is the wavelength of sound wave.Have only the spatial sampling rate enough high, just can avoid the space to mix repeatedly.Yet, if the array element distance is too small, be just sampling, adopt more microphone sensor that more signal space information can not be provided.
In addition, distance also will influence the foundation of signal model between signal source and the microphone array.Definition r is the air line distance that sound source arrives the microphone array center, and L is the total length of linear microphone array.If satisfy formula (2), then meet far field condition; Otherwise, then need adopt the near field model.
|r|>>2L 2/λ (2)
For an even linear microphone array, adopt the far field plane wave model, then the output discrete signal of m microphone can be expressed as:
x m[n]=s[n-Δn m]+η m[n] (3)
Wherein, s[n] be sound-source signal, Δ n mBe that signal and the sample point between the sound-source signal that m microphone receives postpones η m[n] is m the noise signal that microphone receives.
Δ τ mBe the signal that receives of m microphone and the time delay between the sound-source signal, following relational expression then arranged:
Δn m=f s·Δτ m=f s·r/c (4)
In the formula (4), f sBe sample frequency, c is the speed of sound wave at spatial transmission.
Abovementioned steps 2) sound end in detects (Voice Activity Detection, VAD) the visual different phonetic enhancement algorithms of step or increase or subtract.The VAD method of robust is estimated the statistical nature of realizing noise signal, and the performance of follow-up voice enhancement algorithm all has important effect.Usual practice is the VAD method based on short-time energy that adopts single passage, based on the VAD method of zero-crossing rate, based on VAD method of linear prediction system or the like.In addition, based on commonly used the having of the end-point detecting method of array structure: based on the VAD method of Beam-former, based on the VAD method of phase vectors and based on the SPACE V AD method of GSC.
Abovementioned steps 3) in, speech enhancement technique mainly can be divided into based on the method for single microphone with based on the method for microphone array, and wherein the most ripe also the most simple and effective based on the method for single microphone is the spectrum-subtraction voice enhancement algorithm; And based on having that the method for microphone array extensively adopts at present: a) fixed beam former (Fixed Beamformng, FBF); B) adaptive beam former (Adaptive Beamforming, ABF); C) has the wave beam forming method (Microphone Arrays with AdaptivePostfiltering) of self-adaptation postfilter; D) the generalized sidelobe null method (Generalized Sidelobe Canceller, GSC) or the like.In addition, the algorithm of some improved algorithms, associating also emerges in an endless stream.Commonly used have a sound enhancement method that spectrum-subtraction is combined with fixed beam former; Fixed beam former and self-adaptation post-filtering associating sound enhancement method; Generalized sidelobe based on the space transition function is eliminated sound enhancement method etc.Commonly used forms based on the time delay-wave beam that adds up
Aforesaid spectrum-subtraction (Spectral Subtraction, SS) voice enhancement algorithm is one of classical single passage sound enhancement method, is a kind ofly to be widely used in the voice enhancement algorithm that single channel contains additive noise by what Steven professor F.Boll of Univ Utah USA (University of Utah) proposed in 1979.As shown in Figure 2, this method is subtracted each other processing by the amplitude spectrum in short-term to the noise signal of contaminated voice signal and estimation, obtains pure voice signal, and its effect is equivalent at transform domain noisy speech signal have been carried out certain equalization and handles.Yet, the frequency spectrum Gaussian distributed of noise in the reality, the frame power spectrum variation range of noise is very wide, and the ratio of the maximum in frequency domain, minimum value often reaches several magnitude, and the ratio of maximal value and average also reaches 6-8 doubly.Therefore, after deducting noise spectrum, have the remainder of bigger power spectrum component, on frequency spectrum, present the spike that occurs at random, form residual noise acoustically.This noise has certain rhythm fluctuating sense, is called " music noise ".In addition, the influence that the various piece of voice is subjected in spectrum cuts algorithm is different.Fricative is because its feature is similar to noise, and meeting and noise are suppressed together in processing procedure.The nasal sound energy is lower, and the amplitude and the noise of its power spectrum are approaching, strengthens effect and can not show a candle to voiced sound.The attenuation of spectrum-subtraction makes the non-voiced sound part and the HFS of voice weaken the reason that the intelligibility of voice descends after the enhancing that Here it is.
(Delay-and-Sum Beamformer is a kind of typical fixed beam former DSBF) to the time delay-Beam-former that adds up, and is divided into delay compensation and weighted sum two parts.As shown in Figure 3, adopt the far field model, suppose that noise is an additive noise, the signal that receives with the m passage is an example, and its expression formula is:
x m ′ [ n ] = s [ n - Δn m ] + η m [ n ] - - - ( 5 )
Utilize the time delay algorithm for estimating to obtain the time delay of voice signal at each passage, adopt again delay compensation with each channel signal in time domain alignment, obtain:
x m [ n ] = x m ′ [ n + Δn m ] - - - ( 6 )
Each channel signal is weighted summation, promptly obtains wave beam and form output signal:
y [ n ] = Σ m = 1 M x m [ n ] · w m [ n ] - - - ( 7 )
In beamforming algorithm, time delay estimates it is the basis that multicenter voice strengthens accurately.Postpone-add up that Beam-former has that system is simple, algorithm robust, advantage that calculated amount is little, can be applicable to real system.This algorithm can obtain 10log in theory 10The signal to noise ratio (S/N ratio) of M improves.So the voice that obtain strengthen, and then will adopt more microphone array element.In addition, this algorithm has hiding precondition, promptly needs to obtain precise time and postpones to estimate Δ n m, incoming signal is narrow band signal, do not have space loss and reflected signal and reverb signal, the main deficiency of algorithm is: algorithm is for the situation of space more than a voice sound source or directivity noise, reverberation serious interference, its performance descends very fast, in addition, different frequency composition to signal, it responds different, and the low frequency part spatial resolution is poor usually, and HFS is better relatively.
1988, the output rear end that R.Zelinski has proposed to postpone-adding up Beam-former increases the method for a rearmounted adaptive wiener filter (Wiener Filter), has formed classical postfilter voice enhancement algorithm (Delay-and-Sum Beamforming with an Additional Postfiltering).Rearmounted adaptive filter method is in conjunction with linear adaptive beam former (ABF) and postfilter (Postfilter), utilize the spa-tial filter properties of linear ABF and the noncoherent noise rejection characteristic of postfilter, can reach the effect that spatial filtering and frequency filtering voice strengthen simultaneously, further improve output signal-to-noise ratio.
The effect of rearmounted auto adapted filtering is to adopt the adaptive wiener filter method further to estimate the target voice to the signal that delay-accumulation method obtains.Its main thought is hypothesis:
1) voice signal and the noise signal that receive of each passage is incoherent;
2) noise signal that different microphones receive in the array is incoherent;
3) power spectrum density of the noise signal that receives of each microphone is identical.
As shown in Figure 4, behind delay compensation, do Fourier transform and be transformed into frequency domain, the signal of each microphone channel comprises target voice signal and noise signal, after the weighting:
Y m(f)=W m(f)[S(f)+η m(f)] (8)
Y ( f ) = Σ m = 1 M Y m ( f ) - - - ( 9 )
Based on aforementioned three hypothesis, spectral density and the interchannel mutual spectral density of calculating each passage respectively can obtain:
Φ yiyi ( f ) = E { [ W i ( f ) ( S ( f ) + η i ( f ) ) ] [ W i * ( f ) ( S ( f ) + η i ( f ) ) ] }
= E { [ W i ( f ) S ( f ) + W i ( f ) η i ( f ) ] [ W i * ( f ) S ( f ) + W i * ( f ) η i ( f ) ] }
= | W i ( f ) | 2 Φ ss ( f ) + | W i ( f ) | 2 Φ ηiηi ( f )
= | W i ( f ) | 2 [ Φ ss ( f ) + Φ ηiηi ( f ) ] (10)
Φ yiyj ( f ) = E { [ W i ( f ) ( S ( f ) + η i ( f ) ) ] [ W j * ( f ) ( S ( f ) + η j ( f ) ) ] }
= E { [ W i ( f ) S ( f ) + W i ( f ) η i ( f ) ] [ W j * ( f ) S ( f ) + W j * ( f ) η j ( f ) ] }
= E W i ( f ) W j * ( f ) S ( f ) S ( f ) + W i ( f ) W j * ( f ) S ( f ) η j ( f ) + W i ( f ) W j * ( f ) η i ( f ) S ( f ) + W i ( f ) W j * ( f ) η i ( f ) η j ( f ) - - - ( 11 )
= W i ( f ) W j * ( f ) Φ ss ( f )
Optimal transfer function expression formula according to S filter:
H ( f ) = Φ ss ( f ) Φ ss ( f ) + Φ ηη ( f ) - - - ( 12 )
By asking echo signal and the autocorrelation spectrum density of noise signal and the molecule and the denominator that coherence spectra density can obtain transport function respectively of each passage of input.
Can obtain Φ respectively by formula (10) and formula (11) Ss(f) and Φ Ss(f)+Φ η η(f), that is:
Φ ss ( f ) = Φ yiyj ( f ) W i ( f ) W j * ( f ) - - - ( 13 )
Φ ss ( f ) + Φ ηη ( f ) = Φ yiyj ( f ) | W i ( f ) | 2 - - - ( 14 )
Thereby, can obtain the transport function estimated value of rearmounted adaptive wiener filter:
Figure A200810068000D00091
Wherein, M represents number of active lanes,
Figure A200810068000D00092
For getting the real part computing, *Be adjoint operator, W i(f) be the weight of the signal delay of each microphone channel-add up, that is:
W i ( f ) = 1 4 e j - 2 πf c ( i - 1 ) d cos φ - - - ( 16 )
Then the estimated value of the target voice signal of adaptive wiener filter output is:
Z ( f ) = H ^ ( f ) Y ( f ) - - - ( 17 )
By above-mentioned formula as seen, rearmounted adaptive wiener filter method is not limited by the number of noise source.But this method is owing to be based on assumed condition 2), be that the noise signal that different microphones receive in the array is incoherent, and in fact, the cross correlation function of the noise signal that each passage of each microphone array receives only could be ignored under high frequency situations substantially, under the low frequency situation, the simple crosscorrelation of the noise signal that each passage receives is comparatively obvious, can not be left in the basket, thereby this method is the same with fixed beam formation algorithm, HFS enhancing effect for signal is better, and it is relatively poor that low frequency part strengthens effect.
As seen, spectrum-subtraction and post-filtering method respectively have quality, adopt a kind of method can't reach desirable voice separately and strengthen effect, need a kind of algorithm that all is suitable for for low frequency and high frequency voice signal to handle.
Summary of the invention
The objective of the invention is in order to solve at present in the multicenter voice enhancement techniques uniform array the inconsistent problem of the frequency response performance of wide band voice signal, and traditional sound enhancement method problem of also existing high band and low-frequency range to be difficult to take into account.
In order to solve the problems of the technologies described above, the present invention proposes a kind of sound enhancement method that combines with spectrum-subtraction based on the post-filtering of nested type subarray.The technical solution used in the present invention is:
The first step: design the collection that two nested microphone arrays of even subarray are used for multi channel signals; Described multicenter voice signal based on nested subarray comprises five passage voice signals at least;
Second step: detect initial, the end caps of voice signal, estimate the power spectrum of pure noise signal;
The 3rd step: estimated speech signal is in the time delay of each passage;
The 4th the step: each passage voice signal is carried out delay compensation, with each passage voice signal in time domain alignment;
The 5th step: each channel signal is transformed into frequency domain from time domain with Fourier transform;
The 6th step: estimate the auto-power spectrum of clean speech signal and the auto-power spectrum of Noisy Speech Signal, obtain the frequency response function of S filter;
The 7th step:, with fixed beam former the signal of each passage of each subarray is carried out wave beam respectively and form for the signal of two subarrays;
The 8th step: the beamformer output with two subarrays carries out low-pass filtering and high-pass filtering respectively;
The 9th step: the beamformer output to filtered two subarrays carries out spectrum-subtraction or the processing of rearmounted Wiener filtering method, realizes that voice strengthen;
The tenth step: with the wave beam overlap-add after the two-way enhancing, carry out inversefouriertransform, the voice signal after obtaining strengthening in the time domain.
The present invention has following advantage:
1) nested subarray has frequency response preferably to wide band space voice signal;
2) array structure is simple, utilizes public array element to reduce the size of array, and the computational complexity of algorithm is less;
3) adopt hyperchannel post-filtering voice enhancement algorithm only the HFS of target voice signal to be carried out enhancement process, the problem of having avoided the post-filtering voice enhancement algorithm that the voice signal of low-frequency range is strengthened the property and descended;
4) algorithm is easy to realize that calculated amount is little, is applicable to PC platform and embedded platform.
Description of drawings
Fig. 1. typical sound enhancement method block diagram
Fig. 2. amplitude spectrum subtraction sound enhancement method process flow diagram
Fig. 3. postpone-add up the Beam-former process flow diagram
Fig. 4. rearmounted adaptive wiener filter sound enhancement method process flow diagram
Fig. 5. based on the post-filtering and the spectrum-subtraction associating sound enhancement method process flow diagram of nested subarray
Fig. 6. nested subarray design drawing
Embodiment
Based on the FB(flow block) of the post-filtering of nested subarray and spectrum-subtraction associating sound enhancement method as shown in Figure 5, wherein by multi-channel signal acquiring, delay compensation, wave beam form, rearmounted auto adapted filtering four parts form.Below in conjunction with the drawings and specific embodiments the present invention is described in further detail.The implementation case does not limit the present invention, for those skilled in the art, under the prerequisite that does not break away from the principle of the invention, can also make some improvement and variation, and these improvement and variation also should be considered as within protection scope of the present invention.
This enforcement safe operation is on ordinary PC, and concrete configuration is as follows:
CPU:
Figure A200810068000D00111
?2.80GHz
Internal memory: 1GHz
Operating system: Windows XP Professional Edition
Running environment:
Figure A200810068000D00112
MATLAB R2006b
Adopt case study on implementation of the present invention, at sound source characteristic in the indoor environment and noise field characteristic, adopt scattered noise field (Diffuse Noise Field) model and nested subarray (Harmonically NestedSubarrays, HNSA) model carries out modeling to the hyperchannel noisy speech signal in the actual environment.Gather voice signal in the space by the array of two subarray nested structures being made up of 7 omni-directional microphone, each subarray comprises 5 array elements, and then M=5 uses
Figure A200810068000D00113
With
Figure A200810068000D00114
The signal of representing a certain passage of little subarray (Small) and big subarray (Large) respectively, and i=1 ..., 5, j=1 ..., 5.Because nested property, wherein the part microphone channel is shared:
x S 1 ′ [ n ] = x L 2 ′ [ n ] , ? x s 3 ′ [ n ] = x L 3 ′ [ n ] , ? x s 5 ′ [ n ] = x L 4 ′ [ n ] - - - ( 18 )
For formula (5) and the given signal model of formula (6), behind the compensation of delay, pass through Fourier transform again, the frequency-region signal expression formula of two a certain passages of subarray:
X Si ( f ) = S ( f ) + η Si ( f ) e j 2 π N f τ si - - - ( 19 )
X Lj ( f ) = S ( f ) + η Lj ( f ) e j 2 π N fτ Lj - - - ( 20 )
Wherein, S (f) is the Fourier transform of clean speech signal, η Si(f) and η Lj(f) Fourier transform of the noise of difference two subarray i passages and j passage, N is a frame length.
Big or small two subarrays are done the wave beam that adds up respectively to be formed:
Y S ( f ) = 1 5 Σ i = 1 5 X Si ( f ) - - - ( 21 )
Y L ( f ) = 1 5 Σ j = 1 5 X Lj ( f ) - - - ( 22 )
Wave beam is formed output Y S(f) and Y L(f) respectively by high pass (HP) FIR wave filter and low pass (LP) FIR wave filter, obtain
Figure A200810068000D00123
With
Figure A200810068000D00124
Wide band voice signal is divided into two frequency ranges to be handled with different voice enhancement algorithms respectively.
For low frequency signal, adopt spectrum-subtraction as shown in Figure 4 to carry out the denoising enhancing:
| S ^ L ( f ) | = | Y ^ L ′ ( f ) | - ζ ( f ) - - - ( 23 )
Wherein,
Figure A200810068000D00126
Be the estimated value through the target voice signal after the spectrum-subtraction denoising, ζ (f) adopts the amplitude mean value of voice activity detection method in the noise signal of non-speech segment estimation.
And for high-frequency signal, adopt as Fig. 6 and rearmounted adaptive wiener filter method shown in Figure 1 and carry out the voice enhancing.For any two passage i and j in the subarray, i ≠ j, the autopower spectral density and the cross-spectral density of Noisy Speech Signal are respectively:
Φ X i X i ( f ) = Φ ss ( f ) + Φ ηiηi ( f ) - - - ( 24 )
Φ X i X j ( f ) = E { X i ( f ) X j * ( f ) } = Φ ss ( f ) + Φ sηi ( f ) + Φ sηj ( f ) + Φ ηiηj ( f ) - - - ( 25 )
Based on three assumed conditions of aforementioned rearmounted adaptive wiener filter method, the noise signal of each passage is uncorrelated mutually, and also uncorrelated with sound-source signal, then:
Φ sηi(f)=Φ sηj(f)=Φ ηiηj(f)=0 (26)
And the power spectrum density of the noise signal that each microphone receives is identical, is defined as:
Φ ηiηi(f)=Φ ηjηj(f)=Φ ηη(f) (27)
Then formula (24) and formula (25) can be rewritten as:
Φ X i X i ( f ) = Φ ss ( f ) + Φ ηη ( f ) - - - ( 28 )
Φ X i X j = Φ ss ( f ) - - - ( 29 )
Wherein
Φ ^ X i X i ( f ) = 1 M Σ i = 1 M | X i ( f ) | 2 - - - ( 30 )
Φ ^ X i X j ( f ) = 2 M ( M - 1 ) Σ i = 1 M - 1 Σ j = i + 1 M X i ( f ) X j * ( f ) - - - ( 31 )
Consider the signal stationarity in short-term in the actual conditions, the length L of FFT is limited, thereby in the formula (25) back three can not be 0, but levels off to a plural number of 0.Because power spectrum signal Φ Ss(f) may be arithmetic number only, so obtain:
In addition, the signal of each passage is handled by a kind of iteration smooth mode and is obtained.For a certain Frequency point k, the smoothing interval [k-p, k+p] that to define a length be 2p+1, then
Φ ^ ss [ k ] = 1 2 p + 1 Σ l = - p p Φ ss [ k + l ] - - - ( 33 )
Φ ^ X i X i [ k ] = 1 2 p + 1 Σ l = - p p Φ YY [ k + l ] - - - ( 34 )
H ^ [ k ] = Φ ^ ss [ k ] Φ ^ X i X i [ k ] - - - ( 35 )
Take all factors into consideration the relation between precision and the calculated amount, get p=1 or 2 usually.
Then by the output signal behind the Hi-pass filter
Figure A200810068000D00138
Pass through adaptive wiener filter again, the voice signal of the high band after being enhanced:
S ^ S ( f ) = Y S ′ ( f ) · H ^ ( f ) - - - ( 36 )
The voice signal of high and low two frequency bands is carried out overlap-add Fourier synthesis (Fourier SynthesisOverlap-Add), convert the voice signal after strengthening in the time domain to
Figure A200810068000D001310

Claims (5)

1, a kind of sound enhancement method that adopts the post-filtering spectrum-subtraction associating of nested subarray, the multicenter voice signal that is used for indoor environment strengthens, and it is characterized in that described method comprises:
1) two nested microphone arrays of even subarray of design are used for the collection of multi channel signals;
2) detect initial, the end caps of voice signal, estimate the power spectrum of pure noise signal;
3) estimated speech signal is in the time delay of each passage;
4) each passage voice signal is carried out delay compensation, with each passage voice signal in time domain alignment;
5) with Fourier transform each channel signal is transformed into frequency domain from time domain;
6) estimate the auto-power spectrum of clean speech signal and the auto-power spectrum of Noisy Speech Signal, obtain the frequency response function of S filter;
7), with fixed beam former the signal of each passage of each subarray is carried out wave beam respectively and form for the signal of two subarrays;
8) respectively the beamformer output of two subarrays is carried out low-pass filtering and high-pass filtering;
9) beamformer output to filtered two subarrays carries out spectrum-subtraction or the processing of rearmounted Wiener filtering method, realizes that voice strengthen;
10) with the wave beam overlap-add after the two-way enhancing, carry out inversefouriertransform, the voice signal after obtaining strengthening in the time domain.
2, the microphone array array structure of nested subarray according to claim 1, it is characterized in that step (1) is described, each subarray is all to be the fixing uniform linear array of spacing, and the spacing of big subarray is 2 times of boy's array pitch, and part array element can be shared.
3, the voice signal after two sub-array beam are formed according to claim 1 and 2 carries out low-pass filtering or high-pass filtering, it is characterized in that, step (8) is described, voice signal after each passage wave beam of big subarray formed carries out low-pass filtering, voice signal after each passage wave beam of little subarray formed carries out high-pass filtering, makes voice signal that frequency response preferably all be arranged on whole frequency band.
4, describedly with spectrum-subtraction and rearmounted S filter the beamformer output of two subarrays is carried out enhancement process respectively according to claim 1 or 3, it is characterized in that, step (9) is described, carry out spectrum subtraction with the beamformer output of power spectrum-subtraction after and handle, realize the enhancing of voice signal low frequency part low-pass filtering; Carry out filtering with the beamformer output of described rearmounted S filter after, realize the enhancing of voice signal HFS high-pass filtering.
5, the post-filtering spectrum-subtraction of the nested subarray of employing according to claim 1 and 2 associating sound enhancement method is characterized in that described multicenter voice signal comprises five passage voice signals at least.
CNA200810068000XA 2008-06-25 2008-06-25 Voice enhancement method employing combination of nesting-subarray-based post filtering and spectrum-subtraction Pending CN101447190A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA200810068000XA CN101447190A (en) 2008-06-25 2008-06-25 Voice enhancement method employing combination of nesting-subarray-based post filtering and spectrum-subtraction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA200810068000XA CN101447190A (en) 2008-06-25 2008-06-25 Voice enhancement method employing combination of nesting-subarray-based post filtering and spectrum-subtraction

Publications (1)

Publication Number Publication Date
CN101447190A true CN101447190A (en) 2009-06-03

Family

ID=40742829

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA200810068000XA Pending CN101447190A (en) 2008-06-25 2008-06-25 Voice enhancement method employing combination of nesting-subarray-based post filtering and spectrum-subtraction

Country Status (1)

Country Link
CN (1) CN101447190A (en)

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102074241A (en) * 2011-01-07 2011-05-25 蔡镇滨 Method for realizing voice reduction through rapid voice waveform repairing
CN102306496A (en) * 2011-09-05 2012-01-04 歌尔声学股份有限公司 Noise elimination method, device and system of multi-microphone array
CN102800325A (en) * 2012-08-31 2012-11-28 厦门大学 Ultrasonic-assisted microphone array speech enhancement device
WO2014071788A1 (en) * 2012-11-08 2014-05-15 广州市锐丰音响科技股份有限公司 Sound receiving system
WO2015196729A1 (en) * 2014-06-27 2015-12-30 中兴通讯股份有限公司 Microphone array speech enhancement method and device
CN105355210A (en) * 2015-10-30 2016-02-24 百度在线网络技术(北京)有限公司 Preprocessing method and device for far-field speech recognition
CN105702261A (en) * 2016-02-04 2016-06-22 厦门大学 Sound focusing microphone array long distance sound pickup device having phase self-correcting function
CN105869651A (en) * 2016-03-23 2016-08-17 北京大学深圳研究生院 Two-channel beam forming speech enhancement method based on noise mixed coherence
CN106782590A (en) * 2016-12-14 2017-05-31 南京信息工程大学 Based on microphone array Beamforming Method under reverberant ambiance
CN106935246A (en) * 2015-12-31 2017-07-07 芋头科技(杭州)有限公司 A kind of voice acquisition methods and electronic equipment based on microphone array
CN107180642A (en) * 2017-07-20 2017-09-19 北京华捷艾米科技有限公司 Audio signal bearing calibration, device and equipment
CN107734412A (en) * 2016-08-11 2018-02-23 Gn 奥迪欧有限公司 Signal processor, signal processing method, earphone and computer-readable medium
CN107749305A (en) * 2017-09-29 2018-03-02 百度在线网络技术(北京)有限公司 Method of speech processing and its device
CN107797096A (en) * 2017-10-20 2018-03-13 电子科技大学 A kind of detection localization method of blowing a whistle based on microphone face battle array
CN107863110A (en) * 2017-12-14 2018-03-30 西安Tcl软件开发有限公司 Safety prompt function method, intelligent earphone and storage medium based on intelligent earphone
CN107895582A (en) * 2017-10-16 2018-04-10 中国电子科技集团公司第二十八研究所 Towards the speaker adaptation speech-emotion recognition method in multi-source information field
CN107895580A (en) * 2016-09-30 2018-04-10 华为技术有限公司 The method for reconstructing and device of a kind of audio signal
CN108156545A (en) * 2018-02-11 2018-06-12 北京中电慧声科技有限公司 A kind of array microphone
CN108172229A (en) * 2017-12-12 2018-06-15 天津津航计算技术研究所 A kind of authentication based on speech recognition and the method reliably manipulated
CN108877828A (en) * 2017-05-16 2018-11-23 福州瑞芯微电子股份有限公司 Sound enhancement method/system, computer readable storage medium and electronic equipment
CN109493877A (en) * 2017-09-12 2019-03-19 清华大学 A kind of sound enhancement method and device of auditory prosthesis
CN109741759A (en) * 2018-12-21 2019-05-10 南京理工大学 A kind of acoustics automatic testing method towards specific birds species
CN110197671A (en) * 2019-06-17 2019-09-03 深圳壹秘科技有限公司 Orient sound pick-up method, sound pick-up outfit and storage medium
CN110415720A (en) * 2019-07-11 2019-11-05 湖北工业大学 The constant Beamforming Method of the super directional frequency of quaternary difference microphone array
CN110428851A (en) * 2019-08-21 2019-11-08 浙江大华技术股份有限公司 Beamforming Method and device, storage medium based on microphone array
CN111010649A (en) * 2018-10-08 2020-04-14 阿里巴巴集团控股有限公司 Sound pickup and microphone array
CN111226278A (en) * 2017-08-17 2020-06-02 塞伦妮经营公司 Low complexity voiced speech detection and pitch estimation
CN111954121A (en) * 2020-08-21 2020-11-17 云知声智能科技股份有限公司 Microphone array directional pickup method and system
CN112462464A (en) * 2020-11-25 2021-03-09 上海思量量子科技有限公司 Cascadable filtering system and photon filtering method thereof
CN114639398A (en) * 2022-03-10 2022-06-17 电子科技大学 Broadband DOA estimation method based on microphone array

Cited By (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102074241B (en) * 2011-01-07 2012-03-28 蔡镇滨 Method for realizing voice reduction through rapid voice waveform repairing
CN102074241A (en) * 2011-01-07 2011-05-25 蔡镇滨 Method for realizing voice reduction through rapid voice waveform repairing
EP2608197A4 (en) * 2011-09-05 2015-04-08 Goertek Inc Method, device, and system for noise reduction in multi-microphone array
CN102306496A (en) * 2011-09-05 2012-01-04 歌尔声学股份有限公司 Noise elimination method, device and system of multi-microphone array
WO2013033991A1 (en) * 2011-09-05 2013-03-14 歌尔声学股份有限公司 Method, device, and system for noise reduction in multi-microphone array
EP2608197A1 (en) * 2011-09-05 2013-06-26 Goertek Inc. Method, device, and system for noise reduction in multi-microphone array
KR101519768B1 (en) * 2011-09-05 2015-05-12 고어텍 인크 Method, device and system for eliminating noises with multi-microphone array
CN102306496B (en) * 2011-09-05 2014-07-09 歌尔声学股份有限公司 Noise elimination method, device and system of multi-microphone array
CN102800325A (en) * 2012-08-31 2012-11-28 厦门大学 Ultrasonic-assisted microphone array speech enhancement device
WO2014071788A1 (en) * 2012-11-08 2014-05-15 广州市锐丰音响科技股份有限公司 Sound receiving system
US9736562B2 (en) 2012-11-08 2017-08-15 Guangzhou Ruifeng Audio Technology Corporation Ltd. Sound receiving system
WO2015196729A1 (en) * 2014-06-27 2015-12-30 中兴通讯股份有限公司 Microphone array speech enhancement method and device
CN105355210A (en) * 2015-10-30 2016-02-24 百度在线网络技术(北京)有限公司 Preprocessing method and device for far-field speech recognition
CN106935246A (en) * 2015-12-31 2017-07-07 芋头科技(杭州)有限公司 A kind of voice acquisition methods and electronic equipment based on microphone array
CN105702261A (en) * 2016-02-04 2016-06-22 厦门大学 Sound focusing microphone array long distance sound pickup device having phase self-correcting function
CN105702261B (en) * 2016-02-04 2019-08-27 厦门大学 Sound focusing microphone array long range sound pick up equipment with phase self-correcting function
CN105869651A (en) * 2016-03-23 2016-08-17 北京大学深圳研究生院 Two-channel beam forming speech enhancement method based on noise mixed coherence
CN105869651B (en) * 2016-03-23 2019-05-31 北京大学深圳研究生院 Binary channels Wave beam forming sound enhancement method based on noise mixing coherence
CN107734412A (en) * 2016-08-11 2018-02-23 Gn 奥迪欧有限公司 Signal processor, signal processing method, earphone and computer-readable medium
CN107734412B (en) * 2016-08-11 2020-07-03 Gn 奥迪欧有限公司 Signal processor, signal processing method, headphone, and computer-readable medium
CN107895580B (en) * 2016-09-30 2021-06-01 华为技术有限公司 Audio signal reconstruction method and device
CN107895580A (en) * 2016-09-30 2018-04-10 华为技术有限公司 The method for reconstructing and device of a kind of audio signal
CN106782590A (en) * 2016-12-14 2017-05-31 南京信息工程大学 Based on microphone array Beamforming Method under reverberant ambiance
CN108877828B (en) * 2017-05-16 2020-12-08 福州瑞芯微电子股份有限公司 Speech enhancement method/system, computer-readable storage medium, and electronic device
CN108877828A (en) * 2017-05-16 2018-11-23 福州瑞芯微电子股份有限公司 Sound enhancement method/system, computer readable storage medium and electronic equipment
CN107180642A (en) * 2017-07-20 2017-09-19 北京华捷艾米科技有限公司 Audio signal bearing calibration, device and equipment
CN111226278A (en) * 2017-08-17 2020-06-02 塞伦妮经营公司 Low complexity voiced speech detection and pitch estimation
CN111226278B (en) * 2017-08-17 2023-08-25 塞伦妮经营公司 Low complexity voiced speech detection and pitch estimation
CN109493877B (en) * 2017-09-12 2022-01-28 清华大学 Voice enhancement method and device of hearing aid device
CN109493877A (en) * 2017-09-12 2019-03-19 清华大学 A kind of sound enhancement method and device of auditory prosthesis
CN107749305A (en) * 2017-09-29 2018-03-02 百度在线网络技术(北京)有限公司 Method of speech processing and its device
CN107895582A (en) * 2017-10-16 2018-04-10 中国电子科技集团公司第二十八研究所 Towards the speaker adaptation speech-emotion recognition method in multi-source information field
CN107797096A (en) * 2017-10-20 2018-03-13 电子科技大学 A kind of detection localization method of blowing a whistle based on microphone face battle array
CN108172229A (en) * 2017-12-12 2018-06-15 天津津航计算技术研究所 A kind of authentication based on speech recognition and the method reliably manipulated
CN107863110A (en) * 2017-12-14 2018-03-30 西安Tcl软件开发有限公司 Safety prompt function method, intelligent earphone and storage medium based on intelligent earphone
CN108156545A (en) * 2018-02-11 2018-06-12 北京中电慧声科技有限公司 A kind of array microphone
CN108156545B (en) * 2018-02-11 2024-02-09 北京中电慧声科技有限公司 Array microphone
CN111010649A (en) * 2018-10-08 2020-04-14 阿里巴巴集团控股有限公司 Sound pickup and microphone array
CN109741759B (en) * 2018-12-21 2020-07-31 南京理工大学 Acoustic automatic detection method for specific bird species
CN109741759A (en) * 2018-12-21 2019-05-10 南京理工大学 A kind of acoustics automatic testing method towards specific birds species
CN110197671A (en) * 2019-06-17 2019-09-03 深圳壹秘科技有限公司 Orient sound pick-up method, sound pick-up outfit and storage medium
CN110415720A (en) * 2019-07-11 2019-11-05 湖北工业大学 The constant Beamforming Method of the super directional frequency of quaternary difference microphone array
CN110415720B (en) * 2019-07-11 2020-05-12 湖北工业大学 Quaternary differential microphone array super-directivity frequency-invariant beam forming method
CN110428851B (en) * 2019-08-21 2022-02-18 浙江大华技术股份有限公司 Beam forming method and device based on microphone array and storage medium
CN110428851A (en) * 2019-08-21 2019-11-08 浙江大华技术股份有限公司 Beamforming Method and device, storage medium based on microphone array
CN111954121A (en) * 2020-08-21 2020-11-17 云知声智能科技股份有限公司 Microphone array directional pickup method and system
CN112462464A (en) * 2020-11-25 2021-03-09 上海思量量子科技有限公司 Cascadable filtering system and photon filtering method thereof
CN114639398A (en) * 2022-03-10 2022-06-17 电子科技大学 Broadband DOA estimation method based on microphone array
CN114639398B (en) * 2022-03-10 2023-05-26 电子科技大学 Broadband DOA estimation method based on microphone array

Similar Documents

Publication Publication Date Title
CN101447190A (en) Voice enhancement method employing combination of nesting-subarray-based post filtering and spectrum-subtraction
US10504539B2 (en) Voice activity detection systems and methods
CN106251877B (en) Voice Sounnd source direction estimation method and device
Grenier A microphone array for car environments
CN106504763A (en) Based on blind source separating and the microphone array multiple target sound enhancement method of spectrum-subtraction
CN110148420A (en) A kind of audio recognition method suitable under noise circumstance
CN106782590A (en) Based on microphone array Beamforming Method under reverberant ambiance
US20050249038A1 (en) System and process for time delay estimation in the presence of correlated noise and reverberation
CN110517701B (en) Microphone array speech enhancement method and implementation device
CN108198568B (en) Method and system for positioning multiple sound sources
CN103180900A (en) Systems, methods, and apparatus for voice activity detection
CN103339961A (en) Apparatus and method for spatially selective sound acquisition by acoustic triangulation
Roman et al. Binaural segregation in multisource reverberant environments
CN108447499B (en) Double-layer circular-ring microphone array speech enhancement method
CN108337605A (en) The hidden method for acoustic formed based on Difference Beam
CN111312275B (en) On-line sound source separation enhancement system based on sub-band decomposition
Priyanka A review on adaptive beamforming techniques for speech enhancement
López-Espejo et al. Dual-channel spectral weighting for robust speech recognition in mobile devices
Niwa et al. Optimal microphone array observation for clear recording of distant sound sources
CN107360497A (en) Estimate the computational methods and device of reverberation component
Guo et al. Underwater target detection and localization with feature map and CNN-based classification
CN110838303A (en) Voice sound source positioning method using microphone array
Pfeifenberger et al. Blind source extraction based on a direction-dependent a-priori SNR.
CN109243476A (en) The adaptive estimation method and device of reverberation power spectrum after in reverberation voice signal
Firoozabadi et al. Combination of nested microphone array and subband processing for multiple simultaneous speaker localization

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20090603