CN108694957A

CN108694957A - The echo cancelltion design method formed based on circular microphone array beams

Info

Publication number: CN108694957A
Application number: CN201810304397.1A
Authority: CN
Inventors: 张正文; 陈卓; 黄翔; 巩朋成; 李婕; 涂斯纯
Original assignee: Hubei University of Technology
Current assignee: Hubei University of Technology
Priority date: 2018-04-08
Filing date: 2018-04-08
Publication date: 2018-10-23
Anticipated expiration: 2038-04-08
Also published as: CN108694957B

Abstract

The present invention proposes a kind of echo cancelltion design method formed based on circular microphone array beams, mainly solves the problems, such as that the prior art can not precise gain signal source and inhibition noise source.Its realization process is:(1)The sound field impulse response in room is obtained according to sef-adapting filter;(2)Loud speaker plays calibration sound, designs corresponding steering vector, circular microphone array is made to recognize interference source direction;(3)The weight coefficient of spatial filter is designed according to gained steering vector, weighting obtains the optimal direction figure of circular microphone array;(4)It is filtered using the Subband adaptive filters of closed loop configuration in time domain, offsets the echo signal of loud speaker.This method combines spatial domain Wave beam forming and time-domain adaptive sub-band filter method, and it is notable that design obtains echo neutralization effect, improves quality of speech signal and echo cancellor speed.

Description

The echo cancelltion design method formed based on circular microphone array beams

Technical field

The invention belongs to voice process technology fields, and in particular to one kind is formed based on circular microphone array beams Echo cancelltion design method, for precise gain signal source and inhibit the signal in noise source direction.

Background technology

Microphone array signals processing is a kind of emerging technology, has become a research of field of voice signal Hot spot.Single microphone received signal is superimposed by multi-acoustical and ambient noise, and the separation of each sound source is difficult to realize, It thus cannot achieve auditory localization and separation.In order to solve these limitations of single microphone, carried out using microphone array The method of speech processes is seasonable and gives birth to.The wheat that microphone array is put by one group by certain geometry (common linear, annular) Gram wind composition carries out space time processing to the voice signal in the different spaces direction of acquisition, realizes noise suppressed, reverberation removal, people The functions such as acoustic jamming inhibition, sound source direction finding, audio source tracking, array gain, and then Speech processing quality is improved, it is true to improve Phonetic recognization rate under real environment.Therefore, the echo cancelltion based on microphone array is studied, is had great importance.

Currently, the research for microphone array echo cancelltion, the method for focusing primarily upon self-adaptive filters in time area is improved. For example, least mean-square error LMS algorithm or recursive least squares, the latter is intended between expected response and filter output Quadratic sum is minimum.When receiving the new sampled value of input signal in each iteration, minimum two is solved using recursive form Multiply problem, the coefficient of sef-adapting filter is made to be updated, expected response and filter export in least square meaning most Match.LMS filtering is a kind of transient analysis method, i.e., will reappraise quadratic sum to all with input signal at each moment, And reach minimum by adjusting weight vector.But the performance of this method is influenced by input signal, LMS filters can be met The problem of amplifying to gradient noise.

Invention content

It is an object of the invention to:For above-mentioned existing methods disadvantage, propose a kind of based on circular microphone array wave The echo cancelltion design method that beam is formed, the signal for precise gain signal source and inhibition noise source direction.

In order to achieve the above object, the technical solution adopted in the present invention is:

The echo cancelltion design method formed based on circular microphone array beams, which is characterized in that the method includes Following steps:

(1) according to the principle of sef-adapting filter, the sound field in room is estimated using the IMAGE methods based on ray model Impulse response, that is, loud speaker provide to be canceled to the space echo path between microphone array for subsequent echo cancelltion Loud speaker interference signal;

(2) loud speaker plays calibration sound, and circular microphone array makes according to direction of arrival estimation method of the TDOA based on time delay Circular microphone array recognizes loud speaker interference source direction;

(3) orientation that teller is recognized according to direction of arrival estimation method of the TDOA based on time delay, is interfered using loud speaker With the azimuth information of teller, array steering vector is obtained, the weight coefficient of spatial filter is then designed according to steering vector, is added Power obtains the spatial domain optimal direction figure of circular microphone array, and the main lobe direction of array is made to be directed toward teller, and loud speaker interferes Direction is located at array low sidelobe, and the purpose of airspace filter is to make the voice signal of microphone array acquisition teller, gain signal Inhibit the signal in interference source direction while source;

(4) airspace filter has cut off the acoustics circuit of loud speaker and microphone array, remaining residual echo signal by when Domain is handled, since voice signal is broadband signal, so being filtered using the Subband adaptive filters of closed loop configuration in time domain Wave offsets the residual echo signal of loud speaker.

Further, the specific implementation process of the step (1) is:If describing room sound with linear time invariant system Field channel model, sound-source signal is s (t), therefore microphone array reception signal x (t) can be expressed as s (t) and sound field channel The convolution form of impulse response h (t), i.e.,:

X (t)=s (t) * h (t)+n (t) (1)

Wherein * indicates that convolution algorithm, n (t) are noise, and the purpose of channel estimation is exactly the condition in known s (t) and x (t) Lower solution shock response h (t), in ray sound-field model, the shock response of channel may be considered by a series of intrinsic sound rays Determining, at this moment impulse response is:

Wherein, M is the number of intrinsic sound ray, A_iAnd τ_iIt is i-th of intrinsic corresponding decay factor and propagation delay respectively.

Further, the realization process of the step (2) is:According to the geometric format of array, (time delay is based in conjunction with TDOA The arrival bearing angular estimation of estimation) method, the sound of stationary sound source is by room propagated, due to the geometry of microphone array Form, there are one time delay, this time delays to be solved according to cross-correlation function for the voice signal that microphone receives two-by-two, Again to arbitrarily microphone receives signal simultaneous solution two-by-two, where deriving sound bearing, loudspeaker calibration sound side can be oriented Position.

Further, the specific implementation process of the step (3) is:It finds sound bearing and obtains one group of Space Angle letter later Horizontal azimuth and vertical elevation (α, β) are ceased, the steric direction vector A of array, () then can be obtained^HIndicate conjugate transposition;

Wherein, τ_ijFor i and j, array element receives the time delay of signal, { ω two-by-two_ij, i, j=1,2 ..., N } it is that spatial domain is weighed;

Spatial domain weighting is carried out per voice signal all the way to microphone array, obtains optimal direction figure;

The directional diagram of array is defined as:

P (θ)=s |ω^HA(θ)| (4)

Different weight vectors can make the signal on different directions have different responses, to form the space wave of different directions Beam so that beam main lobe is directed toward effective Sounnd source direction, and null beam position interferes Sounnd source direction;

Realize that Spatially adaptive filtering, array received signal are expressed as using maximum signal noise ratio principle

X (t)=x_s(t)+x_n(t) (5)

x_s(t) it is corresponding useful signal part, x_n(t) it is interference and noise section;

Then the output of array is after Wave beam forming:

Y (t)=ω^HX (t)=ω^Hx_s(t)+ω^Hx_n(t)(6)

Further, the specific implementation process of the step (4) is:Processing in time domain uses the subband of closed loop configuration certainly Adaptive filter filters, and essence is that the voice signal that will be received is divided into several subsignals on frequency spectrum, in each subband, Using normalization minimum mean-square adaptive algorithm so that subband mean square error is minimum;

The sub-filter algorithm renewal equation of the closed loop configuration of core is:

Input signal extremely closes weight relative to Δ is postponed existing for desired signal, to the compensation of delay in closed loop sub-band structure It wants;

Wherein,L is the subfilter length , &#91 of analysis and composite filter group;·]Indicate round numbers Part.

Compared with prior art, the beneficial effects of the invention are as follows:It is proposed by the present invention to be based on circular microphone array beams The echo cancelltion design method of formation, the advantage is that:

(1) spatial domain Wave beam forming and time-domain adaptive sub-band filter method are combined, it is aobvious that design obtains echo neutralization effect It writes, improves quality of speech signal and echo cancellor speed.

(2) it uses sub-band adaptive filtering method to realize time-domain filtering, the broadband signal of input can be divided into several The subband signal for carrying a small amount of source signal, to reduce the complexity of system processing.

(3) the microphone array array number of the acoustic echo cancellation system of mainstream is small now, 3~5 average, strictly can not Circular array is constituted, positioning accuracy is low, and airspace filter effect is poor, and the present invention is using frequency dividing sub-band filter, more efficient essence in time domain Really.

(4) have the advantages that can null interference radiating way and the effective information source direction of gain, can be used for higher to performance requirement Tele-conferencing system.

Description of the drawings

Fig. 1 is the realization general flow chart of the present invention;

Fig. 2 is airspace filter distribution map of the present invention;

Fig. 3 is that optimal direction diagram of the present invention is intended to;

Fig. 4 is the sub-process figure of Wave beam forming of the present invention;

Fig. 5 is uniform circular array aspect figure of the present invention;

Fig. 6 is time domain sub-band filter method schematic diagram of the present invention.

Specific implementation mode

For the ease of those of ordinary skill in the art understand and implement the present invention, with reference to embodiment to the present invention make into The detailed description of one step, it should be understood that implementation example described herein is merely to illustrate and explain the present invention, and is not used to limit The fixed present invention.

Based on the echo cancelltion design method that circular microphone array beams are formed, mainly solving the prior art can not be accurate The problem of gain signal source and inhibition noise source.Its realization process is:

(1) the sound field impulse response in room is obtained according to sef-adapting filter;

(2) loud speaker plays calibration sound, designs corresponding steering vector, circular microphone array is made to recognize interference source side To;

(3) weight coefficient of spatial filter is designed according to gained steering vector, weighting obtains circular microphone array most Excellent directional diagram;

(4) it is filtered using the Subband adaptive filters of closed loop configuration in time domain, offsets the echo signal of loud speaker.

The calibration sound that this method is released according to loud speaker first obtains microphone using convex optimization method, matrix disassembling method Then the covariance matrix of array obtains the angle information of interference signal, design corresponding steering vector, is oriented to and is sweared according to gained The weight coefficient of amount design spatial filter so that the weight coefficient on interference information source direction is small as possible, to inhibit to acquire to greatest extent The interference signal arrived.The impulse response time in acoustic echo path is very long, and can reach 200ms for general room arrives If 300ms needs thousands of ranks are even more to can be only achieved simulation actual ghosts path impulse using auto-adaptive fir filter The requirement of receptance function can divide the broadband signal of input so realizing time-domain filtering using sub-band adaptive filtering method The subband signal that a small amount of source signal is carried for several implements step to reduce the complexity of system processing Including as follows:

1) according to meeting room space, determine that the Homogeneous Circular microphone array of N array numbers, N number of array element composition periphery are low Sample rate circular array, for calculating sound bearing angle, independent one high sampling array element is located at circle battle array center, high-quality for acquiring Measure voice signal.The calibration sound signal relationship played using array received signal and loud speaker, adaptively obtains room sound field mould The impulse response of type.If describing room sound field channel model with linear time invariant system, transmitting signal s (t) receives signal X (t) can be expressed as the convolution form of transmitting signal and sound field channel impulse response h (t), i.e.,:

X (t)=s (t) * h (t)+n (t) (1)

Wherein * indicates that convolution algorithm, n (t) are noise.The purpose of channel estimation is exactly the condition in known s (t) and x (t) It is lower to solve specific shock response h (t), for receiving the voice signal that signal estimation loud speaker is sent out by microphone array. In ray sound-field model, the shock response of channel may be considered to be determined by a series of intrinsic sound rays, and at this moment impulse response is:

2) array element minimum spacingλ is the corresponding wavelength of information source maximum frequency.Meet spatial sampling and determines rate.

In order to calculate azimuth where sound source, to justify battle array dot as in the three-dimensional coordinate of origin, any two array element Coordinate can be expressed as B_i(x_i,y_i,z_i), B_j(x_j,y_j,z_j), shown is the geometric representation of information source angle-of- arrival estimation, three In dimension coordinate Oxyz, s is information source arrival direction, and α indicates that horizontal azimuth, β indicate vertical elevation, all array element positions of uniform circular array In x/y plane, the vertical line that a wherein array element Bi is s is crossed, s is handed over to reach microphone array since sound source is far field model at this time in A points It is plane wave when row, so it is OA that sound wave, which reaches Bi array elements with the range difference for reaching origin O points, it can thus be concluded that Bi and O point phases DifferenceIt can show that the phase difference of origin and arbitrary two array element, formula 8 are what Fig. 3 was derived by the geometrical relationship of array Information source direction and array element geometrical relationship:

Formula (9), (10) are center of circle array element and i-th and j-th of array element received signal in Homogeneous Circular microphone array Phase difference, be transformed into time domain be signal reach array element time delay.

(i, j=1,2,3...N)

The signal phase difference expression formula that any two array element receives may finally be determined by formula (9), (10) simultaneous:

The phase difference of same information source is received by calculating array element two-by-two, you can obtain one group of horizontal azimuth and vertical elevation (α, β), the i.e. spatial information (si) of information source, in the present invention, this step is for calculating the orientation for fixing loud speaker in space.Phase Difference can be solved by broad sense cross-correlation function.

Broad sense cross-correlation function method is by finding out the crosspower spectrum between two signals first, one then being given in frequency domain Fixed weighting carries out whitening processing with this to signal and noise, to enhance the higher frequency content of signal-to-noise ratio in signal, inhibits The influence of noise, finally inverse transformation obtains the cross-correlation function between two signals, i.e., to time domain again:

B_iWith B_jIndicate the voice signal forms of time and space that i-th and j-th of microphone receive.In conjunction with geometry shown in Fig. 5 Structure receives signal simultaneous solution to arbitrary three groups of microphones, where deriving sound bearing, can orient loud speaker reference note Orientation.WhereinIt is a symmetric function related with τ, there is unique peak value, pass through the corresponding horizontal seat of search peak It marks, the delay, τ of available two signals, in formulaFor cross-spectral density function, ψ_ij(ω) is the weighting of broad sense cross-correlation Function.

3) optimal beam forming directional diagram, the spatial domain weight coefficient method for solving of uniform circular array are as follows in order to obtain:

The mathematical model of microphone array output signal data:

R_x=E[x(t)x^H(t)]=AR_xA^H+R_n (13)

R_nIt is noise covariance matrix, R_xIt is signal covariance matrix, A is array steering vector.(·)^HIndicate that conjugation turns It sets.Spatial domain steering vector:

Wherein, τ_ijFor i and j, array element receives the time delay of signal, { ω two-by-two_ij, i, j=1,2 ..., N } it is that spatial domain is weighed.

And then, the output of array can be obtained in the weighting output of all array elements after being added:

Wherein, { } * indicates conjugation.

The directional diagram of array is defined as:

P (θ)=s |ωHA(θ)(16)

Different weight vectors can make the signal on different directions have different responses, to form the space wave of different directions Beam.

Realize that Spatially adaptive filtering, array received signal are expressed as x (t)=x using maximum signal noise ratio principle_s(t)+ x_n(t), x_s(t) it is corresponding useful signal part, x_n(t) it is interference and noise section.

Then the output of array is after Wave beam forming:

Y (t)=ω^HX (t)=ω^Hx_s(t)+ω^Hx_n(t) (17)

Adaptive weighted vector ω is acquired by maximum signal noise ratio principle:

WhereinFor signal covariance matrix,For noise jamming covariance square Battle array.Make the maximum optimal weight vector ω of above formula output signal-to-noise ratio_optIt is matrix to (R_S,R_n) the corresponding spy of maximum generalized characteristic value Sign vector.

For example, far field there are one be located at θ₀The signal source of angle, then have

Obtained maximum signal noise ratio principle isα is and θ in formula₀Unrelated constant.

4) processing in time domain is filtered using the Subband adaptive filters of closed loop configuration, and essence is the voice that will be received Signal is divided into several subsignals on frequency spectrum, and the method makes all subband convergence rates that can all improve, and computation complexity Also it is greatly reduced, this just improves computational efficiency to a certain extent.

The sub-filter algorithm renewal equation of closed loop configuration is:

Input signal extremely closes weight relative to Δ is postponed existing for desired signal, to the compensation of delay in closed loop sub-band structure It wants.

L is the subfilter length , &#91 of analysis and composite filter group;·]Indicate round numbers part.

Inventive microphone array and loud speaker orientation diagram are as shown in Figure 2.The performance parameter of inventive microphone array is such as Shown in following table;

Referring to Fig.1, steps are as follows for realization of the invention:

Step 1, it is assumed that two loud speakers (having secondary reflection) and two tellers determine that uniform circular array row element number of array is 7=2*2+2+1, the number that can obtain forming wave beam is 6.

Step 2, after fixing microphone array column position, the specific calibration audio of loud speaker output, with above-mentioned offer Azimuth method of estimation positions the orientation of two loud speakers interference and corresponding four secondary reflection interference sources.

Step 3, the corresponding weighting coefficient in interference source direction is calculated using above-mentioned Adaptive beamformer method so that side Four null angles are generated on collection of illustrative plates, to reduce the gain on interference source direction.

After system enters operating mode, microphone array starts the voice signal of collection site people, is denoted as x (t), with identification It is the same to calibrate sound direction, azimuth information where array real-time estimation teller, Adaptive beamformer updates weighting coefficient ω_ij, so that the main lobe of direction collection of illustrative plates is directed toward voice source direction, to improve the voice signal gain of teller.

It should be understood that the part that this specification does not elaborate belongs to the prior art.

It should be understood that the above-mentioned description for preferred embodiment is more detailed, can not therefore be considered to this The limitation of invention patent protection range, those skilled in the art under the inspiration of the present invention, are not departing from power of the present invention Profit requires under protected ambit, can also make replacement or deformation, each fall within protection scope of the present invention, this hair It is bright range is claimed to be determined by the appended claims.

Claims

1. the echo cancelltion design method formed based on circular microphone array beams, which is characterized in that the method includes such as Lower step:

(1) according to the principle of sef-adapting filter, the sound field impulse in room is estimated using the IMAGE methods based on ray model Response, that is, loud speaker provide to be canceled raise to the space echo path between microphone array for subsequent echo cancelltion Sound device interference signal;

(2) loud speaker plays calibration sound, and circular microphone array makes circle according to direction of arrival estimation method of the TDOA based on time delay Microphone array recognizes loud speaker interference source direction;

(3) orientation that teller is recognized according to direction of arrival estimation method of the TDOA based on time delay is interfered and is said using loud speaker The azimuth information for talking about people, obtains array steering vector, and the weight coefficient of spatial filter is then designed according to steering vector, is weighted To the spatial domain optimal direction figure of circular microphone array, the main lobe direction of array is made to be directed toward teller, and loud speaker interference radiating way Positioned at array low sidelobe, the purpose of airspace filter is to make the voice signal of microphone array acquisition teller, gain signal source Inhibit the signal in interference source direction simultaneously;

(4) airspace filter has cut off the acoustics circuit of loud speaker and microphone array, remaining residual echo signal by time domain into Row processing, so being filtered using the Subband adaptive filters of closed loop configuration in time domain, is supported since voice signal is broadband signal Disappear the residual echo signal of loud speaker.

2. the echo cancelltion design method formed as described in claim 1 based on circular microphone array beams, feature are existed In the specific implementation process of the step (1) is:If describing room sound field channel model, sound with linear time invariant system Source signal is s (t), therefore microphone array reception signal x (t) can be expressed as s (t) and sound field channel impulse response h (t) Convolution form, i.e.,:

X (t)=s (t) * h (t)+n (t) (1)

Wherein * indicates that convolution algorithm, n (t) are noise, and the purpose of channel estimation is exactly to be asked under conditions of known s (t) and x (t) Shock response h (t) is solved, in ray sound-field model, the shock response of channel may be considered to be determined by a series of intrinsic sound rays , at this moment impulse response is:

3. the echo cancelltion design method formed as described in claim 1 based on circular microphone array beams, feature are existed In the realization process of the step (2) is:According to the geometric format of array, in conjunction with TDOA methods, the sound of stationary sound source passes through Room propagated, due to the geometric format of microphone array, time delay that there are one the voice signals that microphone receives two-by-two, this A time delay can be solved according to cross-correlation function, then to arbitrarily microphone receives signal simultaneous solution two-by-two, derivation is spoken Where the orientation of source, loudspeaker calibration sound orientation can be oriented.

4. the echo cancelltion design method formed as described in claim 1 based on circular microphone array beams, feature are existed In the specific implementation process of the step (3) is:Find obtain after sound bearing one group of Space Angle information level azimuth and The steric direction vector A of array, () then can be obtained in vertical elevation (α, β)^HIndicate conjugate transposition;

The directional diagram of array is defined as:

P (θ)=s |ω^HA(θ)| (4)

Different weight vectors can make the signal on different directions have different responses, to form the spatial beams of different directions, So that beam main lobe is directed toward effective Sounnd source direction, null beam position interferes Sounnd source direction;

X (t)=x_s(t)+x_n(t) (5)

Then the output of array is after Wave beam forming:

Y (t)=ω^HX (t)=ω^Hx_s(t)+ω^Hx_n(t) (6)。

5. the echo cancelltion design method formed as described in claim 1 based on circular microphone array beams, feature are existed In the specific implementation process of the step (4) is:Processing in time domain is filtered using the Subband adaptive filters of closed loop configuration Wave, essence is that the voice signal that will be received is divided into several subsignals on frequency spectrum, in each subband, most using normalization Small square adaptive algorithm so that subband mean square error is minimum;

Input signal is most important in closed loop sub-band structure to the compensation of delay relative to postponing Δ existing for desired signal;