CN102147458B

CN102147458B - Method and device for estimating direction of arrival (DOA) of broadband sound source

Info

Publication number: CN102147458B
Application number: CN 201010608798
Authority: CN
Inventors: 冯大航; 鲍明; 管鲁阳
Original assignee: Institute of Acoustics CAS
Current assignee: Institute of Acoustics CAS
Priority date: 2010-12-17
Filing date: 2010-12-17
Publication date: 2013-03-13
Anticipated expiration: 2030-12-17
Also published as: CN102147458A

Abstract

The invention relates to a method and device for estimating the direction of arrival (DOA) of a broadband sound source. The method comprises the following steps: selecting one section of sound data X(n)=[x1(n),...,xP(n)]T from sound data collected from a microphone array, wherein n is a subset of Z*; evenly dividing the selected sound data X(n) into L frames; carrying out N-point quick Fourier transform on each frame of sound data Xl (n); presenting the frequency domain of the obtained sound data as follows: Xl(k)=[xl1(k),...,xlP(k)]T, wherein k=1,..., N/2; then, calculating the modulus square value eta (k) of a coherence function between two microphones in the microphone array; randomly selecting the modulus square values of q coherence functions from the eta (k); and estimating the DOA only by using a frequency band corresponding to the modulus squares of the q coherence functions. According to the method and device, the computation amount of an algorithm is greatly lowered on the premise of ensuring that the precision is hardly lowered.

Description

A method and device for estimating the direction of arrival of a broadband sound source

技术领域 technical field

本发明涉及目标测向技术领域，特别涉及一种针对宽带声源的波达方向估计方法及其装置。The invention relates to the technical field of target direction finding, in particular to a method and device for estimating the direction of arrival of a broadband sound source.

背景技术 Background technique

目标测向在声呐、雷达、导航及无线传感器网络等领域都有重要的应用。而传感器阵列是目标测向的主要方式。Target direction finding has important applications in sonar, radar, navigation and wireless sensor networks and other fields. The sensor array is the main way to find the direction of the target.

在目标测向技术中，一个重要技术问题是如何对宽带声源进行定位。对于这方面已经有很多的算法。宽带MUSIC算法是先将宽带信号通过傅里叶变换到频域上，然后对每个频带用窄带的MUSIC法处理，最后将各个频带的结果综合得到声源的方向估计。即非相干子空间方法(ISM，Incoherent Sub-space Method)。这类方法运算量小，但精度差，并且只能处理非相干信号；最大似然(ML，Maximum Likelihood)估计算法为代表的拟合类算法，以其良好的渐进性，灵活的适用性，可低的计算复杂度，得到了广泛关注。针对宽带信号，Yao Kung教授等人提出了针对宽带声源的AML(Approximated Maximum Likelihood)波达方向估计算法。这类算法精度高，能处理相干信号，但运算量很大。同时，对于相干信号源还可以用聚焦矩阵的方法处理。首先需要利用低复杂度算法对声源方向进行一个预估，通过这个预估的角度构造聚焦矩阵，通过聚焦矩阵把不同频带的信号聚焦到一个中心频率上，然后把它当成一个窄带信号来处理，这种方法的缺点是需要知道声源的预估角度，并且预估角度的选择对于算法的性能影响很大。In the target direction finding technology, an important technical problem is how to locate the broadband sound source. There are already many algorithms for this. The wideband MUSIC algorithm first transforms the wideband signal into the frequency domain through Fourier transform, then processes each frequency band with the narrowband MUSIC method, and finally combines the results of each frequency band to obtain the direction estimation of the sound source. That is, the Incoherent Sub-space Method (ISM, Incoherent Sub-space Method). This type of method has a small amount of calculation, but poor accuracy, and can only deal with incoherent signals; the fitting algorithm represented by the maximum likelihood (ML, Maximum Likelihood) estimation algorithm, with its good gradualness and flexible applicability, Its low computational complexity has attracted widespread attention. For broadband signals, Professor Yao Kung and others proposed the AML (Approximated Maximum Likelihood) direction of arrival estimation algorithm for broadband sound sources. This type of algorithm has high precision and can deal with coherent signals, but it has a large amount of computation. At the same time, the method of focusing matrix can also be used for coherent signal sources. First of all, it is necessary to use a low-complexity algorithm to estimate the direction of the sound source, construct a focusing matrix through this estimated angle, focus signals of different frequency bands on a center frequency through the focusing matrix, and then treat it as a narrow-band signal , the disadvantage of this method is that the estimated angle of the sound source needs to be known, and the selection of the estimated angle has a great influence on the performance of the algorithm.

发明内容 Contents of the invention

本发明的目的在于，在保证精度基本不降低的情况下极大降低算法的运算量。The purpose of the present invention is to greatly reduce the calculation amount of the algorithm under the condition that the accuracy is basically not reduced.

为达到上述目的，提出一种针对宽带声源的波达方向估计方法，该方法具体步骤包括：In order to achieve the above purpose, a DOA estimation method for broadband sound sources is proposed. The specific steps of the method include:

步骤1)：从传声器阵列采集的声音数据中，选取一段声音数据X(n)＝[x₁(n)，...，x_P(n)]^T，其中，P表示传声器阵列中传声器的个数，P≥2；将该段声音数据X(n)均匀分成L帧，每帧的数据X_l(n)的长度为N，X(n)的数据长度为N×L，l＝1，...，L；n∈Z^*；Step 1): From the sound data collected by the microphone array, select a piece of sound data X(n)=[x ₁ (n), ..., x _P (n)] ^T , where P represents the number of microphones in the microphone array number, P≥2; the section of sound data X (n) is evenly divided into L frames, the length of the data X _l (n) of each frame is N, and the data length of X (n) is N×L, l=1 ,...,L; n∈Z ^* ;

对每一帧声音数据X_l(n)做N点快速傅里叶变换，得到声音数据的频域表示X_l(k)＝[x_l1(k)，...，x_lP(k)]^T，其中k＝1，2，...，N/2；Perform N-point fast Fourier transform on each frame of sound data X _l (n), and obtain the frequency domain representation of sound data X _l (k)=[x _l1 (k), ..., x _lP (k)] ^T , where k=1, 2, ..., N/2;

步骤2)：根据所述的步骤1)得到的X_l(k)按照下式(1)计算传声器阵列中的两个传声器间相干函数的模平方η(k)；Step 2): X _l (k) that obtains according to described step 1) calculates the modular square η (k) of the coherence function between two microphones in the microphone array according to the following formula (1);

$η η ((k k)) = = {| | {γ γ}_{i i,, j j} ((k k)) | |}^{22} = = \frac{{| | {Φ Φ}_{i i,, j j} ((k k)) | |}^{22}}{{Φ Φ}_{i i,, i i} ((k k)) {Φ Φ}_{j j,, j j} ((k k))} - - - - - - ((11))$

其中，k＝1，2，...，N/2；i，j∈[1，2，...，P]；Φ_i，j(k)为第i个传声器和第j个传声器的互功率谱，Φ_i，i(k)为第i个传声器的自功率谱，

Φ_j，j(k)为第j个传声器的自功率谱，

Among them, k=1, 2,..., N/2; i, j∈[1, 2,..., P]; Φ _{i, j} (k) is the i-th microphone and the j-th microphone cross power spectrum, Φ _{i, i} (k) is the self-power spectrum of the i-th microphone,

Φ _{j, j} (k) is the self-power spectrum of the jth microphone,

步骤3)：根据所述的步骤2)得到的η(k)随机选出q个相干函数模的平方；Step 3): according to the η (k) that described step 2) obtains, randomly select the square of q coherent function modules;

步骤4)：根据所述的步骤3)选出的q个相干函数模的平方对应的声音数据X_l(k)进行波达方向估计，其中，k∈[k₁，...，k_q]，k_q表示频带序号，且[k₁，...，k_q]∈[1，...，N/2]。Step 4): According to the sound data X _l (k) corresponding to the squares of the q coherence function modules selected in the above step 3), the direction of arrival is estimated, where k∈[k ₁ ,...,k _q ], k _q represents the serial number of the frequency band, and [k ₁ ,...,k _q ]∈[1,...,N/2].

所述的步骤1)还包括：保存所述的传声器阵列到当前为止采集的连续L帧声音数据；并保存来自所述传声器阵列最新的1帧声音数据后，将保存的连续L帧声音数据中最早的1帧声音数据删除。The step 1) also includes: saving the continuous L frames of sound data collected so far by the microphone array; The oldest sound data of 1 frame is deleted.

所述的步骤2)中，两个传声器为传声器阵列中空间距离最大的两个传声器。In the step 2), the two microphones are the two microphones with the largest spatial distance in the microphone array.

所述的步骤2)中，两个传声器之间连续(L-1)帧的数据作为数据缓存，根据当前获得第L帧的N个数据和缓存的N×(L-1)个数据来求取相干函数的模平方。In the described step 2), the data of consecutive (L-1) frames between the two microphones is used as a data buffer, and the N × (L-1) data obtained according to the N data of the Lth frame currently obtained and the cache are calculated Take the square of the magnitude of the coherence function.

所述的步骤4)中，根据所述的步骤3)选出的q个相干函数模的平方对应的第L帧声音数据X_L(k)进行波达方向估计，其中，k∈[k₁，...，k_q]，k_q表示频带序号，且[k₁，...，k_q]∈[1，...，N/2]。In the step 4), direction of arrival estimation is performed on the L-th frame sound data X _L (k) corresponding to the squares of the q coherence function modules selected in the step 3), wherein, k∈[k ₁ ,...,k _q ], k _q represents the frequency band number, and [k ₁ ,...,k _q ]∈[1,...,N/2].

所述的步骤3)中，比较所述的步骤2)中得到的η(k)大小，从最大的开始依次挑选前q个η(k)值，即η(k₁)≥η(k₂)≥...≥η(k_q)≥η(k_m)，其中，k_q表示频带序号，[k₁，...，k_q]∈[1，2，....，N/2]，

In the step 3), compare the η(k) values obtained in the step 2), and select the first q η(k) values sequentially from the largest, that is, η(k ₁ )≥η(k ₂ )≥...≥η(k _q )≥η(k _m ), where k _q represents the frequency band number, [k ₁ ,...,k _q ]∈[1,2,...,N/ 2],

为实现上述目的，提出一种针对宽带声源的波达方向估计装置，其特征在于，该装置包括：傅里叶变换模块、相干函数的模平方模块和波达方向估计模块；In order to achieve the above object, a kind of direction-of-arrival estimation device for broadband sound source is proposed, it is characterized in that, the device comprises: a Fourier transform module, a modulus square module of a coherent function and a direction-of-arrival estimation module;

所述的傅里叶变换模块，包括：数据选择单元、数据分帧单元和傅里叶变换单元；用于所述的数据选择单元从传声器阵列采集的声音数据中，选取一段声音数据X(n)＝[x₁(n)，...，x_P(n)]^T，所述的数据分帧单元将所述的数据选择单元选择的声音数据X(n)均匀分成L帧，所述的傅里叶变换单元对每一帧声音数据X_l(n)做N点快速傅里叶变换，得到声音数据的频域表示X_l(k)＝[x_l1(k)，...，x_lP(k)]^T；并将所述频域的声音数据输出至所述的相干函数的模平方模块；其中，P表示传声器阵列中传声器的个数，P≥2；每帧的数据X_l(n)的长度为N，l＝1，...，L，X(n)的数据长度为N×L；n∈Z^*；k＝1，2，...，N/2；The Fourier transform module includes: a data selection unit, a data framing unit and a Fourier transform unit; for the data selection unit to select a section of sound data X(n) from the sound data collected by the microphone array )=[x ₁ (n), ..., x _P (n)] ^T , the data framing unit divides the sound data X(n) selected by the data selection unit into L frames evenly, and the The Fourier transform unit performs N-point fast Fourier transform on each frame of sound data X _l (n), and obtains the frequency domain representation of sound data X _l (k)=[x _l1 (k),..., x _lP (k)] ^T ; and the sound data of the frequency domain is output to the modular square module of the coherence function; wherein, P represents the number of microphones in the microphone array, and P≥2; the data x of each frame The length of _l (n) is N, l=1,...,L, the data length of X(n) is N×L; n∈Z ^* ; k=1,2,...,N/2;

所述的相干函数的模平方模块，用于根据所述的傅里叶变换模块得到的X_l(k)按照式(2)计算传声器阵列中的两个传声器间相干函数的模平方η(k)；The module square module of described coherence function is used for calculating the module square η(k) of the coherence function between two microphones in the microphone array according to the X _l (k) that described Fourier transform module obtains );

$η η ((k k)) = = {| | {γ γ}_{i i,, j j} ((k k)) | |}^{22} = = \frac{{| | {Φ Φ}_{i i,, j j} ((k k)) | |}^{22}}{{Φ Φ}_{i i,, i i} ((k k)) {Φ Φ}_{j j,, j j} ((k k))} - - - - - - ((22))$

其中，k＝1，2，...，N/2；i，j∈[1，2，...，P]；Φ_i，j(k)为第i个传声器和第j个传声器的互功率谱，

Φ_i，i(k)为第i个传声器的自功率谱，

Φ_j，j(k)为第j个传声器的自功率谱，

Among them, k=1, 2,..., N/2; i, j∈[1, 2,..., P]; Φ _{i, j} (k) is the i-th microphone and the j-th microphone cross power spectrum,

Φ _{i, i} (k) is the self-power spectrum of the i-th microphone,

Φ _{j, j} (k) is the self-power spectrum of the jth microphone,

所述的波达方向估计模块，用于根据从所述的相干函数的模平方模块得到的相干函数的模平方中随机选出q个相干函数模的平方对应的声音数据X_l(k)进行波达方向估计，其中，k∈[k₁，...，k_q]，k_q表示频带序号，且[k₁，...，k_q]∈[1，...，N/2]。The direction of arrival estimation module is used to randomly select the sound data X _l (k) corresponding to the square of the q coherent function modulus from the modulus square of the coherent function obtained from the modulus square module of the coherent function. Direction of arrival estimation, where k∈[k ₁ ,...,k _q ], k _q represents the frequency band number, and [k ₁ ,...,k _q ]∈[1,...,N/2 ].

所述的相干函数的模平方模块中，两个传声器为传声器阵列中空间距离最大的两个传声器。In the modular square module of the coherence function, the two microphones are the two microphones with the largest spatial distance in the microphone array.

所述的装置还包括声音数据缓存模块；所述的声音数据缓存模块，用于保存传声器阵列到当前为止采集的连续L帧声音数据；并保存接收到来自传声器阵列最新的1帧声音数据后，将保存的连续L帧声音数据中最早的1帧声音数据删除；并将缓存的声音数据输出至所述的傅里叶变换模块。The device also includes a sound data buffer module; the sound data buffer module is used to save the continuous L frames of sound data collected by the microphone array so far; and after saving the latest 1 frame of sound data received from the microphone array, Deleting the earliest frame of sound data among the saved consecutive L frames of sound data; and outputting the buffered sound data to the Fourier transform module.

所述的相干函数的模平方模块将选取的两个传声器之间连续(L-1)帧的数据作为数据缓存，根据当前获得第L帧的N个数据和缓存的N×(L-1)个数据来求取相干函数的模平方。The modular square module of the coherence function uses the data of consecutive (L-1) frames between the selected two microphones as a data buffer, and obtains N data of the L-th frame and the N×(L-1) of the buffer according to the current data to find the modulus square of the coherence function.

所述的波达方向估计模块根据所述的比较模块得到的q个相干函数模的平方对应的第L帧声音数据X_L(k)进行波达方向估计，其中，k∈[k₁，...，k_q]，k_q表示频带序号，且[k₁，...，k_q]∈[1，...，N/2]。The direction of arrival estimation module performs direction of arrival estimation on the L-th frame sound data X _L (k) corresponding to the squares of the q coherence function modules obtained by the comparison module, where k∈[k ₁ ,. .., k _q ], k _q represents the serial number of the frequency band, and [k ₁ , ..., k _q ]∈[1, ..., N/2].

该装置还包括一比较模块，用于根据比较所述的相干函数的模平方模块得到的η(k)大小，从最大的开始依次挑选前q个η(k)值，即η(k₁)≥η(k₂)≥...≥η(k_q)≥η(k_m)，其中，k_q表示频带序号，[k₁，...，k_q]∈[1，2，....，N/2]，

；将挑选出的相干函数的模平方输出至所述的波达方向估计模块。The device also includes a comparison module, which is used to select the first q η(k) values sequentially from the largest one, i.e. η(k ₁ ) ≥η(k ₂ )≥...≥η(k _q )≥η(k _m ), where k _q represents the frequency band number, [k ₁ ,...,k _q ]∈[1, 2, .. ..,N/2],

; Output the modulus square of the selected coherence function to the DOA estimation module.

本发明的优点在于，本发明可以大大降低运算量。同时这种简化算法可适用于背景技术部分所讨论的任何算法。以AML算法为例，由于AML算法的运算量最大，通过验证可以得出在定向效果相当的情况下，大大降低了运算量。The advantage of the present invention is that the present invention can greatly reduce the calculation amount. At the same time, this simplified algorithm can be applied to any algorithm discussed in the background art section. Taking the AML algorithm as an example, since the AML algorithm has the largest amount of calculation, it can be concluded that the calculation amount is greatly reduced when the orientation effect is equivalent.

附图说明 Description of drawings

图1为传声器阵列的空间结构示意图；Fig. 1 is a schematic diagram of the spatial structure of a microphone array;

图2为本发明提出的一种针对宽带声源的波达方向估计方法流程图；Fig. 2 is a flow chart of a method for estimating the direction of arrival of a broadband sound source proposed by the present invention;

图3为仿真试验中白噪声情况下本发明实施例方案与全频带方案进行波达方向估计的结果比较图；Fig. 3 is a comparison diagram of the results of DOA estimation between the embodiment scheme of the present invention and the full-band scheme under the white noise situation in the simulation test;

图4为仿真试验中有色噪声情况下本发明实施例方案与全频带方案进行波达方向估计的结果比较图；Fig. 4 is a comparison diagram of the results of direction of arrival estimation between the embodiment scheme of the present invention and the full-band scheme under the condition of colored noise in the simulation test;

图5为野外试验中本发明实施例方案与全频带方案进行波达方向估计的结果比较图。Fig. 5 is a comparison diagram of DOA estimation results between the solution of the embodiment of the present invention and the full-band solution in the field test.

具体实施方式 Detailed ways

下面结合附图和实施例对本发明进一步说明。The present invention will be further described below in conjunction with the accompanying drawings and embodiments.

设有R个全向(omni-directional)传声器，第p个传声器接收信号为：There are R omni-directional microphones, and the signal received by the pth microphone is:

${x x}_{p p} ((n no)) = = {Σ Σ}_{m m = = 11}^{M m} {s the s}_{00}^{((m m))} ((n no - - {t t}_{p p}^{((m m))})) + + {w w}_{p p} ((n no)) - - - - - - ((11))$

其中：n＝0，...，L-1，L为一帧信号采样点个数；p＝1，...，R，R为接收阵列中传声器的个数；m＝1，...，M，M为声源数量(M＜R)。为源信号，为用采样点数表示的时间延迟。w_p(n)为阵列接收信号中噪声分量。其中，时间延迟

根据阵型及目标声源位置的几何关系获得。对于一个远场条件下的R元圆阵，即R个传声器等距排列在一个圆周上，第p个传声器的用采样点数表示的时间延迟为：Among them: n=0,..., L-1, L is the number of sampling points of a frame signal; p=1,..., R, R is the number of microphones in the receiving array; m=1,... ., M, M is the number of sound sources (M<R). is the source signal, is the time delay expressed in number of sampling points. w _p (n) is the noise component in the received signal of the array. Among them, the time delay

It is obtained according to the geometric relationship between the formation and the position of the target sound source. For an R-element circular array under far-field conditions, that is, R microphones are arranged equidistantly on a circle, the time delay of the p-th microphone represented by the number of sampling points is:

${t t}_{p p}^{((m m))} = = \frac{{f f}_{s the s} r r}{c c} cos cos ((\frac{22 π π ((p p - - 11))}{R R} - - {θ θ}_{m m})) - - - - - - ((22))$

其中，f_s为信号采样频率，r为圆阵半径，c为各向同性介质中的波速，θ_m为第m个源的远场方位角。Among them, f _s is the signal sampling frequency, r is the radius of the circular array, c is the wave velocity in the isotropic medium, and θ _m is the far-field azimuth of the mth source.

对接收信号做N点的FFT变换，得频率域信号模型：Perform N-point FFT transformation on the received signal to obtain the frequency domain signal model:

X(k)＝D(Θ，k)S₀(k)+η(k) (3)X(k)=D(Θ,k)S ₀ (k)+η(k) (3)

式中，k＝0，...，N/2；频域阵列输出信号：X(k)＝[X₁(k)，...，X_R(k)]^T；M个声源的频谱：

方向矩阵(Steering Matrix)为：D(Θ，k)＝[d⁽¹⁾(k)，...，d^(M)(k)]；方向矢量(Steering Vector)为：

η(k)为噪声频谱。In the formula, k=0,...,N/2; frequency domain array output signal: X(k)=[X ₁ (k),...,X _R (k)] ^T ; M sound sources Spectrum:

The direction matrix (Steering Matrix) is: D(Θ, k)=[d ⁽¹⁾ (k), ..., d ^(M) (k)]; the direction vector (Steering Vector) is:

η(k) is the noise spectrum.

噪声谱向量η(k)是方差为Lσ²零均值复数高斯白噪声。注意到即使w(n)为任意独立同分布的随机变量，根据中心极限定理，都可推知η(k)服从高斯分布。The noise spectrum vector η(k) is the zero-mean complex Gaussian white noise with variance ^Lσ2 . Note that even if w(n) is any independent and identically distributed random variable, according to the central limit theorem, it can be deduced that η(k) obeys the Gaussian distribution.

定义 S(k)＝D(k)S₀(k) (4)Definition S(k)＝D(k)S ₀ (k) (4)

得到NR/2维空间-时频向量：Get NR/2-dimensional space-time-frequency vector:

X＝G(θ)+ξ₁ (5)X＝G(θ)+ξ ₁ (5)

其中，G(θ)＝[S(0)^T，...，S(N/2)^T]^T，并且有：where G(θ)=[S(0) ^T , ..., S(N/2) ^T ] ^T , and there are:

${R R}_{ξ ξ} = = E E. [[{ξξ ξξ}^{H h}]] = = L L [\begin{matrix} {σ σ}^{22} {I I}_{R R} & 00 \\ {σ σ}^{22} {I I}_{R R} \\ L L \\ 00 & {σ σ}^{22} {I I}_{R R} \end{matrix}] - - - - - - ((66))$

其中，上角标^H代表复矩阵的共轭转置。where the superscript ^H represents the conjugate transpose of a complex matrix.

假设初始未知参数空间为：Suppose the initial unknown parameter space is:

$Θ Θ = = {[[{θ θ}_{s the s}^{T T},, {S S}_{00}^{{((11))}^{T T}},, . . . . . .,, {S S}_{00}^{{((M m))}^{T T}}]]}^{T T} - - - - - - ((77))$

其中，源方位角

源频谱

where the source azimuth

source spectrum

所以有最大似然函数：So there is the maximum likelihood function:

$f f ((Θ Θ)) = = \frac{11}{{((22 π π))}^{NR NR / / 22 / / 22} {| | {R R}_{ξ ξ} | |}^{11 / / 22}} exp exp {{- - \frac{11}{22} {[[X x - - G G ((Θ Θ))]]}^{H h} {R R}_{ξ ξ}^{- - 11} [[X x - - G G ((Θ Θ))]]}} - - - - - - ((88))$

对数化该似然函数并略去常数项得：Logarithmizing this likelihood function and omitting the constant term gives:

$L L ((Θ Θ)) = = - - \frac{{| | | | X x - - G G ((Θ Θ)) | | | |}^{22}}{{R R}_{ξ ξ}} - - - - - - ((99))$

可以得到最优化准则：The optimization criterion can be obtained:

$\underset{Θ Θ}{max max} L L ((Θ Θ)) = = \underset{Θ Θ}{min min} {Σ Σ}_{k k = = 00}^{N N / / 22} \frac{{| | | | X x ((k k)) - - D D. ((k k)) {S S}_{00} ((k k)) | | | |}^{22}}{{R R}_{ξ ξ} ((k k))}$

$= = \underset{Θ Θ}{min min} {Σ Σ}_{k k = = 00}^{N N / / 22} \frac{{| | | | X x ((k k)) - - D D. ((k k)) {S S}_{00} ((k k)) | | | |}^{22}}{{σ σ}^{22}} - - - - - - ((1010))$

这等价于对所有k个函数，找到又由于σ²为常数，f(k)可简写为：This is equivalent to, for all k functions, finding And because σ ² is a constant, f(k) can be abbreviated as:

f(k)＝||X(k)-D(k)S₀(k)||² (11)f(k)＝||X(k)-D(k)S ₀ (k)|| ² (11)

使f(k)取的极小值的源信号矢量S₀(k)一定满足因此为了对任何源定位有最小的残差，源矢量的估计给定：The source signal vector S ₀ (k) that makes f(k) take the minimum value must satisfy So in order to have the smallest residual error for any source location, the source vector estimate is given by:

其中，

是方向矩阵D(k)的伪逆矩阵(pseudo-inverse)。定义正交规划那么AML算法的波达方向估计可以通过下式得到。in,

is the pseudo-inverse of the direction matrix D(k). Define an Orthogonal Program Then the direction of arrival estimation of the AML algorithm can be obtained by the following formula.

${θ θ}_{s the s} = = \underset{{θ θ}_{s the s}}{max max} {Σ Σ}_{k k = = 00}^{N N / / 22} {| | | | P P ((k k,, {θ θ}_{s the s})) X x ((k k)) | | | |}^{22} - - - - - - ((1313))$

声音信号虽然是宽带的，但信号一般都不可能是白的，它一般都存在一些谐频结构，即信号主要集中在某些频段上；或由于噪声是有色的，某些频带的信号完全被噪声湮没掉了，因此在计算时，这些频带上的信息是冗余的，甚至会影响最后的结果。所以采用一定的方法来选取某些频带是必要的，本发明方案中，将频带上的相干函数作为选择频带的标准。Although the sound signal is broadband, it is generally impossible for the signal to be white. It generally has some harmonic frequency structure, that is, the signal is mainly concentrated in certain frequency bands; or because the noise is colored, the signal in certain frequency bands is completely blocked. The noise is annihilated, so when calculating, the information on these frequency bands is redundant and may even affect the final result. Therefore, it is necessary to use a certain method to select certain frequency bands. In the solution of the present invention, the coherence function on the frequency band is used as the criterion for selecting the frequency band.

相干函数(coherence)用于描述两组数据之间的相关性。本发明方案中，相干函数的定义为：The coherence function is used to describe the correlation between two sets of data. In the scheme of the present invention, the definition of coherence function is:

${γ γ}_{xy xy} = = \frac{{G G}_{xy xy} ((f f))}{\sqrt{{G G}_{xx xxx} ((f f)) {G G}_{yy yy} ((f f))}} - - - - - - ((1414))$

其中，x和y代表两组不同的数据，G_xy(f)为x和y的互谱，G_xx(f)和G_yy(f)分别为x和y的自谱。G_xy(f)＝E{X(f)Y^*(f)}，G_xx(f)＝E{X(f)X^*(f)}，G_yy(f)＝E{Y(f)Y^*(f)}。Among them, x and y represent two different sets of data, G _xy (f) is the cross-spectrum of x and y, G _xx (f) and G _yy (f) are the auto-spectrum of x and y respectively. G _xy (f) = E{X(f)Y ^* (f)}, G _xx (f) = E{X(f)X ^* (f)}, G _yy (f) = E{Y(f) Y ^* (f)}.

幅度平方相干函数(MSC)为相干函数γ_xy的模的平方：The magnitude squared coherence function (MSC) is the square of the modulus of the coherence function γ _xy :

${| | {γ γ}_{xy xy} ((f f)) | |}^{22} = = \frac{{| | {G G}_{xy xy} ((f f)) | |}^{22}}{{G G}_{xx xxx} ((f f)) {G G}_{yy yy} ((f f))} - - - - - - ((1515))$

由此，可见相干函数指的都是频域上的运算。From this, it can be seen that the coherence function refers to operations in the frequency domain.

相干函数的取值为0到1之间，即：The value of the coherence function is between 0 and 1, namely:

0≤|γ_xy|²≤1 (16)0≤| _γxy | ^2≤1 (16)

当取值为0时表示频段x和y上的频谱完全不相关，取值为1表示频段x和y上的频谱完全相关。When the value is 0, it means that the spectrum on the frequency band x and y is completely uncorrelated, and the value is 1, which means that the spectrum on the frequency band x and y is completely correlated.

在特定的情况下有：信噪比

由此可见MSC可以用来衡量信号某个频带上的信噪比的大小。当|γ_xy(k)|²较大时，表明第k个频带上有用的信号较多；反之，则说明第k频带上噪声占主要部分。In specific cases there are: SNR

It can be seen that MSC can be used to measure the signal-to-noise ratio in a certain frequency band of the signal. When |γ _xy (k)| ² is large, it indicates that there are more useful signals in the kth frequency band; otherwise, it indicates that the noise in the kth frequency band is the main part.

传声器阵列的空间结构示意图，如图1所示。使用图1所示的十字阵型的传声器阵列进行风噪信号采集，圆圈101表示12枚B&K 4189型标准传声器，并配有直径为10cm的球形海绵风罩。102表示用于固定这些传声器的支架。十字阵上三个4元阵列的孔径分别是0.2米，0.6米，1.2米。采用SONY SIR-1000型16通道同步数字录音机，采样频率为48kHz。数据分析时，通过低通滤波、归一化、降采样到1kHz进行分析。A schematic diagram of the spatial structure of the microphone array is shown in Figure 1. Use the cross-shaped microphone array shown in Figure 1 to collect wind noise signals, circle 101 represents 12 B&K 4189 standard microphones, and is equipped with a spherical sponge windshield with a diameter of 10cm. 102 denotes brackets for fixing these microphones. The apertures of the three 4-element arrays on the cross array are 0.2m, 0.6m and 1.2m respectively. Using SONY SIR-1000 16-channel synchronous digital recorder, the sampling frequency is 48kHz. During data analysis, low-pass filtering, normalization, and downsampling to 1kHz were used for analysis.

使用实测数据，统计阵列风噪声的空间相关性。采用下式计算通道间平均相关系数：Using the measured data, the spatial correlation of the wind noise of the array is counted. The average correlation coefficient between channels was calculated using the following formula:

$\overset{&OverBar; &OverBar;}{γ γ} ((k k)) = = \frac{22}{R R ((R R - - 11))} {Σ Σ}_{i i = = 11}^{R R} {Σ Σ}_{j j = = i i + + 11}^{R R} | | {γ γ}_{ij ij} ((k k)) | | - - - - - - ((1717))$

其中，k＝0，...，N/2；R为通道数，γ_ij(k)为第i通道与第j通道的互相关系数，

为平均相关系数。Wherein, k=0,...,N/2; R is the number of channels, and γ _ij (k) is the cross-correlation coefficient between the i-th channel and the j-th channel,

is the average correlation coefficient.

表1 风噪声通道间平均相关系数Table 1 Average correlation coefficient between wind noise channels

表1说明，与低频风噪声的空间相关性不同，中高频处的风噪声通道间相关系数小，空间相关性微弱，阵列风噪声基本可以认为通道间相互独立。Table 1 shows that, unlike the spatial correlation of low-frequency wind noise, the inter-channel correlation coefficient of wind noise at medium and high frequencies is small, and the spatial correlation is weak. The array wind noise can basically be considered independent of the channels.

本发明方案是在采集了声音数据之后，根据相干系数对声音数据进行筛选，筛选出携带有用信号较多的频段的声音数据，而放弃其他频段的声音数据，这样一来，就减少了波达方向估计所需的数据量，从而减小后续的波达方向估计的运算量。The solution of the present invention is to filter the sound data according to the coherence coefficient after collecting the sound data, to filter out the sound data of the frequency bands carrying more useful signals, and to discard the sound data of other frequency bands, thus reducing the wave arrival rate. The amount of data required for direction estimation, thereby reducing the amount of computation for subsequent DOA estimation.

本发明提出的一种针对宽带声源的波达方向估计方法流程图；如图2所示，包括如下步骤：A flow chart of a method for estimating the direction of arrival of a broadband sound source proposed by the present invention; as shown in Figure 2, it includes the following steps:

步骤201：从由P个传声器组成的传声器阵列采集的声音数据中，选择一段声音数据x(n)＝[x₁(n)，...，x_P(n)]^T，其中，n∈Z^*。Step 201: Select a piece of sound data x(n)=[x ₁ (n), . . . , x _P (n)] ^T from the sound data collected by a microphone array composed of P microphones, where n Z ^* .

步骤202：将所选择的声音数据x(n)均匀分成L段x_l(n)，l＝1，...，L，每段的数据x_l(n)的长度为N，x(n)的数据数据长度为N×L。Step 202: the selected sound data x (n) is evenly divided into L segments x _l (n), l=1, ..., L, the length of each segment of data x _l (n) is N, x (n ) data data length is N×L.

步骤203：对每一段声音数据x_l(n)做N点快速傅里叶变换(FFT)，得到声音数据的频域表示为X_l(k)＝[X_l1(k)，...，X_lP(k)]^T，其中，k＝1，2，...，N/2。Step 203: do N-point Fast Fourier Transform (FFT) to each section of sound data x _l (n), and obtain the frequency domain representation of the sound data as X _l (k)=[X _l1 (k), ..., X _lP (k)] ^T , where k=1, 2, . . . , N/2.

步骤204：计算传声器阵列中的两个传声器间的相干函数模的平方(MSC)

其中，i，j∈[1，2，...，P]，Φ_i，j(k)为第i个传声器和第j个传声器的互功率谱，Φ_i，i(k)和Φ_j，j(k)分别为第i个传声器和第j个传声器的自功率谱：Step 204: Calculate the square of the coherence function modulus (MSC) between two microphones in the microphone array

Among them, i, j∈[1, 2, ..., P], Φ _{i, j} (k) is the cross-power spectrum of the i-th microphone and the j-th microphone, Φ _{i, i} (k) and Φ _{j , j} (k) are the self-power spectra of the i-th microphone and the j-th microphone respectively:

${Φ Φ}_{i i,, i i} ((k k)) = = {Σ Σ}_{l l = = 11}^{L L} {| | {X x}_{li li} ((k k)) | |}^{22};;$ ${Φ Φ}_{j j,, j j} ((k k)) = = {Σ Σ}_{l l = = 11}^{L L} {| | {X x}_{lj lj} ((k k)) | |}^{22};;$ ${Φ Φ}_{i i,, j j} ((k k)) = = {Σ Σ}_{l l = = 11}^{L L} {X x}_{li li} ((k k)) {X x}_{lj lj}^{* *} ((k k)) . .$

较佳地，所述两个传声器可以为传声器阵列中空间距离最大的两个传声器。Preferably, the two microphones may be the two microphones with the largest spatial distance in the microphone array.

步骤205：从相干函数模的平方η(k)中挑选出较大的q个相干函数模的平方值，即η(k₁)≥η(k₂)≥...≥η(k_q)≥η(k_m)，其中，k_q为频带序号，[k₁，...，k_q]∈[1，2，....，N/2]，

Step 205: Select larger q coherent function modulus square values from the coherent function modulus square η(k), that is, η(k ₁ )≥η(k ₂ )≥...≥η(k _q ) ≥η(k _m ), where k _q is the frequency band number, [k ₁ ,...,k _q ]∈[1,2,...,N/2],

步骤206：使用挑选出的q个相干函数模的平方对应的声音数据X_l(k)进行波达方向估计。其中，k∈[k₁，...，k_q]；以AML算法为例，则角度的计算公式由(18)可知，Step 206: Use the selected sound data X _l (k) corresponding to the squares of the q coherence function moduli to perform DOA estimation. Among them, k∈[k ₁ ,...,k _q ]; taking the AML algorithm as an example, the calculation formula of the angle can be known from (18),

${θ θ}_{s the s} = = \underset{{θ θ}_{s the s}}{max max} {Σ Σ}_{k k = = {k k}_{11}}^{{k k}_{q q}} {| | | | P P ((k k,, {θ θ}_{s the s})) {X x}_{l l} ((k k)) | | | |}^{22} - - - - - - ((1818))$

由于相干函数的计算需要较多数据，即N×L要较大，因此可能影响方法的实时性。在相干函数的计算中，保存了之前连续(L-1)帧的数据作为数据缓存，根据当前获得的N个数据和缓存的N×(L-1)个数据求取MSC。并且每次获得新一帧数据之后，都丢弃最前一帧的数据，更新整个数据缓存。如系统中的全部数据为x_l(n)，l＝1，...，L，其中x_L(n)为当前获得的一帧数据，而x_l(n)，l＝1，...，L-1为以前的数据。使用x_l(n)，l＝1，...，L算MSC，而仅使用x_L(n)进行角度估计。Since the calculation of the coherence function requires more data, that is, N×L is larger, it may affect the real-time performance of the method. In the calculation of the coherence function, the data of the previous consecutive (L-1) frames are saved as a data cache, and the MSC is calculated according to the currently obtained N data and the cached N×(L-1) data. And each time a new frame of data is obtained, the data of the previous frame is discarded, and the entire data cache is updated. For example, all the data in the system are x _l (n), l=1,..., L, where x _L (n) is a frame of data currently obtained, and x _l (n), l=1, .. ., L-1 is the previous data. MSC is calculated using x _l (n), l=1, . . . , L, while only x _L (n) is used for angle estimation.

本发明还提出一种针对宽带声源的波达方向估计的装置，包括：The present invention also proposes a device for estimating the direction of arrival of a broadband sound source, including:

傅立叶变换模块，用于对P个传声器组成的传声器阵列采集的声音数据进行快速傅立叶变换，得到声音数据的频域表示，并将所述频域声音数据输出至相干函数的模平方模块；The Fourier transform module is used to perform fast Fourier transform on the sound data collected by the microphone array composed of P microphones to obtain the frequency domain representation of the sound data, and output the frequency domain sound data to the modular square module of the coherence function;

相干函数的模平方模块，用于计算传声器阵列中两个传声器间的声音数据的相干函数的模平方，并将计算得到的相干函数的模平方输出至比较模块；The module square of the coherence function is used to calculate the module square of the coherence function of the sound data between two microphones in the microphone array, and the module square of the calculated coherence function is output to the comparison module;

比较模块，用于从相干函数的模平方中挑选出较大的q个值；A comparison module is used to select larger q values from the modulus square of the coherence function;

波达方向估计模块，用于根据比较模块所挑选出的相干函数的模平方对应的声音数据的频带进行波达方向估计。The direction of arrival estimation module is used to estimate the direction of arrival according to the frequency band of the sound data corresponding to the modulus square of the coherence function selected by the comparison module.

较佳地，所述相干函数的模平方模块用于计算传声器阵列中距离最远的两个传声器间声音数据的相干函数的模平方，并将计算得到的相干函数的模平方输出至比较模块。Preferably, the module square of the coherence function is used to calculate the module square of the coherence function of the sound data between the two farthest microphones in the microphone array, and output the calculated module square of the coherence function to the comparison module.

较佳地，所述傅立叶变换模块包括：Preferably, the Fourier transform module includes:

数据选择单元，用于从由P个传声器组成的传声器阵列采集的声音数据中选择长度为N×L的一段声音数据作为待处理声音数据x(n)＝[x₁(n)，...，x_P(n)]^T，其中，n∈Z^*；The data selection unit is used to select a piece of sound data with a length of N×L from the sound data collected by the microphone array composed of P microphones as the sound data to be processed x(n)=[x ₁ (n),... , x _P (n)] ^T , where n∈Z ^* ;

数据分帧单元，用于将所述数据选择单元所选择的待处理声音数据x(n)分成L段，其中任一段的声音数据为x_l(n)，l＝1，...，L，并将分段后的声音数据输出；以及A data framing unit for dividing the sound data x(n) to be processed selected by the data selection unit into L segments, wherein the sound data of any segment is x _l (n), l=1,...,L , and output the segmented sound data; and

傅立叶变换单元，用于对所述数据分帧单元输出的每一段声音数据x_l(n)做N点快速傅里叶变换，得到声音数据的频域表示为X_l(k)＝[X_l1(k)，...，X_lP(k)]^T，其中，k＝1，2，...，N/2。The Fourier transform unit is used to do N-point fast Fourier transform to each section of sound data x _l (n) output by the data framing unit, and obtain the frequency domain expression of the sound data as X _l (k)=[X _l1 (k), . . . , X _lP (k)] ^T , where k=1, 2, . . . , N/2.

较佳地，相干函数的模平方模块计算传声器阵列第i个传声器和第j个传声器之间的相干函数的模平方为：

其中，i，j∈[1，2，...，P]，Φ_i，j(k)为第i个传声器和第j个传声器的互功率谱，Φ_i，i(k)和Φ_j，j(k)分别为第i个传声器和第j个传声器的自功率谱。Preferably, the module square of the coherence function calculates the module square of the coherence function between the i-th microphone and the j-th microphone of the microphone array as:

Among them, i, j∈[1, 2, ..., P], Φ _{i, j} (k) is the cross-power spectrum of the i-th microphone and the j-th microphone, Φ _{i, i} (k) and Φ _{j , j} (k) are the self-power spectra of the i-th microphone and the j-th microphone, respectively.

较佳地，所述比较模块从N/2个相干函数的模平方η(k)中挑选出较大的q个相干函数的模平方值，即η(k₁)≥η(k₂)≥...≥η(k_q)，k_q表示频带序号，k＝1，2，...，N/2，[k₁，...，k_q]∈[1，2，....，N/2]。Preferably, the comparison module selects the larger q modulus square values of coherent functions from the modulus squares η(k) of N/2 coherent functions, that is, η(k ₁ )≥η(k ₂ )≥ ...≥η(k _q ), k _q represents the serial number of the frequency band, k=1, 2, ..., N/2, [k ₁ , ..., k _q ]∈[1, 2, ... ., N/2].

较佳地，所述DOA估计模块根据比较模块所挑选出的相干函数的模平方对应的声音数据的频带进行波达方向估计。Preferably, the DOA estimating module performs direction-of-arrival estimation according to the frequency band of the sound data corresponding to the modulus square of the coherence function selected by the comparing module.

较佳地，该装置进一步包括：Preferably, the device further includes:

声音数据缓存模块，用于保存所述传声器阵列到当前为止采集的连续L帧声音数据；并当接收到来自所述传声器阵列最新的1帧声音数据后，将保存的连续L帧声音数据中最早的1帧声音数据删除。The sound data cache module is used to save the continuous L frames of sound data collected so far by the microphone array; and after receiving the latest 1 frame of sound data from the microphone array, the earliest continuous L frames of sound data will be saved 1 frame of sound data is deleted.

所述傅立叶变换模块用于对声音数据缓存模块中存储的声音数据进行行快速傅立叶变换得到声音数据的频域表示，并将所述频域声音数据输出至相干函数的模平方模块。The Fourier transform module is used to perform row-fast Fourier transform on the sound data stored in the sound data buffer module to obtain the frequency domain representation of the sound data, and output the frequency domain sound data to the modulus square module of the coherence function.

以下通过仿真实验以及实地实验的结果对本发明方案的技术效果进行说明。在仿真试验中，传声器组阵列为圆阵，阵元数为4，圆阵的半径为0.2m，声源为白噪声，个数为1，入射角为70°。使用全频带的AML算法和频带选取的AML算法结果对比分别如图3和图4所示，其中，图3为白噪声的情况，图4为有色噪声的情况。其中有色噪声是由白噪声通过一个50阶的FIR滤波器所产生的。The technical effect of the solution of the present invention will be described below through the results of simulation experiments and field experiments. In the simulation test, the microphone group array is a circular array, the number of array elements is 4, the radius of the circular array is 0.2m, the sound source is white noise, the number is 1, and the incident angle is 70°. The comparison of the results of the AML algorithm using the full frequency band and the AML algorithm with frequency band selection is shown in Figure 3 and Figure 4, respectively, where Figure 3 is the case of white noise, and Figure 4 is the case of colored noise. The colored noise is generated by passing white noise through a 50-order FIR filter.

从图3可以看出，当背景为白噪声的条件时，低信噪比下全频带的AML算法要优于频带选取的AML算法；高信噪比下这两种算法的偏差相近都趋向于零。从图4可以看出，当背景为有色噪声时，低信噪比使用全频带的AML算法已经不是最优的，从图中可以看出在不同信噪比下应选取不同的频带数，这是由于背景为有色噪声，在某些频带上信噪比会很大，而在某些频带上信噪比会很小，因此选取过多的频带会使这些信噪比低的频带也参与运算，从而使算法的偏差增大；高信噪比下，两种算法的偏差也都趋近于0。It can be seen from Figure 3 that when the background is white noise, the full-band AML algorithm with low SNR is better than the AML algorithm with frequency band selection; the deviations of the two algorithms are similar and tend to be zero. It can be seen from Figure 4 that when the background is colored noise, it is not optimal to use the full-band AML algorithm for low SNRs. It can be seen from the figure that different numbers of frequency bands should be selected under different SNRs. Because the background is colored noise, the signal-to-noise ratio will be large in some frequency bands, and the signal-to-noise ratio will be small in some frequency bands, so selecting too many frequency bands will make these frequency bands with low signal-to-noise ratio also participate in the calculation , so that the deviation of the algorithm increases; under high signal-to-noise ratio, the deviation of the two algorithms also tends to 0.

在野外实际实验中，阵列为圆阵，阵元数为4，圆阵半径为0.2m，声源为坦克绕阵列做圆周运动。实验结果如图5所示，图标“*”是全频带搜索的结果，图标“o”是只取相干系数最大的20个频带计算所得的结果，从图中可以看出，两者定位精度基本一样，但使用相干系数选择频带后，运算量是原来的1/15。In the actual field experiment, the array is a circular array, the number of array elements is 4, the radius of the circular array is 0.2m, and the sound source is a tank moving in a circle around the array. The experimental results are shown in Figure 5. The icon "*" is the result of the full-band search, and the icon "o" is the result calculated by taking only the 20 frequency bands with the largest coherence coefficients. It can be seen from the figure that the positioning accuracy of the two is basically the same. The same, but after using the coherence coefficient to select the frequency band, the amount of calculation is 1/15 of the original.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到本发明可借助软件加必需的硬件平台的方式来实现，当然也可以全部通过硬件来实施，但很多情况下前者是更佳的实施方式。基于这样的理解，本发明的技术方案对背景技术做出贡献的全部或者部分可以以软件产品的形式体现出来，该计算机软件产品可以存储在存储介质中，如ROM/RAM、磁碟、光盘等，包括若干指令用以使得一台计算机设备执行本发明各个实施例或者实施例的某些部分所述的方法。计算机设备如：个人计算机，服务器，或者网络设备等。Through the description of the above embodiments, those skilled in the art can clearly understand that the present invention can be realized by means of software plus a necessary hardware platform, and of course all can be implemented by hardware, but in many cases the former is better implementation. Based on this understanding, all or part of the contribution made by the technical solution of the present invention to the background technology can be embodied in the form of software products, and the computer software products can be stored in storage media, such as ROM/RAM, magnetic disks, optical disks, etc. , including several instructions for causing a computer device to execute the methods described in various embodiments or some parts of the embodiments of the present invention. Computer equipment such as: personal computer, server, or network equipment, etc.

最后所应说明的是，以上实施例仅用以说明本发明的技术方案而非限制。尽管参照实施例对本发明进行了详细说明，本领域的普通技术人员应当理解，对本发明的技术方案进行修改或者等同替换，都不脱离本发明技术方案的精神和范围，其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention rather than limit them. Although the present invention has been described in detail with reference to the embodiments, those skilled in the art should understand that modifications or equivalent replacements to the technical solutions of the present invention do not depart from the spirit and scope of the technical solutions of the present invention, and all of them should be included in the scope of the present invention. within the scope of the claims.

Claims

1. Wave arrival direction estimating method for broadband sound source, the method concrete steps comprise:

Step 1): from the voice data that microphone array gathers, choose one section voice data X (n)=[x ₁(n) ..., x _P(n)] ^T, wherein, P represents the number of microphone in the microphone array, P 〉=2; This section voice data X (n) evenly is divided into the L frame, the data X of every frame _l(n) length is N, and the data length of X (n) is N * L, l=1 ..., L; N ∈ Z ^*

To each frame voice data X _l(n) do the conversion of N point quick Fourier, obtain the frequency domain representation X of voice data _l(k)=[x _L1(k) ..., x _LP(k)] ^T, k=1 wherein, 2 ..., N/2;

Step 2): the X that obtains according to described step 1) _l(k) calculate the mould square η (k) of coherence function between two microphones in the microphone array according to following formula (1);

η (k) = {| γ_{i, j (k)} |}^{2} = \frac{{| Φ_{i, j} (k) |}^{2}}{Φ_{i, i} (k) Φ_{j, j} (k)} - - - (1)

Wherein, k=1,2 ..., N/2; I, j ∈ [1,2 ..., P]; Φ _{I, j}(k) be the cross-power spectrum of i microphone and j microphone,

Φ _{I, i}(k) be the auto-power spectrum of i microphone,

Φ_{i, i} (k) = Σ_{l = 1}^{L} {| x_{li} (k) |}^{2};

Φ _{J, j}(k) be the auto-power spectrum of j microphone,

Φ_{j, j} (k) = Σ_{l = 1}^{L} {| x_{lj} |}^{2};

Step 3): according to described step 2) η (k) that obtains from the beginning of maximum select successively q coherence function mould square;

Step 4): the voice data X of square correspondence of q the coherence function mould of selecting according to described step 3) _l(k) carry out direction of arrival and estimate, wherein, k ∈ [k ₁..., k _q], k _qExpression band number, and [k ₁..., k _q] ∈ [1 ..., N/2].

2. the Wave arrival direction estimating method for broadband sound source according to claim 1 is characterized in that, described step 1) also comprises: preserve the continuous L frame voice data that described microphone array gathers till current; And preserve from behind the up-to-date 1 frame voice data of described microphone array, with the earliest 1 frame voice data deletion in the continuous L frame voice data of preserving.

3. the Wave arrival direction estimating method for broadband sound source according to claim 1 is characterized in that, described step 2) in, two microphones are two microphones of space length maximum in the microphone array.

4. the Wave arrival direction estimating method for broadband sound source according to claim 2, it is characterized in that, described step 2) in, between two microphones continuously the data of (L-1) frame ask for the mould square of coherence function as data buffer storage according to N data of the current acquisition L frame individual data of the N that deposits * (L-1) of easing up.

5. the Wave arrival direction estimating method for broadband sound source according to claim 4 is characterized in that, in the described step 4), and the L frame voice data X of square correspondence of q the coherence function mould of selecting according to described step 3) _L(k) carry out direction of arrival and estimate, wherein, k ∈ [k ₁..., k _q], k _qExpression band number, and [k ₁..., k _q] ∈ [1 ..., N/2].

6. the Wave arrival direction estimating method for broadband sound source according to claim 1, it is characterized in that in the described step 3), more described step 2) in η (k) size that obtains, select successively a front q η (k) value, i.e. η (k from the beginning of maximum ₁) 〉=η (k ₂) 〉=... 〉=η (k _q) 〉=η (k _m), wherein, k _qThe expression band number, [k ₁..., k _q∈ [1,2 ...., N/2],

7. the direction of arrival estimation unit for broadband sound source is characterized in that this device comprises: the mould of Fourier transform module, coherence function square module and direction of arrival estimation module;

Described Fourier transform module comprises: data selection unit, data framing unit and Fourier transform unit; Described data selection unit is chosen one section voice data X (n)=[x from the voice data that microphone array gathers ₁(n) ..., x _P(n)] ^T, the voice data X (n) that described data selection unit is selected in described data framing unit evenly is divided into the L frame, and described Fourier transform unit is to each frame voice data X _l(n) do the conversion of N point quick Fourier, obtain the frequency domain representation X of voice data _l(k)=[x _L1(k) ..., x _LP(k)] ^TAnd described frequency domain voice data is exported to the mould square module of described coherence function; Wherein, P represents the number of microphone in the microphone array, P 〉=2; The data X of every frame _l(n) length is N, l=1 ..., L, the data length of X (n) is N * L; N ∈ Z ^*K=1,2 ..., N/2;

The mould of described coherence function square module is used for the X that obtains according to described Fourier transform module _l(k) calculate the mould square η (k) of coherence function between two microphones in the microphone array according to formula (2);

η (k) = {| γ_{i, j (k)} |}^{2} = \frac{{| Φ_{i, j} (k) |}^{2}}{Φ_{i, i} (k) Φ_{j, j} (k)} - - - (2)

Φ _{I, i}(k) be the auto-power spectrum of i microphone,

Φ_{i, i} (k) = Σ_{l = 1}^{L} | {X_{lj} (k) |}_{;}^{2}

Φ _{J, j}(k) be the auto-power spectrum of j microphone,

Φ_{j, j} (k) = Σ_{l = 1}^{L} {| X_{lj} (k) |}^{2};

Described direction of arrival estimation module, the mould square that is used for the coherence function that obtains according to the mould square module from described coherence function is selected the voice data X of square correspondence of q coherence function mould successively from the beginning of maximum _l(k) carry out direction of arrival and estimate, wherein, k ∈ [k ₁..., k _q], k _qExpression band number, and [k ₁..., k _q] ∈ [1 ..., N/2].

8. the direction of arrival estimation unit for broadband sound source according to claim 7 is characterized in that, in the mould of the described coherence function square module, two microphones are two microphones of space length maximum in the microphone array.

9. the direction of arrival estimation unit for broadband sound source according to claim 7 is characterized in that described device also comprises the voice data cache module;

Described voice data cache module be used for to be preserved the continuous L frame voice data that microphone array gathers till current; And preserve and to receive from behind the up-to-date 1 frame voice data of microphone array, with the earliest 1 frame voice data deletion in the continuous L frame voice data of preserving; And export the voice data of buffer memory to described Fourier transform module.

10. the direction of arrival estimation unit for broadband sound source according to claim 9, it is characterized in that, between two microphones that the mould of described coherence function square module will be chosen continuously the data of (L-1) frame ask for the mould square of coherence function as data buffer storage according to N data of the current acquisition L frame individual data of the N that deposits * (L-1) of easing up.

11. the direction of arrival estimation unit for broadband sound source according to claim 10 is characterized in that, the L frame voice data X of square correspondence of q the coherence function mould that described direction of arrival estimation module obtains according to comparison module _L(k) carry out direction of arrival and estimate, wherein, k ∈ [k ₁..., k _q].

12. the direction of arrival estimation unit for broadband sound source according to claim 7, it is characterized in that, this device also comprises a comparison module, be used for η (k) size that the mould square module according to more described coherence function obtains, select successively a front q η (k) value, i.e. η (k from the beginning of maximum ₁) 〉=η (k ₂) 〉=... 〉=η (k _q) 〉=η (k _m), wherein, k _qThe expression band number, [k ₁..., k _q] ∈ [1,2 ...., N/2],

Square export the mould of the coherence function picked out to described direction of arrival estimation module.