CN113933779A

CN113933779A - Unknown sound source number DOA estimation method based on S transformation

Info

Publication number: CN113933779A
Application number: CN202111206753.4A
Authority: CN
Inventors: 钟舜聪; 黎昕婷; 钟剑锋; 许令鸿; 吴生源; 谢宇昆
Original assignee: Fuzhou University
Current assignee: Fuzhou University
Priority date: 2021-10-15
Filing date: 2021-10-15
Publication date: 2022-01-14

Abstract

The invention relates to a DOA estimation method of the number of unknown sound sources based on S transformation. The method establishes a time-frequency domain array signal model which has consistency with the traditional array model form by performing S transformation on voice signals, introduces multi-resolution analysis, and has the advantages that the self-adaptive time-frequency window of wavelet transformation breaks away from the limitation of wavelet transformation allowable conditions and the like; and then, according to the characteristic that the power spectrum matrix of the frequency band at different moments is in a joint diagonalization structure, the model is applied to the traditional method to obtain a new space spectrum, the space spectrum is based on an S transformation result, the problem that the number of signal sources needs to be preset in the traditional MUSIC is solved, and the DOA estimation precision is better.

Description

Unknown sound source number DOA estimation method based on S transformation

Technical Field

The invention belongs to the field of voice signal positioning, and particularly relates to a DOA (direction of arrival) estimation method for the number of unknown sound sources based on S (direction of arrival) transformation.

Background

Direction of arrival (DOA) estimation based on sensor arrays has been a research hotspot in the fields of sonar, radar, communication, speech processing and the like, and the requirement of modern technology development makes it no longer limited to traditional narrow-band signal processing. The speech signal is used as a broadband non-stationary signal, and the traditional narrowband signal processing cannot realize high-precision positioning of the sound source position.

The broadband signal DOA estimation algorithm is mainly divided into two types in principle: non-coherent signal subspace approach and coherent signal subspace approach. The former uses the idea of frequency decomposition to decompose a broadband source signal into a series of narrowband signals and then perform DOA estimation. The latter is based on the principle of frequency focusing, wherein the most typical method is a bilateral correlation transformation method, signals obtained by decomposition are transformed to a uniform reference frequency point through a focusing matrix, then DOA estimation is carried out in a narrow-band mode, the coherent signal subspace method also needs to estimate the angle of a signal source, and the dependence of the focusing matrix on the estimated angle causes deviation of a final DOA estimation result.

For broadband non-stationary signals, researchers also try to research from the time-frequency domain, and the time-frequency information of the broadband signals is fully utilized to obtain more accurate DOA estimation results. Common time-frequency analysis methods include short-time fourier transform, S-transform, and wavelet transform, which are capable of multi-resolution analysis due to variable window functions. The S transformation is a time-frequency analysis method between short-time Fourier transformation and wavelet transformation, overcomes the defect that the short-time Fourier transformation cannot adjust the frequency of an analysis window, introduces multi-resolution analysis, and has the advantages that the self-adaptive time-frequency window of the wavelet transformation and basic wavelets are out of the limit of allowable conditions, and the like.

Disclosure of Invention

The present invention is to solve the above problems in the prior art, and provide a method for estimating the number of unknown sound sources DOA based on S transform.

In order to achieve the purpose, the technical scheme of the invention is as follows: a DOA (direction of arrival) estimation method for the number of unknown sound sources based on S transformation comprises the following steps:

1) in a sound field, arranging a microphone array which is uniformly and linearly arranged, and synchronously acquiring a section of voice signal by using a data acquisition device to obtain an array receiving signal;

2) s transformation is carried out on the signals after S transformation parameters are set according to the effective frequency band of the array received signals, and S transformation results are obtained;

3) constructing a time-frequency array data model under S transformation according to the S transformation result obtained in the step 2) to obtain a time-frequency domain guide vector;

4) performing correlation operation on the S transformation result according to the time-frequency array data model obtained in the step 3) to obtain an autocorrelation spectrum covariance matrix of the array signal;

5) constructing target direction-of-arrival estimation under a noise condition according to the time-frequency domain steering vector in the step 3) and the autocorrelation spectrum covariance matrix of the array signal obtained in the step 4);

6) solving the target direction of arrival estimation minimization function obtained in the step 5) to obtain a space spectrum estimation result;

7) and setting a search range and search stepping, and obtaining the positions of all spectrum peaks through spectrum search to obtain the maximum estimation value of the DOA.

In an embodiment of the present invention, in step 2), S-transform is performed on the array received signal to obtain an S-transform result, where the S-transform step is:

firstly, defining the received signal of the m-th array element in the array as:

p is the number of signal sources, s_p(t) is the incident signal of P broadband signal sources, tau_mpIs the time delay, n, caused by the arrival of the p-th signal at the m-th array element_m(t) isAdditive white Gaussian noise of m array elements;

for signal x_m(t) S-transform, the result ST of the transform_mCan be expressed as:

τ is a time factor, f_i(I ═ 1,2, …, I) is the frequency component, and the above equation is written as a fourier spectrum:

in order to be a component of the signal,

is a noise component.

In an embodiment of the present invention, the step 3) of constructing a time-frequency array data model under S transform to obtain a time-frequency domain steering vector includes:

assuming that the center frequency and the bandwidth of the S transform frequency window satisfy the narrow-band signal condition, the time-frequency array signal model of the M array element receiving signals is expressed as follows:

is a signal s_p(t), P is 1,2, …, S transform result vector of P;

is a noise vector;

H(θ,f_i)＝[h(θ₁,f_i),h(θ₂,f_i),…,h(θ_P,f_i)]an array direction matrix with dimension of M multiplied by P;

the steering vector of the array model data is represented as:

h(θ_p,f_i)＝[1,exp(-j2πf_iτ_1p),…,exp(-j2πf_iτ_(M-1)p)]^T。

in an embodiment of the present invention, the step 4) of performing correlation operation on the S transformation result to obtain an autocorrelation spectrum covariance matrix of the array signal includes:

ST (tau, f) is calculated according to the principle of classical multiple signal classification algorithm_i) The covariance matrix of (a) is:

R_Y＝E{ST(τ,f_i)·ST^H(τ,f_i)}＝H(θ,f_i)R_X(τ,f_i)H^H(θ,f_i)+σ_i ²I

R_X(τ,f_i) Is composed of

The covariance matrix of (a) is determined,

for transformed noise variance, σ_iIs a matrix singular value;

since the received data is finite in length in practice, the maximum likelihood estimate of the data covariance matrix is expressed as:

in an embodiment of the present invention, in step 5), a target direction of arrival estimation under a noise condition is constructed according to a time-frequency domain steering vector and an autocorrelation spectrum covariance matrix of an array signal, and the steps are as follows:

for the nth (1, 2, P) signal source, a vector b is defined_n(f) Satisfy the requirement of

Regardless of the noise term, there is:

wherein

Under noisy conditions, the target direction-of-arrival estimate, i.e., the minimization function, is expressed as

Wherein d (f) is satisfied

In an embodiment of the present invention, in step 6), a minimization function of the estimation of the target direction of arrival is solved, and the obtained spatial spectrum estimation result is:

s transform to get

Substituting the data covariance result based on the S transformation for the time continuum processing result, the spectral estimation formula is:

compared with the prior art, the invention has the following beneficial effects:

the invention firstly carries out S transformation on voice signals to obtain a multi-resolution time-frequency spectrum matrix, thereby establishing a time-frequency domain array signal model which has consistency with the traditional array model form, overcoming the defect that the frequency of an analysis window cannot be adjusted by short-time Fourier transformation, introducing multi-resolution analysis, and having the advantages that the self-adaptive time-frequency window of wavelet transformation breaks away from the limitation of wavelet transformation tolerance conditions and the like; and then, according to the characteristic that the power spectrum matrix of the frequency band at different moments is in a joint diagonalization structure, applying the model to a traditional method to obtain a new space spectrum, wherein the space spectrum is based on an S transformation result, and the problem that the number of signal sources needs to be preset in the traditional MUSIC is solved.

The invention provides a DOA estimation method of the number of unknown sound sources based on S transformation, aiming at the problems of high calculation complexity, large calculation amount, the need of pre-estimating the number of signal sources and the like in the DOA estimation method. Firstly, S transformation is carried out on voice signals, a time-frequency domain array signal model which is consistent with the traditional array model form is established, multi-resolution analysis is introduced, and the method has the advantages that the self-adaptive time-frequency window of wavelet transformation breaks away from the limitation of wavelet transformation tolerance conditions and the like; and then, according to the characteristic that the power spectrum matrix of the frequency band at different moments is in a joint diagonalization structure, applying the model to a traditional method to obtain a new space spectrum, wherein the space spectrum is based on an S transformation result, and the problem that the number of signal sources needs to be preset in the traditional MUSIC is solved. The method can improve the DOA estimation resolution, has better estimation precision and has stronger practicability in the actual processing of voice signal sound source positioning.

Drawings

Fig. 1 is a flow chart of the present invention and fig. 1 is a programming flow chart of the positioning method of the present invention.

Fig. 2 shows the positioning result of one-dimensional wideband signal DOA estimation in the embodiment of the present invention.

Fig. 3 shows the positioning result of the DOA estimation of the two-dimensional speech signal in the example of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be described in detail and clearly with reference to the accompanying drawings. As shown in fig. 1, the present invention provides a method for estimating the number DOA of unknown sound sources based on S transform, comprising the following steps:

2) s transformation is carried out on the signals after S transformation parameters are set according to the effective frequency band of the received signals, and S transformation results are obtained; the method comprises the following specific steps:

p is the number of signal sources, s_p(t) is the incident signal of P broadband signal sources, tau_mpIs the time delay, n, caused by the arrival of the p-th signal at the m-th array element_m(t) is additive white gaussian noise for the mth array element;

in order to be a component of the signal,

is a noise component;

is a signal s_p(t), P is 1,2, …, S transform result vector of P;

is a noise vector;

the steering vector of the array model data is represented as:

h(θ_p,f_i)＝[1,exp(-j2πf_iτ_1p),…,exp(-j2πf_iτ_(M-1)p)]^T

4) calculating the S transformation result to obtain an autocorrelation spectrum covariance matrix of the array signal;

R_Y＝E{ST(τ,f_i)·ST^H(τ,f_i)}＝H(θ,f_i)R_X(τ,f_i)H^H(θ,f_i)+σ_i ²I

R_X(τ,f_i) Is composed of

The covariance matrix of (a) is determined,

for transformed noise variance, σ_iIs a matrix singular value;

5) constructing target direction-of-arrival estimation under a noise condition;

Regardless of the noise term, there is:

wherein

Wherein d (f) is satisfied

6) Solving a target direction of arrival estimation minimization function to obtain a space spectrum estimation result;

s transform to get

In this example, the following experimental conditions were constructed: a16-element uniform linear array is adopted to construct two incoherent linear frequency modulation signals with the signal-to-noise ratio of 0dB and the frequency range of 165Hz to 300Hz, the incidence angles of the signals are set to be-20 degrees and 20 degrees respectively, the sampling frequency is 4kHz, and the total length of the signals is 1024 data points.

As shown in fig. 2, accurate DOA estimation results can be obtained at the double peak.

Under the same condition, the real voice signals are adopted to carry out two-dimensional sound source positioning verification, the reverberation condition is not considered, and the azimuth angles and the pitch angles of the two voice sound sources are respectively set to be 40 degrees, 40 degrees and-20 degrees.

As can be seen from fig. 3, this method can also obtain accurate positioning results in two-dimensional positioning.

The above examples are to be construed as merely illustrative and not limitative of the remainder of the disclosure. After reading the description of the invention, the skilled person can make various changes or modifications to the invention, and these equivalent changes and modifications also fall into the scope of the invention defined by the claims.

Claims

1. A DOA (direction of arrival) estimation method for the number of unknown sound sources based on S transformation is characterized by comprising the following steps of:

2. The method for estimating the number of unknown acoustic sources DOA based on S transform as claimed in claim 1, wherein the S transform is performed on the array received signals in step 2) to obtain S transform results, and the S transform step is:

in order to be a component of the signal,

is a noise component.

3. The method for estimating the number of unknown acoustic sources DOA based on S transformation as claimed in claim 2, wherein the step 3) of constructing a time-frequency array data model under S transformation, and the step of obtaining the time-frequency domain steering vector comprises:

is a signal s_p(t), P is 1,2, …, S transform result vector of P;

is a noise vector;

the steering vector of the array model data is represented as:

h(θ_p,f_i)＝[1,exp(-j2πf_iτ_1p),…,exp(-j2πf_iτ_(M-1)p)]^T。

4. the method for estimating the number of unknown acoustic sources DOA based on S transformation as claimed in claim 3, wherein the step 4) of performing correlation operation on the S transformation result to obtain the autocorrelation spectrum covariance matrix of the array signal comprises:

R_X(τ,f_i) Is X_fi(τ) a covariance matrix of (τ),

for transformed noise variance, σ_iIs a matrix singular value;

5. the method for estimating the number of unknown acoustic sources DOA based on S transformation as claimed in claim 4, wherein in step 5), the target direction of arrival estimation under the noise condition is constructed according to the time-frequency domain steering vector and the autocorrelation spectrum covariance matrix of the array signal, and the steps are as follows:

Regardless of the noise term, there is:

wherein d is_n(f)＝R_Xn(f)h^H(θ_n,f)b_n(f)；

Wherein d (f) is satisfied

6. The method for estimating the number of unknown acoustic sources DOA based on S transformation as recited in claim 5, wherein the objective direction of arrival estimation minimization function is solved in step 6), and the obtained spatial spectrum estimation result is:

s transform to get