CN111352075B

CN111352075B - Underwater multi-sound-source positioning method and system based on deep learning

Info

Publication number: CN111352075B
Application number: CN201811564007.0A
Authority: CN
Inventors: 徐及; 黄兆琼; 颜永红
Original assignee: Institute of Acoustics CAS
Current assignee: Institute of Acoustics CAS
Priority date: 2018-12-20
Filing date: 2018-12-20
Publication date: 2022-01-25
Anticipated expiration: 2038-12-20
Also published as: CN111352075A

Abstract

The invention discloses an underwater multi-sound-source positioning method and system based on deep learning, wherein the method comprises the following steps: receiving a signal to be detected through a hydrophone array, and estimating the direction of a sound source; and forming a subarray wave beam in the direction where the sound source possibly exists, then calculating a spatial correlation matrix of the signal to be detected, forming a characteristic vector, inputting the characteristic vector to a pre-trained time delay neural network, and outputting the distance of the sound source. The underwater multi-sound-source positioning method can be independent of prior knowledge of environmental parameters; a plurality of sound sources are distinguished on a characteristic level by utilizing a subarray beam forming method, so that a plurality of underwater targets are positioned simultaneously.

Description

Underwater multi-sound-source positioning method and system based on deep learning

Technical Field

The invention relates to the field of underwater positioning, in particular to an underwater multi-sound-source positioning method and system based on deep learning.

Background

The sound source positioning comprises single sound source positioning and multi-sound source positioning, and the sound source positioning technology can indicate the spatial orientation of a sound source target, so that important spatial information is provided for subsequent information acquisition and processing.

The traditional method mainly utilizes the modern digital signal processing technology to estimate the position information of the underwater sound source, and gives the sound source position through a lattice point matching search or analysis mode.

In recent years, a small number of methods introduce a neural network into an underwater sound source positioning task, however, previous researches are directed to the underwater single sound source positioning task, and compared with single sound source positioning, the multi-sound source positioning task is more complex, and the problem of multi-sound source positioning in an actual environment is solved due to the fact that a plurality of sound sources are mutually interfered.

Disclosure of Invention

The invention aims to overcome the technical defects and provides an underwater multi-sound-source positioning method based on deep learning.

In order to achieve the above object, an underwater multi-sound-source localization method based on deep learning includes:

receiving a signal to be detected through a hydrophone array, and estimating the direction of a sound source; and forming a subarray wave beam in the direction where the sound source possibly exists, then calculating a spatial correlation matrix of the signal to be detected, forming a characteristic vector, inputting the characteristic vector to a pre-trained time delay neural network, and outputting the distance of the sound source.

As an improvement of the above method, the training step of the time-delay neural network includes:

step 1) forming a subarray wave beam on each frequency in a signal bandwidth to realize focusing of a sound source signal;

step 2) calculating a spatial correlation matrix of signals focused by all sub-arrays on each sound source position at each frequency in a signal bandwidth to form a characteristic vector;

and 3) taking the characteristic vector as input, taking the distance of a known sound source as a label, and training the time delay neural network by adopting a minimum mean square error criterion to obtain the trained time delay neural network.

As an improvement of the above method, the step 1) is specifically:

divide the hydrophone array into B sub-arrays, { omega }₁,…,Ω_BRespectively performing beam forming on a known sound source on each sub-array; the focus signal at the b-th sub-array is then expressed as:

wherein, the upper mark is tau_kRepresenting the delay of the sound source at the kth hydrophone relative to the first microphone,/_kAnd

representing the distance between the kth hydrophone and the first hydrophone and the unit direction vector, Ω_bIs a hydrophone index number set contained in the sub-array b, c is sound velocity, j is an imaginary part unit, f_iIs the frequency, i is the frequency index; y is_k(f_i) Converting a sound source signal received by a kth hydrophone into a digital sound signal, and performing Fourier transform to obtain a signal; β is the azimuth of the known sound source; performing subarray beam forming on all B subarrays:

G(f_i)＝[g₁(f_i),…,g_B(f_i)]^T。

as an improvement of the above method, the step 2) is specifically:

computing a spatial correlation matrix R (f) of the sound source_i)：

The spatial correlation matrix R (f)_i) The real and imaginary parts of each element of (a) are connected in series to form a feature vector.

As an improvement of the above method, the step of training the time-delay neural network by using the minimum mean square error criterion is as follows:

wherein r is_lSound source distance output for time-delay neural networkValue, r'_lThe distance value of the sound source is known, and L is the number of samples; e is a minimum cost function, iteration is carried out through random gradient descent back propagation, and a weight matrix of the time delay neural network is obtained.

As an improvement of the above method, the estimating the azimuth of the sound source specifically includes:

step S1) calculates the signal Y (f) to be detected_i) Of the spatial correlation matrix E [ Y (f)_i)Y^H(f_i)]：

Wherein E (-) represents the expected average operation (-)^HRepresents the transpose of the conjugate,

and

the eigenvalue and eigenvector matrices correspond to signal subspaces respectively,

and

respectively corresponding to the eigenvalue and the eigenvector matrix to a noise subspace;

step S2) of obtaining P_MUSICMaximum value of function obtains theta as sound source direction estimated value alpha₁,…,α_D：

Wherein, H (theta, f)_i) Is the steering vector of the sound source, F is the number of frequency points, and D is the number of sound sources.

As an improvement of the above method, the sub-array beamforming in the direction where the sound source may exist is specifically:

divide the hydrophone array into B sub-arrays, { omega }₁,…,Ω_BAnd forming a beam for each sound source on each sub-array, respectively, and then expressing a focusing signal for a d-th sound source on a b-th sub-array as:

wherein, the upper scale sigma_k,dThe delay of the kth hydrophone relative to the first microphone of the kth sound source is represented, D is more than or equal to 1 and less than or equal to D, l_kAnd

representing the distance between the kth hydrophone and the first hydrophone and the unit direction vector, Ω_bIs a hydrophone index number set contained in the sub-array b, c is sound velocity, j is an imaginary part unit, f_iIs the frequency, i is the frequency index; performing subarray beam forming on all B subarrays for the d sound sources:

as an improvement of the above method, the calculating a spatial correlation matrix of the signal to be detected to form the eigenvector specifically includes:

spatial correlation matrix S of the d-th possible source of the signal to be detected_d(f_i) Comprises the following steps:

the spatial correlation matrix S_d(f_i) The real and imaginary parts of each element of (a) are concatenated to form a feature vector.

An underwater multi-sound source localization system based on deep learning, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the above method when executing the program.

The invention has the advantages that:

1. the underwater multi-sound-source positioning method utilizes a deep neural network and does not depend on prior knowledge of environmental parameters; a plurality of sound sources are distinguished on a characteristic level by utilizing a sub-array beam forming method, so that the purpose of simultaneously positioning a plurality of underwater targets is realized;

2. the method only needs single sound source data in the training stage, and can realize the positioning task in the multi-source scene, thereby greatly reducing the complexity of the model.

Drawings

Fig. 1 is a flow chart of an underwater multi-sound-source localization method based on deep learning according to the present invention.

Detailed Description

The invention will now be further described with reference to the accompanying drawings.

Referring to fig. 1, the invention provides an underwater multi-sound-source positioning method based on deep learning, which comprises the following steps:

step 1) converting a sound source signal received through a hydrophone array into a digital sound signal;

converting a sound source signal received through the hydrophone array into a digital sound signal; wherein the hydrophone array comprises K microphones.

Step 2), Fourier transform is carried out on the digital sound signal;

step 3) forming a subarray wave beam on each frequency in the signal bandwidth to realize focusing of sound source signals in different directions, and the specific steps are as follows:

3-1) divide the hydrophone array into B sub-arrays, { omega₁,…,Ω_BAnd forming beams for the sound sources on each sub-array, respectively, and assuming that there are D sound sources, a focusing signal for the D-th sound source in the b-th sub-array may be represented as:

wherein tau is superscripted_k,dRepresenting the delay of the d sound source at the k hydrophone relative to the first microphone,/_kAnd

representing the distance between the kth hydrophone and the first hydrophone and the unit direction vector, Ω_bThe index number of the hydrophone contained in the sub-array b is set, c is sound velocity, j is an imaginary part unit, f is frequency, and i is a frequency index. Carrying out sub-array beam forming on all B sub-arrays of the d sound sources to obtain

Step 4) solving a spatial correlation matrix for the signals focused by all the subarrays at each sound source position at each frequency in the signal broadband to obtain a characteristic vector, wherein the specific steps are as follows:

and (3) solving a covariance matrix of each sound source, wherein a spatial correlation matrix of the d sound source can be expressed as:

wherein

Will be on the effective frequency band

The real part and the imaginary part of the neural network are connected in series to be used as input characteristic vectors of the neural network;

and 5) in the training stage, learning the training sample by using the time delay neural network to obtain a mapping relation model of the characteristic vector and the sound source distance. The criterion of neural network training is the minimum mean square error criterion:

wherein r is_lRepresents an estimated value of the sound source distance, r'_lThe reference value of the sound source distance is obtained, and L is the number of samples; minimizing a cost function E through a random gradient descent back propagation algorithm to obtain a weight matrix of the neural network;

step 6) in the testing stage, inputting a test sample for azimuth estimation, performing sub-array beam forming on the azimuth where a sound source possibly exists, then obtaining a spatial correlation matrix to obtain a characteristic vector of a test signal, inputting the characteristic vector to a trained model to obtain a distance estimation value of the sound source, and the specific steps are as follows:

step 6-1), carrying out azimuth estimation on the test sample, estimating the azimuth of a possible signal, and firstly solving a spatial correlation matrix of an observed signal based on a MUSIC (multiple signal classification) method, wherein the spatial correlation matrix is expressed as:

where E (-) represents the expected average operation (-)^HRepresents the transpose of the conjugate,

and

and

the eigenvalue and eigenvector matrix correspond to the noise subspace. The signal orientation may be obtained by maximizing the function:

the final target orientation is estimated as Θ ═ θ₁,…,θ_D}；

Step 6-2), extracting characteristics of all directions of possible sound sources in the theta according to the steps 3-1) and 4-1), and inputting a model to obtain distance information of a target;

the characteristic vectors extracted in the steps 3) and 4) distinguish a plurality of sound sources on a characteristic level, so that a neural network model can be trained through single sound source signal data to obtain the corresponding relation between the characteristics and the target distance, during testing, the position of the sound source possibly existing is estimated through a position estimation module, then the characteristics of different sound sources are extracted, and the estimated values of the distances of the sound sources can be respectively obtained by inputting the characteristics of the neural network model, so that the positioning of the multiple sound sources is realized.

Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and are not limited. Although the present invention has been described in detail with reference to the embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. An underwater multi-sound-source positioning method based on deep learning comprises the following steps:

receiving a signal to be detected through a hydrophone array, and estimating the direction of a sound source; forming subarray wave beams in the direction where a sound source possibly exists, then calculating a spatial correlation matrix of a signal to be detected, forming a characteristic vector, inputting the characteristic vector to a pre-trained time delay neural network, and outputting the distance of the sound source;

the sub-array beam forming in the direction where the sound source may exist is specifically as follows:

the calculation of the spatial correlation matrix of the signal to be detected to form the eigenvector specifically comprises:

2. The deep learning-based underwater multi-sound-source positioning method according to claim 1, wherein the training step of the time-delay neural network comprises:

3. The deep learning-based underwater multi-sound-source positioning method according to claim 2, wherein the step 1) is specifically as follows:

wherein, the upper mark is tau_kIndicating that the source is opposite the kth hydrophoneDelay at the first microphone,/_kAnd

G(f_i)＝[g₁(f_i),…,g_B(f_i)]^T。

4. the deep learning-based underwater multi-sound-source positioning method according to claim 3, wherein the step 2) is specifically:

computing a spatial correlation matrix R (f) of the sound source_i)：

5. The deep learning-based underwater multi-sound-source positioning method according to claim 4, wherein the step of training the time-delay neural network by using the minimum mean square error criterion is as follows:

wherein r is_lA sound source distance value r output by the time delay neural network_l' is a known sound source distance value, and L is the number of samples; e is a minimum cost function, iteration is carried out through random gradient descent back propagation, and a weight matrix of the time delay neural network is obtained.

6. The deep learning-based underwater multi-sound-source localization method according to claim 1, wherein the estimating the azimuth of the sound source specifically comprises:

and

and

7. An underwater multi-sound source localization system based on deep learning, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor, when executing the program, carries out the steps of the method according to one of claims 1 to 6.