CN113657416A

CN113657416A - Deep sea sound source ranging method and system based on improved deep neural network

Info

Publication number: CN113657416A
Application number: CN202010396396.1A
Authority: CN
Inventors: 王文博; 肖旭; 苏林; 任群言; 马力
Original assignee: Institute of Acoustics CAS
Current assignee: Institute of Acoustics CAS
Priority date: 2020-05-12
Filing date: 2020-05-12
Publication date: 2021-11-16
Anticipated expiration: 2040-05-12
Also published as: CN113657416B

Abstract

The invention discloses a deep sea sound source ranging method and system based on an improved deep neural network, which are used for measuring the distance between a vertical array and a sound source, wherein the vertical array comprises N array elements; the method comprises the following steps: performing FFT processing on complex sound pressure values measured by each array element of the vertical array in real time to obtain discrete M frequency values, thereby forming a sequence with the data dimension of 2 multiplied by M multiplied by N, and performing frequency domain normalization processing on the sequence; and inputting the sequence subjected to the frequency domain normalization processing into a pre-trained deep neural network, outputting an Lx 1 array, and acquiring the serial number of the neuron of the output layer corresponding to the maximum value in the array, thereby estimating the sound source distance. The method can realize the distance estimation of the underwater sound source under the typical deep sea environment, and can obviously improve the distance measurement precision and stability.

Description

Deep sea sound source ranging method and system based on improved deep neural network

Technical Field

The invention relates to the field of underwater sound physics, in particular to a deep sea sound source ranging method and system based on an improved deep neural network.

Background

The underwater sound passive distance measurement is a main function of a sonar system and is a problem which is addressed by underwater sound workers for many years. Because the ocean is a time-varying and space-varying complex acoustic channel, the traditional positioning method often faces the problems of environmental mismatch, too large calculation amount and the like.

In recent years, deep learning is taken as a new branch based on a data driving mode, and a new idea is provided for underwater sound passive ranging by the strong feature extraction capability and the unique advantages of processing complex, high-dimensional, nonlinear and other data. The deep neural network establishes complex nonlinear mapping between high-dimensional underwater acoustic physical quantities through a large number of data samples, avoids the problem of difficult physical modeling, and can realize effective identification of the sound source position.

However, when the current deep learning method executes the ranging task, the range estimation values are dispersed in the estimation range, and the stability is poor.

Disclosure of Invention

The invention aims to overcome the technical defects and provides a deep sea sound source ranging method based on an improved deep neural network.

In order to achieve the above object, embodiment 1 of the present invention provides a deep sea sound source ranging method based on an improved deep neural network, for measuring a distance between a vertical array and a sound source, where the vertical array includes N array elements; the method comprises the following steps:

performing FFT processing on complex sound pressure values measured by each array element of the vertical array in real time to obtain discrete M frequency values, thereby forming a sequence with the data dimension of 2 multiplied by M multiplied by N, and performing frequency domain normalization processing on the sequence;

and inputting the sequence subjected to the frequency domain normalization processing into a pre-trained deep neural network, outputting an Lx 1 array, and acquiring the serial number of the neuron of the output layer corresponding to the maximum value in the array, thereby estimating the sound source distance.

As an improvement of the above method, the deep neural network comprises an input layer, a first convolutional layer, a second convolutional layer, a third convolutional layer, a fully-connected layer and an output layer which are connected in sequence; introducing a dropout layer with the coefficient of 0.3 between the third convolution layer and the full-connection layer; a dropout layer with the coefficient of 0.5 is introduced between the full-connection layer and the output layer;

the input layer is used for inputting a sequence with data dimension of 2 multiplied by M multiplied by N; the data collected by each array element is a discrete complex sound pressure value with M frequencies; 2 refers to the real part and imaginary part values of the complex sound pressure value in the complex frequency domain after Fourier transformation;

the first convolutional layer consists of 128 convolutional kernels of size 4 × 4, and these convolutional kernels are then connected to the normalization layer and the maximum pooling layer with a step size of 2 × 2;

the second convolutional layer consists of 128 convolutional kernels of size 3 × 3, which are then connected to the normalization layer and the max pooling layer with a step size of 2 × 2;

the third convolution layer consists of 256 convolution kernels with the size of 3 x 3 and is connected with the normalization layer;

the full-connection layer consists of four layers of neurons and forms a 2048 neuron full-connection neural network;

the output layer is a Gaussian regression layer, the number of the neurons of the output layer is L, L is a discrete distance point number, the receiving distance range is divided into L distance points with equal intervals by a fixed interval delta r, the corresponding distance values are delta r,2 delta r,3 delta r, … and L delta r respectively, and then the vector output by the output layer comprises the probability of L distance values.

As an improvement of the above method, the method further comprises: the deep neural network training method specifically comprises the following steps:

constructing a training set with Gaussian labels:

inputting the environmental parameters into a sound field calculation program Krakenc to calculate a frequency domain simulation sound pressure field data set, combining the frequency domain simulation sound pressure field data set with experimental actual measurement data to form an input data set, and then performing frequency normalization processing;

for the measured distance r calculated by GPS_GPSIts Gaussian label L_GaussExpressed as:

wherein, sigma represents the deviation between the real distance and the GPS measuring distance, L is the number of discrete distance points, and a Gaussian label L_GaussAs an expected output of the deep neural network;

and training the deep neural network by using the training set to obtain the trained deep neural network.

As an improvement of the above method, the inputting the sequence after the frequency domain normalization into a pre-trained deep neural network, outputting an lx 1 array, and obtaining the sequence number of the neuron in the output layer corresponding to the maximum value in the array, thereby estimating the sound source distance specifically includes:

inputting the sequence subjected to the frequency domain normalization processing into a pre-trained deep neural network, and outputting an Lx 1 array;

the serial number of the output layer neuron corresponding to the maximum value in the array is t_position；

Distance r of sound source_estimateComprises the following steps:

r_estimate＝t_position·Δr。

embodiment 2 of the present invention provides a deep-sea sound source ranging system based on an improved deep neural network, the system including; the system comprises a vertical array comprising N array elements, a trained deep neural network, a data preprocessing module and a sound source distance estimation module;

the data preprocessing module is used for carrying out FFT processing on complex sound pressure values measured by each array element of the vertical array in real time to obtain discrete M frequency values, so that a sequence with data dimension of 2 multiplied by M multiplied by N is formed, and frequency domain normalization processing is carried out on the sequence;

and the sound source distance estimation module is used for inputting the sequence subjected to the frequency domain normalization processing into the trained deep neural network, outputting an array L multiplied by 1, and acquiring the sequence number of the neuron of the output layer corresponding to the maximum value in the array, thereby estimating the sound source distance.

As an improvement of the system, the deep neural network comprises an input layer, a first convolutional layer, a second convolutional layer, a third convolutional layer, a fully-connected layer and an output layer which are connected in sequence; introducing a dropout layer with the coefficient of 0.3 between the third convolution layer and the full-connection layer; a dropout layer with the coefficient of 0.5 is introduced between the full-connection layer and the output layer;

As an improvement of the above system, the training step of the deep neural network specifically includes:

constructing a training set with Gaussian labels:

As an improvement of the above system, the sound source distance estimation module is implemented by the following steps:

Distance r of sound source_estimateComprises the following steps:

r_estimate＝t_position·Δr。

the invention has the advantages that:

1. the method can realize the distance estimation of the underwater sound source under the typical deep sea environment, and can obviously improve the distance measurement precision and stability;

2. the effectiveness of the method is verified through the measurement data of the deep sea experiment, and compared with the traditional deep neural network estimation method, the method has the lowest relative error;

3. the improved neural network estimation value has a small fluctuation range, and the distance measurement performance and stability are comprehensively improved; the model is established based on the data, so that the sound field theoretical modeling of an unknown environment can be avoided, the influence caused by environment mismatch is avoided to the maximum extent, and the universality of the model is improved; the trained model only performs light-weight calculation in the prediction stage, so that the real-time processing of data is facilitated; for the trained model, the time for completing one ranging task is only millisecond level, and the real-time performance is realized.

Drawings

FIG. 1 is a flow chart of a deep sea sound source ranging method based on an improved deep neural network according to the present invention;

FIG. 2 is a schematic diagram of the improved Convolutional Neural Network (CNN) structure of the present invention;

FIG. 3(a) is an experimental chart and sound velocity profile (SSP); track1 represents the first route, track2 represents the second route, and the circles represent the location of the VLA;

FIG. 3(b) is SSPs measured in situ using CTD; the dashed line represents the 3 CTD measurements and the solid line represents the average of the 3 CTD measurements;

FIG. 4 is a schematic diagram of a simulated sound field;

FIG. 5 is a comparison of ranging errors for different CNNs data; (a) MAPE of the ranging result of the airline 1, (c) 10% error precision of the ranging result of the airline 1, (e) 5% error precision of the ranging result of the airline 1; (b) MAPE of range finding for lane 2, (d) 10% error accuracy of range finding for lane 2, and (f) 5% error accuracy of range finding for lane 2.

Detailed Description

The technical solution of the present invention will be described in detail below with reference to the accompanying drawings.

Example 1

As shown in fig. 1, embodiment 1 of the present invention provides a deep-sea sound source ranging method based on an improved deep neural network, which includes three main steps: preprocessing and marking data, generating a Gaussian label and training a deep neural network, and actually measuring data preprocessing and neural network output test.

The method comprises the following steps: and (4) preprocessing simulation and experimental data. Calculating a frequency domain simulation sound pressure field data set from the input environmental parameters to a sound field calculation program Krakenc, combining the frequency domain simulation sound pressure field data set with experimental measured data into a data set, and removing the influence of the sound source amplitude through frequency normalization processing. For input data (including simulated sound pressure field data and actually received data), data preprocessing is required to weaken the influence of the sound source intensity amplitude spectrum. And (3) carrying out normalization processing on each frequency point on all array elements by using a formula (1):

step two: and inputting the data after frequency normalization into a deep convolutional neural network for training. During training, a CNN network structure shown in FIG. 2 is adopted, an input sequence is formed by N array element measurement values of a vertical array, and data acquired by each array element are discrete M frequency values, so that the data dimension of the input layer is 2 multiplied by M multiplied by N. Where 2 refers to data of real and imaginary parts in the complex frequency domain after fourier transform of the complex sound pressure. Convolutional layer 1 consists of 128 convolutional kernels of size 4 × 4, which are then connected to the normalization layer and the max pooling layer with a step size of 2 × 2; convolutional layer 2 consists of 128 convolutional kernels of size 3 × 3, which are then connected to the normalization layer and the max pooling layer with a step size of 2 × 2; convolutional layer 3 consists of 256 convolutional kernels of size 3 x 3 and is connected to the normalization layer. A dropout layer with a coefficient of 0.3 is introduced between the convolution layer and the full-link layer. The full-connection layer composed of four layers of neurons forms a 2048 neuron full-connection neural network. A dropout layer with a factor of 0.5 is introduced between the fully connected layer and the output layer. The output layer is a Gaussian regression layer: firstly, setting L as discrete distance point number, dividing the receiving distance range into L equally spaced distance points at fixed interval delta r, and setting the corresponding distance value as

r＝[Δr,2Δr,3Δr,…,LΔr]，

The vector output by the output layer includes the probabilities of the L distance values, and the highest probability corresponds to a possible distance value.

For calculation by GPSMeasured distance r of_GPSWith its Gaussian label expressed as

Wherein sigma represents the deviation between the real distance and the GPS measuring distance, L is the number of discrete distance points, and the Gaussian label L obtained by the formula_GaussAs the output of CNN. The size of the output layer neuron output by the CNN is L multiplied by 1, and the CNN model is trained by using training set data and Gaussian labels.

Step three: and (5) testing the network. Carrying out FFT processing on data obtained by actual measurement to obtain a frequency data set, and then carrying out frequency domain normalization processing to obtain a test data set; and then taking the data set after the frequency domain normalization processing as a test data set. After training is completed, the data of the test set is input into the trained CNN model, and a sequence with the value of Nx 1 is output. Then, the position t of the output layer neuron where the maximum value in the output character string is found_position. Finally, the sound source distance estimated by the CNN model can be expressed as

r_estimate＝t_position·Δr

In a deep sea test in 2170 m depth sea area, the test equipment layout and test sea area environment are as shown in fig. 3(a), in the test, a towing sound source is used for simulating an underwater target, a transmitting signal has a plurality of narrow-band bandwidths (the narrow-band frequencies are 63, 79, 105, 126, 160, 203, 260 and 315Hz) and is transmitted under different signal-to-noise ratios, and in order to test the ranging effect under low signal-to-noise ratios, the signal-to-noise ratios are sequentially reduced by 0, 5, 10 and 15 dB every 10 seconds. The sonar receiving equipment is 1 equally-spaced 8-array element vertical array (VLA), and the array is arranged at a depth of 20m from the seabed.

The experiment was divided into two routes, as shown in FIG. 3 (a). The sea floor of the airline 1 is relatively flat, furthest from the VLA. The water depth of the flight line 2 varies along the north latitudes with a slope of about 9.65 degrees, with the test vessel carrying the source traveling back and forth from east to west relative to the VLA, the furthest distance VLA being about 8 kilometers west and the distance VLA being about 12 kilometers east.

Fig. 3(b) shows the SSPs (before, after and during the operation of the test vessel, respectively) calculated from 3 measurements of CTD (conductivity, temperature, depth), the solid line being the average of the 3 CTD measurements. The measuring vessel has a speed of about 4 knots and its depth varies between 50 and 100 meters due to changes in the vessel's state (cornering, etc.).

The simulation was performed using the KRAKENC sound field calculation program and the simulation environment is shown in fig. 4, where the SSP used in the simulation is the SSP obtained from the three measured SSPs and from their average value, for obtaining a smooth sound field structure. The water depth of the VLA was about 2170 meters, the seafloor was set as a single layer seafloor, the seafloor density was 1.5g/cm3, and the attenuation coefficient was 0.2 dB/wavelength. The sound velocity of the sea bottom has a great influence on the sound field, so the change range of the sound field is set to be 1500-1800 m/s, and the interval is 50 m/s. The sound source depths were set to 20, 40, 60, 80, and 100m in this order. The first array element of the VLA is set to 1885, 1905, and 1925m to account for the location error of the VLA depth. The horizontal distance interval was 0.02km, and the farthest distance was 12 km. 252,000 copies of the acoustic field data set (4SSPs 7 bottom speed 5 source depth 601 range 3VLA depth) were generated using the above parameters.

Using the CNN structure shown in fig. 2, CNNs of 3 different types of output layers are trained using copy field data: the 1 st type is the traditional classification type CNN, and the output layer divides the output distance into different categories for training every 0.02 km; the 2 nd type is the traditional regression type CNN, and the output layer outputs in the form of continuous distance values; the 3 rd is the Gaussian regression CNN provided by the invention, the output layer adopts the Gaussian distribution of one-dimensional Gaussian function, the standard deviation is 0.1, the distance interval is 0.02km, and the flow of the method is shown in figure 2. The Adam self-adaptive learning rate optimization algorithm is adopted to optimize the network, 100 epochs are trained in total, the initial learning rate is 1e-4, the regularization factor is 5e-4, and the Fast Fourier Transform (FFT) adopts the execution time window length of 1s and the step length of 1/4. The signal was normalized and input to three CNNs. Lane 1 and Lane 2 in FIG. 3(a) contain 4800 and 14400 data samples, respectively, for each SNR.

From the comparison of the experimental results in fig. 5, it is concluded that: the average relative error and accuracy of the Gaussian regression CNN are best, when the signal-to-noise ratio is 0dB, the positioning accuracy within the error range of 10% can reach 99.56%, and the ranging accuracy within the error range of 5% can reach 90.14%. With the reduction of the signal-to-noise ratio, the ranging accuracy of the three methods is reduced, but the degradation degree of the Gaussian regression type CNN is smaller than that of the other two methods. In addition, the Mean Absolute Percent Error (MAPE) of the classified CNN is the largest, but with higher precision than the regression CNN. This is because the classification problem cannot effectively distinguish between outputs at different distances, and the regression type CNN outputs a continuous result, and the closer the distance is to the true distance, the smaller the error. The MAPE of the regressive CNN is small, however its ranging accuracy is worst. This shows that the regression method can sufficiently reflect the regression fitting ability of the deep neural network, the output relative average error is very small, but the corresponding positioning accuracy is not high, which may be caused by mismatching of the actual environment and the simulation model. In summary, the output layer of the CNN is set to a gaussian distribution regression form, which can significantly improve the ranging accuracy and stability.

Example 2

Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and are not limited. Although the present invention has been described in detail with reference to the embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A deep sea sound source ranging method based on an improved deep neural network is used for measuring the distance between a vertical array and a sound source, wherein the vertical array comprises N array elements; the method comprises the following steps:

2. The deep sea sound source ranging method based on the improved deep neural network as claimed in claim 1, wherein the deep neural network comprises an input layer, a first convolutional layer, a second convolutional layer, a third convolutional layer, a fully-connected layer and an output layer which are connected in sequence; introducing a dropout layer with the coefficient of 0.3 between the third convolution layer and the full-connection layer; a dropout layer with the coefficient of 0.5 is introduced between the full-connection layer and the output layer;

3. The method for ranging a deep-sea sound source based on an improved deep neural network according to claim 2, further comprising: the deep neural network training method specifically comprises the following steps:

constructing a training set with Gaussian labels:

4. The method according to claim 3, wherein the method for measuring the distance of the sound source in the deep sea based on the improved deep neural network comprises the steps of inputting the sequence after the frequency domain normalization into the deep neural network trained in advance, outputting an L x 1 array, and obtaining the sequence number of the neuron in the output layer corresponding to the maximum value in the array, thereby estimating the distance of the sound source, specifically comprising:

Distance r of sound source_estimateComprises the following steps:

r_estimate＝t_position·Δr。

5. a deep sea sound source ranging system based on an improved deep neural network, the system comprising; the system comprises a vertical array comprising N array elements, a trained deep neural network, a data preprocessing module and a sound source distance estimation module;

6. The deep sea sound source ranging system based on the improved deep neural network as claimed in claim 5, wherein the deep neural network comprises an input layer, a first convolutional layer, a second convolutional layer, a third convolutional layer, a fully-connected layer and an output layer which are connected in sequence; introducing a dropout layer with the coefficient of 0.3 between the third convolution layer and the full-connection layer; a dropout layer with the coefficient of 0.5 is introduced between the full-connection layer and the output layer;

7. The deep-sea sound source ranging system based on the improved deep neural network as claimed in claim 6, wherein the training step of the deep neural network specifically comprises:

constructing a training set with Gaussian labels:

8. The deep-sea sound source ranging system based on the improved deep neural network as claimed in claim 7, wherein the sound source distance estimation module is implemented by:

Distance r of sound source_estimateComprises the following steps:

r_estimate＝t_position·Δr。