CN113537113B - Underwater sound target identification method based on composite neural network - Google Patents

Underwater sound target identification method based on composite neural network Download PDF

Info

Publication number
CN113537113B
CN113537113B CN202110844909.5A CN202110844909A CN113537113B CN 113537113 B CN113537113 B CN 113537113B CN 202110844909 A CN202110844909 A CN 202110844909A CN 113537113 B CN113537113 B CN 113537113B
Authority
CN
China
Prior art keywords
layer
network
data
input
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110844909.5A
Other languages
Chinese (zh)
Other versions
CN113537113A (en
Inventor
徐丽
钱婧捷
李柏宽
申林山
闫鑫
娄茹珍
贾我欢
李悦齐
张立国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN202110844909.5A priority Critical patent/CN113537113B/en
Publication of CN113537113A publication Critical patent/CN113537113A/en
Application granted granted Critical
Publication of CN113537113B publication Critical patent/CN113537113B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Signal Processing (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
  • Image Analysis (AREA)

Abstract

An underwater sound target identification method based on a composite neural network belongs to the technical field of underwater sound signal identification. The invention aims to solve the problem that the accuracy rate of underwater sound target identification is low by adopting the existing method. The invention designs a base layer network structure based on a composite neural network, firstly learns the time sequence characteristics of input audio sample data through an LSTM algorithm to obtain state information updated through the algorithm as an intermediate vector, further continuously transmits the state information in the layer through a CNN network, obtains the spatial characteristics of the input audio sample data through convolution pooling operation in the CNN network, and finally obtains an underwater acoustic target recognition result through a softmax function at the last layer of the CNN network. The invention can be applied to underwater acoustic signal identification.

Description

Underwater sound target identification method based on composite neural network
Technical Field
The invention belongs to the technical field of underwater acoustic signal identification, and particularly relates to an underwater acoustic target identification method based on a composite neural network.
Background
In recent years, with the development of machine learning, deep learning and other technologies, the underwater acoustic target recognition technology has achieved some new developments and research results. The detection and identification of the underwater acoustic target have key effects on underwater operation and underwater target perception, and with the informatization and intellectualization of naval equipment, the underwater acoustic target identification is a prerequisite condition for future underwater and water operations, so whether the underwater acoustic target can be identified and analyzed timely and accurately is an important factor for mastering the initiative of war in the marine war. Due to the fact that the purity of the data information acquired by the marine audio is not high, when some conventional algorithms are used for training, the accuracy rate of the model for predicting the data is not high enough, and the sample data cannot be recognized accurately. Although the CNN algorithm can relatively well identify data information, the structure of the CNN algorithm can cause the CNN algorithm to omit some time-related data information; LSTM has a good effect on identifying the temporal characteristics of data information, but no CNN has a good effect on data processing with spatial characteristics.
In summary, the existing method cannot achieve a good processing effect on both the time-related data information and the spatial feature data, and therefore, the accuracy of the underwater acoustic target identification by using the existing method is still low.
Disclosure of Invention
The invention aims to solve the problem that the accuracy of underwater sound target identification is low by adopting the existing method, and provides an underwater sound target identification method based on a composite neural network.
The technical scheme adopted by the invention for solving the technical problems is as follows:
an underwater acoustic target identification method based on a composite neural network specifically comprises the following steps:
step 1, segmenting an input sound signal through a window function to obtain a plurality of signals with the same length, and then respectively carrying out short-time Fourier transform on each signal to obtain a short-time Fourier transform result;
after the short-time Fourier transform result is converted into an energy spectrum, performing Mel filtering on the energy spectrum to obtain a Mel filtering result;
then, discrete cosine transform is carried out on the Mel filtering result to obtain the MFCC characteristics of the input sound signal;
step 2, inputting the MFCC characteristics obtained in the step 1 into an LSTM network to obtain an output result of the LSTM network;
and 3, inputting the output result of the LSTM network into the CNN network, and outputting the target identification result through the CNN network.
The beneficial effects of the invention are: the invention designs a base layer network structure based on a composite neural network, firstly learns the time sequence characteristics of input audio sample data through an LSTM algorithm to obtain state information updated through the algorithm as an intermediate vector, further continuously transmits the state information in the layer through a CNN network, obtains the spatial characteristics of the input audio sample data through convolution pooling operation in the CNN network, and finally obtains an underwater acoustic target recognition result through a softmax function at the last layer of the CNN network.
The target identification accuracy of the composite neural network algorithm is higher than that of the LSTM algorithm or the CNN algorithm which is used independently, and can reach 73%, and the initial identification accuracy and the convergence speed of the composite neural network are better than those of the LSTM algorithm and the CNN algorithm.
Drawings
FIG. 1 is a diagram of sound waveforms;
FIG. 2 is a feature extraction flow diagram;
FIG. 3 is a composite network identification flow diagram;
fig. 4 is a comparison graph of recognition accuracy of three network models based on the deep learning method.
Detailed Description
First embodiment this embodiment will be described with reference to fig. 3. The underwater sound target identification method based on the composite neural network specifically comprises the following steps:
step 1, segmenting an input sound signal through a window function to obtain a plurality of signals with the same segment length, and then respectively carrying out short-time Fourier transform on each segment of signals to obtain a short-time Fourier transform result;
after converting the short-time Fourier transform result into an energy spectrum, performing Mel filtering on the energy spectrum to obtain a Mel filtering result;
then, discrete cosine transform is carried out on the Mel filtering result to obtain the MFCC (Mel frequency cepstrum) characteristics of the input sound signal;
step 2, inputting the MFCC characteristics obtained in the step 1 into an LSTM network to obtain an output result of the LSTM network;
the LSTM network learns the time sequence characteristics of the audio sample data to obtain an intermediate vector with the time sequence characteristics; the specific process comprises the following steps:
step 2.1: the input data is interacted with the sigmoid to obtain the data to judge the retention degree, so that the data which can enter the network can meet the requirement;
step 2.2: the state information of the current data is obtained by performing product operation on the retention coefficient obtained by sigmoid of the input of the hidden layer and the output of the previous layer and the data obtained by the hidden layer and the output of the previous layer;
step 2.3: and combining the data information state of the previous layer with the current input signal to obtain a weight through sigmoid, putting the current data information state into tanh to obtain a numerical value and performing product calculation on the numerical value and the previous weight, and finally obtaining a 256-dimensional intermediate vector related to the original audio sample data.
And 3, inputting the output result of the LSTM network into the CNN network, and outputting the target identification result through the CNN network.
The method analyzes the noise characteristics of the underwater target, applies the auditory perception characteristics of the target Mel Frequency Cepstrum Coefficient (MFCC) to the underwater sound target recognition, combines machine learning to study, obtains the LOFAR spectrogram through the time domain signal of the data, and further performs characteristic preprocessing on the data. According to the invention, through the research on two deep learning algorithms, namely a Convolutional Neural Network (CNN) and a Long and Short time Memory network (LSTM), an algorithm based on a composite Neural network is provided, the time sequence characteristics of voice are firstly learned by the LSTM to obtain an intermediate vector, then the spatial characteristics of a CNN learning sample are utilized on the basis of the intermediate vector, and finally a target identification result is output by utilizing a softmax function in the last layer of the CNN network, wherein the identification result is a ship target or marine sound.
The second embodiment is as follows: the difference between this embodiment and the first embodiment is that, in step 1, short-time fourier transform is performed on each segment of signal to obtain a short-time fourier transform result; the specific process comprises the following steps:
Figure BDA0003180117510000031
in the formula: STFT (t, f) is the result of short-time fourier transform, t is time, s (τ) is the input sound signal, h (·) represents a window function, which represents the complex conjugate, f is frequency, in Hz, e is the base of the natural logarithm, j is the imaginary unit, τ is the integral variable.
For the selected window functions with different lengths, the resolution of the time frequency can show the relative trend of an inverse function along with the coverage size of the time window.
Other steps and parameters are the same as those in the first embodiment.
The third concrete implementation mode: the difference between this embodiment and the first or second embodiment is that, in step 1, the short-time fourier transform result is converted into an energy spectrum, and the specific process is as follows:
SPEC(t,f)=|STFT(t,f)| 2
where SPEC (t, f) is the energy spectrum.
Other steps and parameters are the same as those in the first or second embodiment.
The fourth concrete implementation mode is as follows: the difference between this embodiment and one of the first to third embodiments is that the structure of the CNN network specifically includes:
from the input layer, the CNN network sequentially includes the input layer, a first convolution layer, a first pooling layer, a second convolution layer, a second pooling layer, a third convolution layer, a third pooling layer, a fourth convolution layer, a full-link layer, and a softmax classification layer.
The invention designs a ten-layer network structure, and the more the number of layers of the convolutional neural network is, the stronger the nonlinear fitting capability is, and the higher the complexity of the recognizable features is. The more neurons convolved within a layer, the more rich the details of the extracted target are. In the network structure, relu is used as an activation function, and the final model is obtained by overlapping different characteristics of data. The softmax function in the last layer can visually express the result generated by classification, so that the result is more convincing.
Other steps and parameters are the same as those in one of the first to third embodiments.
The fifth concrete implementation mode is as follows: the difference between this embodiment and one of the first to fourth embodiments is that the activation function used by the CNN network is Relu.
Other steps and parameters are the same as in one of the first to fourth embodiments.
Examples
The invention is further described below with reference to the accompanying drawings.
The invention provides an underwater sound target identification method based on a composite neural network, which is characterized in that a base layer network structure based on the composite neural network is designed, the time sequence characteristics of audio sample data are firstly learned through an LSTM algorithm, state information updated through the algorithm is obtained and used as an intermediate vector, further the state information in the layer is continuously transmitted through a CNN network, and the spatial characteristics of the audio sample data are obtained through operations such as convolution pooling in the CNN algorithm.
The invention specifically comprises the following steps:
step 1: performing MFCC feature extraction on original sample data, and inputting an obtained result into a designed LSTM network;
step 1.1: in a method of recognizing a sound, mel-frequency cepstrum coefficients, which are cepstrum parameters extracted in the Mel-scale frequency domain, are a commonly used feature, and the Mel-scale-frequency relation formula is expressed as:
Figure BDA0003180117510000041
in the formula: f is frequency in Hz;
step 1.2: the method comprises the steps that an input data signal is divided through a window function to obtain a plurality of short signals with the same length, and then the data signal is analyzed through Fourier transform to obtain a spectrogram meeting requirements;
for each small part of data information, the displayed signal is stable, so that the expression of fourier calculation is:
Figure BDA0003180117510000042
the short-time fourier expression is:
Figure BDA0003180117510000043
in the formula: h (t) represents a window function, and represents a complex conjugate;
transforming the time-frequency function into an energy spectrum:
SPEC(t,f)=|STFT(t,f)| 2 (4)
for the selected window functions with different lengths, the resolution of the time frequency can present the relative trend of an inverse function along with the coverage size of the time window;
step 1.3: the data is transmitted continuously through Mel filtering;
step 1.4: and obtaining the MFCC characteristics of the original sample data through corresponding calculation changes such as discrete cosine and the like.
Step 2: the time sequence characteristics of the audio sample data are learned on the basis of basic characteristics by passing the input sample data information through an LSTM network, and an intermediate vector with the time sequence characteristics is obtained;
step 2.1: the input data is interacted with the sigmoid to obtain the data to judge the retention degree, so that the data which can enter the network can meet the requirement;
step 2.2: the state information of the current data is obtained by performing product operation on the retention coefficient obtained by sigmoid of the input of the hidden layer and the output of the previous layer and the data obtained by the hidden layer and the output of the previous layer;
step 2.3: and combining the data information state of the previous layer with the current input signal to obtain a weight through sigmoid, putting the current data information state into tanh to obtain a numerical value and performing product calculation on the numerical value and the previous weight, and finally obtaining a 256-dimensional intermediate vector related to the original audio sample data.
And step 3: on the basis of the intermediate vector, the intermediate vector is combined with the audio data and the convolutional neural network through operations such as convolutional pooling and the like through a CNN network to obtain the spatial characteristics of the audio data learning sample, so that a final training model is obtained;
step 3.1: the CNN can show good space characteristics according to the structure of the CNN, corresponding data characteristics can be obtained by performing convolution operation on input data, and main characteristics of the data are reserved through pooling operation;
step 3.2: the invention designs a ten-layer network structure, which comprises four convolutional layers, three pooling layers and a full-connection layer from an input layer, wherein a softmax function is used as the last layer; the design of CNN in the composite network structure is shown in table 1:
table 1 design of CNN in composite network architecture
Figure BDA0003180117510000051
Figure BDA0003180117510000061
Step 3.3: in the network structure, relu is used as an activation function, and a final model is obtained by overlapping different characteristics of data.
Respectively carrying out experimental verification on a traditional machine learning method (using MFCC characteristics and an SVM classifier), a convolutional neural network-based method (using MFCC characteristics and a traditional CNN network), a long-time memory network-based method (using MFCC characteristics and an RNN network) and the composite neural network-based method, and comparing experimental results;
experiments based on the traditional machine learning method: using MFCC features and an SVM classifier as a traditional machine learning method, wherein the SVM uses a one-to-one classification mode;
the Mel frequency cepstrum coefficient is a commonly applied feature extraction technology for audio signal identification, corresponding feature output is obtained by performing operations such as filtering processing and the like on original audio data, waveform data of audio is converted into tensor data containing time sequence, and cepstrum parameters extracted in Mel scale frequency domain are combined with analysis processing on time and frequency domains;
the SVM can obtain the recognition accuracy of the SVM in the model and any information data which can be accurately recognized from the current countable sample information data, and obtain the best method thereof to show the generalization; the SVM one-to-one classification method can obtain the accuracy of the model by counting the number of all predicted correct data;
experiments based on the convolutional neural network approach: in the traditional CNN experiment, the network structure has nine layers in total, the initial layer is an input layer, the next three convolutional layers, three pooling layers and one full-connection layer, and the last layer is a layer taking the softmax function as the end; in the network structure, relu is used as an activation function, and the softmax function is a function capable of visually representing the result generated by classification;
experiment based on long and short term memory networking method (LSTM): after MFCC feature extraction is carried out on input original audio data, on the basis of an RNN (radio network), a gate structure is added to neurons between each layer of a network hidden layer to obtain new relation between the neurons, a parameter which can be used as a discrimination leaving parameter is obtained by carrying out sigmoid function operation on input data of the hidden layer at the current stage and output data of the hidden layer at the previous stage, the parameter and a numerical value obtained by carrying out tanh function on the two items of data are subjected to product operation to obtain state information of current data, a corresponding result is obtained by carrying out weighting on the current input signal and the data information state of the previous stage through the sigmoid function, the state of the current data information is combined with the tanh function to obtain another numerical value, and the numerical value and the weighted result of the previous stage are subjected to product operation to obtain output data;
experiment based on the composite neural network method: firstly, performing MFCC feature extraction on original audio data information, and finally obtaining a 256-dimensional intermediate vector about original audio sample data through an LSTM network structure, and then performing spatial training on the intermediate vector through a designed CNN network;
the network structure of the CNN algorithm has ten layers in total, starting from an input layer, four convolutional layers, three pooling layers and one full-connection layer are arranged next to the input layer, and finally, a layer ending with a softmax function is used; in the network structure, relu is used as an activation function, and the softmax function is a function capable of visually representing the result generated by classification;
experiments carried out on the different algorithms can obtain the accuracy of each method for identifying the underwater sound target, and the accuracy of the model is expressed in the form of an image, so that the result can be analyzed more clearly and intuitively.
Referring to fig. 1, a sound waveform diagram; a flowchart of MFCC feature extraction in the conventional machine learning underwater sound target recognition method is shown in fig. 2.
Referring to fig. 3, it is a flow chart of the composite network identification of the present invention: firstly, performing MFCC feature extraction on original audio data information, then learning the time sequence feature of audio sample data through an LSTM algorithm by virtue of an LSTM network structure to obtain a 256-dimensional intermediate vector about the original audio sample data, then performing spatial training on the intermediate vector through a designed CNN network, obtaining the spatial feature of the audio sample data through operations such as convolution pooling in the CNN algorithm, and finally outputting a classification result through a softmax function in the CNN algorithm; the network structure of the CNN algorithm has ten layers in total, starting from an input layer, four convolutional layers, three pooling layers and one full-connection layer are arranged next to the input layer, and finally, a layer ending with a softmax function is used; in the network structure, relu is used as an activation function, and the softmax function is a function capable of visually representing the result generated by classification.
The experimental performance of the CNN and LSTM algorithms and the composite neural network algorithm is compared by adopting the identification accuracy (the accuracy after model training), the initial identification accuracy (the identification accuracy of an untrained model) and the convergence rate, and the identification accuracy images of three network models based on the deep learning method are shown in the figure 4, so that the results can be analyzed more clearly and intuitively by representing the accuracy of the models in the form of images, wherein the images are shown in the figure 4:
through images, the recognition accuracy of the CNN network model is about 63%, the recognition accuracy of the LSTM network model is about 67%, and the recognition accuracy of the composite neural network model is about 73%; on the accuracy, the LSTM algorithm is superior to the CNN algorithm, the accuracy of the composite neural network algorithm is higher than that of the LSTM algorithm, the convergence rate of the composite neural network algorithm is better than that of the LSTM algorithm, and the convergence rate of the CNN algorithm is slightly better than that of the composite neural network algorithm; the composite neural network algorithm model can embody a good recognition effect at the beginning, and is better than an LSTM model and a CNN model.
The above-described calculation examples of the present invention are merely to describe the calculation model and the calculation flow of the present invention in detail, and are not intended to limit the embodiments of the present invention. It will be apparent to those skilled in the art that other variations and modifications can be made on the basis of the foregoing description, and it is not intended to exhaust all of the embodiments, and all obvious variations and modifications which fall within the scope of the invention are intended to be included within the scope of the invention.

Claims (4)

1. An underwater acoustic target identification method based on a composite neural network is characterized by comprising the following steps:
step 1, segmenting an input sound signal through a window function to obtain a plurality of signals with the same length, and then respectively carrying out short-time Fourier transform on each signal to obtain a short-time Fourier transform result;
after the short-time Fourier transform result is converted into an energy spectrum, performing Mel filtering on the energy spectrum to obtain a Mel filtering result;
then, discrete cosine transform is carried out on the Mel filtering result to obtain the MFCC characteristics of the input sound signal;
step 2, inputting the MFCC characteristics obtained in the step 1 into an LSTM network to obtain an output result of the LSTM network;
the input sample data information is processed through an LSTM network, and the time sequence characteristics of the audio sample data are learned on the basis of the basic characteristics to obtain an intermediate vector with the time sequence characteristics; the specific process comprises the following steps:
step 2.1: the input data is interacted with the sigmoid to obtain the data to judge the retention degree, so that the data which can enter the network can meet the requirement;
step 2.2: the state information of the current data is obtained by performing product operation on the retention coefficient obtained by sigmoid of the input of the hidden layer and the output of the previous layer and the data obtained by tanh of the input of the hidden layer and the output of the previous layer;
step 2.3: combining the data information state of the previous layer with the current input signal to obtain a weight through sigmoid, putting the current data information state into tanh to obtain a numerical value and performing product calculation with the previous weight, and finally obtaining a 256-dimensional intermediate vector related to the original audio sample data;
step 3, inputting the output result of the LSTM network into the CNN network, and outputting a target identification result through the CNN network;
the structure of the CNN network is specifically as follows:
from the input layer, the CNN network sequentially comprises the input layer, a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, a third convolutional layer, a third pooling layer, a fourth convolutional layer, a full-connection layer and a softmax classification layer.
2. The underwater sound target identification method based on the composite neural network as claimed in claim 1, wherein in the step 1, short-time fourier transform is performed on each segment of signal respectively to obtain a short-time fourier transform result; the specific process comprises the following steps:
Figure FDA0003847372500000011
in the formula: STFT (t, f) is the result of short-time fourier transform, t is time, s (τ) is the input sound signal, h (·) represents a window function, which represents the complex conjugate, f is frequency, in Hz, e is the base of the natural logarithm, j is the imaginary unit, τ is the integral variable.
3. The underwater acoustic target identification method based on the composite neural network as claimed in claim 2, wherein in the step 1, the short-time fourier transform result is converted into an energy spectrum, and the specific process is as follows:
SPEC(t,f)=|STFT(t,f)| 2
where SPEC (t, f) is the energy spectrum.
4. The underwater acoustic target recognition method based on the composite neural network as claimed in claim 3, wherein the activating function adopted by the CNN network is Relu.
CN202110844909.5A 2021-07-26 2021-07-26 Underwater sound target identification method based on composite neural network Active CN113537113B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110844909.5A CN113537113B (en) 2021-07-26 2021-07-26 Underwater sound target identification method based on composite neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110844909.5A CN113537113B (en) 2021-07-26 2021-07-26 Underwater sound target identification method based on composite neural network

Publications (2)

Publication Number Publication Date
CN113537113A CN113537113A (en) 2021-10-22
CN113537113B true CN113537113B (en) 2022-10-25

Family

ID=78088998

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110844909.5A Active CN113537113B (en) 2021-07-26 2021-07-26 Underwater sound target identification method based on composite neural network

Country Status (1)

Country Link
CN (1) CN113537113B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114636995A (en) * 2022-03-16 2022-06-17 中国水产科学研究院珠江水产研究所 Underwater sound signal detection method and system based on deep learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108846323A (en) * 2018-05-28 2018-11-20 哈尔滨工程大学 A kind of convolutional neural networks optimization method towards Underwater Targets Recognition
CN112329524A (en) * 2020-09-25 2021-02-05 泰山学院 Signal classification and identification method, system and equipment based on deep time sequence neural network
CN112615804A (en) * 2020-12-12 2021-04-06 中国人民解放军战略支援部队信息工程大学 Short burst underwater acoustic communication signal modulation identification method based on deep learning
CN112887239A (en) * 2021-02-15 2021-06-01 青岛科技大学 Method for rapidly and accurately identifying underwater sound signal modulation mode based on deep hybrid neural network

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200348662A1 (en) * 2016-05-09 2020-11-05 Strong Force Iot Portfolio 2016, Llc Platform for facilitating development of intelligence in an industrial internet of things system
CN110807365B (en) * 2019-09-29 2022-02-11 浙江大学 Underwater target identification method based on fusion of GRU and one-dimensional CNN neural network
US11885907B2 (en) * 2019-11-21 2024-01-30 Nvidia Corporation Deep neural network for detecting obstacle instances using radar sensors in autonomous machine applications
CN112329819A (en) * 2020-10-20 2021-02-05 中国海洋大学 Underwater target identification method based on multi-network fusion
CN112364779B (en) * 2020-11-12 2022-10-21 中国电子科技集团公司第五十四研究所 Underwater sound target identification method based on signal processing and deep-shallow network multi-model fusion
CN112927723A (en) * 2021-04-20 2021-06-08 东南大学 High-performance anti-noise speech emotion recognition method based on deep neural network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108846323A (en) * 2018-05-28 2018-11-20 哈尔滨工程大学 A kind of convolutional neural networks optimization method towards Underwater Targets Recognition
CN112329524A (en) * 2020-09-25 2021-02-05 泰山学院 Signal classification and identification method, system and equipment based on deep time sequence neural network
CN112615804A (en) * 2020-12-12 2021-04-06 中国人民解放军战略支援部队信息工程大学 Short burst underwater acoustic communication signal modulation identification method based on deep learning
CN112887239A (en) * 2021-02-15 2021-06-01 青岛科技大学 Method for rapidly and accurately identifying underwater sound signal modulation mode based on deep hybrid neural network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Underwater Acoustic Target Recognition with ResNet18 on ShipsEar Dataset;Feng Hong 等;《2021 IEEE 4th International Conference on Electronics Technology》;20210616;第1240-1244页 *
基于CNN-LSTM网络的声纹识别研究;闫河 等;《计算机应用与软件》;20190412;第36卷(第4期);第166-170页 *
水声目标特征提取与分类识别技术研究;连梓旭;《中国优秀硕士学位论文全文数据库 基础科学辑》;20200215(第2期);A005-102 *

Also Published As

Publication number Publication date
CN113537113A (en) 2021-10-22

Similar Documents

Publication Publication Date Title
JP7337953B2 (en) Speech recognition method and device, neural network training method and device, and computer program
US11908455B2 (en) Speech separation model training method and apparatus, storage medium and computer device
CN110400579B (en) Speech emotion recognition based on direction self-attention mechanism and bidirectional long-time and short-time network
Ariav et al. An end-to-end multimodal voice activity detection using wavenet encoder and residual networks
CN110827804B (en) Sound event labeling method from audio frame sequence to event label sequence
CN109637522B (en) Speech emotion recognition method for extracting depth space attention features based on spectrogram
CN112581979B (en) Speech emotion recognition method based on spectrogram
CN110853656B (en) Audio tampering identification method based on improved neural network
CN113571067A (en) Voiceprint recognition countermeasure sample generation method based on boundary attack
CN115862684A (en) Audio-based depression state auxiliary detection method for dual-mode fusion type neural network
Zhou et al. A denoising representation framework for underwater acoustic signal recognition
CN112541533A (en) Modified vehicle identification method based on neural network and feature fusion
CN113537113B (en) Underwater sound target identification method based on composite neural network
Zhao et al. A survey on automatic emotion recognition using audio big data and deep learning architectures
Cheng et al. DNN-based speech enhancement with self-attention on feature dimension
Yechuri et al. A nested U-net with efficient channel attention and D3Net for speech enhancement
CN112329819A (en) Underwater target identification method based on multi-network fusion
Hu et al. Speech Emotion Recognition Based on Attention MCNN Combined With Gender Information
Parekh et al. Tackling interpretability in audio classification networks with non-negative matrix factorization
CN111785262B (en) Speaker age and gender classification method based on residual error network and fusion characteristics
CN115171878A (en) Depression detection method based on BiGRU and BiLSTM
Qiu et al. Adversarial Latent Representation Learning for Speech Enhancement.
Li et al. MPAF-CNN: Multiperspective aware and fine-grained fusion strategy for speech emotion recognition
Li et al. Multi-layer attention mechanism based speech separation model
Sushma et al. Emotion analysis using signal and image processing approach by implementing deep neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant