CN116106880A - Underwater sound source ranging method and device based on attention mechanism and multi-scale fusion - Google Patents

Underwater sound source ranging method and device based on attention mechanism and multi-scale fusion Download PDF

Info

Publication number
CN116106880A
CN116106880A CN202310390544.2A CN202310390544A CN116106880A CN 116106880 A CN116106880 A CN 116106880A CN 202310390544 A CN202310390544 A CN 202310390544A CN 116106880 A CN116106880 A CN 116106880A
Authority
CN
China
Prior art keywords
feature
layer
network
module
sound source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310390544.2A
Other languages
Chinese (zh)
Other versions
CN116106880B (en
Inventor
徐立军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN202310390544.2A priority Critical patent/CN116106880B/en
Publication of CN116106880A publication Critical patent/CN116106880A/en
Application granted granted Critical
Publication of CN116106880B publication Critical patent/CN116106880B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S11/00Systems for determining distance or velocity not using reflection or reradiation
    • G01S11/14Systems for determining distance or velocity not using reflection or reradiation using ultrasonic, sonic, or infrasonic waves
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/30Assessment of water resources

Abstract

The application provides an underwater sound source ranging method and device based on an attention mechanism and multi-scale fusion, and relates to the technical field of underwater communication, wherein the method comprises the following steps: preprocessing a received underwater signal by using a signal processing technology to obtain a sample covariance matrix corresponding to the received signal; the method comprises the steps of inputting a sample covariance matrix into an underwater sound source ranging network for feature extraction, and taking an output result as a predicted distance, wherein the underwater sound source ranging network takes a residual network as a main network, and the underwater sound source ranging network comprises a self-adaptive feature fusion module and at least one feature subspace channel attention module. The accurate ranging of the underwater sound source is realized by the scheme.

Description

Underwater sound source ranging method and device based on attention mechanism and multi-scale fusion
Technical Field
The application relates to the technical field of underwater communication, in particular to an underwater sound source ranging method and device based on an attention mechanism and multi-scale fusion.
Background
Ocean is an important space and resource guarantee for realizing sustainable development. The underwater sound source ranging based on sound waves is an important means in the marine application fields of environment perception, marine monitoring, information collection and the like, and is one of basic technologies for improving the capability of coping with marine emergencies and strengthening strategic tactics. Underwater sound source ranging is essentially a feature engineering problem, including both feature extraction and position prediction. Therefore, the high-efficiency characteristic extraction module is designed, and accurate ranging of the underwater sound source can be realized.
Underwater sound source ranging is classified into two types, one is a model-driven method, in which the target position is predicted by manually designing features, which are closely related to the physical characteristics of sound wave propagation. A typical representative of such methods is a matching field processing Method (MFP). The method is based on marine environment parameters, utilizes an acoustic propagation model to simulate a sound field within a limited range, then matches the simulated sound field with a real sound field, and estimates a sound source distance. The model driving method has the following problems: the characteristics of manual design cannot truly and comprehensively reflect the actual conditions of deep sea, are limited in practical application, and can directly lead to the reduction of ranging performance if the characteristics are designed wrongly. Therefore, the data-driven underwater sound source ranging method, namely a Deep Neural Network (DNN), is an effective alternative method for learning the characteristic pattern through data analysis and interpretation.
In recent years, DNN has been widely used in marine engineering, such as underwater target detection, direction of arrival estimation, sea bed classification, and the like. DNN learns features related to sound source location given input acoustic data through multiple nonlinear layers. Compared with a model driving method, DNN has stronger characteristic representation capability, and the most advanced performance is obtained in underwater sound source ranging. As a data driven approach, the performance of DNNs depends largely on the amount and quality of training data. However, for ocean engineering, the acquisition of actual data is quite difficult, involving budgeting, time consuming experiments, regulatory and confidentiality. The sparsity of the training data causes the model to be over-fitted, the generalization capability is poor, and the prediction accuracy is low.
Disclosure of Invention
The present application aims to solve, at least to some extent, one of the technical problems in the related art.
Therefore, the first object of the application is to provide an underwater sound source ranging method based on an attention mechanism and multi-scale fusion, which solves the technical problems of low prediction precision and difficult application of the existing method and realizes accurate ranging of the underwater sound source.
A second object of the present application is to propose an underwater sound source ranging device based on an attention mechanism and multi-scale fusion.
To achieve the above objective, an embodiment of a first aspect of the present application provides an underwater sound source ranging method based on an attention mechanism and multi-scale fusion, including: preprocessing a received underwater signal by using a signal processing technology to obtain a sample covariance matrix corresponding to the received signal; the method comprises the steps of inputting a sample covariance matrix into an underwater sound source ranging network for feature extraction, and taking an output result as a predicted distance, wherein the underwater sound source ranging network takes a residual network as a main network, and the underwater sound source ranging network comprises a self-adaptive feature fusion module and at least one feature subspace channel attention module.
According to the underwater sound source ranging method based on the attention mechanism and the multi-scale fusion, a signal processing technology is utilized to preprocess received signals, and the obtained sample covariance matrix can effectively represent the relation between signal frequency and a receiving array. And then improving the traditional attention mechanism and the multi-scale fusion module, adding the improved attention mechanism and the multi-scale fusion module into the traditional DNN to obtain an underwater sound source ranging network based on the attention mechanism and the multi-scale fusion, and finally inputting an obtained sample covariance matrix into the underwater sound source ranging network to output a predicted distance.
Optionally, in an embodiment of the present application, preprocessing the received underwater signal by using a signal processing technology to obtain a sample covariance matrix corresponding to the received signal, including:
normalizing the received underwater signal, and calculating to obtain an initial sample covariance matrix according to the normalized signal;
separating the real part and the imaginary part of the initial sample covariance matrix, and stacking the separated sample covariance matrices of different frequencies along a first dimension to obtain a sample covariance matrix;
wherein the initial sample covariance matrix is expressed as:
Figure SMS_1
wherein ,
Figure SMS_2
the representation shape is +.>
Figure SMS_3
Sample covariance matrix of>
Figure SMS_4
Representing the number of array elements of the receiving array,/->
Figure SMS_5
Representing the normalized signal, ++>
Figure SMS_6
Representing the complex conjugate transpose.
Optionally, in an embodiment of the present application, the underwater sound source ranging network further includes at least one pooling layer and at least one full connection layer, the residual network is a multi-layer network, each layer is composed of at least one residual block, other layers except for the last layer in the residual network are intermediate layers, each intermediate layer corresponds to a feature subspace channel attention module, each layer in the residual network corresponds to one pooling layer and one full connection layer, and the sample covariance matrix is input into the underwater sound source ranging network for feature extraction, including:
inputting the sample covariance matrix into a residual error network, sequentially passing through each layer of network to obtain output data of each middle layer, and obtaining an output result of the last layer as a final output result of the residual error network;
the final output result passes through the corresponding pooling layer and the full connection layer to obtain an initial prediction result;
the output data of each intermediate layer is respectively passed through a corresponding feature subspace channel attention module to obtain a feature map corresponding to each intermediate layer, all the feature maps are input into a self-adaptive feature fusion module to obtain integrated features corresponding to each intermediate layer, and the integrated features corresponding to each intermediate layer are respectively passed through a corresponding pooling layer and a full-connection layer to obtain prediction results of all the intermediate layers;
and adding and averaging the initial prediction result and the prediction results of all intermediate layers to obtain a final prediction result.
Optionally, in one embodiment of the present application, the feature subspace channel attention module includes a feature subspace module and at least one compressed excitation attention module, the compressed excitation attention module including a compression module and an excitation module, inputting output data of the intermediate layer to the feature subspace channel attention module, comprising:
dividing output data along a channel dimension by using a feature subspace module to obtain at least one group of feature graphs, wherein each group of feature graphs corresponds to one compressed excitation attention module;
inputting each group of feature graphs into a corresponding compression excitation attention module, coding the whole spatial features on the channels of the feature subgroups into global features by using global average pooling through the corresponding compression modules, obtaining the weight of each channel according to the global features through the corresponding excitation modules, and multiplying the weight of each channel by the feature graphs of the corresponding groups to obtain updated feature graphs of each group;
and splicing each updated group of feature images along the channel dimension to obtain corresponding feature images.
Alternatively, in one embodiment of the present application, the global features are expressed as:
Figure SMS_7
wherein ,
Figure SMS_8
representing global features->
Figure SMS_9
A set of feature maps representing the current process;
the weight of each channel is expressed as:
Figure SMS_10
wherein ,
Figure SMS_12
representing the weight of each channel, +.>
Figure SMS_15
Representing global features->
Figure SMS_17
and />
Figure SMS_13
Representing a ReLU activation function and a sigmoid activation function, respectively,>
Figure SMS_16
representing a convolution layer comprising->
Figure SMS_18
and />
Figure SMS_19
,/>
Figure SMS_11
For compressing channel characteristics->
Figure SMS_14
For restoring the channel dimension;
each updated set of feature maps is represented as:
Figure SMS_20
wherein ,
Figure SMS_21
a set of feature maps representing the current process, +.>
Figure SMS_22
Representing the weight of each channel;
the feature map is expressed as:
Figure SMS_23
wherein ,
Figure SMS_24
representing updated->
Figure SMS_25
Group feature graphs, concat, represent the concatenation operation of channel dimensions.
Optionally, in an embodiment of the present application, inputting all feature maps into the adaptive feature fusion module to obtain integrated features corresponding to each intermediate layer, including:
selecting the intermediate values of the picture sizes of all the feature images as standard values, adjusting the sizes of all the feature images according to the standard values through interpolation and maximum pooling, and fusing the adjusted feature images to obtain initial fusion features;
the initial fusion feature is subjected to convolution and softmax functions to obtain a space self-adaptive weight, and the space self-adaptive weight is split in the channel dimension, so that the space self-adaptive weight corresponds to the adjusted feature map in sequence to obtain a split weight;
correspondingly multiplying the adjusted feature map and the corresponding split weight, and adding to obtain updated fusion features;
and scaling and expanding the updated fusion features according to the picture size of each feature map according to the FPN structure, and adding each feature map and the corresponding adjusted updated fusion features through jump connection to obtain the integrated features of the feature maps corresponding to each middle layer.
Optionally, in an embodiment of the present application, the adjusted feature map is expressed as:
Figure SMS_26
wherein ,
Figure SMS_27
、/>
Figure SMS_28
、/>
Figure SMS_29
representing all feature maps;
the initial fusion features are expressed as:
Figure SMS_30
wherein ,
Figure SMS_31
representing an initial fusion feature;
the split weights are expressed as:
Figure SMS_32
wherein ,
Figure SMS_33
representing the weight corresponding to each adjusted feature map;
the updated fusion characteristics are expressed as:
Figure SMS_34
wherein ,
Figure SMS_35
representing the updated fusion features;
the integrated features of the feature map corresponding to each intermediate layer are expressed as:
Figure SMS_36
Figure SMS_37
Figure SMS_38
wherein ,
Figure SMS_39
representing the integrated features of the feature maps corresponding to all intermediate layers.
In order to achieve the above object, an embodiment of a second aspect of the present invention provides an underwater sound source ranging device based on an attention mechanism and multi-scale fusion, which includes a preprocessing module and a ranging module, wherein,
the preprocessing module is used for preprocessing the received underwater signal by utilizing a signal processing technology to obtain a sample covariance matrix corresponding to the received signal;
the distance measurement module is used for inputting the sample covariance matrix into the underwater sound source distance measurement network to perform feature extraction and taking the output result as a predicted distance, wherein the underwater sound source distance measurement network takes a residual network as a main network, and the underwater sound source distance measurement network comprises a self-adaptive feature fusion module and at least one feature subspace channel attention module.
Optionally, in an embodiment of the present application, the preprocessing module is specifically configured to:
normalizing the received underwater signal, and calculating to obtain an initial sample covariance matrix according to the normalized signal;
and separating the real part and the imaginary part of the initial sample covariance matrix, and stacking the separated sample covariance matrices of different frequencies along the first dimension to obtain the sample covariance matrix.
Optionally, in an embodiment of the present application, the underwater sound source ranging network further includes at least one pooling layer and at least one fully connected layer, the residual network is a multi-layer network, each layer is composed of at least one residual block, other layers except for the last layer in the residual network are intermediate layers, each intermediate layer corresponds to a characteristic subspace channel attention module, each layer of the residual network corresponds to one pooling layer and one fully connected layer, and the ranging module is specifically configured to:
inputting the sample covariance matrix into a residual error network, sequentially passing through each layer of network to obtain output data of each middle layer, and obtaining an output result of the last layer as a final output result of the residual error network;
the final output result passes through the corresponding pooling layer and the full connection layer to obtain an initial prediction result;
the output data of each intermediate layer is respectively passed through a corresponding feature subspace channel attention module to obtain a feature map corresponding to each intermediate layer, all the feature maps are input into a self-adaptive feature fusion module to obtain integrated features corresponding to each intermediate layer, and the integrated features are respectively passed through a corresponding pooling layer and a full-connection layer to obtain prediction results of all the intermediate layers;
and adding and averaging the initial prediction result and the prediction results of all intermediate layers to obtain a final prediction result.
Additional aspects and advantages of the application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the application.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:
fig. 1 is a schematic flow chart of an underwater sound source ranging method based on an attention mechanism and multi-scale fusion according to an embodiment of the present application;
FIG. 2 is a diagram illustrating an exemplary structure of a multi-scale fusion underwater sound source ranging network based on an attention mechanism according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a feature subspace channel attention architecture of an embodiment of the present application;
fig. 4 is a schematic structural diagram of an adaptive feature fusion module according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an underwater sound source ranging device based on an attention mechanism and multi-scale fusion according to an embodiment of the present application.
Detailed Description
Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the drawings are exemplary and intended for the purpose of explaining the present application and are not to be construed as limiting the present application.
The following describes an underwater sound source ranging method and device based on an attention mechanism and multi-scale fusion according to the embodiments of the present application with reference to the accompanying drawings.
Fig. 1 is a schematic flow chart of an underwater sound source ranging method based on an attention mechanism and multi-scale fusion according to an embodiment of the present application.
As shown in fig. 1, the underwater sound source ranging method based on the attention mechanism and the multi-scale fusion comprises the following steps:
step 101, preprocessing a received underwater signal by using a signal processing technology to obtain a sample covariance matrix corresponding to the received signal;
step 102, inputting a sample covariance matrix into an underwater sound source ranging network for feature extraction, and taking an output result as a predicted distance, wherein the underwater sound source ranging network takes a residual network as a main network, and the underwater sound source ranging network comprises a self-adaptive feature fusion module and at least one feature subspace channel attention module.
According to the underwater sound source ranging method based on the attention mechanism and the multi-scale fusion, a signal processing technology is utilized to preprocess received signals, and the obtained sample covariance matrix can effectively represent the relation between signal frequency and a receiving array. And then improving the traditional attention mechanism and the multi-scale fusion module, adding the improved attention mechanism and the multi-scale fusion module into the traditional DNN to obtain an underwater sound source ranging network based on the attention mechanism and the multi-scale fusion, and finally inputting an obtained sample covariance matrix into the underwater sound source ranging network to output a predicted distance.
Optionally, in an embodiment of the present application, preprocessing the received underwater signal by using a signal processing technology to obtain a sample covariance matrix corresponding to the received signal, including:
for the pretreatment of the received signal, namely by a signal processing technology, the received signal is normalized firstly, and then a sample covariance matrix is calculated, wherein the formula is as follows:
Figure SMS_40
wherein ,
Figure SMS_41
representing the normalized signal, ++>
Figure SMS_42
Representing the complex conjugate transpose>
Figure SMS_43
Representing a sample covariance matrix in the shape of l×l, L representing the number of receive array elements. Due to->
Figure SMS_44
Each data is in complex form and cannot be directly input into the neural network. Thus, the real part and the imaginary part thereof are separated, resulting in input data having a shape of 2×l×l.
Finally, at different frequencies
Figure SMS_45
Stacking along a first dimension to obtain final input data in the shape of 2FXLXL, wherein +.>
Figure SMS_46
Indicating the number of frequencies.
Optionally, in an embodiment of the present application, the underwater sound source ranging network further includes at least one pooling layer and at least one full connection layer, the residual network is a multi-layer network, each layer is composed of at least one residual block, other layers except for the last layer in the residual network are intermediate layers, each intermediate layer corresponds to a feature subspace channel attention module, each layer in the residual network corresponds to one pooling layer and one full connection layer, and the sample covariance matrix is input into the underwater sound source ranging network for feature extraction, including:
inputting the sample covariance matrix into a residual error network, sequentially passing through each layer of network to obtain output data of each middle layer, and obtaining an output result of the last layer as a final output result of the residual error network;
the final output result passes through the corresponding pooling layer and the full connection layer to obtain an initial prediction result;
the output data of each intermediate layer is respectively passed through a corresponding feature subspace channel attention module to obtain a feature map corresponding to each intermediate layer, all the feature maps are input into a self-adaptive feature fusion module to obtain integrated features corresponding to each intermediate layer, and the integrated features are respectively passed through a corresponding pooling layer and a full-connection layer to obtain prediction results of all the intermediate layers;
and adding and averaging the initial prediction result and the prediction results of all intermediate layers to obtain a final prediction result.
Optionally, in one embodiment of the present application, the feature subspace channel attention module includes a feature subspace module and at least one compressed excitation attention module, the compressed excitation attention module including a compression module and an excitation module, inputting output data of the intermediate layer to the feature subspace channel attention module, comprising:
dividing output data along a channel dimension by using a feature subspace module to obtain at least one group of feature graphs, wherein each group of feature graphs corresponds to one compressed excitation attention module;
inputting each group of feature graphs into a corresponding compression excitation attention module, coding the whole spatial features on the channels of the feature subgroups into global features by using global average pooling through the corresponding compression modules, obtaining the weight of each channel according to the global features through the corresponding excitation modules, and multiplying the weight of each channel by the feature graphs of the corresponding groups to obtain updated feature graphs of each group;
and splicing each updated group of feature images along the channel dimension to obtain corresponding feature images.
Alternatively, in one embodiment of the present application, the global features are expressed as:
Figure SMS_47
wherein ,
Figure SMS_48
representing global features->
Figure SMS_49
A set of feature maps representing the current process;
the weight of each channel is expressed as:
Figure SMS_50
wherein ,
Figure SMS_52
representing the weight of each channel, +.>
Figure SMS_55
Representing global features->
Figure SMS_58
and />
Figure SMS_53
Representing a ReLU activation function and a sigmoid activation function, respectively,>
Figure SMS_56
representing a convolution layer comprising->
Figure SMS_57
and />
Figure SMS_59
,/>
Figure SMS_51
For compressing channel characteristics->
Figure SMS_54
For restoring the channel dimension;
each updated set of feature maps is represented as:
Figure SMS_60
wherein ,
Figure SMS_61
a set of feature maps representing the current process, +.>
Figure SMS_62
Representing the weight of each channel;
the feature map is expressed as:
Figure SMS_63
wherein ,
Figure SMS_64
representing updated->
Figure SMS_65
Group feature graphs, concat, represent the concatenation operation of channel dimensions.
Optionally, in an embodiment of the present application, inputting all feature maps into the adaptive feature fusion module to obtain integrated features corresponding to each intermediate layer, including:
selecting the intermediate values of the picture sizes of all the feature images as standard values, adjusting the sizes of all the feature images according to the standard values through interpolation and maximum pooling, and fusing the adjusted feature images to obtain initial fusion features;
the initial fusion feature is subjected to convolution and softmax functions to obtain a space self-adaptive weight, and the space self-adaptive weight is split in the channel dimension, so that the space self-adaptive weight corresponds to the adjusted feature map in sequence to obtain a split weight;
correspondingly multiplying the adjusted feature map and the corresponding split weight, and adding to obtain updated fusion features;
and scaling and expanding the updated fusion features according to the picture size of each feature map according to the FPN structure, and adding each feature map and the corresponding adjusted updated fusion features through jump connection to obtain the integrated features of the feature maps corresponding to each middle layer.
Optionally, in an embodiment of the present application, the adjusted feature map is expressed as:
Figure SMS_66
wherein ,
Figure SMS_67
、/>
Figure SMS_68
、/>
Figure SMS_69
representing all feature maps;
the initial fusion features are expressed as:
Figure SMS_70
wherein ,
Figure SMS_71
representing an initial fusion feature;
the split weights are expressed as:
Figure SMS_72
wherein ,
Figure SMS_73
representing the weight corresponding to each adjusted feature map;
the updated fusion characteristics are expressed as:
Figure SMS_74
wherein ,
Figure SMS_75
representing the updated fusion features;
the integrated features of the feature map corresponding to each intermediate layer are expressed as:
Figure SMS_76
Figure SMS_77
Figure SMS_78
wherein ,
Figure SMS_79
representing the integrated features of the feature maps corresponding to all intermediate layers. />
FIG. 2 is a diagram showing an example of the structure of an underwater sound source ranging network based on the attention mechanism and multi-scale fusion of the present application, as shown in FIG. 2, the present embodiment first selects ResNet-50 as a backbone network, which contains four layers in total, each layer is composed of a plurality of residual blocks (ResBlock), and generates an output
Figure SMS_80
. On one hand, the outputs are sequentially transmitted from bottom to top, a prediction result is obtained through the pooling layer and the full-connection layer, and on the other hand, the front three-layer output firstly passes through the attention of three characteristic subspace channels, so that channel characteristics and multi-frequency characteristics are extracted. Then, through a self-adaptive feature fusion module, fusion of semantic features and detail features is realized, and finally three prediction results ++are obtained through three pooling layers and a full-connection layer>
Figure SMS_81
. Finally, adding and averaging the four prediction results to obtainThe final prediction result is defined by the following formula:
Figure SMS_82
feature subspace channel attention
Because the channel dimension of the sample covariance matrix represents frequency, compared with the traditional image, the channel number is more, and the channel feature is richer. Thus, there is a need for efficient characterization and learning of channel features using a channel attention mechanism to impart varying importance to these features. Based on the above findings, the present embodiment devised a feature subspace channel attention. FIG. 3 is a schematic diagram of the attention of the channel of the feature subspace according to the present embodiment.
Let the input feature map be
Figure SMS_83
Where M is the number of feature channels and H, W is the feature space dimension. The input feature map is first passed through a feature subspace module that divides the input feature map along the channel dimension, dividing G groups in total, each group comprising G features. The subspace division mode can effectively learn multi-collar features, and the extraction of the multi-collar features is helpful for solving the problem of large intra-class variation in the sample covariance matrix.
Next, each set of feature maps is passed through a compressed stimulus attention module (Squeeze-excitation module, SE). The SE emphasizes effective information and suppresses ineffective information by weighting the channel characteristics, so that the complex channel characteristics in the sample covariance matrix can be better extracted. SE mainly includes two operations, extrusion (squeeze) and excitation (specification). Taking one group of characteristic diagrams as an example, the group of characteristic diagrams is set as
Figure SMS_84
The set of feature maps is first input into the squeeze module. squeeze encodes the entire spatial feature on a channel as a global feature using global averaging pooling, defined by the following equation:
Figure SMS_85
wherein
Figure SMS_86
Representing global features. Then, through an expression module, the expression captures the dependency relationship among channels through global features extracted by the squeeze, and the dependency relationship is defined by the following formula:
Figure SMS_87
wherein
Figure SMS_88
Representing the weight of each channel. Delta and sigma represent the ReLU activation function and sigmoid activation function, respectively.
Figure SMS_89
and />
Figure SMS_90
Representing a convolution layer, the first convolution layer compresses the channel characteristics, fully capturing the relationship between channels, and the second convolution layer restores the channel dimensions. And finally multiplying the weight of each channel to realize the recalibration of the input features in the channel dimension. The process is defined by the following formula:
Figure SMS_91
wherein ,
Figure SMS_92
a set of feature maps representing the current process, +.>
Figure SMS_93
Representing the weight of each channel;
finally, each set of feature maps passing through the SE module are stitched along the channel dimension, the process being defined by the following formula:
Figure SMS_94
wherein
Figure SMS_95
Output representing attention of characteristic subspace channel, +.>
Figure SMS_96
Representing each set of feature graphs through SE, concat represents a concatenation operation of channel dimensions.
Self-adaptive feature fusion module
Because of the higher intra-class variation in the sample covariance matrix, for data with similar distances, a single semantic feature cannot be accurately predicted, and low-level detail features need to be combined. Feature Pyramid (FPN) realizes feature integration by fusing deep semantic information and shallow detail features, and is an effective solution. However, the bottom-up path and top-down path in the FPN are one sequential approach. This sequential approach results in more concern for adjacent features per layer feature and itself, and lack of concern for cross-layer features. Based on the findings, the invention designs a self-adaptive feature fusion module. Fig. 4 is a schematic structural diagram of an adaptive feature fusion module according to the present embodiment.
The three scale feature diagrams of the input are respectively
Figure SMS_97
、/>
Figure SMS_98
、/>
Figure SMS_99
The corresponding scales decrease in turn. In the first step, the three feature maps are adjusted to an intermediate size, i.e. with +.>
Figure SMS_100
After scaling the feature map with the same size, the initial fusion feature ++is obtained by averaging>
Figure SMS_101
The process is defined by the following formula:
Figure SMS_102
Figure SMS_103
the second step obtains a scale by a 1 x 1 convolution and softmax function
Figure SMS_104
Is used for the spatial adaptation of the weights of the (c). Then split it in the channel dimension, and +.>
Figure SMS_105
The three features correspond in sequence. The process is defined by the following formula:
Figure SMS_106
wherein ,
Figure SMS_107
representing the weight corresponding to each adjusted feature map;
third step, will
Figure SMS_108
And->
Figure SMS_109
Multiplying. The context information is aggregated by calculating a weighted sum. The process is defined by the following formula:
Figure SMS_110
fourth, according to the structure in FPN, the reverse operation is used
Figure SMS_111
Scaling and expanding, then connecting with +_ through jump>
Figure SMS_112
Add, output integration feature->
Figure SMS_113
. The process is defined by the following formula:
Figure SMS_114
Figure SMS_115
Figure SMS_116
in order to achieve the above embodiment, the present application further provides an underwater sound source ranging device based on an attention mechanism and multi-scale fusion.
Fig. 5 is a schematic structural diagram of an underwater sound source ranging device based on an attention mechanism and multi-scale fusion according to an embodiment of the present application.
As shown in fig. 5, the underwater sound source ranging device based on the attention mechanism and the multi-scale fusion comprises a preprocessing module and a ranging module, wherein,
the preprocessing module is used for preprocessing the received underwater signal by utilizing a signal processing technology to obtain a sample covariance matrix corresponding to the received signal;
the distance measurement module is used for inputting the sample covariance matrix into the underwater sound source distance measurement network to perform feature extraction and taking the output result as a predicted distance, wherein the underwater sound source distance measurement network takes a residual network as a main network, and the underwater sound source distance measurement network comprises a self-adaptive feature fusion module and at least one feature subspace channel attention module.
Optionally, in an embodiment of the present application, the preprocessing module is specifically configured to:
normalizing the received underwater signal, and calculating to obtain an initial sample covariance matrix according to the normalized signal;
and separating the real part and the imaginary part of the initial sample covariance matrix, and stacking the separated sample covariance matrices of different frequencies along the first dimension to obtain the sample covariance matrix.
Optionally, in an embodiment of the present application, the underwater sound source ranging network further includes at least one pooling layer and at least one fully connected layer, the residual network is a multi-layer network, each layer is composed of at least one residual block, other layers except for the last layer in the residual network are intermediate layers, each intermediate layer corresponds to a characteristic subspace channel attention module, each layer of the residual network corresponds to one pooling layer and one fully connected layer, and the ranging module is specifically configured to:
inputting the sample covariance matrix into a residual error network, sequentially passing through each layer of network to obtain output data of each middle layer, and obtaining an output result of the last layer as a final output result of the residual error network;
the final output result passes through the corresponding pooling layer and the full connection layer to obtain an initial prediction result;
the output data of each intermediate layer is respectively passed through a corresponding feature subspace channel attention module to obtain a feature map corresponding to each intermediate layer, all the feature maps are input into a self-adaptive feature fusion module to obtain integrated features corresponding to each intermediate layer, and the integrated features are respectively passed through a corresponding pooling layer and a full-connection layer to obtain prediction results of all the intermediate layers;
and adding and averaging the initial prediction result and the prediction results of all intermediate layers to obtain a final prediction result.
It should be noted that the foregoing explanation of the embodiment of the underwater sound source ranging method based on the attention mechanism and the multi-scale fusion is also applicable to the underwater sound source ranging device based on the attention mechanism and the multi-scale fusion of the embodiment, and will not be repeated herein.
In the description of the present specification, a description referring to the terms "one embodiment," "some embodiments," "examples," "particular examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "plurality" is at least two, such as two, three, etc., unless explicitly defined otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and additional implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. As with the other embodiments, if implemented in hardware, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.
The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like. Although embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives, and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims (10)

1. An underwater sound source ranging method based on an attention mechanism and multi-scale fusion is characterized by comprising the following steps:
preprocessing a received underwater signal by using a signal processing technology to obtain a sample covariance matrix corresponding to the received signal;
and inputting the sample covariance matrix into an underwater sound source ranging network for feature extraction, and taking an output result as a predicted distance, wherein the underwater sound source ranging network takes a residual network as a main network, and the underwater sound source ranging network comprises a self-adaptive feature fusion module and at least one feature subspace channel attention module.
2. The method for ranging underwater sound sources based on an attention mechanism and multi-scale fusion according to claim 1, wherein the preprocessing of the received underwater signals by using a signal processing technology to obtain a sample covariance matrix corresponding to the received signals comprises:
normalizing the received underwater signal, and calculating to obtain an initial sample covariance matrix according to the normalized signal;
separating the real part and the imaginary part of the initial sample covariance matrix, and stacking the separated sample covariance matrices of different frequencies along a first dimension to obtain the sample covariance matrix;
wherein the initial sample covariance matrix is expressed as:
Figure QLYQS_1
wherein ,
Figure QLYQS_2
the representation shape is +.>
Figure QLYQS_3
Sample covariance matrix of>
Figure QLYQS_4
Representing the number of array elements of the receiving array,/->
Figure QLYQS_5
Representing the normalized signal, ++>
Figure QLYQS_6
Representing the complex conjugate transpose.
3. The underwater sound source ranging method based on the attention mechanism and the multi-scale fusion according to claim 1, wherein the underwater sound source ranging network further comprises at least one pooling layer and at least one fully-connected layer, the residual network is a multi-layer network, each layer is composed of at least one residual block, other layers except the last layer in the residual network are intermediate layers, each intermediate layer corresponds to a characteristic subspace channel attention module, each layer of the residual network corresponds to one pooling layer and one fully-connected layer, and the inputting the sample covariance matrix into the underwater sound source ranging network for characteristic extraction comprises:
inputting the sample covariance matrix into the residual error network, sequentially passing through each layer of network to obtain output data of each middle layer, and obtaining an output result of the last layer as a final output result of the residual error network;
the final output result passes through a corresponding pooling layer and a full-connection layer to obtain an initial prediction result;
the output data of each intermediate layer is respectively passed through a corresponding feature subspace channel attention module to obtain feature graphs corresponding to each intermediate layer, all the feature graphs are input into the self-adaptive feature fusion module to obtain integrated features corresponding to each intermediate layer, and the integrated features corresponding to each intermediate layer are respectively passed through a corresponding pooling layer and a full-connection layer to obtain prediction results of all the intermediate layers;
and adding and averaging the initial prediction results and the prediction results of all the intermediate layers to obtain a final prediction result.
4. An attention mechanism and multiscale fusion based underwater sound source ranging method according to claim 3 wherein the feature subspace channel attention module comprises a feature subspace module and at least one compressed excitation attention module comprising a compression module and an excitation module, the input of the intermediate layer output data to the feature subspace channel attention module comprising:
dividing the output data along a channel dimension by using the feature subspace module to obtain at least one group of feature graphs, wherein each group of feature graphs corresponds to one compression excitation attention module;
inputting each group of feature graphs into a corresponding compression excitation attention module, coding the whole spatial feature on the channels of the feature subgroups into global features by using global average pooling through the corresponding compression modules, obtaining the weight of each channel according to the global features through the corresponding excitation modules, and multiplying the weight of each channel by the feature graphs of the corresponding groups to obtain updated feature graphs of each group;
and splicing the updated feature graphs of each group along the channel dimension to obtain corresponding feature graphs.
5. The underwater sound source ranging method based on the attention mechanism and the multi-scale fusion according to claim 4, wherein the global features are expressed as:
Figure QLYQS_7
wherein ,
Figure QLYQS_8
representing global features->
Figure QLYQS_9
A set of feature maps representing the current process;
the weight of each channel is expressed as:
Figure QLYQS_10
wherein ,
Figure QLYQS_12
representing the weight of each channel, +.>
Figure QLYQS_16
Representing global features->
Figure QLYQS_18
and />
Figure QLYQS_13
Representing a ReLU activation function and a sigmoid activation function, respectively,>
Figure QLYQS_15
representing a convolution layer comprising->
Figure QLYQS_17
and />
Figure QLYQS_19
,/>
Figure QLYQS_11
For compressing channel characteristics->
Figure QLYQS_14
For restoring the channel dimension;
each set of updated feature maps is represented as:
Figure QLYQS_20
wherein ,
Figure QLYQS_21
a set of feature maps representing the current process, +.>
Figure QLYQS_22
Representing the weight of each channel;
the feature map is expressed as:
Figure QLYQS_23
wherein ,
Figure QLYQS_24
representing updated->
Figure QLYQS_25
Group feature graphs, concat, represent the concatenation operation of channel dimensions.
6. The method for ranging underwater sound sources based on an attention mechanism and multi-scale fusion according to claim 3, wherein said inputting all feature maps into the adaptive feature fusion module to obtain the integrated features corresponding to each intermediate layer comprises:
selecting the intermediate value of the picture sizes of all feature images as a standard value, adjusting the sizes of all feature images according to the standard value through interpolation and maximum pooling, and fusing the adjusted feature images to obtain initial fusion features;
the initial fusion feature is subjected to convolution and softmax functions to obtain space self-adaptive weights, and the space self-adaptive weights are split in channel dimensions so that the space self-adaptive weights correspond to the adjusted feature images in sequence to obtain split weights;
correspondingly multiplying the adjusted feature map and the corresponding split weight and then adding to obtain updated fusion features;
and scaling and expanding the updated fusion features according to the picture size of each feature map according to the FPN structure, and adding each feature map and the corresponding adjusted updated fusion features through jump connection to obtain the integrated features of the feature maps corresponding to each middle layer.
7. The method for ranging underwater sound sources based on the attention mechanism and the multi-scale fusion according to claim 6, wherein the adjusted feature map is expressed as:
Figure QLYQS_26
/>
wherein ,
Figure QLYQS_27
、/>
Figure QLYQS_28
、/>
Figure QLYQS_29
representing all feature maps;
the initial fusion features are expressed as:
Figure QLYQS_30
wherein ,
Figure QLYQS_31
representing the initial fusion feature;
the splitting weight is expressed as:
Figure QLYQS_32
wherein ,
Figure QLYQS_33
representing the weight corresponding to each adjusted feature map;
the updated fusion characteristics are expressed as:
Figure QLYQS_34
wherein ,
Figure QLYQS_35
representing the updated fusion feature;
the integrated features of the feature map corresponding to each intermediate layer are expressed as follows:
Figure QLYQS_36
Figure QLYQS_37
Figure QLYQS_38
wherein ,
Figure QLYQS_39
representing the integrated features of the feature maps corresponding to all intermediate layers.
8. An underwater sound source ranging device based on an attention mechanism and multi-scale fusion is characterized by comprising a preprocessing module and a ranging module, wherein,
the preprocessing module is used for preprocessing the received underwater signal by utilizing a signal processing technology to obtain a sample covariance matrix corresponding to the received signal;
the distance measurement module is used for inputting the sample covariance matrix into an underwater sound source distance measurement network to perform feature extraction and taking an output result as a predicted distance, wherein the underwater sound source distance measurement network takes a residual network as a main network, and the underwater sound source distance measurement network comprises a self-adaptive feature fusion module and at least one feature subspace channel attention module.
9. The underwater sound source ranging device based on the attention mechanism and the multi-scale fusion as claimed in claim 8, wherein the preprocessing module is specifically configured to:
normalizing the received underwater signal, and calculating to obtain an initial sample covariance matrix according to the normalized signal;
and separating the real part and the imaginary part of the initial sample covariance matrix, and stacking the separated sample covariance matrices of different frequencies along a first dimension to obtain the sample covariance matrix.
10. The underwater sound source ranging device based on the attention mechanism and the multi-scale fusion according to claim 8, wherein the underwater sound source ranging network further comprises at least one pooling layer and at least one fully-connected layer, the residual network is a multi-layer network, each layer is composed of at least one residual block, other layers except the last layer in the residual network are intermediate layers, each intermediate layer corresponds to a characteristic subspace channel attention module, each layer of the residual network corresponds to one pooling layer and one fully-connected layer, and the ranging module is specifically configured to:
inputting the sample covariance matrix into the residual error network, sequentially passing through each layer of network to obtain output data of each middle layer, and obtaining an output result of the last layer as a final output result of the residual error network;
the final output result passes through a corresponding pooling layer and a full-connection layer to obtain an initial prediction result;
the output data of each intermediate layer is respectively passed through a corresponding feature subspace channel attention module to obtain feature graphs corresponding to each intermediate layer, all the feature graphs are input into the self-adaptive feature fusion module to obtain integrated features corresponding to each intermediate layer, and the integrated features corresponding to each intermediate layer are respectively passed through a corresponding pooling layer and a full-connection layer to obtain prediction results of all the intermediate layers;
and adding and averaging the initial prediction results and the prediction results of all the intermediate layers to obtain a final prediction result.
CN202310390544.2A 2023-04-13 2023-04-13 Underwater sound source ranging method and device based on attention mechanism and multi-scale fusion Active CN116106880B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310390544.2A CN116106880B (en) 2023-04-13 2023-04-13 Underwater sound source ranging method and device based on attention mechanism and multi-scale fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310390544.2A CN116106880B (en) 2023-04-13 2023-04-13 Underwater sound source ranging method and device based on attention mechanism and multi-scale fusion

Publications (2)

Publication Number Publication Date
CN116106880A true CN116106880A (en) 2023-05-12
CN116106880B CN116106880B (en) 2023-06-30

Family

ID=86267671

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310390544.2A Active CN116106880B (en) 2023-04-13 2023-04-13 Underwater sound source ranging method and device based on attention mechanism and multi-scale fusion

Country Status (1)

Country Link
CN (1) CN116106880B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116559778A (en) * 2023-07-11 2023-08-08 海纳科德(湖北)科技有限公司 Vehicle whistle positioning method and system based on deep learning

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2014035328A (en) * 2012-08-10 2014-02-24 Tokyo Univ Of Marine Science & Technology Underwater positional relation information acquisition system and underwater positional relation information acquisition method
CN109975762A (en) * 2017-12-28 2019-07-05 中国科学院声学研究所 A kind of underwater sound source localization method
CN109993280A (en) * 2019-03-27 2019-07-09 东南大学 A kind of underwater sound source localization method based on deep learning
CN112329658A (en) * 2020-11-10 2021-02-05 江苏科技大学 Method for improving detection algorithm of YOLOV3 network
CN113269077A (en) * 2021-05-19 2021-08-17 青岛科技大学 Underwater acoustic communication signal modulation mode identification method based on improved gating network and residual error network
CN114332592A (en) * 2022-03-11 2022-04-12 中国海洋大学 Ocean environment data fusion method and system based on attention mechanism
CN115239974A (en) * 2022-06-27 2022-10-25 重庆邮电大学 Vision synchronous positioning and map construction closed-loop detection method integrating attention mechanism

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2014035328A (en) * 2012-08-10 2014-02-24 Tokyo Univ Of Marine Science & Technology Underwater positional relation information acquisition system and underwater positional relation information acquisition method
CN109975762A (en) * 2017-12-28 2019-07-05 中国科学院声学研究所 A kind of underwater sound source localization method
CN109993280A (en) * 2019-03-27 2019-07-09 东南大学 A kind of underwater sound source localization method based on deep learning
CN112329658A (en) * 2020-11-10 2021-02-05 江苏科技大学 Method for improving detection algorithm of YOLOV3 network
CN113269077A (en) * 2021-05-19 2021-08-17 青岛科技大学 Underwater acoustic communication signal modulation mode identification method based on improved gating network and residual error network
CN114332592A (en) * 2022-03-11 2022-04-12 中国海洋大学 Ocean environment data fusion method and system based on attention mechanism
CN115239974A (en) * 2022-06-27 2022-10-25 重庆邮电大学 Vision synchronous positioning and map construction closed-loop detection method integrating attention mechanism

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郭启帆;刘磊;张珹;徐文娟;靖稳峰;: "基于特征金字塔的多尺度特征融合网络", 工程数学学报, no. 05 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116559778A (en) * 2023-07-11 2023-08-08 海纳科德(湖北)科技有限公司 Vehicle whistle positioning method and system based on deep learning
CN116559778B (en) * 2023-07-11 2023-09-29 海纳科德(湖北)科技有限公司 Vehicle whistle positioning method and system based on deep learning

Also Published As

Publication number Publication date
CN116106880B (en) 2023-06-30

Similar Documents

Publication Publication Date Title
CN110472627B (en) End-to-end SAR image recognition method, device and storage medium
CN110532932B (en) Method for identifying multi-component radar signal intra-pulse modulation mode
CN111736125B (en) Radar target identification method based on attention mechanism and bidirectional stacking cyclic neural network
CN110751044A (en) Urban noise identification method based on deep network migration characteristics and augmented self-coding
CN110555841B (en) SAR image change detection method based on self-attention image fusion and DEC
CN116106880B (en) Underwater sound source ranging method and device based on attention mechanism and multi-scale fusion
CN111062450B (en) Image classification device and method based on FPGA and SCNN architecture
Wei et al. A method of underwater acoustic signal classification based on deep neural network
Rasmussen et al. Automatic detection and classification of baleen whale social calls using convolutional neural networks
CN116206185A (en) Lightweight small target detection method based on improved YOLOv7
Li et al. Learning deep models from synthetic data for extracting dolphin whistle contours
CN110782458A (en) Object image 3D semantic prediction segmentation method of asymmetric coding network
CN114742985A (en) Hyperspectral feature extraction method and device and storage medium
CN106251375A (en) A kind of degree of depth study stacking-type automatic coding of general steganalysis
CN116502174A (en) Multi-mode deep learning-based environment recognition method and device
CN113222824B (en) Infrared image super-resolution and small target detection method
CN109508639B (en) Road scene semantic segmentation method based on multi-scale porous convolutional neural network
White et al. More than a whistle: Automated detection of marine sound sources with a convolutional neural network
CN117310668A (en) Underwater sound target identification method integrating attention mechanism and depth residual error shrinkage network
CN112966815A (en) Target detection method, system and equipment based on impulse neural network
CN116884435A (en) Voice event detection method and device based on audio prompt learning
Islam et al. Convolutional neural network based marine cetaceans detection around the swatch of no ground in the bay of bengal
CN113284150B (en) Industrial quality inspection method and industrial quality inspection device based on unpaired industrial data
CN115147727A (en) Method and system for extracting impervious surface of remote sensing image
CN114065822A (en) Electromagnetic identification method and system for ocean tide fluctuation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant