CN117173556A

CN117173556A - Small sample SAR target recognition method based on twin neural network

Info

Publication number: CN117173556A
Application number: CN202310942393.7A
Authority: CN
Inventors: 魏倩茹; 杨锦龙
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2023-07-30
Filing date: 2023-07-30
Publication date: 2023-12-05

Abstract

The application discloses a small sample SAR target recognition method based on a twin neural network, which comprises two symmetrical branches, wherein each branch adopts DenseNet as a main extraction network, and an acceptance is used for replacing a conventional convolution layer in the main network; each branch contains 3 DenseBlock, each DenseBlock contains 4 units of acceptance, as shown in FIG. 2, each unit of acceptance outputs 12 channels of feature map; the ratio of the number of feature mapping channels output for different convolution kernels is 1:3:8. According to the application, when the data sets are fewer, the categories are single and the images in complex scenes are identified, the identification accuracy is effectively improved; different feature graphs are generated to express the features in a mode of sharing the local connection weight, so that the complexity of network calculation is reduced to a greater extent.

Description

Small sample SAR target recognition method based on twin neural network

Technical Field

The application relates to radar technology, in particular to a small sample SAR target recognition method based on a twin neural network.

Background

Synthetic aperture radar (Synthetic Aperture Radar, SAR) is an all-day, all-weather radar with high resolution and high penetration. SAR plays an important role in the military and civilian fields. By virtue of the observation advantage of the ground penetrating capacity, SAR has rapidly developed in recent decades, and is widely applied to the fields of agriculture and forestry, hydrologic investigation, disaster early warning, resource exploration, ocean monitoring and the like.

Traditional SAR target identification methods rely on a large number of artificial design features. In addition, the traditional method is generally large in calculation amount and poor in generalization performance for new categories. With the rapid development of machine learning techniques, convolutional neural networks (Convolutional Neural Network, CNN) have become very popular and indispensable architectures in SAR target identification, and are largely superior to traditional approaches.

In recent years, CNN has become a mainstream feature extraction method in SAR automatic target recognition, and its variant greatly improves recognition performance by deleting full connection layer, constructing multi-channel structure, adopting batch normalization, adding information recorder, constructing cascade network, introducing attention mechanism, and the like. Meanwhile, the multi-feature fusion network is further designed by utilizing important features such as geometry, space, time and the like. However, these methods have limited feature representation capability due to the simplicity of the network architecture and the inability to extract geometry due to distortion of the input data. Traditionally, depth CNN typically requires a large amount of training data to avoid overfitting, but due to limitations in the observation conditions and high cost and time-consuming manual labeling, training data is insufficient to achieve automatic identification of SAR targets. Since data acquisition of SAR is more difficult than natural scene images and annotation recognition of SAR images is very time-consuming and laborious, obtaining large-scale SAR target samples and annotation recognition remains a challenge for SAR.

The existing SAR target recognition method mostly applies the target recognition method of the natural image to the SAR image, ignores the characteristics of SAR data, namely the SAR image is a complex-valued image which simultaneously contains amplitude and phase information, and the traditional characteristic extraction mode can cause the change of the target SAR image or SAR characteristic vector due to the fact that the SAR image (or characteristic vector) is sensitive to the change of the attitude angle of the target when the SAR target is recognized, and in addition, the change of the structure, shielding, concealment, background, parameter and the like of the target can cause the change of the target SAR image or SAR characteristic vector, so that the recognition effect is possibly unsatisfactory. Since the acquisition of SAR data and sample labeling can be very expensive and laborious, a large number of samples cannot be obtained, some target classes may have only a few or tens of labeled samples, and the severe reliance of deep learning on large data sets limits the application of deep learning in the field of Automatic Target Recognition (ATR) of Synthetic Aperture Radars (SAR) where the target sample set is typically small. In this case, the performance of CNN is significantly reduced, and a problem of small sample SAR target identification occurs.

Disclosure of Invention

The application mainly aims to provide a small sample SAR target recognition method based on a twin neural network, which comprises two symmetrical branches, wherein each branch adopts DenseNet as a main extraction network, and aims at solving the gradient disappearance problem of a DenseNet network model when small sample learning is performed, and the main network uses acceptance to replace a conventional convolution layer, so that feature propagation is enhanced, and features with different scales can be utilized. Jump connection is added in the network to enhance the capability of the network to utilize different scale features in final prediction, and the network structure can reduce the number of learning parameters in the model and simultaneously alleviate the problem of overfitting; then, similarity calculation is carried out on the two inputs after the feature extraction, and whether the two inputs belong to the same category is judged; finally, the output is compressed into the [0,1] interval by using a sigmoid function.

The technical scheme adopted by the application is as follows: a small sample SAR target recognition method based on a twin neural network comprises two symmetrical branches, wherein each branch adopts DenseNet as a main extraction network, and an acceptance is used in the main network to replace a conventional convolution layer;

each branch contains 3 DenseBlock, each DenseBlock contains 4 units of acceptance, as shown in FIG. 2, each unit of acceptance outputs 12 channels of feature map; the ratio of the number of feature mapping channels output for different convolution kernels is 1:3:8.

Further, where Hl represents the output of the first layer in DenseBlock, the output Hl+1 of the (l+1) th layer can be expressed as formula (1)

Hl+1＝[[f7(Hl)，f5(Hl)，f3(Hl)]，Hl-1，...，H1]

(1)

fn () represents a convolution layer with a convolution kernel size n, [ ] represents a join operation on a channel; the convolution layer consists of four basic operations, namely batch normalization, unbiased convolution, a ReLU activation function and dropout with 0.2 probability;

the acceptance unit consists of several convolution layers, the first 1 x 1 convolution of which is introduced as a bottleneck layer that produces 4 times the feature map dimension;

then, processing the feature map by using convolution layers with different kernel sizes;

the feature maps generated by the first DenseBlock and the second DenseBlock are added to the final fully connected layer through a 1×1 convolution layer and global averaging pooling.

Further, the feature extraction module mainly comprises 3 DenseBlock modules;

the transition layer is accessed after the first two DenseBlock blocks, and the grid size and the channel number of the feature map are reduced to half of the original size through the transition layer between the adjacent DenseBlock blocks;

the transition layer contains mainly a convolution of 1 x 1 and an average pooling with a step size of 2.

Further, during training, model training is performed by using a binary cross entropy loss function to measure the similarity between two input samples;

the binary cross entropy is a commonly used los function, p represents the distribution of real marks, q represents the prediction mark distribution of a trained model, the similarity of p and q can be measured by the cross entropy Loss function, and the binary cross entropy is shown as a formula (2):

the application has the advantages that:

by utilizing the features of different scales, the application strengthens the feature propagation and effectively relieves the gradient vanishing problem in the learning of small samples;

jump connection is added in the network to enhance the capability of the network to utilize different scale features in final prediction, and the network structure can reduce the number of learning parameters in the model and simultaneously alleviate the problem of overfitting;

under the condition that training samples are limited, the generalization capability of the model is further improved;

when the data sets are fewer, the categories are single and the images in the complex scene are identified, the identification accuracy is effectively improved;

different feature graphs are generated to express the features in a mode of sharing the local connection weight, so that the complexity of network calculation is reduced to a greater extent.

In addition to the objects, features and advantages described above, the present application has other objects, features and advantages. The present application will be described in further detail with reference to the drawings.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application.

FIG. 1 is a network structure diagram of a small sample SAR target recognition method based on a twin neural network of the present application;

FIG. 2 is a diagram of a DenseBlock module of the present application;

FIG. 3 is a block diagram of an acceptance unit of the present application;

fig. 4 is a transition layer diagram of the present application.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

(1) In a small sample SAR target recognition method (Multi-scale Dense connection model based on Siamese neural network for few shot SAR target recognition, MDSN) based on a Multi-scale dense connection model of a twin neural network, the problem of overfitting is solved through a Multi-scale dense connection network structure, the idea of the twin neural network is utilized, the MDSN network adopts double inputs, as shown in figure 1, the weight is shared by a dotted line part, and the loss calculation is performed after the two inputs obtain two outputs. For a picture, a convolution kernel scans each pixel of the picture in turn, and the weight of each pixel point is processed to be unchanged, namely the weight sharing is realized. The network output obtained by this convolution is then referred to as a feature map of the image, i.e. a convolution kernel corresponds to a feature map. To extract more features from the original natural data, as many differentiated feature maps can be obtained by processing the input data using multiple convolution kernels. Thus, after introducing the idea of weight sharing, the network does not express the features through a large number of connection weights, but rather expresses the features by generating different feature graphs in a manner of sharing local connection weights. Therefore, the complexity of network calculation can be reduced to a greater extent, and the actual realization of the network is facilitated. Meanwhile, various features of the local structure of the image are mined, and recognition of an image target is facilitated.

(2) The small sample identification method based on the multi-scale dense connection model of the twin neural network comprises two symmetrical branches, each branch adopts DenseNet as a main extraction network, and aiming at the gradient disappearance problem of the DenseNet network model when small sample learning is carried out, an acceptance is used in the main network to replace a conventional convolution layer. Each branch contains 3 DenseBlock, each containing 4 units of acceptance, each outputting 12 channels of feature maps, as shown in FIG. 2. The ratio of the number of feature map channels outputs of different convolution kernels (3×3,5×5,7×7) is 1:3:8.

(3) Hl represents the output of the first layer in DenseBlock, then the output Hl+1 of the (l+1) th layer can be expressed as equation (1)

Hl+1＝[[f7(Hl)，f5(Hl)，f3(Hl)]，Hl-1，...，H1]

(1)

fn () represents a convolution layer of convolution kernel size n, [ ] represents a join operation on a channel. The convolution layer consists of four basic operations, batch normalization, unbiased convolution, a ReLU activation function, and a dropout of 0.2 probability. The acceptance unit consists of several convolution layers, the first 1 x 1 convolution of which is introduced as a bottleneck layer that produces 4 times the feature map dimension. The feature map is then processed with convolution layers of different kernel sizes. For a large-sized convolution kernel, e.g., 7 x 7, two kernel sizes of 7 x 1 and 1 x 7 convolutions may be used to replace this 7 x 7 kernel, as shown in fig. 3. Bottleneck layers and substitution operations help to reduce model complexity, alleviating overfitting. In addition, in order to utilize features of different mesh sizes, feature maps generated by the first and second DenseBlock are added to the final fully connected layer through a 1×1 convolution layer and global averaging pooling.

(4) The feature extraction module mainly comprises 3 DenseBlock modules. And accessing a transition layer after the first two DenseBlock blocks, and reducing the grid size and the channel number of the feature map to half of the original size through the transition layer between adjacent DenseBlock blocks. The transition layer contains mainly a convolution of 1 x 1 and an average pooling of step size 2 as shown in fig. 4. The transition layer serves mainly two functions:

1) Preventing infinite increase of the number of features and further compressing the data.

2) Downsampling reduces the resolution of the feature map.

(5) During training, a binary cross entropy (Binary cross entropy) loss function is used for model training to measure the similarity between two input samples. The binary cross entropy is a commonly used los function, p represents the distribution of real marks, q represents the prediction mark distribution of the trained model, and the cross entropy Loss function can measure the similarity of p and q. The formula of the binary cross entropy is shown as formula two:

/>

the foregoing description of the preferred embodiments of the application is not intended to limit the application to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the application are intended to be included within the scope of the application.

Claims

1. The small sample SAR target recognition method based on the twin neural network is characterized by comprising two symmetrical branches, wherein each branch adopts DenseNet as a main extraction network, and an acceptance is used for replacing a conventional convolution layer in the main network;

2. The method for small sample SAR target recognition based on twin neural network as set forth in claim 1, wherein Hl represents the output of the first layer in DenseBlock, and the output Hl+1 of the (l+1) th layer can be expressed as formula (1)

Hl+1＝[[f7(Hl)，f5(Hl)，f3(Hl)]，Hl-1，...，H1]

(1)

the feature map generated by the first DenseBlock and the second DenseBlock is passed 1×

1 convolution layer and global average pooling are added to the final fully connected layer.

3. The small sample SAR target identification method based on the twin neural network according to claim 1, wherein the feature extraction module mainly comprises 3 DenseBlock modules;

4. The small sample SAR target identification method based on the twin neural network according to claim 1, wherein during training, model training is performed using a binary cross entropy loss function to measure the similarity between two input samples;