CN109375186B

CN109375186B - Radar target identification method based on depth residual error multi-scale one-dimensional convolution neural network

Info

Publication number: CN109375186B
Application number: CN201811405815.2A
Authority: CN
Inventors: 郭晨; 简涛; 孙顺; 徐从安; 王海鹏; 王聪
Original assignee: Naval Aeronautical University
Current assignee: Naval Aeronautical University
Priority date: 2018-11-22
Filing date: 2018-11-22
Publication date: 2022-05-31
Anticipated expiration: 2038-11-22
Also published as: CN109375186A

Abstract

The invention provides a radar target identification method based on a depth residual error multi-scale one-dimensional convolution neural network. The radar target has translation sensitivity and attitude angle sensitivity, so that the identification difficulty is high, and the method can automatically extract the invariant features of the radar target HRRP. The method designs two one-dimensional residual multi-scale blocks and two one-dimensional multi-scale down-sampling layers by utilizing the characteristics of convolution kernels and down-sampling operation, and constructs a neural network model by taking the two blocks as cores, and on the basis, a new loss function is provided to improve the separability of the characteristics. The method can effectively extract the depth invariant feature of the target, has high recognition accuracy, and has good robustness and generalization performance.

Description

Radar target identification method based on depth residual error multi-scale one-dimensional convolution neural network

Technical Field

The invention belongs to the radar target automatic identification technology, and provides a radar target identification method based on a deep learning framework, which solves the problems of target HRRP (high resolution range) feature extraction and classification identification under the condition of designing a full-angle domain.

Background

The existing deep learning-based radar target HRRP feature extraction method is basically based on a self-coding model and a variant thereof, and self-coding consists of an encoder and a decoder. The encoding process is the mapping of input to a feature layer, the output of an encoder is the feature extracted from encoding, and when the number of neurons in the feature layer (hidden layer) is smaller than the dimension of input data, the encoding can be regarded as a dimension reduction operation, similar to a Principal Component Analysis (PCA).

At present, the problems of radar target HRRP identification are mainly divided into two aspects, namely attitude angle sensitivity of HRRP, angular domain division is needed in the identification process, and generalization performance of a second identification model is poor. Therefore, the number of hidden layers and neurons thereof needs to be increased by using self-coding for identification, but the overfitting phenomenon is easily generated in the model due to the limited number of training data. The self-coding based radar target HRRP identification is therefore mainly directed to the smaller angular domain.

The method utilizes a one-dimensional depth residual error multi-scale coiling machine neural network model to extract full-angle domain complex features of HRRP, the model has good generalization performance due to the local connection characteristic of a convolution kernel, and meanwhile, the cosine central loss function provided by the method improves the feature separability of different types of targets, so that the method can improve the target identification accuracy.

Disclosure of Invention

The invention aims to provide a radar target identification method based on a depth residual error multi-scale one-dimensional convolution neural network aiming at the identification difficulty of full-angle-domain HRRP, and the identification accuracy of the HRRP target under the full-angle-domain condition is improved.

The technical solution of the invention is as follows: and constructing a one-dimensional depth residual multi-scale convolutional neural network, designing a loss function which is more beneficial to the fine division of the target, and training the model by using the training sample to obtain the model which can be used for end-to-end identification.

In order to achieve the purpose, the invention comprises the following steps:

step 1: and constructing a depth one-dimensional residual convolution neural network model, and initializing model parameters.

Step 2: and (5) forward propagation, and calculating a loss function L in the iterative process.

And step 3: and (4) backward propagation, wherein parameters in the model are updated by adopting a chain rule.

And 4, step 4: and (5) repeating the steps 2 and 3 until the loss function is converged to obtain a model for radar target identification.

Compared with the prior art, the invention has the following advantages:

(1) the depth residual error multi-scale convolution neural network takes a one-dimensional convolution kernel as a core, the convolution kernel has the characteristic of sharing a local visual field and a weight, the convolution kernels with different scales can extract features with different fineness, the two features are utilized to enable a model to extract the full-angle-domain complex features of the HRRP by utilizing fewer parameters, and the generalization capability of the model is improved.

(2) A new loss function-cosine center loss is designed, and the loss function is utilized to train a model, so that the inter-class distance of the features can be increased, the intra-class distance of the features can be reduced, and the identification accuracy of the target can be improved.

Drawings

FIG. 1: multi-scale convolutional layer I.

FIG. 2: multi-scale convolutional layer II.

FIG. 3: residual multi-scale block schematic.

FIG. 4: layer I is multi-scale down sampled.

FIG. 5: layer II is multi-scale down sampled.

FIG. 6: the model provided by the method is a schematic diagram.

Detailed Description

The invention is described in further detail below with reference to the drawings. The model constructed by the invention is explained as follows by referring to the attached drawings of the specification:

1. construction of residual multi-scale volume blocks

Two multi-scale convolutional layers are designed, and the structures of the layers are shown in figures 1 and 2. Except for the modules that label the downsampled layers, the other modules represent convolutional layers.

Each of the multi-scale convolutional layers I, II includes four branches, where (from left to right), branch one and branch two of multi-scale convolutional layer I, II are the same, branch one includes one convolutional layer, the convolutional kernel scale is 1 × 1, and branch two includes two convolutional layers, and the convolutional kernel scales (from top to bottom) are 1 × 1 and 3 × 3, respectively. The difference in the multi-scale convolutional layer I, II is mainly reflected in branch three and branch four. The three branches contain three convolutional layers, the sizes of the convolutional cores from top to bottom of the three branches of the multi-scale convolutional layer I, II are 1 x 1, 3 x 1, 5 x 1 and 5 x 1, 3 x 1 and 1 x 1 respectively, the four branches contain one down-sampling layer and one convolutional layer, and the convolutional cores of the convolutional layers in the four branches of the multi-scale convolutional layers I and II are 3 x 1 and 1 x 1 respectively. And the down-sampling layers of the branch four are the same, and the function of the layer is to remove redundant information and better keep effective information of the characteristics. And finally, directly splicing the features of each branch together, taking the multi-scale convolutional layer I as an example, and obtaining an output feature with the number of channels of the input feature passing through the multi-scale convolutional layer as 16+16+16+ 64. The output of each convolution layer and down-sampling layer uses zero padding operation, namely the dimension of the input and output of the multi-scale convolution layer is not changed, and only the number of channels is changed.

The residual block is composed of convolutional layers, and may include one or more convolutional layers, and in order to ensure that the network does not degrade, the number of convolutional layers in the residual block is not too large, and is usually two. The output of the residual block is the sum of the convolution layer output and the residual block input, and the use of the residual block can avoid the situation that the model has gradient explosion or disappears. Through convolution operation, the length of the characteristic vector is reduced, so that the zero filling method is utilized to ensure that the number and the dimensionality of the output characteristic vector are the same as those of the input characteristic vector. The plus sign in the figure indicates that the elements of the corresponding feature vector are added one by one. The expression of the residual block is

x^l+1＝F(x^l,θ_r)+W_rx^l

Wherein x is^l∈R^k×mRepresenting input feature vectors, x^l+1∈R^h×mRepresenting the output eigenvectors, k and h represent the number of input and output eigenvectors, respectively. Theta_r＝{k₁,k₂,b₁,b₂Denotes the set of parameters, W, of the residual block_r∈R^h×kFor adjusting x^lDimension of (2) is such that it is in contact with x^l+1Are the same, here we set

W_rThe constant matrix is not needed to be updated in the process of back propagation.

The residual multi-scale convolution block can be obtained by replacing the convolution layer in the residual block with a multi-scale convolution layer, as shown in fig. 3.

2. Construction of multiscale downsampling layers

Two multi-scale down-sampling layers are designed as shown in fig. 4 and 5. Except for the modules that label the downsampled layers, the other modules represent convolutional layers.

The multi-scale downsampling layers I, II each include three branches, where (from left to right) branch one includes one convolution layer with convolution kernel sizes of 3 x 1 and 1 x 1, respectively. The branch two contains two convolution layers, the convolution kernel sizes are 5 x 1, 3 x 1 and 3 x 1, 1 x 1 respectively, the branch three consists of one convolution layer and one down-sampling layer, the convolution kernel sizes are 3 x 1 and 1 x 1 respectively, and the window length of the down-sampling layer is 3 x 1. The step length of the last layer of each branch is 2, so that the output dimensionality is one half of the input characteristic dimensionality, and the function of the multi-scale down-sampling layer not only reduces the data dimensionality, but also extracts deeper characteristics.

3. Loss function design

The Loss function design uses Large boundary Cosine Loss (Large Margin Cosine Loss) in the face recognition method for reference, and on the basis, the constraint of the intra-class distance of the features is considered, and a Loss function-Cosine center Loss function based on the Cosine distance and the feature center is provided. The expression for the loss function is as follows:

where m is the number of training data, x_iRepresenting the output characteristics, y, of the fully connected layer preceding the output layer for the ith sample data_iThe true label of the ith sample data represents the true category to which the ith sample data belongs. W_j∈R^dIs the weight matrix W of the full connection layer belongs to R^d×nThe jth column of (1) is the weight vector corresponding to the jth class target.

Represents W_j，x_iCosine values between, s and alpha controlling L respectively_LMCLAnd L_centerThe amplitude of the normalized feature is adjusted to avoid the problem of unconvergence of the loss function after feature normalization, and a is a constant and is used for enhancing the feature x_iAnd weight W_jIs constrained.

Is the label y of the ith sample_iIs correspondingly provided withThe feature centers of the classes, λ is the weight of the central loss term, the larger λ, the more clustered the features within the class.

The loss function is divided into two parts L_LMCLAnd L_center，L_LMCLCalled Large Margin Cosine Loss, abbreviated as (LMCL), and the expression of the traditional softmax-Loss is

Wherein the content of the first and second substances,

cosθ_jthe similarity between the weight W and the feature x can be considered as a similarity measurement, and in order to train the model by making full use of the angle information, we want the norm of the weight matrix W and the norm of the feature vector x to be 1, so that the normalization operation is performed on W and x, and W is equal to W^*/||W^*||，x＝x^*/||x^*If | then get

Because of the fact that

Is monotonically decreasing, the characteristic x_iAnd weight W_jThe smaller the included angle of (a) is,

the larger, the feature x_iThe greater the probability that the corresponding sample i belongs to the class j, when satisfied

When sample i belongs to the y_iAnd (4) class. In the proposed method, when satisfied

When the sample i belongs to the y_iAnd (4) class.

The effect of the hyperparameter a > 0 is to increaseStrong characteristic x_iAnd weight W_jThe included angle of (a) is constrained, and under the ideal condition, the value range of a is

Where n is the number of categories. In the feature space, a is 1-cos (2 pi/n) if and only if the included angle between each adjacent feature center is the same and the included angle between the feature of each type of sample and the corresponding feature center is 0.

The parameter s solves the problem of non-convergence of the loss function, and the size of the parameter s satisfies

m is the number of training samples in min-batch, and ε is a very small constant, the closer to 0 the better.

L_centerC is normalized characteristic center, and the initialized expression is

Wherein, δ (y)_iJ) represents when y_iClass j, δ (-) equals 1, otherwise δ (-) equals 0, m_jIndicating the number of samples contained in the j-th class target in min-batch,

feature x representing sample i_iFeature center corresponding thereto

The function penalizes samples with larger included angle between the feature vector and the feature center.

Claims

1. A radar target identification method based on a depth residual error multi-scale one-dimensional convolution neural network is characterized by comprising the following steps:

step 1, constructing a depth one-dimensional residual convolution neural network model, and initializing model parameters;

step 2, forward propagation is carried out, and a loss function L in the iterative process is calculated;

the calculation method of the loss function specifically comprises the following steps:

wherein L is_LMCLIs LargeMargin Cosine Loss, L_centerFor the angle center loss function, m is the number of training data in one min-batch, x_iRepresenting the output characteristics, y, of the fully connected layer preceding the output layer for the ith sample data_iThe real label of the ith sample data represents the category of the ith sample data; w is a group of_j∈R^dIs the weight matrix W of the full connection layer belongs to R^d×nThe jth column of (1) represents a weight vector corresponding to the jth class target;

represents W_jAnd x_iThe cosine value between s and alpha controls L_LMCLAnd L_centerDetermining cosine boundaries between the categories by taking the amplitude of the normalized features as a constant;

is the label y of the ith sample data_iThe feature center of the corresponding class, λ is the weight of the center loss term;

feature x representing sample i_iFeature center corresponding thereto

Cosine value of the included angle of (a);

step 3, backward propagation, and updating parameters in the model by adopting a chain rule;

and 4, repeating the steps 2 and 3 until the loss function is converged to obtain a model for radar target identification.

2. The radar target recognition method of claim 1, wherein the neural network model in step 1 is composed of a residual multi-scale convolutional layer, a multi-scale down-sampling layer, a convolutional layer, a down-sampling layer, and a full-connection layer.