Disclosure of Invention
Aiming at the problem that the existing identification rate of space target ISAR images under the condition of small samples is low, the invention provides an XGboost space target ISAR image classification method under the condition of small samples based on multi-learner optimization.
The method for classifying the ISAR images of the space targets under the condition of small samples of the XGboost based on multi-learner optimization aims at ISAR spinning space target images to be classified and utilizes an XGboost model based on multi-learner optimization to classify the ISAR images of the space targets;
the XGboost model based on multi-learner optimization is an XGboost network model formed by integrating stacked networks by utilizing an XGboost algorithm;
the stack network is based on meta-learner integration, and parallel basic learners in the stack network comprise a graph convolution network, a capsule network and an Alexnet network added with an attention mechanism module based on rotation invariance; the Alexnet network added with the attention mechanism module based on the rotation invariance is a network model formed by adding the attention mechanism module based on the rotation invariance to the back of a second convolution layer of the Alexnet network; the capsule network comprises a basic feature extraction module, a vector feature extraction module and a dynamic routing layer, and does not contain a pooling layer.
Further, the method is characterized in that the ISAR spinning space target image to be classified needs to be preprocessed before the space target ISAR image classification is carried out by utilizing the XGboost model based on multi-learner optimization, and the preprocessing process comprises the following steps:
firstly, carrying out enhanced Lee filtering on the space target ISAR image;
and then carrying out contrast enhancement and energy normalization processing.
Further, the specific structure of the capsule network is as follows:
(A) basic feature extraction module:
the basic feature extraction module comprises three convolution units, wherein each convolution unit comprises a convolution layer, a batch normalization layer and a ReLU activation layer;
(B) the vector feature extraction module:
the vector feature extraction module comprises 8 convolutional layers, wherein the 8 convolutional layers are in parallel relation, and the 8 convolutional layers do not use an activation function; the sizes of the convolution layers are the same, and the input feature graphs are respectively convolved;
then, composing the obtained result into a tensor form;
(C) dynamic routing layer
The dynamic routing layer is realized by adopting a dynamic routing method.
Further, the Alexnet network incorporating the attention mechanism module based on rotation invariance is as follows:
the attention mechanism module based on the rotation invariance comprises a rotation processing module and a space attention mechanism module;
a rotation processing module: based on an Alexnet network, performing rotation operation on the output characteristic diagram of the ReLU activation function of the second convolution layer to obtain a rotated characteristic diagram;
the spatial attention mechanism module comprises three convolution units, namely Conv1, Conv2 and Conv 3;
conv 1: the rotation characteristic diagram is firstly activated through the convolution layer and then through the ReLU;
conv2 comprises Conv2-1 and Conv2-2 subunits, the characteristic diagrams processed by Conv1 are respectively sent into Conv2-1 and Conv2-2 subunits, and in the Conv2-1 and Conv2-2 subunits, the convolution layers are firstly passed through and then deactivated by Softmax, and an output characteristic diagram F is obtained2-1And F2-2;
The Conv3 unit comprises Conv3-1 and Conv3-2 subunits, a rotation characteristic diagram obtained by the rotation processing module is directly and respectively sent into the Conv3-1 and Conv3-2 subunits, the convolution layers in the Conv3-1 and Conv3-2 subunits are firstly processed, then batch normalization processing is carried out, and G is obtained3-j,j=1,2;
For feature map F
2-jWas subjected to transposition to obtain F'
2-jIs prepared from'
2-jAnd a characteristic diagram G
3-jAnd random initialization coefficient alpha
jMultiplying and adding to obtain an output characteristic diagram
Finally, averaging the output characteristic diagrams obtained by rotating all the angles to obtain the output characteristic diagram of the RIAM module
K represents the number of rotation operations performed by the rotation processing module.
Further, the graph convolution network is as follows:
the graph convolution network comprises a feature extraction module and a graph neural network processing module;
(a) a feature extraction module:
the feature extraction module comprises three convolution units, each convolution unit comprising: the system comprises a convolution layer, a batch normalization layer, a maximum pooling layer and a ReLU activation function layer;
the last convolution unit outputs the feature expression of the image in a vector form, and combines the feature expression with the one-hot coding formed by the image category as the input of the graph neural network module;
(b) the figure neural network module:
the graph neural network module comprises two units for calculating the adjacent matrix and two graph convolution units; the first calculation adjacent matrix unit is connected with the first graph convolution unit, and then the second calculation adjacent matrix unit and the second graph convolution unit are connected;
the input of the first calculation adjacent matrix unit is the subtraction of node expression vectors of the feature extraction module to obtain a two-dimensional feature map; the input of the second calculation adjacent matrix unit is a characteristic graph obtained by subtracting the node expression vectors output by the first graph convolution unit; the two calculation adjacent matrix units respectively comprise three convolution modules, the first two convolution modules respectively comprise a convolution layer, a batch normalization layer and a ReLU activation layer, and the last convolution module only comprises one convolution layer;
the first graph convolution unit comprises a full connection layer and a batch normalization layer; the second graph convolution unit contains two fully connected layers and one batch normalization layer, and the output of the second graph convolution unit is equal to the classification probability of five types of targets.
Further, the process of integrating the XGBoost network model formed by the stacked network by using the XGBoost algorithm includes the steps of:
2.4.1 computing weights of the underlying learner
XGboost networkIs D { xi,α,li,ei,αWhere i 1,2, n denotes the number of meta-features, xi,αIs a meta feature, liIs a label, ei,αIs the weight of the element feature, alpha is 1,2,3 is the serial number of the basic learner, and introduces the weight q corresponding to the learnerαThe weight expression of the meta-feature is as follows:
wherein x is
i,α,1、……,x
i,α,NRespectively being meta-features x
i,αThe elements of (1);
is the mean of the meta-features; n is the total number of meta-features, N is the number of elements in a meta-feature;
2.4.2 calculation of weight to loss function partial derivatives
Defining the loss function L, including the predicted values
And a label l
iCross entropy loss of l, and tree structure f
kCorresponding regularization loss Ω ();
the loss function L is defined as:
l is defined as follows:
wherein γ is the regularization parameter for the number of leaves, T is the number of leaf nodes, λ is the regularization parameter for the leaf weights, and w represents the weights of the leaves;
l relates to qαPartial derivatives of (a):
2.4.3 adaptively updating weights
Updating the learner's weights with the partial derivatives and update rate η calculated above:
updating the weight corresponding to the meta-feature by the change of the weight of the learner, and realizing the self-adaptive updating of the weight;
2.4.4 calculation of network gain
The gain of the network is calculated from the first and second derivatives of the predicted values, gi,αAnd hi,αAre the first and second derivatives of the meta-feature, and L and R are the left and right nodes of the decision tree. The gain is defined as:
and (5) splitting the tree by the maximum gain to obtain a final classification result.
Further, the training process of the XGBoost network model includes the following steps:
meta-feature output of graph convolution network: taking the classification probability of the data in the verification set as training meta-characteristics;
adding a rotation invariance-based attention mechanism module to the meta-feature output of the Alexnet network: combining the classification confidence degrees of the verification sample sets to form training meta-features;
meta-feature output of capsule network: taking the L2 norm of the output vector to obtain the classification probability, and taking the classification probability of the data in the verification sample set as the training element characteristic;
and training the XGboost network optimized by the multiple learners by using the training element characteristics.
Further, the training process of the graph volume network, the capsule network and the Alexnet network is as follows:
the Alexnet network model training process comprises the following steps:
in the training process, each class of the training sample library is equally divided into K parts, wherein K-1 part is a training sample set, the rest part is a verification sample set, and K-fold cross verification is carried out to ensure that K parts of data are verified;
training an Alexnet network model by utilizing a training sample set and a verification sample set, and combining the classification confidence degrees of the verification sample set to form training meta-features;
the graph convolution network model training process comprises the following steps:
dividing a training sample library into a training set and a verification set, wherein each class of the training set has qshot images, and each class of the verification set has K-qshot images; in the training process, 5 xqshot +1 images of the network are input in each iteration, wherein 1 means that one image is taken from a verification set in sequence to be used as verification, and the label of the image is known and is used for supervised learning;
training the graph convolution network model by using a training set and a verification set, and taking the classification probability of data in the verification set as training meta-features;
the capsule network model training process comprises the following steps:
dividing a training sample library into a training set and a verification set, training a capsule network model by using the training set and the verification set, obtaining an L2 norm of an output vector to obtain a classification probability, and taking the classification probability of data in the verification sample set as a training meta-feature;
the ISAR spin space target sample is stored in the training sample library.
Further, the ISAR spin space target sample is obtained based on a three-dimensional space target model and a range-Doppler algorithm simulation.
Further, the ISAR spin space target samples stored in the training sample library are preprocessed samples, and the preprocessing process includes the following steps:
firstly, carrying out enhanced Lee filtering on the space target ISAR image;
and then carrying out contrast enhancement, energy normalization and data expansion, wherein the data expansion comprises left-right turning and up-down turning.
The invention has the beneficial effects that:
the method of the invention uses the thought of an integration network as a reference, uses three complementary small sample learning networks, namely a network of Alexenet + RIAM module which focuses on the ISAR image target position and semantic information, a graph convolution network which focuses on the deep characteristic relation of the image and a capsule network which focuses on the relative position relation of key parts in a space target, and finally uses an innovative XGboost decision tree optimized by a multi-learner to integrate the network.
The invention comprises three parts, namely an ISAR spinning space target sample, a stacked network architecture and target class estimation; the ISAR spinning space target sample part is responsible for simulating and generating a large number of ISAR images under the space target motion state to form a training sample library and a testing sample library, the stacking network architecture provides a design structure of the network, and the target type estimation part is based on the stacking network and conducts training test on the ISAR spinning space target sample library to achieve a target type estimation function.
By adopting the stack network model trained by the training sample library, the target class estimation can be carried out on the ISAR space target images under the condition of a small amount of marked samples in the ISAR observation scene, the correct classification probability of basic learners such as a graph convolution network and the like on the targets is improved, the actual requirements are met, and the implementation is convenient.
According to the invention, scattering point information required by ISAR imaging is obtained through the three-dimensional model of the space target and electromagnetic parameter simulation, the ISAR imaging result is approximately consistent with the actually measured ISAR image, a small number of ISAR simulation images of the five types of targets are obtained, the requirement of the text stacking network method on samples is ensured, and the smooth application of the algorithm is ensured.
The invention innovatively provides a stacked network for ISAR spin space target type estimation, the input of the stacked network is an ISAR space target image with a small number of marked samples, the output is the category to which the target belongs, wherein the introduction of an attention mechanism module based on rotation invariance and the proposal of an XGboost method optimized by multiple learners provide guarantee for the improvement of the recognition rate, and certain method superiority is provided.
The invention utilizes ISAR imaging technology and deep learning technology to better meet the requirements of complex ISAR target class estimation in different imaging scenes, and the stacked network keeps certain stability on the visual angle change, affine transformation and noise of the input image.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention is further described with reference to the following drawings and specific examples, which are not intended to be limiting.
First embodiment this embodiment will be described with reference to fig. 1 to 5,
the method for classifying the ISAR images of the space targets under the condition of the small samples of the XGboost based on the optimization of the multiple learners in the embodiment comprises the following steps:
s1, obtaining an ISAR spin space target sample sequence based on the three-dimensional space target model and the range-Doppler algorithm simulation, and grouping the ISAR spin space target samples into a training sample library and a test sample library;
after obtaining the ISAR spin space target sample sequence, image data needs to be preprocessed: firstly, carrying out enhanced Lee filtering on the space target ISAR image to remove speckle noise; then contrast enhancement, energy normalization and data expansion are carried out, wherein the data expansion comprises left-right overturning and up-down overturning;
in the process of grouping ISAR spinning space target samples into a training sample library and a testing sample library, a K/N value is required to be set, the value represents the proportion of the number of images in the training sample library to the number of images in the testing sample library, and when the K/N is equal to 1, the number of images in the training sample library is equal to the number of images in the testing sample library;
s2, constructing a stack network based on meta-learner integration, wherein a basic learner of the stack network comprises an Alexnet network, a graph convolution network and a capsule network, wherein the Alexnet network is added with an attention mechanism module based on rotation invariance; the basic learner of the stacked network is a learner working in parallel;
the last meta-learner integrates the three basic learners by adopting an XGboost algorithm optimized by multiple learners;
the Alexnet network added with the attention mechanism module based on the rotation invariance comprises the following components: an attention mechanism module based on rotation invariance is added behind a second convolution layer of the Alexnet network, and due to the fact that ISAR space target image imaging mechanism causes different display angles of the same target in an image, the rotation invariance of feature extraction is enhanced for the network, and the identification effect is enhanced; the construction process of the Alexnet network added with the attention mechanism module based on the rotation invariance comprises the following steps:
2.1 constructing Alexnet network model with attention mechanism module based on rotation invariance:
2.1.1 equally dividing each class of the training sample library into K parts in the training process, wherein K-1 part is a training sample set, the remaining part is a verification sample set, and performing K-fold cross verification to ensure that K parts of data are verified;
2.1.2 attention mechanism module RIAM based on rotation invariance comprises a rotation processing module and a space attention mechanism module;
a rotation processing module: based on Alexnet network, carrying out a series of rotation operations on the output characteristic diagram of the ReLU activation function of the second convolution layer to become a rotation characteristic diagram;
is a series of rotary operations, using
The function makes an angle of lambda on the input characteristic diagram
iE, K, obtaining a rotated characteristic diagram x, x e R
{C×W×H}Wherein, R represents a real number set, W multiplied by H is the size of the characteristic diagram, C is the number of characteristic channels and is consistent with the number of input characteristic diagram channels;
the spatial attention mechanism module comprises three convolution units, namely Conv1, Conv2 and Conv 3;
firstly, setting the convolution kernel size and the channel number of Conv1, and deactivating the rotation characteristic diagram through the convolution layer and then through the ReLU, as shown in the following formula:
F(x)=ReLU(Conv1(Rλ(x))) (1)
conv2 includes Conv2-1 and Conv2-2 subunits, and the Conv1 processed feature maps are respectively sent to Conv2-1 and Conv2-2 subunits, and for the Conv2-1 and Conv2-2 subunits, the input feature map is F (x), the size of a convolution kernel is set, and the number of channels is consistent with the number of input feature channels. In Conv2-1 and Conv2-2 subunits, the output characteristic diagram F is obtained by firstly passing through the convolution layer and then deactivating by Softmax2-1And F2-2It is expressed as follows:
F2-j=Softmax(Conv2-j(F(x))),j=1,2 (2)
the Conv3 unit comprises Conv3-1 and Conv3-2 subunits, a rotation characteristic diagram obtained by the rotation processing module is directly and respectively sent into the Conv3-1 and Conv3-2 subunits, a convolution layer in the Conv3-1 and Conv3-2 subunits is firstly passed, hole convolution is used, the size of a convolution kernel is 3 x 3, the expansion step is set to be 2, filling is set to keep the size of an output characteristic diagram to be W x H, (so that a RIAM module can be added on the premise of not changing an Alexnet structure); then, the batch normalization process is performed as follows:
G3-j=BN(Conv3-j(Rλ(x))),j=1,2 (3)
for feature map F2-jWas subjected to transposition to obtain F'2-jIs prepared from'2-jAnd a characteristic diagram G3-jAnd random initialization coefficient alphajThe multiplication and addition result in an output profile, expressed as:
and finally, averaging the output characteristic diagrams obtained by rotating all the angles to obtain the output characteristic diagram of the RIAM module, wherein the output characteristic diagram is expressed as follows:
after the constructed attention mechanism module based on the rotation invariance is placed in a second convolution layer of Alexnet, because the requirement on the rotation invariance of the characteristics in too deep layers is not high, the characteristics extracted by the first convolution layer are too sensitive to the rotation, and the final classification result is poor after the first convolution layer;
2.1.3, combining the classification confidence degrees of the verification sample set to form training meta-features; and inputting the test sample library serving as a test set into the constructed Alexnet network for training to obtain the test meta-characteristics.
2.2 construct graph convolution network:
2.2.1 data processing:
the training sample library is divided into a training set and a verification set, wherein each class of the training set has qshot images, and each class of the verification set has K-qshot images; in the training process, 5 xqshot +1 images of the network are input in each iteration, wherein 1 means that one image is taken from a verification set in sequence to serve as verification, and the label of the image is known and is used for supervised learning.
2.2.2 the graph convolution network comprises a feature extraction module and a graph neural network processing module;
on the whole, the graph convolution network has 9 convolution layers, wherein the number of the feature extraction modules is 3, and the number of two calculation adjacent matrix units of the graph neural network module is 3 respectively; the convolution kernel size in the first convolution layer is 9 x 9 pixels, the step length is 2 pixels, and 32 convolution kernels are in total; the convolution kernel size in the second convolution layer is 5 x 5 pixels, the step size is 1 pixel, and 64 convolution kernels are calculated; the convolution kernel size in the third convolution layer is 3 x 3 pixels, the step size is 1 pixel, and 128 convolution kernels are totally arranged; the convolution kernel size in the fourth convolution layer is 3 x 3 pixels, the step size is 1 pixel, and 64 convolution kernels are totally arranged; the convolution kernel size in the fifth convolution layer is 3 x 3 pixels, the step length is 1 pixel, and 32 convolution kernels are in total; the convolution kernel size in the sixth convolution layer is 3 x 3 pixels, the step length is 1 pixel, and 1 convolution kernel is total; the convolution kernel size in the seventh convolution layer is 3 x 3 pixels, the step size is 1 pixel, and 64 convolution kernels are totally arranged; the convolution kernel size in the eighth convolution layer is 3 x 3 pixels, the step length is 1 pixel, and 32 convolution kernels are in total; the size of the convolution kernel in the ninth convolution layer is 3 multiplied by 3 pixels, the step length is 1 pixel, and 1 convolution kernel is total; only after the last convolution layer of the adjacent matrix unit is calculated, the activation function is not existed, and the activation function of each of the rest convolution layers is the ReLU activation function.
The total number of 3 pooling layers in the graph convolution network is maximum pooling, the size is 2 x 2 pixels, and the step length is 2 pixels.
(a) A feature extraction module:
the feature extraction module comprises three convolution units, each convolution unit comprising: a convolution layer, a batch normalization layer, a maximum pooling layer and a ReLU activation function layer;
the convolution layer input in the first convolution unit is a training set image and a verification set image, and the input of the next two convolution units is a feature map output by the previous convolution unit;
processing the weight coefficient by batch normalization;
performing pooling operation by using the maximum pooling layer;
activating the weight coefficient by utilizing a ReLU function;
and the last convolution unit outputs the feature expression of the image, the feature expression exists in a vector form, and the one-hot codes formed by the feature expression unit and the class of the image are combined and are recorded as node representation vectors of the feature extraction module to be used as the input of the graph neural network module.
(b) The figure neural network module:
the graph neural network module comprises two units for calculating the adjacent matrix and two graph convolution units; the first calculation adjacent matrix unit is connected with the first graph convolution unit, and then the second calculation adjacent matrix unit and the second graph convolution unit are connected;
the first input of the calculation adjacent matrix unit is a two-dimensional characteristic graph obtained by subtracting the node expression vectors of the characteristic extraction module (the initial input is a series of images, the node expression vectors are obtained after the images are processed in pairs, and the node expression vectors are subtracted from each other); the input of the second calculation adjacent matrix unit is a feature graph obtained by subtracting the node expression vectors output by the first graph convolution unit (similar to the processing of subtracting the node expression vectors of the first calculation adjacent matrix unit); the two calculation adjacent matrix units respectively comprise three convolution modules, the first two convolution modules respectively comprise a convolution layer, a batch normalization layer and a ReLU activation layer, and the last convolution module only comprises one convolution layer;
the first graph convolution unit comprises a full connection layer and a batch normalization layer; the second graph convolution unit contains two fully connected layers and one batch normalization layer, and the output of the second graph convolution unit is equal to the classification probability of five types of targets.
2.2.4 output of Meta features:
and taking the classification probability of the data in the verification set as a training meta-feature, and taking the classification probability of the data in the test sample library as a test meta-feature.
2.3 constructing a capsule network:
2.3.1 data processing:
the training sample library is subdivided into a training set and a validation set.
2.3.2 the capsule network comprises a basic feature extraction module, a vector feature extraction module and a dynamic routing layer;
on the whole, the capsule network has 11 convolution layers, wherein 3 basic feature extraction modules and 8 vector feature extraction modules are provided; the convolution kernel size in the first convolution layer is 9 x 9 pixels, the step size is 2 pixels, and 64 convolution kernels are obtained in total; the convolution kernel size in the second convolution layer is 9 x 9 pixels, the step size is 2 pixels, and 128 convolution kernels are totally arranged; the convolution kernel size in the third convolution layer is 7 x 7 pixels, the step size is 1 pixel, and 128 convolution kernels are totally arranged; the sizes of convolution kernels from the fourth convolution layer to the eleventh convolution layer are all 9 x 9 pixels, the step length is 2 pixels, the convolution kernels are all 32 layers, the 8 convolution layers are in parallel relation, and vector features are formed in new dimensions; the activation functions of the first three convolutional layers are all ReLU activation functions, and the rest convolutional layers do not use activation functions. The capsule network does not contain a pooling layer.
(A) Basic feature extraction module:
the basic feature extraction module aims to convert the image into a feature map form through a convolution layer; the basic feature extraction module comprises three convolution units, wherein each convolution unit comprises a convolution layer, a batch normalization layer and a ReLU activation layer;
the input of the convolutional layer is a training set image and a test set image;
processing the weight coefficient by batch normalization;
activating the weight coefficient by utilizing a ReLU function;
does not contain a pooling layer;
(B) the vector feature extraction module:
the vector feature extraction module comprises 8 convolution layers, the size of each convolution layer is the same, the input feature graph is convolved respectively, and the obtained results form a tensor form;
the input of the convolutional layer is a training set image and a test set image;
setting the size of convolution kernel, the number of channels and the step length of the convolution layer respectively;
combining the output results in a new dimension, and outputting a feature vector;
the length of the characteristic vector changes after the characteristic vector passes through the attitude transformation matrix;
(C) dynamic routing layer
Similar to the full connection layer in the convolutional neural network (the dynamic routing and the full connection layer work on different principles and have the same function), the dynamic routing method is used for realizing the dynamic routing in the capsule network;
the number of output vectors is consistent with the type number of the space target;
output of 2.3.3-dimensional features
And taking the L2 norm of the output vector to obtain the classification probability, taking the classification probability of the data in the verification sample set as the training meta-feature, and taking the classification probability of the data in the test sample set as the test meta-feature.
2.4 building a multi-learner optimized XGboost network:
the XGboost network optimized by the multiple learners comprises the steps of calculating the weight of a basic learner, calculating the partial derivative of a loss function by the weight, and calculating the self-adaptive updating weight and the network gain;
2.4.1 computing weights of the underlying learner
The input to the network is D { xi,α,li,ei,α} of whichWhere i 1,2, n denotes the number of training meta-features, xi,αIs a training meta-feature, liIs a label, ei,αIs the weight of the element feature, alpha is 1,2,3 is the serial number of the basic learner, and introduces the weight q corresponding to the learnerαThe weight expression of the meta-feature is as follows:
wherein x is
i,α,1、……,x
i,α,NRespectively being meta-features x
i,αThe elements of (1);
is the mean of the meta-features; n is the total number of meta-features, N is the number of elements in a meta-feature;
2.4.2 calculation of weight to loss function partial derivatives
Defining the loss function L, including the predicted values
And a label l
iCross entropy loss of l, and tree structure f
kCorresponding regularization loss Ω ();
the loss function L is defined as:
l is defined as follows:
wherein γ is the regularization parameter for the number of leaves, T is the number of leaf nodes, λ is the regularization parameter for the leaf weights, and w represents the weights of the leaves;
l relates to qαPartial derivatives of (a):
2.4.3 adaptively updating weights
Updating the learner's weights with the partial derivatives and update rate η calculated above:
updating the weight corresponding to the meta-feature by the change of the weight of the learner, and realizing the self-adaptive updating of the weight;
2.4.4 calculation of network gain
The gain of the network is calculated from the first and second derivatives of the predicted values, gi,αAnd hi,αAre the first and second derivatives of the meta-feature, and L and R are the left and right nodes of the decision tree. The gain is defined as:
and (5) splitting the tree by the maximum gain to obtain a final classification result.
S3, training the basic learner by adopting a training sample library and a testing sample library respectively to obtain training element characteristics and testing element characteristics;
training the XGboost network optimized by the multiple learners by using the training element characteristics to obtain an XGboost network model, and testing the testing element characteristics to obtain a result with higher recognition rate;
the training sample and the test sample are preprocessed ISAR images;
in the training process, the pixel size of the Alexene network input ISAR sample of the stack network is 224 multiplied by 224, the pixel size of the graph convolution network input ISAR sample is 32 multiplied by 32, the pixel size of the capsule network input ISAR sample is 128 multiplied by 128, and the class 5 space targets are included.
For the graph convolution network, the input image size is 32 × 32 pixels;
the ISAR image is processed by the characteristic extraction layer and then transmitted to the first convolution layer;
the first convolution layer processes the input image and outputs 32 feature maps with the size of 32 multiplied by 32 pixels;
after batch normalization and ReLU processing, the first pooling layer processes the feature map output by the first convolution layer and outputs a three-dimensional feature matrix of 16 × 16 × 32 pixels;
the second convolution layer processes the three-dimensional feature matrix output by the first pooling layer and outputs 64 feature maps with the size of 16 x 16 pixels;
after batch normalization and ReLU processing, the second pooling layer processes the feature map output by the second convolution layer and outputs a three-dimensional feature matrix of 8 multiplied by 64 pixels;
the third convolution layer processes the three-dimensional feature matrix output by the second pooling layer and outputs 128 feature maps with the size of 8 x 8 pixels;
the third pooling layer processes the feature map output by the third convolution layer after batch normalization and ReLU processing, and outputs a three-dimensional feature matrix of 4 x 128 pixels;
converting the feature map into vectors, adding one-hot codes as vertex feature representation, wherein the size of the vertex feature representation is 21 multiplied by 1 multiplied by 128, and subtracting the vertex representation to obtain an adjacent feature matrix, and the size of the adjacent feature matrix is 21 multiplied by 128 pixels;
the fourth convolution layer processes the adjacent feature matrix and outputs 64 feature maps with the size of 21 multiplied by 21 pixels;
processing the feature map output by the fourth convolutional layer by the fifth convolutional layer after batch normalization and ReLU processing, and outputting 32 feature maps with the size of 21 x 21 pixels;
processing the feature map output by the fifth convolutional layer by the sixth convolutional layer after batch normalization and ReLU processing, and outputting 1 feature map with the size of 21 multiplied by 21 pixels;
the first full-connection layer comprises 16 neurons and is fully connected with one-dimensional vectors mapped by the three-dimensional characteristic matrix output by the sixth convolutional layer;
outputting vertex representation vectors with the size of 16 multiplied by 1 after the processing of a full connection layer and a batch normalization layer; the vertex represents the adjacent characteristic matrix obtained by subtraction, and the size of the adjacent characteristic matrix is 16 multiplied by 1 pixels;
the seventh convolution layer processes the adjacent feature matrix and outputs 64 feature maps with the size of 16 multiplied by 16 pixels;
processing the feature map output by the seventh convolutional layer by the eighth convolutional layer after batch normalization and ReLU processing, and outputting 32 feature maps with the size of 16 multiplied by 16 pixels;
the ninth convolutional layer after batch normalization and ReLU processing processes the feature map output by the eighth convolutional layer and outputs 1 feature map with the size of 16 multiplied by 16 pixels;
the second full-connection layer comprises 16 neurons and is fully connected with the one-dimensional vectors mapped by the three-dimensional characteristic matrix output by the fourth pooling layer;
the second full-junction layer comprises 5 neurons and is fully connected with the 16 neurons of the first full-junction layer;
for the capsule network, the input image size is 128 × 128 pixels;
the first convolution layer processes the input image and outputs 64 feature maps with the size of 60 multiplied by 60 pixels;
batch normalization and ReLU processing;
the second convolution layer processes the three-dimensional feature matrix and outputs 128 feature maps with the size of 26 multiplied by 26 pixels;
batch normalization and ReLU processing;
the third convolution layer processes the three-dimensional characteristic matrix and outputs 128 characteristic graphs with the size of 20 multiplied by 20 pixels;
batch normalization and ReLU processing;
processing the three-dimensional feature matrix by the fourth to eleventh convolutional layers to obtain a feature vector with the size of 6 × 6 × 8 × 32 formed by the feature map on a new dimension;
1152 eigenvectors with the size of 1 × 16 are obtained after processing by 6 × 6 × 32 transformation matrixes with the size of 8 × 16;
5 characteristic vectors of 1 multiplied by 16 are obtained by utilizing a dynamic routing algorithm;
the L2 norm of the feature vector is computed to obtain the classification probability.
During training, the number of training samples in each batch in the Alexnet network is 32, 2800 batches are trained, the learning rate before the full link layer is 0.0001, the full link layer is 0.001, and the optimization method is SGD; the number of training samples in each batch in the GCN is 16, 6000 batches of training are performed in total, and the learning rate is 0.01; the number of training samples in each batch in the CapsNet is 64, 100 batches are trained together, and the learning rate is 0.001; the depth of the XGboost optimized by the multiple learners is 20, the number of the trees is set to 2000, and the classification number is 5.
Although the invention herein has been described with reference to particular embodiments, it is to be understood that these embodiments are merely illustrative of the principles and applications of the present invention. It is therefore to be understood that numerous modifications may be made to the illustrative embodiments and that other arrangements may be devised without departing from the spirit and scope of the present invention as defined by the appended claims. It should be understood that features described in different dependent claims and herein may be combined in ways different from those described in the original claims. It is also to be understood that features described in connection with individual embodiments may be used in other described embodiments.