CN111274893A

CN111274893A - Aircraft image fine-grained identification method based on component segmentation and feature fusion

Info

Publication number: CN111274893A
Application number: CN202010038491.4A
Authority: CN
Inventors: 熊运生; 牛新; 窦勇; 姜晶菲; 王康; 郄航
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2020-01-14
Filing date: 2020-01-14
Publication date: 2020-06-12
Anticipated expiration: 2040-01-14
Also published as: CN111274893B

Abstract

The invention discloses an aircraft image fine-grained identification method based on component segmentation and feature fusion, and aims to solve the problem that the existing image identification system cannot extract features from a component level, so that the identification categories are not many. The technical scheme is that an aircraft remote sensing image fine-grained identification system based on component segmentation and feature fusion, which is composed of a key point detection subsystem, a shared feature extractor, a component feature generator, a feature fusion subsystem and a loss function module, is constructed; constructing a data set; training a recognition system; the trained recognition system is used for recognition, the key point detection subsystem accurately positions 6 key points during recognition, the shared feature extractor is used for completely extracting the overall features of the aircraft, the feature generator is used for quickly and accurately obtaining feature sub-graphs of 4 components, and the feature fusion subsystem is used for extracting the internal features of all the components and fusing the features. By adopting the method and the device, the characteristics can be extracted from the component level, so that the identification categories are more and more accurate.

Description

Aircraft image fine-grained identification method based on component segmentation and feature fusion

Technical Field

The invention relates to the field of image recognition, in particular to an aircraft image (mainly a remote sensing image) fine-grained recognition method based on component segmentation and feature fusion.

Background

With the development of space technology, remote sensing images have become an effective means for investigating and monitoring resources, environment, city layout, traffic facilities and the like. The aircraft type identification is used as a subtask of remote sensing image identification, and has very large practical requirements and application values. Aircraft type identification mainly includes manual feature-based methods and deep learning-based methods.

The traditional aircraft identification method is mainly based on manual features, extracts the features of the image such as texture, color, geometric shape and the like through an algorithm, and performs certain reasoning to realize the classification of the aircraft. The method extracts rotation invariant features of the target, such as a Hu matrix, a Zernike matrix, a wavelet matrix, a Fourier descriptor, SIFT and the like, divides the whole shape of the target through a certain threshold, and then matches the extracted features with a parameterized shape template. To better utilize the external shape features such as symmetry of the aircraft, the images may be pose-aligned prior to template matching. However, the methods need a large amount of manual feature design, the quality of the feature design or the template design directly influences the accuracy of recognition, and the practical application value is limited.

The type recognition is carried out by building a neural network based on the deep learning method. The neural network consists of a convolution layer, a pooling layer, a full-link layer, an activation layer and other structures, low-level texture, intermediate features and high-level semantic information in the image are gradually extracted through multiple overlapping of the network layers, and finally the class probability corresponding to the image is output. Compared with an identification method based on manual characteristics, the method has better fitting capability, robustness and accuracy, more and more aircraft remote sensing images are processed and identified by adopting a deep neural network, and at present, neural networks such as a multilayer perceptron, a BP neural network, a Convolutional Neural Network (CNN) and a countermeasure generating network (GAN) are applied to identification of the aircraft remote sensing images.

However, the existing aircraft identification method based on deep learning has the problem that the aircraft is regarded as a whole to extract features, and detailed features are not considered from a component level. The types of aircrafts distinguished by the images are not many, and the difference between the types is obvious, so that the images can be correctly classified only by considering the overall characteristics of the images. However, in the real world, the types of aircraft are many (more than 47 types), and the differences among the sub-types are slight, so that the aircraft parts must be positioned and distinguished by deeply and carefully distinguishing the aircraft parts. The existing aircraft identification method based on deep learning cannot identify different aircraft through component differentiation, so that the identification types are not many. For example (document "Fu, k.; Dai, w.; Zhang, y.; Wang, z.; Yan, m.; Sun, x.multicam: Multiple class actuation mapping for aircraft identification in remote Sensing images, remote Sensing 2019,11, 544." is translated into remote Sensing image aircraft identification based on Multiple classes of activation maps, "remote Sensing journal") is a method for identifying the type of aircraft based on deep learning, which uses class activation maps to locate the position of the aircraft in an image, then segments the entire aircraft from a background image, and then inputs the segmented aircraft image into another deep neural network for classification and identification. The method only divides the whole aircraft, but does not divide each part of the aircraft, is suitable for the condition that the difference between the aircraft types is large, and only realizes the classification and identification of the 17 types of aircraft.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: aiming at the problem that the existing aircraft remote sensing image recognition system cannot extract features from the component level, so that the recognition categories are not many, the aircraft remote sensing image fine-grained recognition method based on component segmentation and feature fusion is provided, and the category number and accuracy of aircraft fine-grained recognition are effectively improved.

The technical scheme of the invention is as follows:

the method comprises the following steps of firstly, constructing an aircraft image fine-grained identification system based on component segmentation and feature fusion, wherein the system is composed of a key point detection subsystem, a shared feature extractor, a component feature generator, a feature fusion subsystem and a loss function module.

The key point detection subsystem is connected with the shared feature extractor and the component feature generator, reads the original remote sensing image of the aircraft from the constructed data set, detects 6 key points in the original remote sensing image of the aircraft to obtain coordinate values of the 6 key points, and sends the coordinate values to the shared feature extractor and the component feature generator. The key point detection subsystem is a neural network and adopts a document 'Xiao, B'; wu, h.; wei, Y.simple bases for human position estimation and tracking of proceedings of the European Conference on Computer Vision (ECCV),2018, pp.466-481 ": simple baseline for human pose estimation and tracking) page 2, lines 26-40 (the neural network structure is mainly formed by sequentially connecting ResNet50 and three deconvolution layers, and can detect key points in an image to obtain the coordinates of the key points). The 6 key points are respectively a nose K1, a junction K2 of a fuselage and a left wing leading edge, a junction K3 of the fuselage and a right wing leading edge, a left wing far end K4, a right wing far end K5 and a tail wing midpoint K6. The length H of the fuselage can be obtained from K1 and K6_objK2 and K3 can obtain the width W of the fuselage_backboneK4 and K5 make it possible to obtain the width W of the aircraft_objTherefore, the part frame P1 of the fuselage backbone can be divided by K1, K2, K3 and K6; taking K2 and K4 as vertexes as rectangular frames, and dividing a left wing part frame P2; taking K3 and K5 as vertexes as rectangular frames, a right wing frame P3 can be divided; the empennage component frame P4 can be divided by taking K6 as a reference point to form a rectangular frame, wherein the length of the rectangular frame is half of the length of a fuselage, and the width of the rectangular frame is half of the width of an airplane.

The shared feature extractor is connected with the key point detection subsystem and the component feature generator, adopts a feature pyramid structure, takes ResNet50 as a backbone network, takes an original remote sensing image and key point coordinates output by the key point detection subsystem as input, performs upward correction on the original remote sensing image, performs feature extraction on the upward corrected original remote sensing image, and outputs an extracted feature map to the component feature generator.

The part feature generator is connected with the key point detection subsystem, the shared feature extractor and the feature fusion subsystem, receives 6 key point coordinates from the key point detection subsystem, receives an extracted feature map from the shared feature extractor, generates four part frames of the aircraft by using the 6 key point coordinates, wherein the four part frames are respectively a fuselage P1, a left wing P2, a right wing P3 and an empennage P4, maps the four part frames to the extracted feature map, obtains feature sub-maps corresponding to 4 parts of the aircraft, and are respectively named as T1, T2, T3 and T4, and outputs T1, T2, T3 and T4 to the feature fusion subsystem.

The feature fusion subsystem is connected with the component feature generator and the loss function module and comprises 4 component feature full-connection layers, namely PFC1, PFC2, PFC3 and PFC4, a combined full-connection layer CFC, a full-connection layer FC and a Softmax layer; PFC1, PFC2, PFC3 and PFC4 are connected with CFC, PFC1 takes a characteristic sub-graph T1 as input, PFC2 takes a characteristic sub-graph T2 as input, PFC3 takes a characteristic sub-graph T3 as input, PFC4 takes a T4 as input, PFC1, PFC2, PFC3 and PFC4 respectively extract internal characteristics of T1, T2, T3 and T4 to obtain 4 characteristic vectors with internal characteristics, which are respectively named as TE1, TE2, TE3 and TE4, and TE1, TE2, TE3 and TE4 are output to CFC; the CFC is connected with PFC1, PFC2, PFC3, PFC4 and FC layer, and TE1, TE2, TE3 and TE4 are overlapped to form a new characteristic diagram TE 5; the FC layer is connected with the CFC layer and the Softmax layer, receives TE5 from the CFC, performs characteristic fusion on TE5, generates a characteristic vector V, and outputs the generated characteristic vector V to the Softmax layer; and the Softmax layer is connected with the CFC, receives the fused characteristic vector V from the CFC, and performs probability calculation on the characteristic vector V to obtain the prediction probability of the image corresponding to various aircraft types, namely the prediction type probability. When an aircraft remote sensing image fine-grained identification system based on component segmentation and feature fusion is trained, the prediction category probability is sent to a loss function module, and when a cross entropy loss value sent by the loss function module is received, a back propagation algorithm is adopted to adjust network parameters of the system, so that the system is gradually adjusted to an optimal working state; during actual identification, selecting the type corresponding to the maximum value in the prediction type probability as the aircraft type identified by the system;

the loss function module is connected with the feature fusion subsystem, receives the prediction category probability of the images from the feature fusion subsystem when the aircraft remote sensing image fine-grained identification system based on component segmentation and feature fusion is trained, obtains the real category probability corresponding to the original remote sensing image from the data set, calculates the cross entropy loss value between the prediction category probability and the real category probability, and returns the calculated cross entropy loss value to the feature fusion subsystem.

Secondly, constructing a data set, wherein the method comprises the following steps: 2.1A data set containing Q types of remote sensing images of an aircraft is acquired by using Google Earth software (requiring version 7.3.0 or more), 100 images are acquired by each type of aircraft, and each image is scaled to the size of 224x3 (the unit is pixel, the first number represents the length of the image, the second number represents the width of the image, and the third number represents the number of channels of the image). Q is the category number of the aircraft and is a positive integer, and Q is more than or equal to 47.

2.2 the aircraft remote sensing image data set acquired in step 2.1 is divided, wherein α% of the pictures are divided into training sets, β% of the images are taken as test sets, and it is ensured that the images in the test sets never appear in the training sets, wherein α + β is 100, and α > β.

2.4 data augmentation of each image in the test and training sets, rotation of the image every 20 degrees to form a new image, and generation of 18 augmented images per image, so that the test and training sets have Q × 100 × 18 images in total.

And 2.4, using numbers 0 to Q-1 as class codes of Q aircrafts, marking the real class probability of each picture after being augmented (generating a unique hot code (i.e. one-hot code, wherein Q bits, only one bit is 1, all other bits are 0, and if the Q bit is 1, the picture after being augmented is represented as the Q class), representing the class probability of the picture, storing the class codes of the pictures and the corresponding unique hot codes in a text file), and marking 6 real key points of each picture by using labelme (which is image marking software developed by Computer Science and Artificial Intelligence Laboratories (CSAIL) of Massachusetts Institute of Technology (MIT), version 3.11.2 or more).

And thirdly, training a key point detection subsystem.

3.1 setting training parameters: the optimization algorithm was chosen as described in the Adam optimization algorithm literature "Kingma D P, Ba J.Adam: analog for stock optimization [ J ]. arXiv preprint arXiv:1412.6980,2014": adam: a random optimization method, arXiv electronic preprinting library preprinting number 1412.6980 in 2014), wherein the loss function is mean square error loss, the number of samples selected in one training is 32, and the training time is 140 rounds (by observing the loss function, the loss function is basically not changed after 140 rounds, so that the loss function is set as 140 rounds). The initial learning rate is 0.001.

3.2 initialize training round variable N ═ 1.

3.3 Key point detection subsystem reads 32 images from training set as a small batch mini-batch, key point detection subsystem detects key points of the 32 images (the key point detection method is shown in the literature of "Xiao, B.; Wu, H.; Wei, Y. simple bases for human position estimation and tracking. proceedings of the European Conference on Computer Vision Vision (ECCV),2018, pp.466-481." is translated as a simple baseline for human pose estimation and tracking) in sequence) and obtains 6 predicted key points of the 32 images in 2018 Computer Vision European Conference pages 467 26-40.

3.4 comparing the predicted key points of 32 images with the real key points, calculating the mean square error loss between the predicted key points and the real key points, and then updating the network parameters by using a back propagation algorithm (the documents "david e.rumelhart, Geoffrey e.hinton, and Ronald j.williams.learning interactive expressions by back-amplifying errors. nature,323(99): 533-.

3.5 making N equal to N +1, if N is less than or equal to 100, rotating to 3.3; if N is more than 100 and less than or equal to 120, the learning rate is enabled to be 0.0001, and then the conversion is carried out to 3.3; if N is more than 120 and less than or equal to 140, the learning rate is enabled to be 0.00001, and then the rotation is 3.3; if N is more than 140, the training is finished, and the fourth step is carried out.

And fourthly, training the shared feature extractor, the component feature generator and the feature fusion subsystem as a whole.

4.1 setting training parameters: the optimization algorithm is selected to be Adam optimization algorithm, the loss function is cross entropy, namely CE loss, the training times are 140 rounds, and the initial learning rate is 0.001.

4.2 initialize training round number variable N ═ 1.

4.3 the shared feature extractor reads 32 pictures from the training set in sequence, and corrects each picture upwards, and the correction method comprises the following steps: a straight line is drawn between the corresponding real key points P1 and P6 of the image, named as L1, and the image is rotated so that L1 is perpendicular to the X-axis (the X-axis in a cartesian coordinate system), i.e., the image is rectified upwards. Other strategies follow up to the remote sensing image of the aircraft with the upward rectification completed.

4.4 shared feature extractor takes the corrected 32 images as a mini-batch, let these 32 images be I₁，…，I_n，…，I₃₂. N is more than or equal to 1 and less than or equal to 32. For each image, a feature map with the size of 56 × 56 × 128 is generated, and the 32 feature images are made to be TF₁，…，TF_n，…，TF₃₂。

4.5 let the variable n be 1.

4.6 parts builder with image I_nThe coordinates of the 6 real key points are taken as a reference and are taken as an image I_n4 component frames are generated, and the method comprises the following steps:

4.6.1 according to I_nEstimating the height and width of the aircraft, and calculating the maximum width W of the aircraft through a formula I, a formula II and a formula III_objLength H of the aircraft_objWidth W of fuselage of aircraft_backbone：

W_obj＝|x_lwing-x_rwingEquation one

H_obj＝|y_nose-y_empennageEquation two

W_backbone＝|x_fuselwing-x_fuserwingEquation III

Wherein x is_lwingAnd x_rwingX coordinate values, y coordinate values of the left wing far end K4 and the right wing far end K5_noseAnd y_empennageThe coordinate values of Y axis, x axis, of the nose K1 and the tail middle point K6_fuselwingAnd x_fuserwingThe X-axis coordinates of the intersection K2 of the fuselage and the left wing leading edge and the intersection K3 of the fuselage and the right wing leading edge are respectively.

4.6.2 generate the frame P1 of the fuselage backbone, by shifting the key point K1 upward by 5 pixels as the coordinates of the top edge midpoint of P1, and by 2 × W_backboneIs the width H of P1_objA rectangular box is generated for the height of P1. (setting the P1 width as W_backboneDouble, this is done to take into account that there may be engines or other ancillary structures on both sides of the fuselage)

4.6.3 Fin part frame P4 is generated as follows, W_obj/2、H_objThe width and height of P4 are determined as/2, the X coordinate of key point K6 is determined as the middle point of the X axis of P4, and the Y coordinate of K6 is determined as the reference point, and the position is moved up to H_obj3/8 as the upper boundary of P4, down H_obj1/8 serves as the lower boundary of P4, which results in P4. (this is done to account for the fact that most of the area of the fin left and right horizontal stabilizers will be offset above K6. this method can better incorporate the tail features without adding more background area.)

4.6.4, a left wing frame P2 is generated, which is processed in two cases, left wing angle. The method comprises the following steps:

4.6.4.1 calculate the left wing angle. A connecting line is made between key points K2 and K4, named as L2, and an included angle theta 1 between L2 and L1 is calculated and is the included angle of the left wing.

4.6.4.2 left wing included angleWhen the angle is less than or equal to 60 degrees, a connecting line of K2 and K4 is used as a diagonal line to form a rectangular frame, namely the left wing part frame P2, and the rotation is 4.6.5. When the included angle of the left wing is larger than 60 degrees, the difference of Y coordinates of key points K2 and K4 is used as the height H of the wing_Wing1Shifted up by H in the Y coordinate of K2_Wing1/2 generating Key point K2' offset by H down the Y coordinate of K4_Wing1The key point K4 'is generated by K2' and K4 'as diagonal lines to form a rectangular box, namely P2 (so that the engine and load under the wing can be still intercepted when the wing is horizontally unfolded), and the key point K4' is turned to 4.6.5.

4.6.5 right wing component frame P3 was created by a method that dealt with right wing pinch angle split. The method comprises the following steps:

4.6.5.1 calculate the right wing angle. A connecting line is made between key points K3 and K5, named as L3, and an included angle theta 2 between L3 and L1 is calculated and is the right wing included angle.

4.6.5.2 when the right wing included angle is less than or equal to 60 degrees, the line between K3 and K5 is used as the diagonal line to make a rectangular frame, namely the right wing component frame P3, and the rotation is 4.7. When the right wing included angle is larger than 60 degrees, the difference of Y coordinates of key points K3 and K5 is used as the height H of the wing_Wing2Shifted up by H in the Y coordinate of K3_Wing2/2 generating Key point K3' offset by H down the Y coordinate of K5_Wing2The key point K5 'is generated by K3' and K5 'as diagonal lines to form a rectangular box, namely P3 (so that the engine and load under the wing can be still intercepted when the wing is horizontally unfolded), and the key point K5' is turned to 4.7.

4.7 the positions of the part frames P1, P2, P3, P4 are set to a 4:1 ratio (the part frames P1, P2, P3, P4 are generated on the original because the original size is 224x224x3, and the feature map TF is a feature map TF_nIs 56x56x128, so the mapping needs to be matched in the length and width dimensions) are mapped to TF_nFirstly, feature sub-graphs P1 ', P2', P3 'and P4' corresponding to 4 parts are segmented, then P1 ', P2', P3 'and P4' are all adjusted to be 7 multiplied by 128 in size, and feature sub-graphs T1, T2, T3 and T4 with the same size are generated;

4.8 the PFC1 further extracts internal features from the T1 to generate a feature map TE 1; meanwhile, the PFC2 further extracts internal features from the T2 to generate a feature map TE 2; meanwhile, the PFC3 further extracts internal features from the T3 to generate a feature map TE 3; meanwhile, the PFC4 further extracts internal features from the T4, and generates a feature map TE 4. The dimensions of TE1, TE2, TE3, and TE4 were all 7 × 7 × 128.

4.9 the combined fully-connected CFC uses TE1, TE2, TE3 and TE4 as input, and adds TE1, TE2, TE3 and TE4 with the size of 7 × 7 × 128 on the third dimension (i.e. channel) to form a feature map TE5 with the size of 7 × 7 × 512, and then inputs TE5 into the fully-connected FC.

And 4.10 the full-connection layer FC receives the characteristic diagram TE5 from the CFC layer, multiplies each pixel in the TE5 by a scaling factor (the value range is 0-1) respectively, realizes characteristic fusion, generates a characteristic vector V with the length of Q, and outputs the generated characteristic vector V to the softmax layer.

And 4.11, the softmax layer receives the output feature vector V from the FC, performs probability calculation on the feature vector V to obtain the prediction probability of each category of the image corresponding to each aircraft, and sends the prediction probability to the loss function module. The calculation method is as follows:

4.11.1 let variable i equal to 1.

4.11.2 if the ith value in V is Vi, the prediction probability s corresponding to Vi_iComprises the following steps:

where e is a natural constant.

4.11.3 if i<Q, let i equal i +1, go to 4.11.1; otherwise, the image I is obtained_nPredicted class probabilities s for Q classes of various aircraft₁，…，s_i，…，s_QA 1 is to₁，…，s_i，…，s_QSend to the loss function module, go to 4.12.

4.12 loss function Module receives image I from softmax layer_nPredicted probabilities s for Q classes of various aircraft₁，…，s_i，…，s_QAnd acquiring an image I from the data set_nIs true ofClass probability, calculating cross entropy loss value L between predicted class probability and real class probability_nThe calculation formula is as follows:

wherein p is_iIs an image I_nTrue class probabilities corresponding to the ith class of the Q classes of various aircraft.

4.13 if n <32, let n be n +1, change 4.6; otherwise, go to 4.14.

4.14 Total loss function value L ═ L₁+…+L_n…+L₃₂And then updating the parameters of the shared feature extractor, the component feature generator and the feature fusion subsystem by using a back propagation algorithm according to the set learning rate.

4.15, if N is equal to or less than 100, turning to 4.3; if N is more than 100 and less than or equal to 120, the learning rate is enabled to be 0.0001, and 4.3 is converted; if N is more than 120 and less than or equal to 140, the learning rate is enabled to be 0.00001, and 4.3 is rotated; if 140< N, training is complete.

And fifthly, carrying out fine-grained identification on the aircraft image by using the trained aircraft identification system.

And 5.1, randomly selecting a remote sensing image from the test set to perform fine-grained identification on the aircraft, wherein the selected image is I.

5.2, inputting the image I into a key point detection subsystem, and obtaining the coordinates of 6 key points K1, K2, K3, K4, K5 and K6 of the aircraft in the I.

5.3 shared feature extractor reads image I from the test set, receives coordinates of 6 keypoints K1, K2, K3, K4, K5, K6 from the keypoint detection subsystem, rectifies image I upward using keypoints K1 and K6, and then extracts features of the image, obtaining a feature map I' with pixel size 56 × 56 × 128.

5.4 the part feature generator receives coordinates for 6 keypoints K1, K2, K3, K4, K5, K6 from the keypoint detection subsystem, receives a 56x56x128 feature map I' from the shared feature extractor, and generates 4 aircraft part frames P1, P2, P3, P4. The method comprises the following specific steps:

5.4.1 calculating the W of the aircraft through a formula I, a formula II and a formula III_obj、H_obj、W_backbone。

5.4.2 Generation of fuselage backbone frame P1 by shifting the position of key point K1 upward by 5 pixels as the coordinates of the top edge midpoint of P1 by 2 × W_backboneIs the width H of P1_objA rectangular box is generated for the height of P1.

5.4.3 generating the Tail part frame P4, method W_obj/2、H_objThe width and height of P4 are determined as/2, the X coordinate of key point K6 is determined as the middle point of the X axis of P4, and the Y coordinate of K6 is determined as the reference point, and the position is moved up to H_obj3/8 as the upper boundary of P4, down H_obj1/8 as the lower boundary of P4, P4 is generated.

5.4.4 left wing component box P2 is generated by the method of left wing angle division. The method comprises the following steps:

5.4.4.1 calculate the left wing angle. A connecting line is made between key points K2 and K4, named as L2, and the included angle between L2 and L1 is calculated and is the included angle of the left wing.

5.4.4.2 when the included angle of the left wing is less than or equal to 60 degrees, a connecting line of K2 and K4 is taken as a diagonal line to form a rectangular frame, namely a left wing component frame P2, and the rotation is 5.4.5. When the included angle of the left wing is larger than 60 degrees, the difference between the Y coordinates of key points K2 and K4 is used as the height (H) of the wing_Wing1) Shifted up by H in the Y coordinate of K2_Wing1/2 generating Key point K2' offset by H down the Y coordinate of K4_Wing1Generating a key point K4 ', taking K2 ' and K4 ' as diagonal lines to form a rectangular frame, namely P2, and turning to 5.4.5;

5.4.5 right wing component frame P3 was created by a method that dealt with right wing pinch angle split. The method comprises the following steps:

and 5.4.5.1 calculating a right wing included angle. A connecting line is made between key points K3 and K5, named as L3, and the included angle between L3 and L1 is calculated and is the right wing included angle.

5.4.5.2 when the included angle of the right wing is less than or equal to 60 degrees, the connecting line of K3 and K5 is taken as a diagonal lineAnd (5) forming a rectangular frame, namely a right wing part frame P3, and turning by 5.5. When the right wing included angle is larger than 60 degrees, the difference of Y coordinates of key points K3 and K5 is used as the height H of the wing_Wing2Shifted up by H in the Y coordinate of K3_Wing2/2 generating Key point K3' offset by H down the Y coordinate of K5_Wing2Generating a key point K5 ' by taking K2 ' and K4 ' as diagonal lines to form a rectangular frame, namely P3 turns to 5.5; .

5.5 the part feature generator maps the positions of the part frames P1, P2, P3 and P4 to the feature map I' generated in the step 5.3, and divides feature sub-maps T1, T2, T3 and T4 corresponding to 4 parts;

5.6 the PFC1 further extracts internal features from the T1 to generate a feature map TE 1; meanwhile, the PFC2 further extracts internal features from the T2 to generate a feature map TE 2; meanwhile, the PFC3 further extracts internal features from the T3 to generate a feature map TE 3; meanwhile, the PFC4 further extracts internal features from the T4, and generates a feature map TE 4.

5.7 with the full-link CFC input TE1, TE2, TE3, TE4, the TE1, TE2, TE3, TE4 with the size of 7 × 7 × 128 are superimposed on the third dimension (channel) to form a feature map TE5 with the size of 7 × 7 × 512, and then TE5 is input to the full-link FC.

And 5.8, the full connection layer FC receives the characteristic diagram TE5 from the CFC layer, and multiplies each pixel in the TE5 by a scaling factor (between the value range of 0-1) respectively to realize characteristic fusion, generate a characteristic vector V with the length of Q, and output the generated characteristic vector V to the softmax layer.

And 5.9, receiving the output feature vector V from the FC by the softmax layer, and performing probability calculation on the feature vector V to obtain the prediction probability values of the images corresponding to various categories of aircrafts. The calculation method is as follows:

5.9.1 let variable i equal to 1.

5.9.2, setting the ith value in V as Vi, and calculating by formula IV to obtain the prediction probability s corresponding to Vi_i。

5.9.3 if i is less than Q, let i equal to i +1, go to 5.9.1; otherwise, go to 5.10.

5.10Q prediction probabilities s output from the softmax layer₁，…，s_i，…，s_QIn the selection of the highest probability of the probability,the type corresponding to the maximum probability is taken as the aircraft class identified by the system.

According to the method, 6 key points of the aircraft are obtained through the key point detection subsystem, the images are corrected upwards based on the key points, and the feature map of the whole aircraft is obtained through the shared feature extractor. On the basis, the part feature generator generates 4 part frames by using the key points, maps the 4 part frames to a feature map output by the shared feature extractor, and cuts out feature subgraphs corresponding to all the parts; and then, further extracting the detail features inside each part by utilizing the full-connection layers PFC1, PFC2, PFC3 and PFC4, fusing the features by combining the full-connection layer CFC and the full-connection layer FC, and finally inputting the feature into the softmax layer to realize type identification.

The invention can achieve the following beneficial effects:

1. in the first step of the invention, an efficient remote sensing image aircraft fine-grained identification system is constructed through a plurality of neural networks. The key point detection subsystem can realize accurate positioning of 6 key points, the shared feature extractor can perform complete extraction on the overall features of the aircraft, the component feature generator can quickly and accurately obtain feature sub-graphs of 4 components, and the feature fusion subsystem can extract the internal features of all the components and fuse the features together to realize accurate identification.

2. The method comprises the steps of constructing a high-quality data set, wherein the data set comprises Q aircrafts, and marking categories and key points of the aircrafts, and the third step and the fourth step of efficiently training a remote sensing image aircraft fine-grained identification system by using the data set, and adjusting parameters of each neural network to optimal values by a back propagation algorithm, so that the system can accurately identify the types of the aircrafts.

3. In the 5.2 step of the invention, a key point detection subsystem is adopted to accurately position 6 key points; step 5.4.2 sets the width of the part frame P1 to 2 × W_backboneSo that P1 may include engines or other ancillary structures that may be present to both sides of the fuselage; step 5.4.3 part frame P4 generation method enables P4 to be mounted on the left and right sides of the empennageUnder the condition that most areas of the fixed surface are deviated above the key point K6, all tail features can be included, and no more background areas are added; the method for generating the component boxes P2 and P3 at the 5.4.4 step and the 5.4.5 step enables the P2 and the P3 to intercept engines and loads under the wings when the wings are unfolded horizontally, and further ensures the accuracy of the identification of the invention.

4. The invention completely extracts the overall characteristics of the aircraft in the 5.3 th step, carefully extracts the internal characteristics of each component in the 5.6 th step and effectively fuses the internal characteristics of the components in the 5.7 th to 5.8 th steps, so that the identification system has the capability of identifying subtle differences.

Drawings

FIG. 1 is a logic structure diagram of an aircraft remote sensing image fine-grained identification system constructed in the first step of the invention;

FIG. 2 is a schematic diagram of the location of key points and component frames of the present invention;

FIG. 3 is a schematic diagram of the generation of a wing component frame at an included angle of 60 degrees or less according to steps 4.6.4 and 4.6.5 of the present invention;

FIG. 4 is a schematic diagram of the generation of a wing component frame at an angle greater than 60 degrees according to steps 4.6.4 and 4.6.5 of the present invention;

fig. 5 is an overall flow chart of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are described in further detail below with reference to the accompanying drawings.

In the fine-grained identification method for the aircraft remote sensing image based on the deep learning convolutional neural network in the embodiment, the image to be detected is subjected to key point detection, feature extraction and feature fusion through a plurality of subsystems based on the neural network, so that the category of the aircraft in the image to be detected is identified.

Fig. 5 is an overall flow chart of the present invention. As shown in fig. 5, the present invention includes the steps of:

firstly, an aircraft image fine-grained identification system based on component segmentation and feature fusion is constructed, and as shown in fig. 1, the system is composed of a key point detection subsystem, a shared feature extractor, a component feature generator, a feature fusion subsystem and a loss function module.

The key point detection subsystem is connected with the shared feature extractor and the component feature generator, reads the original remote sensing image of the aircraft from the constructed data set, detects 6 key points in the original remote sensing image of the aircraft to obtain coordinate values of the 6 key points, and sends the coordinate values to the shared feature extractor and the component feature generator. The key point detection subsystem is a neural network and can detect key points in the image to obtain coordinates of the key points. As shown in fig. 2, the 6 key points are respectively a nose K1, a fuselage and left wing leading edge junction K2, a fuselage and right wing leading edge junction K3, a left wing distal end K4, a right wing distal end K5, and a tail wing midpoint K6. A part frame P1 of the fuselage backbone can be divided by K1, K2, K3 and K6; taking K2 and K4 as vertexes as rectangular frames, and dividing a left wing part frame P2; taking K3 and K5 as vertexes as rectangular frames, a right wing frame P3 can be divided; the empennage component frame P4 can be divided by taking K6 as a reference point to form a rectangular frame, wherein the length of the rectangular frame is half of the length of a fuselage, and the width of the rectangular frame is half of the width of an airplane.

And secondly, constructing a data set.

2.1 a data set containing remote sensing images of aircraft of type Q (Q ═ 47) was acquired using Google Earth software (version 7.3.0 or more), 100 images were acquired for each aircraft type, each image scaled to 224 × 224 × 3.

2.2, the aircraft remote sensing image data set acquired in step 2.1 is divided, wherein 60% (α ═ 60) of the pictures are divided into a training set, 40% (β ═ 40%) of the images are taken as a test set, and the images in the test set are guaranteed to never appear in the training set.

2.4 data augmentation of each image in the test and training sets, rotation of the image every 20 degrees to form a new image, 18 augmented images per image, so that there are 47 × 100 × 18 images in the test and training sets.

And 2.4, using numbers 0 to Q-1 as class codes of Q aircrafts, marking the real class probability of each image after augmentation, and marking 6 real key points of each image by labelme (version 3.11.2 or more).

And thirdly, training a key point detection subsystem.

3.1 setting training parameters: the optimization algorithm is selected to be Adam optimization algorithm, the loss function is mean square error loss, the number of samples selected in one training, namely the batch size, is 32, and the training times are 140 rounds. The initial learning rate is 0.001.

3.2 initialize training round variable N ═ 1.

3.3 the key point detection subsystem reads 32 images from the training set in turn as a small batch, namely mini-batch, and the key point detection subsystem performs key point detection on the 32 images to obtain 6 predicted key points of the 32 images.

And 3.4, comparing the predicted key points and the real key points of the 32 images, calculating the mean square error loss between the predicted key points and the real key points, and then updating network parameters by adopting a back propagation algorithm according to the set learning rate.

4.1 setting training parameters: the optimization algorithm is selected to be an Adam optimization algorithm, the loss function is cross entropy, namely, pass, the training times are 140 rounds, and the initial learning rate is 0.001.

4.2 initialize training round number variable N ═ 1.

4.3 the shared feature extractor reads 32 pictures from the training set in sequence, and corrects each picture upwards, and the correction method comprises the following steps: a straight line is drawn between the corresponding real key points P1 and P6 of the image, named L1, and the image is rotated so that the L1 is perpendicular to the X axis, so that the upward correction of the image is completed. Other strategies follow up to the remote sensing image of the aircraft with the upward rectification completed.

4.5 let the variable n be 1.

4.6.1 according to I_nEstimating the height and width of the aircraft, and calculating the maximum width W of the aircraft through a formula I, a formula II and a formula III_objLength H_objWidth W of fuselage of aircraft_backbone：

W_obj＝|x_lwing-x_rwingEquation one

H_obj＝|y_nose-y_empennageEquation two

W_backbone＝|x_fuselwing-x_fuserwingEquation III

4.6.2 generate the frame P1 of the fuselage backbone, by shifting the key point K1 upward by 5 pixels as the coordinates of the top edge midpoint of P1, and by 2 × W_backboneIs the width H of P1_objA rectangular box is generated for the height of P1.

4.6.3 Fin part frame P4 is generated as follows, W_obj/2、H_objThe width and height of P4 are determined as/2, the X coordinate of key point K6 is determined as the middle point of the X axis of P4, and the Y coordinate of K6 is determined as the reference point, and the position is moved up to H_obj3/8 as the upper boundary of P4, down H_obj1/8 serves as the lower boundary of P4, which results in P4.

4.6.4.1 calculate the left wing angle. As shown in fig. 3, a connecting line is drawn between the key points K2 and K4, named as L2, and an included angle θ 1 between L2 and L1 is calculated, which is the left wing included angle.

4.6.4.2 As shown in FIG. 3, when the angle between the left wing is less than or equal to 60 degrees, the line between K2 and K4 is used as the diagonal line to form a rectangular frame, i.e. the left wing component frame P2, and the angle is turned by 4.6.5. As shown in FIG. 4, when the left wing included angle is larger than 60 degrees, the difference between the Y coordinates of key points K2 and K4 is used as the height H of the wing_Wing1Shifted up by H in the Y coordinate of K2_Wing1/2 generating Key point K2' offset by H down the Y coordinate of K4_Wing1Generating a key point K4 ' with K2 ' and K4 ' as diagonal lines as rectangular frames, i.e.P2, turn 4.6.5.

4.6.5.1 calculate the right wing angle. As shown in fig. 3, a connecting line is drawn between the key points K3 and K5, named as L3, and an included angle θ 2 between L3 and L1 is calculated, which is a right wing included angle.

4.6.5.2 As shown in FIG. 3, when the right wing angle is less than or equal to 60 degrees, the line between K3 and K5 is used as the diagonal line to form a rectangular frame, i.e. the right wing frame P3, and the angle is turned by 4.7. As shown in FIG. 4, when the right wing included angle is larger than 60 degrees, the difference between the Y coordinates of key points K3 and K5 is used as the height H of the wing_Wing2Shifted up by H in the Y coordinate of K3_Wing2/2 generating Key point K3' offset by H down the Y coordinate of K5_Wing2And/2, generating a key point K5 ', and making a rectangular frame by taking K3 ' and K5 ' as diagonal lines, namely P3, and turning to 4.7.

4.7 mapping the positions of the part frames P1, P2, P3, P4 to TF in a 4:1 size ratio_nFirstly, feature sub-graphs P1 ', P2', P3 'and P4' corresponding to 4 parts are segmented, then P1 ', P2', P3 'and P4' are all adjusted to be 7 multiplied by 128 in size, and feature sub-graphs T1, T2, T3 and T4 with the same size are generated;

And 4.10 the full-connection layer FC receives the characteristic diagram TE5 from the CFC layer, multiplies each pixel in the TE5 by a scaling factor (the value range is 0-1) respectively, realizes characteristic fusion, generates a characteristic vector V with the length of 47, and outputs the generated characteristic vector V to the softmax layer.

4.11.1 let variable i equal to 1.

where e is a natural constant.

4.12 loss function Module receives image I from softmax layer_nPredicted probabilities s for Q classes of various aircraft₁，…，s_i，…，s_QAnd acquiring an image I from the data set_nCalculating a cross entropy loss value L between the predicted class probability and the true class probability_nThe calculation formula is as follows:

4.13 if n <32, let n be n +1, change 4.6; otherwise, go to 4.14.

4.14 Total loss function value L ═ L₁+…+L_n…+L₃₂And then updating the shared feature extractor with a back-propagation algorithm based on the set learning rateParameters of the component feature generator and the feature fusion subsystem.

5.4.1 calculating the W of the aircraft by a formula I, a formula II and a formula III_obj、H_obj、W_backbone。

And 5.4.5.2 when the included angle of the right wing is less than or equal to 60 degrees, taking the connecting line of K3 and K5 as a diagonal line to form a rectangular frame, namely a right wing component frame P3, and turning by 5.5. When the right wing included angle is larger than 60 degrees, the difference of Y coordinates of key points K3 and K5 is used as the height (H) of the wing_Wing2) Shifted up by H in the Y coordinate of K3_Wing2/2 generating Key point K3' offset by H down the Y coordinate of K5_Wing2Generating a key point K5 ' by taking K2 ' and K4 ' as diagonal lines to form a rectangular frame, namely P3 turns to 5.5; .

5.9.1 let variable i equal to 1.

5.10Q prediction probabilities s output from the softmax layer₁，…，s_i，…，s_QThe maximum probability is selected, and the type corresponding to the maximum probability is used as the aircraft type identified by the system.

The system and the method for identifying the fine granularity of the aircraft remote sensing image based on the component segmentation and the feature fusion are introduced in detail, a specific example is applied in the embodiment to explain the principle and the embodiment of the invention, and the description of the embodiment is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. An aircraft image fine-grained identification method based on component segmentation and feature fusion is characterized by comprising the following steps:

the method comprises the following steps that firstly, an aircraft remote sensing image fine-grained identification system based on component segmentation and feature fusion is built, and the system is composed of a key point detection subsystem, a shared feature extractor, a component feature generator, a feature fusion subsystem and a loss function module;

the key point detection subsystem is connected with the shared feature extractor and the component feature generator, reads an original remote sensing image of the aircraft from the data set, detects 6 key points in the original remote sensing image of the aircraft to obtain coordinate values of the 6 key points, and sends the coordinate values to the shared feature extractor and the component feature generator; the key point detection subsystem is a neural network and is used for detecting key points in the image to obtain coordinates of the key points; the 6 key points are respectively a nose K1, a junction K2 of a fuselage and a left wing leading edge, a junction K3 of the fuselage and a right wing leading edge, a left wing far end K4, a right wing far end K5 and a tail wing midpoint K6;

the shared characteristic extractor is connected with the key point detection subsystem and the component characteristic generator, takes the original remote sensing image of the aircraft and the key point coordinates output by the key point detection subsystem as input, upwards corrects the original remote sensing image, extracts the characteristics of the upwards corrected original remote sensing image, and outputs the extracted characteristic graph to the component characteristic generator;

the part feature generator is connected with the key point detection subsystem, the shared feature extractor and the feature fusion subsystem, receives 6 key point coordinates from the key point detection subsystem, receives an extracted feature map from the shared feature extractor, generates four part frames of the aircraft by using the 6 key point coordinates, wherein the four part frames are respectively a fuselage P1, a left wing P2, a right wing P3 and an empennage P4, maps the four part frames to the extracted feature map, obtains feature sub-images corresponding to 4 parts of the aircraft, are respectively named as T1, T2, T3 and T4, and outputs T1, T2, T3 and T4 to the feature fusion subsystem;

the feature fusion subsystem is connected with the component feature generator and the loss function module and comprises 4 component feature full-connection layers, namely PFC1, PFC2, PFC3 and PFC4, a combined full-connection layer CFC, a full-connection layer FC and a Softmax layer; PFC1, PFC2, PFC3 and PFC4 are connected with CFC, PFC1 takes a characteristic sub-graph T1 as input, PFC2 takes a characteristic sub-graph T2 as input, PFC3 takes a characteristic sub-graph T3 as input, PFC4 takes a T4 as input, PFC1, PFC2, PFC3 and PFC4 respectively extract internal characteristics of T1, T2, T3 and T4 to obtain 4 characteristic vectors with internal characteristics, which are respectively named as TE1, TE2, TE3 and TE4, and TE1, TE2, TE3 and TE4 are output to CFC; the CFC is connected with PFC1, PFC2, PFC3, PFC4 and FC layer, receives TE1, TE2, TE3 and TE4, and superposes TE1, TE2, TE3 and TE4 to form a new characteristic diagram TE 5; the FC layer is connected with the CFC layer and the Softmax layer, receives TE5 from the CFC, performs characteristic fusion on TE5, generates a characteristic vector V, and outputs the generated characteristic vector V to the Softmax layer; the Softmax layer is connected with the CFC, receives the fused characteristic vector V from the CFC, and performs probability calculation on the characteristic vector V to obtain prediction probabilities, namely prediction category probabilities, of the images corresponding to various aircraft categories; when an aircraft remote sensing image fine-grained identification system based on component segmentation and feature fusion is trained, the prediction category probability is sent to a loss function module, and when a cross entropy loss value sent by the loss function module is received, a back propagation algorithm is adopted to adjust network parameters of the system; during actual identification, selecting the type corresponding to the maximum value in the prediction type probability as the aircraft type identified by the system;

the loss function module is connected with the feature fusion subsystem, receives the prediction category probability of the image from the feature fusion subsystem when training the aircraft remote sensing image fine-grained identification system based on component segmentation and feature fusion, acquires the real category probability corresponding to the original remote sensing image from the data set, calculates the cross entropy loss value between the prediction category probability and the real category probability, and returns the calculated cross entropy loss value to the feature fusion subsystem;

secondly, constructing a data set, wherein the method comprises the following steps:

2.1 using Google Earth software to collect a data set containing Q types of aircraft remote sensing images, wherein each type of aircraft collects 100 images, each image is scaled to a size of 224 multiplied by 3, the unit of the size is a pixel, the first number represents the length of the image, the second number represents the width of the image, and the third number represents the channel number of the image; q is the category number of the aircraft and is a positive integer;

2.2, dividing the aircraft remote sensing image data set acquired in the step 2.1, wherein α% of pictures are divided into a training set, β% of images are taken as a test set, and images in the test set are guaranteed to never appear in the training set, wherein α + β is 100, and α is more than β;

2.3, performing data amplification on each image in the test set and the training set, rotating the image every 20 degrees to form a new image, generating 18 amplified images for each image, and totally Q multiplied by 100 multiplied by 18 images in the test set and the training set;

2.4, using numbers 0 to Q-1 as category codes of Q aircrafts, marking the real category probability of each image after being amplified, and marking 6 real key points of each image by labelme;

thirdly, training a key point detection subsystem;

fourthly, training the shared feature extractor, the component feature generator and the feature fusion subsystem as a whole, wherein the method comprises the following steps:

4.1 setting training parameters: selecting an optimization algorithm as an Adam optimization algorithm, wherein a loss function is cross entropy, namely CE loss, the training times are 140 rounds, and the initial learning rate is 0.001;

4.2 initializing a training round number variable N as 1;

4.3 the shared characteristic extractor reads 32 pictures from the training set in sequence, and corrects each picture upwards;

4.4 shared feature extractor takes the corrected 32 images as a mini-batch, let these 32 images be I₁，…，I_n，…，I₃₂N is more than or equal to 1 and less than or equal to 32; for each image, a feature map with the size of 56 × 56 × 128 is generated, and the 32 feature images are made to be TF₁，…，TF_n，…，TF₃₂；

4.5 letting the variable n be 1;

4.6 parts builderImage I_nThe coordinates of the 6 real key points are taken as a reference and are taken as an image I_n4 component frames are generated, and the method comprises the following steps:

4.6.1 according to I_nEstimating the height and width of the aircraft, and calculating the size of the aircraft, including the maximum width W of the aircraft_objLength H_objWidth W of fuselage of aircraft_backbone；

4.6.2 generating a frame P1 of the fuselage backbone component by shifting the position of the key point K1 upwards by 5 pixel points as the coordinates of the top edge midpoint of P1 by 2W_backboneIs the width H of P1_objGenerating a rectangular box for the height of P1;

4.6.3 empennage part frame P4 is generated by W_obj/2、H_objThe width and height of P4 are determined as/2, the X coordinate of key point K6 is determined as the middle point of the X axis of P4, and the Y coordinate of K6 is determined as the reference point, and the position is moved up to H_obj3/8 as the upper boundary of P4, down H_obj1/8 as the lower boundary of P4, generating P4;

4.6.4, a left wing frame P2 is generated by:

4.6.4.1 calculating left wing angle: a connecting line is made between key points K2 and K4, named as L2, and an included angle theta 1 between L2 and L1 is calculated and is a left wing included angle;

4.6.4.2 when the angle between the left wing is less than or equal to 60 degrees, the line between K2 and K4 is used as the diagonal line to form a rectangular frame, namely the left wing component frame P2, and the angle is 4.6.5; when the included angle of the left wing is larger than 60 degrees, the difference of Y coordinates of key points K2 and K4 is used as the height H of the wing_Wing1Shifted up by H in the Y coordinate of K2_Wing1/2 generating Key point K2' offset by H down the Y coordinate of K4_Wing1Generating a key point K4 ', taking K2 ' and K4 ' as diagonal lines to form a rectangular frame, namely P2, and turning to 4.6.5;

4.6.5 right wing frame P3 is created by:

4.6.5.1 calculate right wing angle: a connecting line is made between key points K3 and K5, named as L3, and an included angle theta 2 between L3 and L1 is calculated, wherein the included angle is a right wing included angle;

4.6.5.2 when the right wing included angle is less than or equal to 60 degrees, the angle is connected by K3 and K5The line is a diagonal line and is a rectangular frame, namely a right wing part frame P3, and the rotation is 4.7; when the right wing included angle is larger than 60 degrees, the difference of Y coordinates of key points K3 and K5 is used as the height H of the wing_Wing2Shifted up by H in the Y coordinate of K3_Wing2/2 generating Key point K3' offset by H down the Y coordinate of K5_Wing2Generating a key point K5 ', taking K3 ' and K5 ' as diagonal lines to form a rectangular frame, namely P3, and turning to 4.7;

4.8 the PFC1 further extracts internal features from the T1 to generate a feature map TE 1; meanwhile, the PFC2 further extracts internal features from the T2 to generate a feature map TE 2; meanwhile, the PFC3 further extracts internal features from the T3 to generate a feature map TE 3; meanwhile, the PFC4 further extracts internal features from the T4 to generate a feature map TE 4; the sizes of E1, TE2, TE3 and TE4 are all 7 multiplied by 128;

4.9 combining the full connection layer CFC, using TE1, TE2, TE3 and TE4 as input, overlapping TE1, TE2, TE3 and TE4 with the size of 7 × 7 × 128 on a third dimension, i.e. a channel, to form a feature map TE5 with the size of 7 × 7 × 512, and then inputting TE5 into the full connection layer FC;

4.10 the full connection layer FC receives the characteristic diagram TE5 from the CFC layer, and multiplies each pixel in TE5 by a scale factor to realize characteristic fusion, generates a characteristic vector V with the length of Q, and outputs V to the softmax layer;

4.11 the softmax layer receives the characteristic vector V from the FC, carries out probability calculation on the V to obtain prediction probability of each category of the various aircrafts corresponding to the image, namely prediction category probability, and sends the prediction category probability to the loss function module; the calculation method is as follows:

4.11.1 let variable i equal to 1;

4.11.2 if the ith value in V is Vi, the prediction probability value s corresponding to Vi_iComprises the following steps:

wherein e is a natural constant;

4.11.3 if i<Q, let i equal i +1, go to 4.11.1; otherwise, image I_nPredicted class probabilities s for Q classes of various aircraft₁，…，s_i，…，s_QSending the data to a loss function module, and converting the data to 4.12;

4.12 loss function Module receives s from the softmax layer₁，…，s_i，…，s_QAnd acquiring an image I from the data set_nCalculating a cross entropy loss value L between the predicted class probability and the true class probability_nThe calculation formula is as follows:

wherein p is_iIs an image I_nTrue class probabilities corresponding to an ith class of the Q classes of various aircraft;

4.13 if n <32, let n be n +1, change 4.6; otherwise, turning to 4.14;

4.14 Total loss function value L ═ L₁+…+L_n…+L₃₂Then, updating parameters of the shared feature extractor, the component feature generator and the feature fusion subsystem by using a back propagation algorithm according to the set learning rate;

4.15 make N equal to N +1, if N is less than or equal to 100, turn 4.3; if N is more than 100 and less than or equal to 120, the learning rate is enabled to be 0.0001, and 4.3 is converted; if N is more than 120 and less than or equal to 140, the learning rate is enabled to be 0.00001, and 4.3 is rotated; if 140< N, training is finished;

fifthly, fine-grained identification is carried out on the aircraft image by using the trained aircraft identification system, and the method comprises the following steps:

5.1 randomly selecting an image from the test set to carry out fine-grained identification on the aircraft, and enabling the selected image to be I;

5.2, inputting the image I into a key point detection subsystem to obtain coordinates of 6 key points K1, K2, K3, K4, K5 and K6 of the aircraft in the I;

5.3 shared feature extractor reads image I from test set, receives coordinates of 6 key points K1, K2, K3, K4, K5, K6 from key point detection subsystem, corrects image I upwards by using key points K1 and K6, then extracts features of image, obtains feature map I' with pixel size 56 × 56 × 128;

5.4 the part feature generator receives coordinates of 6 keypoints K1, K2, K3, K4, K5, K6 from the keypoint detection subsystem, receives the feature map I' from the shared feature extractor, generates 4 aircraft part frames P1, P2, P3, P4; the method comprises the following specific steps:

5.4.1 calculating the W of the aircraft_obj、H_obj、W_backbone；

5.4.2 generating a frame P1 of the fuselage backbone component by shifting the position of the key point K1 upwards by 5 pixels as the coordinate of the top edge midpoint of P1 by 2W_backboneIs the width H of P1_objGenerating a rectangular box for the height of P1;

5.4.3 generating the Tail part frame P4 by W_obj/2、H_objThe width and height of P4 are determined as/2, the X coordinate of key point K6 is determined as the middle point of the X axis of P4, and the Y coordinate of K6 is determined as the reference point, and the position is moved up to H_obj3/8 as the upper boundary of P4, down H_obj1/8 as the lower boundary of P4, generating P4;

5.4.4 left wing box P2 generation method is:

5.4.4.1 calculating left wing angle: a connecting line is made between key points K2 and K4, named as L2, and an included angle theta 1 between L2 and L1 is calculated and is a left wing included angle;

5.4.4.2 when the included angle of the left wing is less than or equal to 60 degrees, taking the connecting line of K2 and K4 as a diagonal line to form a rectangular frame, namely a left wing component frame P2, and turning by 5.4.5; when the included angle of the left wing is larger than 60 degrees, the difference between Y coordinates of key points K2 and K4 is used as the height H of the wing_Wing1Shifted up by H in the Y coordinate of K2_Wing1/2 generating Key point K2' offset by H down the Y coordinate of K4_Wing1Generating a key point K4 ', taking K2 ' and K4 ' as diagonal lines to form a rectangular frame, namely P2, and turning to 5.4.5;

5.4.5 generating a right wing component frame P3, wherein the generation method is processed according to two conditions of right wing included angle division, and the specific method comprises the following steps:

5.4.5.1 calculate Right Airfoil Angle: a connecting line is made between key points K3 and K5, named as L3, and an included angle theta 2 between L3 and L1 is calculated, wherein the included angle is a right wing included angle;

5.4.5.2 when the included angle of the right wing is less than or equal to 60 degrees, taking the connecting line of K3 and K5 as a diagonal line to form a rectangular frame, namely a right wing component frame P3, and turning by 5.5; when the right wing included angle is larger than 60 degrees, the difference of Y coordinates of key points K3 and K5 is used as the height H of the wing_Wing2Shifted up by H in the Y coordinate of K3_Wing2/2 generating Key point K3' offset by H down the Y coordinate of K5_Wing2Generating a key point K5 ', taking K2 ' and K4 ' as diagonal lines to form a rectangular frame, namely P3, and turning to 5.5;

5.6 the PFC1 further extracts internal features from the T1 to generate a feature map TE 1; meanwhile, the PFC2 further extracts internal features from the T2 to generate a feature map TE 2; meanwhile, the PFC3 further extracts internal features from the T3 to generate a feature map TE 3; meanwhile, the PFC4 further extracts internal features from the T4 to generate a feature map TE 4;

5.7 combining the full connection layer CFC, using TE1, TE2, TE3 and TE4 as input, superposing TE1, TE2, TE3 and TE4 with the size of 7 × 7 × 128 on a third dimension, namely a channel, to form a characteristic diagram TE5 with the size of 7 × 7 × 512, and then inputting TE5 into the full connection layer FC;

5.8 the full connection layer FC receives the characteristic diagram TE5 from the CFC layer, multiplies each pixel in TE5 by a scale factor respectively to realize characteristic fusion, generates a characteristic vector V with the length of Q, and outputs the generated characteristic vector V to the softmax layer;

5.9 the softmax layer receives the output feature vector V from the FC, and performs probability calculation on the feature vector V to obtain the prediction probability of the image corresponding to each category of various aircrafts, wherein the calculation method is as follows:

5.9.1 let variable i equal to 1;

5.9.2, setting the ith value in V as Vi, and calculating by formula IV to obtain the prediction probability s corresponding to Vi_i；

5.9.3 if i is less than Q, let i equal to i +1, go to 5.9.1; otherwise, 5.10 is rotated;

5.10Q prediction probabilities s output from the softmax layer₁，…，s_i，…，s_QThe type corresponding to the maximum probability is used as the aircraft type identified by the system.

2. The method for fine-grained identification of aircraft images based on component segmentation and feature fusion as claimed in claim 1, wherein the keypoint detection subsystem is formed by sequentially connecting ResNet50 and three deconvolution layers.

3. The method for identifying the fine granularity of the aircraft image based on the component segmentation and the feature fusion as claimed in claim 1, wherein the shared feature extractor adopts a feature pyramid structure and takes ResNet50 as a backbone network.

4. The fine-grained identification method for aircraft images based on component segmentation and feature fusion as claimed in claim 1, wherein the Google Earth software requires version 7.3.0 or more, and the labelme requires version 3.11.2 or more.

5. The aircraft image fine-grained identification method based on component segmentation and feature fusion as claimed in claim 1, characterized in that Q is more than or equal to 47.

6. The method for identifying the fine granularity of the aircraft image based on the component segmentation and the feature fusion as claimed in claim 1, wherein the method for labeling the augmented real class probability of each picture in step 2.4 is to generate a hot unique code for each picture, which is used for representing the class probability of the picture, and store the class code of the picture and the corresponding hot unique code in a text file.

7. The method for fine-grained identification of aircraft images based on component segmentation and feature fusion as claimed in claim 1, wherein the third step of training the keypoint detection subsystem is:

3.1 setting training parameters: selecting an optimization algorithm as an Adam optimization algorithm, wherein a loss function is mean square error loss, the number of samples selected in one training is Batchsize 32, the training times are 140 rounds, and the initial learning rate is 0.001;

3.2 initializing a training round variable N as 1;

3.3 the key point detection subsystem reads 32 images from the training set in turn as a small batch, namely mini-batch, and the key point detection subsystem performs key point detection on the 32 images to obtain 6 predicted key points of the 32 images;

3.4 comparing the predicted key points and the real key points of the 32 images, calculating the mean square error loss between the predicted key points and the real key points, and then updating network parameters by adopting a back propagation algorithm according to the set learning rate;

3.5 making N equal to N +1, if N is less than or equal to 100, rotating to 3.3; if N is more than 100 and less than or equal to 120, the learning rate is enabled to be 0.0001, and then the conversion is carried out to 3.3; if N is more than 120 and less than or equal to 140, the learning rate is enabled to be 0.00001, and then the rotation is 3.3; if N >140, training is complete.

8. The method for identifying the fine granularity of the aircraft image based on the component segmentation and the feature fusion as claimed in claim 1, wherein the method for upwards correcting the image in the step 4.3 is as follows: a straight line is drawn between the corresponding real key points P1 and P6 of the image, named L1, and the image is rotated so that the L1 is perpendicular to the X axis, so that the upward correction of the image is completed.

9. The method for fine-grained identification of an aircraft image based on component segmentation and feature fusion as claimed in claim 1, wherein the step of 4.6.1 calculates the maximum width W of the aircraft_objLength H_objWidth W of fuselage of aircraft_backboneThe method comprises the following steps:

W_obj＝|x_lwing-x_rwingequation one

H_obj＝|y_nose-y_empennageEquation two

W_backbone＝|x_fuselwing-x_fuserwingEquation III

10. The aircraft image fine-grained identification method based on component segmentation and feature fusion as claimed in claim 1, characterized in that the scale factor value ranges from 0 to 1 in 4.10 steps and 5.8.