CN110766058B - Battlefield target detection method based on optimized RPN (resilient packet network) - Google Patents

Battlefield target detection method based on optimized RPN (resilient packet network) Download PDF

Info

Publication number
CN110766058B
CN110766058B CN201910965047.4A CN201910965047A CN110766058B CN 110766058 B CN110766058 B CN 110766058B CN 201910965047 A CN201910965047 A CN 201910965047A CN 110766058 B CN110766058 B CN 110766058B
Authority
CN
China
Prior art keywords
target
network
layer
frame
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910965047.4A
Other languages
Chinese (zh)
Other versions
CN110766058A (en
Inventor
肖秦琨
邓雪亚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xi'an Keduoduo Information Technology Co.,Ltd.
Original Assignee
Xian Technological University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Technological University filed Critical Xian Technological University
Priority to CN201910965047.4A priority Critical patent/CN110766058B/en
Publication of CN110766058A publication Critical patent/CN110766058A/en
Application granted granted Critical
Publication of CN110766058B publication Critical patent/CN110766058B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention relates to a battlefield target detection method based on an optimized RPN (resilient packet network), which comprises the following steps of 1, constructing a tank armor target data set, and respectively labeling a training data set and a tank armor target on a test data set; 2. initializing a model on the ImageNet data set to train the VGG-16 network; 3. generating a sharing characteristic graph; 4. obtaining target candidate areas with different sizes and proportions; 5. obtaining candidate regions through an RPN, respectively calculating errors between the candidate regions obtained on the two convolution layer characteristic graphs and a real frame, selecting a candidate frame with the minimum error, and finally selecting a candidate region with high accuracy from the candidate region with the minimum error as an optimized target candidate region; 6. and finishing the judgment of the target type and the regression correction of the target boundary box. The invention effectively improves the effectiveness of extracting the candidate region from the small target and the target with shielding influence, thereby improving the precision of battlefield target detection.

Description

Battlefield target detection method based on optimized RPN (resilient packet network)
Technical Field
The invention belongs to the technical field of battlefield target detection, and particularly relates to a battlefield target detection method based on an optimized RPN (resilient packet network).
Background
At present, the detection of tank armor targets in battlefield still uses a method of manual discovery and aiming to calibrate the targets, and then carries out target tracking to achieve the purpose of accurate striking. The accurate detection of the battlefield target is a necessary premise for realizing the accurate attack of the battlefield target. The complexity of battlefield environment and the automatic detection of battlefield targets are the difficult problems of realizing the intellectualization of the tank armored vehicle combat system. In the detection and research of battlefield targets, the research methods in recent years are mainly divided into: based on an artificial model detection algorithm; a detection method based on R-CNN; a Fast R-CNN-based detection method and a Fast R-CNN-based detection method.
The method of the artificial model only contains information such as color histogram, textural features and the like of the image, does not have deep feature abstraction capability, does not accurately depict the target, causes inaccurate generated candidate frames and does not achieve ideal target detection results; the R-CNN detection method adopts a SS (service set) search method to extract candidate frames, so that the calculation amount is large, the time is consumed, the extracted candidate frames are all scaled to a uniform size, and the original information of the image is easily lost due to feature extraction; the method for extracting the candidate box by the Fast R-CNN algorithm is the same as the R-CNN method, the time consumption problem exists in the extraction of the suggested region, the network training is not end-to-end, and the quality of the candidate box cannot be improved by the back propagation algorithm; the Faster R-CNN algorithm adopts the RPN network to extract the candidate frame, so that the problem of time consumption for extracting the candidate frame is solved, but the detection precision of the small target and the shielded target is not high in the battlefield target detection, and the false detection or the accurate detection precision of the small target and the shielded target is not high due to the fact that the characteristic extraction is not comprehensive enough and influences are generated on the candidate frame. Due to the difference of the visual field ranges, targets with different scales can appear, the sizes of the targets are different, the detection precision is greatly influenced, and the problem that the detection precision is not high can appear in small targets. Factors such as target occlusion and dust and smoke also influence the extraction of the candidate frame, resulting in low detection accuracy of the target.
Disclosure of Invention
The invention aims to provide a battlefield target detection method based on an optimized RPN network, which solves the problems that the extraction of a candidate frame is not accurate, and the detection accuracy of small targets and targets with occlusion is not accurate.
The technical scheme adopted by the invention is as follows:
a battlefield target detection method based on an optimized RPN network is implemented according to the following steps:
step 1: constructing a tank armor target data set conforming to the PASCAL VOC data set format, and respectively labeling the tank armor targets on the training data set and the testing data set;
and 2, step: initializing a model on the ImageNet data set to train the VGG-16 network;
and step 3: inputting a training data set of a target image into a convolutional neural network, and respectively extracting target characteristics on convolutional layers conv3-3 and conv5-3 to generate a shared characteristic diagram;
and 4, step 4: respectively sliding windows with different sizes on the shared characteristic diagram generated by the two convolution layers by using a candidate region extraction network to obtain target candidate regions with different sizes and proportions;
and 5: obtaining candidate regions through an RPN, respectively calculating errors between the candidate regions obtained on the two convolution layer characteristic graphs and a real frame, selecting a candidate frame with the minimum error, and finally selecting a candidate region with high accuracy from the candidate region with the minimum error as an optimized target candidate region;
step 6: and (5) sending the optimized target candidate region obtained in the step (5) and the shared characteristic graph corresponding to the convolution layer into a detection network to finish target class judgment and regression correction of a target boundary frame.
Further, step 1 is specifically implemented according to the following steps:
(11) Firstly, the size of the target object is determined, and the larger value of the width and the height of the target object is assumed to be recorded as P max The pixels divide the targets into three types of targets according to the size of the targets in the visual field range, and the size standard of the target classification is as follows:
Figure BDA0002230213690000021
(12) Calibrating targets for targets with different sizes: and respectively calibrating the training data set and the test data set of the tank and the armored vehicle.
Further, step 3 is specifically implemented according to the following steps:
(31) The convolutional neural network is a main network of the target detection network, target features are extracted, and shared convolutional layers are generated, wherein the input of each convolutional layer is as follows:
Z l =W l X l-1 +b l (2)
(32) The output of the l-th layer convolution is:
X l =f(Z l )=f(W l X l-1 +b l ) (3)
(33) The total error for the convolutional layer is:
Figure BDA0002230213690000031
continuously optimizing parameters W and b of the neural network by a gradient descent method;
(34) The parameters W and b are respectively graded according to equation (5) to obtain:
Figure BDA0002230213690000032
Figure BDA0002230213690000033
where e represents the inner product of two vectors,
Figure BDA0002230213690000034
representing a derivation symbol, and calculating values of parameters W and b;
(35) And (3) continuously adjusting parameters of the network to enable the extraction of the target characteristics to be more accurate, obtaining the output of the convolution layer, namely the characteristics of the target, through the function of the activation function of the formula (3) on the convolution result, connecting the characteristics of the target through the full connection layer to form a shared characteristic diagram, and respectively obtaining the shared characteristic diagram by the convolution layer conv3-3 and the conv 5-3.
Further, step 4 is specifically implemented according to the following steps:
(41) Setting different sliding windows on the shared characteristic graphs of the convolution conv3-3 layer and the convolution conv5-3 layer respectively, setting a sliding window with the size of 5 multiplied by 5 on the conv3-3 layer by the RPN network, and setting sliding windows with the sizes of 7 multiplied by 7 and 9 multiplied by 9 on the conv5-3 layer;
(42) Anchor frames with different scales and proportions are arranged on the sliding window, so that W multiplied by H multiplied by k anchor frames can be obtained;
(43) Generating region candidate frames through sliding windows, wherein candidate regions generated by the conv3-3 layer and the conv5-3 layer are respectively represented by a suggestion region 1 and a suggestion region 2, the features in each sliding window are mapped to corresponding low-dimensional features, the low-dimensional features are subjected to a ReLU activation function to obtain vectors, the vectors are input into two convolution layers respectively, namely a candidate region classification judgment layer (cls) and a candidate region position regression layer (reg), the classification cls layer represents the probability value that each candidate region is a target, the output of the probability value is 2k, the reg layer represents the position regression coordinates of k frames, and the output of the regression coordinates is 4k.
Further, step 5 is specifically implemented according to the following steps:
(51) Respectively calculating error values between all target candidate frames and real frames of the convolution conv3-3 and conv5-3 layers, namely the minimum value of a loss function, and specifically implementing the following steps:
(511) The boundaries of the prediction box and the anchor box of 4 coordinates fall back:
Figure BDA0002230213690000041
the boundaries of the anchor frame and the real frame with 4 coordinates fall back:
Figure BDA0002230213690000042
wherein: x, y, w and h represent the center coordinates and width and height of the frame; x, x a And x * Respectively representing the coordinates of the prediction frame, the anchor frame and the real frame; y, y a And y * Respectively representing the coordinates of the prediction frame, the anchor frame and the real frame; w, w a ,w * Respectively representing the widths of the prediction frame, the anchor frame and the real frame; h, h a ,h * Respectively representing the heights of the prediction frame, the anchor frame and the real frame;
(512) The loss function has the function of judging the error between the candidate frame and the real frame, continuously training network adjusting parameters by adopting a gradient descent method, defining the minimum loss function as follows for the same loss functions of convolution conv3-3 and conv5-3 layers:
Figure BDA0002230213690000043
wherein the content of the first and second substances,
Figure BDA0002230213690000044
{p i and t i The outputs of the cls layer and the reg layer are respectively, and i represents the serial number of the anchor frame; j represents the number of convolutional layers; t is t i Represents a predicted offset; t is t i * Representing the offset of the anchor frame and the real frame;
(513) Continuously training an RPN (resilient packet network), and optimizing the value of a loss function to the minimum to obtain a target candidate region with high accuracy; calculating the minimum loss function L of the convolution conv3-3 layer according to the formula (17) 3 ({p i },{t i }) and the minimum loss function L corresponding to the conv5-3 layer of convolution 5 ({p i },{t i });
(52) Comparing the minimum values of the two layers of loss functions to select a smaller loss function, and expressing the minimum value of the two loss functions by L, wherein the minimum value is expressed as:
L=min{L 3 ({p i },{t i }),L 5 ({p i },{t i get the L value, find the least loss function on two convolution layers, also the candidate area with the highest precision as the final optimized candidate area.
Further, step 6 is specifically implemented according to the following steps:
(61) Inputting the target candidate region after RPN network optimization and the shared characteristic graph of the corresponding convolution layer into a detection network, and extracting the region characteristics in the ROI pooling layer;
(62) Inputting the region characteristics into a subsequent softmax layer, performing classification judgment and regression correction of a target boundary on each target candidate region, detecting loss functions of targets in the network, including a target classification loss function and a position boundary loss function, wherein the loss functions of the network are as follows:
Figure BDA0002230213690000051
wherein L 'represents a detection network loss function, a target classification loss function L' cls Comprises the following steps:
Figure BDA0002230213690000052
wherein | S + I represents the number of positive samples, | S - I represents the number of negative samples, a boundary regression loss function is the same as the RPN network boundary loss function, and the loss function is continuously optimized to continuously correct the target classification judgment and the regression of the boundary frame to obtain an optimized target classification and boundary regression value;
(63) Through training of a large number of data sets, parameters of the network are continuously adjusted by adopting a gradient descent method, and finally the total loss of the network is minimized, wherein the total loss function is L * The representation, i.e. the total loss function of the network, is:
L * =L+L′ (24)
training the network to L * The minimum value is reached;
(64) And carrying out target detection on the test data set by using the trained detection network.
Compared with the prior art, the invention has the advantages that:
the improved RPN is used for screening the candidate regions, the optimized candidate regions are selected, the generation of invalid candidate regions by the RPN is reduced, the effectiveness of extracting the candidate regions from small targets and targets with shielding influence is effectively improved, and the precision of battlefield target detection is further improved. Because the convolution conv3-3 and conv5-3 layers respectively generate the shared characteristic graph and are combined with the optimized RPN network for use, the invention has high detection precision on small targets and occlusion targets.
Description of the drawings:
FIG. 1 is an overall framework of the present invention;
fig. 2 is an RPN network structure of the present invention.
The specific implementation mode is as follows:
the present invention will be described in detail below with reference to the drawings and examples.
Referring to fig. 1, a battlefield target detection method based on an optimized RPN network is specifically implemented according to the following steps:
step 1: constructing a tank armor target data set conforming to the PASCAL VOC data set format, and respectively labeling the tank armor targets on the training data set and the test data set, wherein the method specifically comprises the following steps:
(11) First, the size of the target object is determined. Suppose that the larger value of the width and height of the target object is denoted as P max The pixels can divide the targets into three types of targets according to the size of the targets in the visual field, and the size standard of the target classification is as follows:
Figure BDA0002230213690000061
(12) And calibrating the targets for targets with different sizes. Respectively calibrating a training data set and a testing data set of the tank and the armored vehicle;
step 2: initializing a model on the ImageNet data set to train the VGG-16 network;
and 3, step 3: inputting a training data set of a target image into a convolutional neural network, extracting target features on convolutional layers conv3-3 and conv5-3 respectively, and generating a shared feature map, wherein the method specifically comprises the following steps:
(31) The convolutional neural network is a main network of the target detection network, target features are extracted, and shared convolutional layers are generated, wherein the input of each convolutional layer is as follows:
Z l =W l X l-1 +b l (2)
wherein Z, W, X are all in matrix form, l represents the l-th layer convolution, Z l Input representing the l-th layer of convolution, W l Represents the weight from layer l-1 to layer l, X l-1 Represents the output of the l-1 th layer convolution, b l Represents the bias of the l-th layer;
(302) The output of the l-th layer convolution is:
X l =f(Z l )=f(W l X l-1 +b l ) (3)
wherein f (-) represents an activation function;
(303) The total error of the convolutional layer is:
Figure BDA0002230213690000071
wherein | x | purple 2 Represents a 2 norm of x, i.e., is
Figure BDA0002230213690000072
Continuously optimizing parameters W and b of the neural network by a gradient descent method, and substituting the formula (3) into the formula (4) to obtain:
Figure BDA0002230213690000073
(34) The parameters W and b are respectively graded according to equation (5) to obtain:
Figure BDA0002230213690000074
Figure BDA0002230213690000075
where e represents the inner product of the two vectors,
Figure BDA0002230213690000076
expressing the derivation symbol, and solving the values of the parameters W and b, wherein the specific derivation steps are as follows:
(341) Order:
Figure BDA0002230213690000077
then, formula (8) is substituted for formula (6) and formula (7), respectively:
Figure BDA0002230213690000078
/>
Figure BDA0002230213690000079
(342) The parameters W and b are required to be obtained, and only delta is required to be obtained l Namely, according to the chain derivation method, there are:
Figure BDA00022302136900000710
according to the formula (2):
Figure BDA0002230213690000081
derived from equation (12):
Figure BDA0002230213690000082
substituting equation (13) into equation (11) yields:
Figure BDA0002230213690000083
finding delta l The gradients of the parameters W and b are obtained by substituting (9) and (10).
(35) And continuously adjusting parameters of the network to enable the extraction of the target characteristics to be more accurate, obtaining the output of the convolution layer, namely the characteristics of the target, through the function of the activation function of the formula (3) on the convolution result, connecting the characteristics of the target through a full connection layer to form a shared characteristic diagram, and respectively obtaining the shared characteristic diagram by the convolution layer conv3-3 and the conv5-3 layers.
Referring to fig. 2, in the improved RPN network structure, different sliding windows are set on the feature maps of the convolution conv3-3 and conv5-3 layers to obtain candidate regions, errors between each layer and a real frame are respectively calculated to select smaller values, and then a candidate region with high accuracy is selected from the smaller values to serve as an optimized candidate region. The specific implementation steps are as follows 4 and 5:
and 4, step 4: respectively sliding windows with different sizes on the shared characteristic graphs generated by the two convolution layers by using a candidate region extraction network (RPN) to obtain target candidate regions with different sizes and proportions, and specifically implementing the following steps:
(41) Different sliding windows are respectively arranged on the shared feature maps of the convolution conv3-3 layer and the convolution conv5-3 layer, the conv3-3 layer is provided with a sliding window with the size of 5 x 5, the conv5-3 layer is provided with sliding windows with the sizes of 7 x 7 and 9 x 9, and therefore the conv3-3 layer and the conv5-3 convolution layer are selected to obtain the convolution features of the target image, because layer3 is mainly the texture feature of the learning target, layer5 learns the characteristic feature and all features of the target object, and the small target and the shielding object are provided with the sliding window with the size of 5 x 5 on the low-layer convolution, so that the feature extraction of the small target and the shielding target is facilitated;
(42) Arranging anchor frames with different scales and proportions on a sliding window, and designing 512 by the invention 2 ,256 2 ,128 2 And 64 2 The anchor frame of size, every anchor frame sets up three kinds of proportion sizes, is 1 respectively: 2. 2:1 and 1:1, obtaining 12 anchor frames at each pixel position, expressing the number of the anchor frames obtained by each pixel by k, and obtaining W multiplied by H multiplied by k anchor frames for a convolution characteristic diagram with the size of W multiplied by H;
(43) Generating region candidate frames through sliding windows, wherein candidate regions generated by the conv3-3 layer and the conv5-3 layer are respectively represented by a suggested region 1 and a suggested region 2, the features in each sliding window are mapped to corresponding low-dimensional features, the low-dimensional features are subjected to a ReLU activation function to obtain vectors, the vectors are input into two convolution layers respectively including a candidate region classification judgment layer (cls) and a candidate region position regression layer (reg), the classification cls layer represents the probability value that each candidate region is a target, the output is 2k, the reg layer represents the position regression coordinates of k frames, the output is 4k, and the target candidate region containing the position regression coordinates and the category judgment is obtained.
And 5: obtaining candidate regions through an RPN, respectively calculating errors between the candidate regions obtained on the two convolutional layer characteristic diagrams and a real frame, selecting a candidate frame with the minimum error, and finally selecting a candidate region with high accuracy from the candidate region with the minimum error as an optimized target candidate region, wherein the method is implemented according to the following steps:
(51) Respectively calculating error values between all target candidate frames and real frames of the convolution conv3-3 and conv5-3 layers, namely the minimum value of a loss function;
the step (51) is specifically implemented according to the following steps:
(511) The boundaries of the prediction box and the anchor box of the 4 coordinates are returned as follows:
Figure BDA0002230213690000091
the boundaries of the anchor frame and the real frame with 4 coordinates fall back:
Figure BDA0002230213690000092
wherein: x, y, w and h represent the center coordinates and width and height of the frame; x, x a And x * Respectively representing the coordinates of the prediction frame, the anchor frame and the real frame; y, y a And y * Respectively representing the coordinates of the prediction frame, the anchor frame and the real frame; w, w a ,w * Respectively representing the widths of the prediction frame, the anchor frame and the real frame; h, h a ,h * Respectively representing the heights of the prediction frame, the anchor frame and the real frame;
(512) The loss function has the function of judging the error between the candidate frame and the real frame, continuously training network adjusting parameters by adopting a gradient descent method, defining the minimum loss function as follows for the same loss functions of convolution conv3-3 and conv5-3 layers:
Figure BDA0002230213690000105
wherein the content of the first and second substances,
Figure BDA0002230213690000101
{p i and { t } i Are the outputs of the cls and reg layers, respectively, i denotes the anchorThe number of the frame, j, denotes the number of the layer of the convolution layer, t i Indicates the predicted offset, t i * Representing the offset of the anchor frame and the real frame, wherein the classification loss function is:
Figure BDA0002230213690000102
the regression loss function is:
Figure BDA0002230213690000103
wherein the R function is defined as:
Figure BDA0002230213690000104
two moieties of formula (17) are represented by N cls And N reg Normalizing the classification loss term and the regression loss term and weighting by a balance parameter lambda, defaulting to N cls Is set to 512,N reg Setting the number of the anchor frames;
(513) Continuously training the RPN network, and obtaining a target candidate region with high accuracy by optimizing the value of the loss function to the minimum; calculating the minimum loss function L of the convolution conv3-3 layer according to the formula (17) 3 ({p i },{t i }) and the minimum loss function L corresponding to the conv5-3 layers 5 ({p i },{t i });
(52) And comparing the minimum values of the two layers of loss functions, selecting a smaller loss function, namely corresponding to the candidate region after final optimization, and expressing the minimum value of the two by using L, wherein the minimum value is expressed as:
L=min{L 3 ({p i },{t i }),L 5 ({p i },{t i })} (21)
the L value is obtained as the final optimized candidate region, that is, the candidate region with the highest accuracy, which is the minimum loss function on the two convolution layers.
Step 6: sending the optimized target candidate region obtained in the step 5 and the shared feature map of the corresponding convolution layer into a detection network, wherein the detection network consists of an ROI (region of interest) pooling layer and a target classification and regression layer, extracting region features from the optimized candidate region, continuously adjusting parameters of the network, inputting a test data set into the detection network, and finishing target class judgment and regression correction of a target boundary frame, and the method is implemented according to the following steps:
(61) Inputting the target candidate region after RPN network optimization and the shared characteristic graph of the corresponding convolution layer into a detection network, and extracting the region characteristics in the ROI pooling layer;
(62) Inputting the region characteristics into a subsequent softmax layer, and performing category judgment and regression correction of a target boundary on each target candidate region. Loss functions of objects also exist in the detection network, and the loss functions also comprise object classification loss functions and position boundary loss functions. The loss function of the detection network is:
Figure BDA0002230213690000111
wherein L 'represents a detection network loss function, a target classification loss function L' cls Comprises the following steps:
Figure BDA0002230213690000112
wherein | S + I represents the number of positive samples, | S - And | represents the number of negative samples. The boundary regression loss function is the same as the RPN network boundary loss function, and the loss function is continuously optimized to continuously correct the target classification discrimination and the regression of the boundary frame to obtain the optimized target classification and boundary regression value;
(63) Through training of a large number of data sets, parameters of the network are continuously adjusted by adopting a gradient descent method, and finally the total loss of the network is minimized, wherein the total loss function is L * The representation, i.e. the total loss function of the network, is:
L * =L+L′ (24)
training networkLet L * The minimum value is reached;
(64) And carrying out target detection on the test data set by using the trained detection network.
It will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.

Claims (6)

1. A battlefield target detection method based on an optimized RPN network is characterized by being implemented according to the following steps:
step 1: constructing a tank armor target data set conforming to the PASCAL VOC data set format, and respectively labeling a training data set and a tank armor target on a test data set;
step 2: initializing a model on the ImageNet data set to train the VGG-16 network;
and step 3: inputting a training data set of a target image into a convolutional neural network, and respectively extracting target characteristics on convolutional layers conv3-3 and conv5-3 to generate a shared characteristic diagram;
and 4, step 4: respectively sliding windows with different sizes on the shared characteristic diagram generated by the two convolution layers by using a candidate region extraction network to obtain target candidate regions with different sizes and proportions;
and 5: obtaining candidate regions through an RPN, respectively calculating errors between the candidate regions obtained on the two convolution layer characteristic graphs and a real frame, selecting a candidate frame with the minimum error, and finally selecting a candidate region with high accuracy from the candidate region with the minimum error as an optimized target candidate region;
and 6: and (5) sending the optimized target candidate region obtained in the step (5) and the shared characteristic graph of the corresponding convolutional layer into a detection network to finish the judgment of the target category and the regression correction of the target boundary frame.
2. The optimized RPN network-based battlefield target detection method as claimed in claim 1, wherein step 1 is specifically implemented according to the following steps:
(11) HeadFirstly, the size of the target object is judged, and the larger value of the width and the height of the target object is assumed to be recorded as P max The pixels divide the targets into three types of targets according to the size of the targets in the visual field range, and the size standard of the target classification is as follows:
Figure FDA0002230213680000011
(12) And (3) calibrating the targets with different sizes: and respectively calibrating the training data set and the test data set of the tank and the armored vehicle.
3. The battlefield target detection method based on the optimized RPN network as claimed in claim 1 or 2, wherein step 3 is implemented specifically according to the following steps:
(31) The convolutional neural network is a main network of the target detection network, target features are extracted, and shared convolutional layers are generated, wherein the input of each convolutional layer is as follows:
Z l =W l X l-1 +b l (2)
(32) The output of the l-th layer convolution is:
X l =f(Z l )=f(W l X l-1 +b l ) (3)
(33) The total error of the convolutional layer is:
Figure FDA0002230213680000021
continuously optimizing parameters W and b of the neural network by a gradient descent method;
(34) The parameters W and b are respectively graded according to equation (5) to obtain:
Figure FDA0002230213680000022
Figure FDA0002230213680000023
/>
where e represents the inner product of two vectors,
Figure FDA0002230213680000024
representing a derivation symbol, and calculating values of parameters W and b;
(35) And continuously adjusting parameters of the network to enable the extraction of the target characteristics to be more accurate, obtaining the output of the convolution layer, namely the characteristics of the target, through the function of the activation function of the formula (3) on the convolution result, connecting the characteristics of the target through a full connection layer to form a shared characteristic diagram, and respectively obtaining the shared characteristic diagram by the convolution layer conv3-3 and the conv5-3 layers.
4. The optimized RPN network-based battlefield target detection method as claimed in claim 3, wherein step 4 is specifically implemented according to the following steps:
(41) Setting different sliding windows on the shared characteristic graphs of the convolution conv3-3 layer and the convolution conv5-3 layer respectively, setting a sliding window with the size of 5 multiplied by 5 on the conv3-3 layer by the RPN network, and setting sliding windows with the sizes of 7 multiplied by 7 and 9 multiplied by 9 on the conv5-3 layer;
(42) Anchor frames with different scales and proportions are arranged on the sliding window, so that W multiplied by H multiplied by k anchor frames can be obtained;
(43) Generating region candidate frames through sliding windows, wherein candidate regions generated by the conv3-3 layer and the conv5-3 layer are respectively represented by a suggestion region 1 and a suggestion region 2, the features in each sliding window are mapped to corresponding low-dimensional features, the low-dimensional features are subjected to a ReLU activation function to obtain vectors, the vectors are input into two convolution layers respectively, namely a candidate region classification judgment layer (cls) and a candidate region position regression layer (reg), the classification cls layer represents the probability value that each candidate region is a target, the output of the probability value is 2k, the reg layer represents the position regression coordinates of k frames, and the output of the regression coordinates is 4k.
5. The optimized RPN network-based battlefield target detection method as claimed in claim 4, wherein step 5 is specifically implemented according to the following steps:
(51) Respectively calculating error values between all target candidate frames and real frames of the convolution conv3-3 and conv5-3 layers, namely the minimum value of a loss function, and specifically implementing the following steps:
(511) The boundaries of the prediction box and the anchor box of 4 coordinates fall back:
Figure FDA0002230213680000031
the boundaries of the anchor frame and the real frame with 4 coordinates fall back:
Figure FDA0002230213680000032
wherein: x, y, w and h represent the center coordinates and width and height of the frame; x, x a And x * Respectively representing the coordinates of the prediction frame, the anchor frame and the real frame; y, y a And y * Respectively representing the coordinates of the prediction frame, the anchor frame and the real frame; w, w a ,w * Respectively representing the widths of the prediction frame, the anchor frame and the real frame; h, h a ,h * Respectively representing the heights of the prediction frame, the anchor frame and the real frame;
(512) The loss function has the function of judging the error between the candidate frame and the real frame, continuously training network adjusting parameters by adopting a gradient descent method, defining the minimum loss function as follows for the same loss functions of convolution conv3-3 and conv5-3 layers:
Figure FDA0002230213680000033
wherein the content of the first and second substances,
Figure FDA0002230213680000034
{p i and t i The outputs of the cls layer and the reg layer are respectively, and i represents the serial number of the anchor frame;j represents the number of convolutional layers; t is t i Represents a predicted offset; />
Figure FDA0002230213680000035
Representing the offset of the anchor frame and the real frame;
(513) Continuously training an RPN (resilient packet network), and optimizing the value of a loss function to the minimum to obtain a target candidate region with high accuracy; calculating and obtaining the minimum loss function L of the convolution conv3-3 layer according to the formula (17) 3 ({p i },{t i }) and the minimum loss function L corresponding to the conv5-3 layers 5 ({p i },{t i });
(52) Comparing the minimum values of the two layers of loss functions to select a smaller loss function, and expressing the minimum value of the two loss functions by L, wherein the minimum value is expressed as:
L=min{L 3 ({p i },{t i }),L 5 ({p i },{t i })} (21)
the L value is obtained as the candidate region after the final optimization, that is, the candidate region with the highest accuracy, which is the minimum loss function on the two convolutional layers.
6. The optimized RPN network-based battlefield target detection method as claimed in claim 5, wherein step 6 is implemented according to the following steps:
(61) Inputting the target candidate region after RPN network optimization and the shared characteristic graph of the corresponding convolution layer into a detection network, and extracting the region characteristics in the ROI pooling layer;
(62) Inputting the region characteristics into a subsequent softmax layer, performing classification judgment and regression correction of a target boundary on each target candidate region, detecting loss functions of targets in the network, including a target classification loss function and a position boundary loss function, wherein the loss functions of the network are as follows:
Figure FDA0002230213680000041
wherein L' represents a detection netA net loss function, a target classification loss function L' cls Comprises the following steps:
Figure FDA0002230213680000042
wherein | S + I represents the number of positive samples, | S - The method comprises the following steps that |, the number of negative samples is represented, a boundary regression loss function is the same as an RPN (resilient packet network) boundary loss function, the loss function is continuously optimized, target classification judgment and regression of a boundary frame are continuously corrected, and an optimized target classification and boundary regression value is obtained;
(63) Through training of a large number of data sets, parameters of the network are continuously adjusted by adopting a gradient descent method, and finally the total loss of the network is minimized, wherein the total loss function is L * The representation, i.e. the total loss function of the network, is:
L * =L+L′ (24)
training the network to L * The minimum value is reached;
(64) And carrying out target detection on the test data set by using the trained detection network.
CN201910965047.4A 2019-10-11 2019-10-11 Battlefield target detection method based on optimized RPN (resilient packet network) Active CN110766058B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910965047.4A CN110766058B (en) 2019-10-11 2019-10-11 Battlefield target detection method based on optimized RPN (resilient packet network)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910965047.4A CN110766058B (en) 2019-10-11 2019-10-11 Battlefield target detection method based on optimized RPN (resilient packet network)

Publications (2)

Publication Number Publication Date
CN110766058A CN110766058A (en) 2020-02-07
CN110766058B true CN110766058B (en) 2023-04-18

Family

ID=69331874

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910965047.4A Active CN110766058B (en) 2019-10-11 2019-10-11 Battlefield target detection method based on optimized RPN (resilient packet network)

Country Status (1)

Country Link
CN (1) CN110766058B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111898560B (en) * 2020-08-03 2023-08-01 华南理工大学 Classification regression feature decoupling method in target detection
CN111985367A (en) * 2020-08-07 2020-11-24 湖南大学 Pedestrian re-recognition feature extraction method based on multi-scale feature fusion
CN112287977B (en) * 2020-10-06 2024-02-09 武汉大学 Target detection method based on bounding box key point distance
CN112417981B (en) * 2020-10-28 2024-04-26 大连交通大学 Efficient recognition method for complex battlefield environment targets based on improved FasterR-CNN
CN112347895A (en) * 2020-11-02 2021-02-09 北京观微科技有限公司 Ship remote sensing target detection method based on boundary optimization neural network
CN112419310B (en) * 2020-12-08 2023-07-07 中国电子科技集团公司第二十研究所 Target detection method based on cross fusion frame optimization
CN113780270A (en) * 2021-03-23 2021-12-10 京东鲲鹏(江苏)科技有限公司 Target detection method and device
CN112927229B (en) * 2021-04-21 2022-09-30 中国人民解放军陆军装甲兵学院 Tank armored vehicle and three-dimensional detection method for elastic hole damage thereof

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107368845B (en) * 2017-06-15 2020-09-22 华南理工大学 Optimized candidate region-based Faster R-CNN target detection method
CN107451602A (en) * 2017-07-06 2017-12-08 浙江工业大学 A kind of fruits and vegetables detection method based on deep learning
CN110096933B (en) * 2018-01-30 2023-07-18 华为技术有限公司 Target detection method, device and system

Also Published As

Publication number Publication date
CN110766058A (en) 2020-02-07

Similar Documents

Publication Publication Date Title
CN110766058B (en) Battlefield target detection method based on optimized RPN (resilient packet network)
CN113065558B (en) Lightweight small target detection method combined with attention mechanism
CN109241913B (en) Ship detection method and system combining significance detection and deep learning
CN110222215B (en) Crop pest detection method based on F-SSD-IV3
CN109086799A (en) A kind of crop leaf disease recognition method based on improvement convolutional neural networks model AlexNet
CN108564085B (en) Method for automatically reading of pointer type instrument
CN111652321A (en) Offshore ship detection method based on improved YOLOV3 algorithm
CN111242878B (en) Mine image enhancement method based on cuckoo search
CN110569796A (en) Method for dynamically detecting lane line and fitting lane boundary
CN108428220A (en) Satellite sequence remote sensing image sea island reef region automatic geometric correction method
CN111242026B (en) Remote sensing image target detection method based on spatial hierarchy perception module and metric learning
CN112396619B (en) Small particle segmentation method based on semantic segmentation and internally complex composition
CN111144234A (en) Video SAR target detection method based on deep learning
Sun et al. Wheat head counting in the wild by an augmented feature pyramid networks-based convolutional neural network
CN109543585A (en) Underwater optics object detection and recognition method based on convolutional neural networks
CN112258490A (en) Low-emissivity coating intelligent damage detection method based on optical and infrared image fusion
CN109558803B (en) SAR target identification method based on convolutional neural network and NP criterion
CN114549909A (en) Pseudo label remote sensing image scene classification method based on self-adaptive threshold
CN112417981A (en) Complex battlefield environment target efficient identification method based on improved FasterR-CNN
CN108615240B (en) Non-parametric Bayesian over-segmentation method combining neighborhood information and distance weight
CN113902044B (en) Image target extraction method based on lightweight YOLOV3
CN116129292A (en) Infrared vehicle target detection method and system based on few sample augmentation
CN116189160A (en) Infrared dim target detection method based on local contrast mechanism
CN113627240B (en) Unmanned aerial vehicle tree species identification method based on improved SSD learning model
CN115100068A (en) Infrared image correction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230823

Address after: Room 8448, 2nd Floor, Building 4, Free Trade Industrial Park, No. 2168 Zhenghe Fourth Road, Fengdong New City, Xixian New District, Xi'an City, Shaanxi Province, 710086

Patentee after: Xi'an Keduoduo Information Technology Co.,Ltd.

Address before: 710032 No. 2 Xuefu Middle Road, Weiyang District, Xi'an City, Shaanxi Province

Patentee before: XI'AN TECHNOLOGICAL University