CN110766058A - Battlefield target detection method based on optimized RPN (resilient packet network) - Google Patents
Battlefield target detection method based on optimized RPN (resilient packet network) Download PDFInfo
- Publication number
- CN110766058A CN110766058A CN201910965047.4A CN201910965047A CN110766058A CN 110766058 A CN110766058 A CN 110766058A CN 201910965047 A CN201910965047 A CN 201910965047A CN 110766058 A CN110766058 A CN 110766058A
- Authority
- CN
- China
- Prior art keywords
- target
- network
- layer
- frame
- candidate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a battlefield target detection method based on an optimized RPN (resilient packet network), which comprises the following steps of 1, constructing a tank armor target data set, and respectively labeling a training data set and a tank armor target on a test data set; 2. initializing a model on the ImageNet data set to train the VGG-16 network; 3. generating a sharing characteristic graph; 4. obtaining target candidate areas with different sizes and proportions; 5. obtaining candidate regions through an RPN, respectively calculating errors between the candidate regions obtained on the two convolution layer characteristic graphs and a real frame, selecting a candidate frame with the minimum error, and finally selecting a candidate region with high accuracy from the candidate region with the minimum error as an optimized target candidate region; 6. and finishing the judgment of the target type and the regression correction of the target boundary box. The invention effectively improves the effectiveness of extracting the candidate region from the small target and the target with shielding influence, thereby improving the precision of battlefield target detection.
Description
Technical Field
The invention belongs to the technical field of battlefield target detection, and particularly relates to a battlefield target detection method based on an optimized RPN (resilient packet network).
Background
At present, the detection of tank armor targets in battlefield still uses a method of manual discovery and aiming to calibrate the targets, and then carries out target tracking to achieve the purpose of accurate striking. The accurate detection of the battlefield target is a necessary premise for realizing the accurate attack of the battlefield target. The complexity of battlefield environment and the automatic detection of battlefield targets are the difficult problems of realizing the intellectualization of the tank armored vehicle combat system. In the detection and research of battlefield targets, the research methods in recent years are mainly divided into: based on an artificial model detection algorithm; a detection method based on R-CNN; a Fast R-CNN-based detection method and a Fast R-CNN-based detection method.
The method of the artificial model only contains information such as color histogram, textural features and the like of the image, does not have deep feature abstraction capability, does not accurately depict the target, causes inaccurate generated candidate frames and does not achieve ideal target detection results; the R-CNN detection method adopts the SS search method to extract the candidate frames, the calculated amount is large, the time consumption problem exists, the extracted candidate frames are all zoomed to a uniform size, and the original information of the image is easy to lose due to the characteristic extraction; the method for extracting the candidate box by the Fast R-CNN algorithm is the same as the R-CNN method, the time consumption problem exists in the extraction of the suggested region, the network training is not end-to-end, and the quality of the candidate box cannot be improved by the back propagation algorithm; the Faster R-CNN algorithm adopts the RPN network to extract the candidate frame, so that the problem of time consumption for extracting the candidate frame is solved, but the detection precision of the small target and the shielded target is not high in the battlefield target detection, and the false detection or the accurate detection precision of the small target and the shielded target is not high due to the fact that the characteristic extraction is not comprehensive enough and influences are generated on the candidate frame. Due to the difference of the visual field ranges, targets with different scales can appear, the sizes of the targets are different, the detection precision is greatly influenced, and the problem that the detection precision is not high can appear in small targets. Factors such as target occlusion and dust and smoke also influence the extraction of the candidate frame, resulting in low target detection accuracy.
Disclosure of Invention
The invention aims to provide a battlefield target detection method based on an optimized RPN network, which solves the problems that the extraction of a candidate frame is not accurate, and the detection accuracy of small targets and targets with occlusion is not accurate.
The technical scheme adopted by the invention is as follows:
a battlefield target detection method based on an optimized RPN network is implemented according to the following steps:
step 1: constructing a tank armor target data set conforming to the PASCAL VOC data set format, and respectively labeling a training data set and a tank armor target on a test data set;
step 2: initializing a model on the ImageNet data set to train the VGG-16 network;
and step 3: inputting a training data set of a target image into a convolutional neural network, and respectively extracting target features on convolutional layer conv3-3 and conv5-3 layers to generate a shared feature map;
and 4, step 4: respectively sliding windows with different sizes on the shared characteristic diagram generated by the two convolution layers by using a candidate region extraction network to obtain target candidate regions with different sizes and proportions;
and 5: obtaining candidate regions through an RPN, respectively calculating errors between the candidate regions obtained on the two convolution layer characteristic graphs and a real frame, selecting a candidate frame with the minimum error, and finally selecting a candidate region with high accuracy from the candidate region with the minimum error as an optimized target candidate region;
step 6: and (5) sending the optimized target candidate region obtained in the step (5) and the shared characteristic graph corresponding to the convolution layer into a detection network to finish target class judgment and regression correction of a target boundary frame.
Further, step 1 is specifically implemented according to the following steps:
(11) firstly, the size of the target object is judged, and the larger value of the width and the height of the target object is assumed to be recorded as PmaxThe pixels divide the targets into three types of targets according to the size of the targets in the visual field, and the size standard of the target classification is as follows:
(12) calibrating targets for targets with different sizes: and respectively calibrating the training data set and the test data set of the tank and the armored vehicle.
Further, step 3 is specifically implemented according to the following steps:
(31) the convolutional neural network is a main network of the target detection network, target features are extracted, and shared convolutional layers are generated, wherein the input of each convolutional layer is as follows:
Zl=WlXl-1+bl(2)
(32) the output of the l-th layer convolution is:
Xl=f(Zl)=f(WlXl-1+bl) (3)
(33) the total error of the convolutional layer is:
continuously optimizing parameters W and b of the neural network by a gradient descent method;
(34) the parameters W and b are respectively graded according to equation (5) to obtain:
where e represents the inner product of two vectors,representing a derivation symbol, and calculating values of parameters W and b;
(35) and continuously adjusting parameters of the network to enable the extraction of the target characteristics to be more accurate, obtaining the output of the convolution layer, namely the characteristics of the target, through the function of the activation function of the formula (3) on the convolution result, connecting the characteristics of the target through a full connection layer to form a shared characteristic diagram, and obtaining the shared characteristic diagram by the convolution layer conv3-3 and the conv5-3 layers respectively.
Further, step 4 is specifically implemented according to the following steps:
(41) setting different sliding windows on the shared characteristic graphs of the convolution conv3-3 layer and the convolution conv5-3 layer respectively, setting the sliding window with the size of 5 multiplied by 5 on the conv3-3 layer by the RPN network, and setting the sliding windows with the sizes of 7 multiplied by 7 and 9 multiplied by 9 on the conv5-3 layer;
(42) anchor frames with different scales and proportions are arranged on the sliding window, so that W multiplied by H multiplied by k anchor frames can be obtained;
(43) generating region candidate frames through sliding windows, wherein candidate regions generated by the conv3-3 and conv5-3 layers are respectively represented by a suggestion region 1 and a suggestion region 2, features in each sliding window are mapped to corresponding low-dimensional features, the low-dimensional features are subjected to a ReLU activation function to obtain vectors, the vectors are input into two convolution layers respectively, namely a candidate region classification judgment layer (cls) and a candidate region position regression layer (reg), the classification cls layer represents the probability value that each candidate region is a target, the output of the probability value is 2k, and the reg layer represents the position regression coordinates of k frames, and the output of the regression coordinates is 4 k.
Further, step 5 is specifically implemented according to the following steps:
(51) respectively calculating error values between all target candidate boxes and real boxes of convolution conv3-3 and conv5-3 layers, namely, the minimum value of a loss function, and specifically implementing the following steps:
(511) the boundaries of the prediction box and the anchor box of 4 coordinates fall back:
the boundaries of the anchor frame and the real frame with 4 coordinates fall back:
wherein: x, y, w and h represent the center coordinates and width and height of the frame; x, xaAnd x*Respectively representing the coordinates of the prediction frame, the anchor frame and the real frame; y, yaAnd y*Respectively representing the coordinates of the prediction frame, the anchor frame and the real frame; w, wa,w*Respectively representing the widths of the prediction frame, the anchor frame and the real frame; h, ha,h*Respectively represent a prediction frame,The height of the anchor frame and the real frame;
(512) the loss function is used for judging the error between the candidate frame and the real frame, continuously training network adjusting parameters by adopting a gradient descent method, and defining the minimum loss function as follows for the same loss functions of convolution conv3-3 and conv5-3 layers:
wherein the content of the first and second substances,{piand tiThe outputs of the cls layer and the reg layer are respectively, and i represents the serial number of the anchor frame; j represents the number of convolutional layers; t is tiRepresents a predicted offset; t is ti *Representing the offset of the anchor frame and the real frame;
(513) continuously training an RPN (resilient packet network), and optimizing the value of a loss function to the minimum to obtain a target candidate region with high accuracy; calculating the minimum loss function L of the convolution conv3-3 layer according to the formula (17)3({pi},{ti}) and the minimum loss function L corresponding to the conv5-3 layers5({pi},{ti});
(52) Comparing the minimum values of the two layers of loss functions to select a smaller loss function, and expressing the minimum value of the two loss functions by L, wherein the minimum value is expressed as:
L=min{L3({pi},{ti}),L5({pi},{ti})} (21)
the L value is obtained as the candidate region after the final optimization, that is, the candidate region with the highest accuracy, which is the minimum loss function on the two convolutional layers.
Further, step 6 is specifically implemented according to the following steps:
(61) inputting the target candidate region after RPN network optimization and the shared feature map of the corresponding convolution layer into a detection network, and extracting region features at the ROI pooling layer;
(62) inputting the region characteristics into a subsequent softmax layer, performing classification judgment and regression correction of a target boundary on each target candidate region, detecting loss functions of targets in the network, including a target classification loss function and a position boundary loss function, wherein the loss functions of the network are as follows:
wherein L 'represents a detection network loss function, a target classification loss function L'clsComprises the following steps:
wherein | S+I represents the number of positive samples, | S-I represents the number of negative samples, a boundary regression loss function is the same as the RPN network boundary loss function, and the loss function is continuously optimized to continuously correct the target classification judgment and the regression of the boundary frame to obtain an optimized target classification and boundary regression value;
(63) through training of a large number of data sets, parameters of the network are continuously adjusted by adopting a gradient descent method, and finally the total loss of the network is minimized, wherein the total loss function is L*The representation, i.e. the total loss function of the network, is:
L*=L+L′ (24)
training the network to L*The minimum value is reached;
(64) and carrying out target detection on the test data set by using the trained detection network.
Compared with the prior art, the invention has the advantages that:
the improved RPN is used for screening the candidate regions, the optimized candidate regions are selected, the generation of invalid candidate regions by the RPN is reduced, the effectiveness of extracting the candidate regions from small targets and targets with shielding influence is effectively improved, and the precision of battlefield target detection is further improved. Because the convolution conv3-3 and conv5-3 layers respectively generate the shared characteristic graph and are combined with the optimized RPN network for use, the invention has high detection precision on small targets and occlusion targets.
Description of the drawings:
FIG. 1 is an overall framework of the present invention;
fig. 2 is an RPN network structure of the present invention.
The specific implementation mode is as follows:
the present invention will be described in detail below with reference to the drawings and examples.
Referring to fig. 1, a battlefield target detection method based on an optimized RPN network is specifically implemented according to the following steps:
step 1: constructing a tank armor target data set conforming to the PASCAL VOC data set format, and respectively labeling the tank armor targets on the training data set and the test data set, wherein the method specifically comprises the following steps:
(11) first, the size of the target object is determined. Let the larger value of the width and height of the target object be noted as PmaxThe pixels can divide the targets into three types of targets according to the size of the targets in the visual field, and the size standard of the target classification is as follows:
(12) and calibrating the targets for targets with different sizes. Respectively calibrating a training data set and a testing data set of the tank and the armored vehicle;
step 2: initializing a model on the ImageNet data set to train the VGG-16 network;
and step 3: inputting a training data set of a target image into a convolutional neural network, extracting target features on convolutional layer conv3-3 and conv5-3 layers respectively, and generating a shared feature map, wherein the method specifically comprises the following steps:
(31) the convolutional neural network is a main network of the target detection network, target features are extracted, and shared convolutional layers are generated, wherein the input of each convolutional layer is as follows:
Zl=WlXl-1+bl(2)
wherein Z, W, X are all in matrix form, l represents the l-th layer convolution, ZlDenotes the l-th layerInput of convolution, WlRepresents the weight from layer l-1 to layer l, Xl-1Represents the output of the l-1 th layer convolution, blRepresents the bias of the l-th layer;
(302) the output of the l-th layer convolution is:
Xl=f(Zl)=f(WlXl-1+bl) (3)
wherein f (-) represents an activation function;
(303) the total error of the convolutional layer is:
wherein | x | purple2Represents a 2 norm of x, i.e., isContinuously optimizing parameters W and b of the neural network by a gradient descent method, and substituting the formula (3) into the formula (4) to obtain:
(34) the parameters W and b are respectively graded according to equation (5) to obtain:
where e represents the inner product of two vectors,expressing the derivation sign, and solving the values of the parameters W and b, wherein the specific derivation steps are as follows:
(341) order:
then, formula (8) is substituted for formula (6) and formula (7), respectively:
(342) the parameters W and b are required to be obtained, and only delta is required to be obtainedlThat is, there are according to the chain-type derivation rule:
according to the formula (2):
derived from equation (12):
formula (13) is substituted for formula (11) to obtain:
finding deltalThe gradients of the parameters W and b are obtained by substituting (9) and (10).
(35) And continuously adjusting parameters of the network to enable the extraction of the target characteristics to be more accurate, obtaining the output of the convolution layer, namely the characteristics of the target, through the function of the activation function of the formula (3) on the convolution result, connecting the characteristics of the target through a full connection layer to form a shared characteristic diagram, and obtaining the shared characteristic diagram by the convolution layer conv3-3 and the conv5-3 layers respectively.
Referring to fig. 2, in the improved RPN network structure, different sliding windows are set on feature maps of convolution conv3-3 and conv5-3 layers to obtain candidate regions, errors between each layer and a real frame are respectively calculated to select smaller values, and then a candidate region with high accuracy is selected from the smaller values to serve as an optimized candidate region. The specific implementation steps are as follows 4 and 5:
and 4, step 4: respectively sliding windows with different sizes on the shared characteristic graphs generated by the two convolution layers by using a candidate region extraction network (RPN) to obtain target candidate regions with different sizes and proportions, and specifically implementing the following steps:
(41) different sliding windows are respectively arranged on the shared feature maps of the convolution conv3-3 layer and the convolution conv5-3 layer, the conv3-3 layer is provided with a sliding window with the size of 5 x 5, the conv5-3 layer is provided with sliding windows with the sizes of 7 x 7 and 9 x 9, so that the conv3-3 layer and the conv5-3 convolution layer are selected to obtain the convolution features of the target image, because the layer3 is mainly the texture feature of the learning target, the layer5 learns the characteristic feature and all features of the target object, and the small target and the occlusion object are provided with the sliding window with the size of 5 x 5 on the low-layer convolution, so that the feature extraction of the small target and the occlusion target is facilitated;
(42) arranging anchor frames with different scales and proportions on a sliding window, and designing 512 by the invention2,2562,1282And 642The anchor frame of size, every anchor frame sets up three kinds of proportion sizes, is 1 respectively: 2. 2: 1 and 1: 1, obtaining 12 anchor frames at each pixel position, expressing the number of the anchor frames obtained by each pixel by k, and obtaining W multiplied by H multiplied by k anchor frames for a convolution characteristic diagram with the size of W multiplied by H;
(43) generating region candidate frames through sliding windows, wherein candidate regions generated by the conv3-3 layer and the conv5-3 layer are respectively represented by a suggestion region 1 and a suggestion region 2, features in each sliding window are mapped to corresponding low-dimensional features, the low-dimensional features are subjected to a ReLU activation function to obtain vectors, the vectors are input into two convolution layers respectively, namely a candidate region classification judgment layer (cls) and a candidate region position regression layer (reg), the classification cls layer represents the probability value that each candidate region is a target, the output of the probability value is 2k, the reg layer represents that position regression coordinates of k frames are output, the output of the position regression coordinates is 4k, and the target candidate region containing the position regression coordinates and category judgment is obtained.
And 5: obtaining candidate regions through an RPN, respectively calculating errors between the candidate regions obtained on the two convolutional layer characteristic graphs and a real frame, selecting a candidate frame with the minimum error, and finally selecting a candidate region with high accuracy from the candidate region with the minimum error as an optimized target candidate region, wherein the method specifically comprises the following steps:
(51) calculating error values between all target candidate boxes and real boxes of convolution conv3-3 and conv5-3 layers, namely, the minimum value of a loss function;
the step (51) is specifically implemented according to the following steps:
(511) the boundaries of the prediction box and the anchor box of 4 coordinates fall back:
the boundaries of the anchor frame and the real frame with 4 coordinates fall back:
wherein: x, y, w and h represent the center coordinates and width and height of the frame; x, xaAnd x*Respectively representing the coordinates of the prediction frame, the anchor frame and the real frame; y, yaAnd y*Respectively representing the coordinates of the prediction frame, the anchor frame and the real frame; w, wa,w*Respectively representing the widths of the prediction frame, the anchor frame and the real frame; h, ha,h*Respectively representing the heights of the prediction frame, the anchor frame and the real frame;
(512) the loss function is used for judging the error between the candidate frame and the real frame, continuously training network adjusting parameters by adopting a gradient descent method, and defining the minimum loss function as follows for the same loss functions of convolution conv3-3 and conv5-3 layers:
wherein the content of the first and second substances,{piand tiI denotes the number of anchor frames, j denotes the convolution layer of the second layer, tiIndicates the predicted offset, ti *Representing the offset of the anchor frame and the real frame, wherein the classification loss function is:
the regression loss function is:
wherein the R function is defined as:
two moieties of formula (17) are represented by NclsAnd NregNormalizing the classification loss term and the regression loss term and weighting by a balance parameter lambda, defaulting to NclsSet to 512, NregSetting the number of the anchor frames;
(513) continuously training an RPN (resilient packet network), and optimizing the value of a loss function to the minimum to obtain a target candidate region with high accuracy; calculating the minimum loss function L of the convolution conv3-3 layer according to the formula (17)3({pi},{ti}) and the minimum loss function L corresponding to the conv5-3 layers5({pi},{ti});
(52) Comparing the minimum values of the two layers of loss functions to select a smaller loss function, namely corresponding to the candidate region after final optimization, and expressing the minimum value of the two by L, wherein the minimum value is expressed as:
L=min{L3({pi},{ti}),L5({pi},{ti})} (21)
the L value is obtained as the candidate region after the final optimization, that is, the candidate region with the highest accuracy, which is the minimum loss function on the two convolutional layers.
Step 6: sending the optimized target candidate region obtained in the step 5 and the shared feature map of the corresponding convolution layer into a detection network, wherein the detection network consists of an ROI (region of interest) pooling layer and a target classification and regression layer, extracting region features from the optimized candidate region, continuously adjusting parameters of the network, inputting a test data set into the detection network, and finishing target class judgment and regression correction of a target boundary frame, and the method is implemented according to the following steps:
(61) inputting the target candidate region after RPN network optimization and the shared feature map of the corresponding convolution layer into a detection network, and extracting region features at the ROI pooling layer;
(62) inputting the region characteristics into a subsequent softmax layer, and performing category judgment and regression correction of a target boundary on each target candidate region. Loss functions of objects also exist in the detection network, and the loss functions also comprise object classification loss functions and position boundary loss functions. The loss function of the detection network is:
wherein L 'represents a detection network loss function, a target classification loss function L'clsComprises the following steps:
wherein | S+I represents the number of positive samples, | S-And | represents the number of negative samples. The boundary regression loss function is the same as the RPN network boundary loss function, and the loss function is continuously optimized to continuously correct the target classification discrimination and the regression of the boundary frame to obtain the optimized target classification and boundary regression value;
(63) through training of a large number of data sets, parameters of the network are continuously adjusted by adopting a gradient descent method, and finally the total loss of the network is minimized, wherein the total loss function is L*The representation, i.e. the total loss function of the network, is:
L*=L+L′ (24)
training the network to L*The minimum value is reached;
(64) and carrying out target detection on the test data set by using the trained detection network.
It will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.
Claims (6)
1. A battlefield target detection method based on an optimized RPN network is characterized by being implemented according to the following steps:
step 1: constructing a tank armor target data set conforming to the PASCAL VOC data set format, and respectively labeling a training data set and a tank armor target on a test data set;
step 2: initializing a model on the ImageNet data set to train the VGG-16 network;
and step 3: inputting a training data set of a target image into a convolutional neural network, and respectively extracting target features on convolutional layer conv3-3 and conv5-3 layers to generate a shared feature map;
and 4, step 4: respectively sliding windows with different sizes on the shared characteristic diagram generated by the two convolution layers by using a candidate region extraction network to obtain target candidate regions with different sizes and proportions;
and 5: obtaining candidate regions through an RPN, respectively calculating errors between the candidate regions obtained on the two convolution layer characteristic graphs and a real frame, selecting a candidate frame with the minimum error, and finally selecting a candidate region with high accuracy from the candidate region with the minimum error as an optimized target candidate region;
step 6: and (5) sending the optimized target candidate region obtained in the step (5) and the shared characteristic graph corresponding to the convolution layer into a detection network to finish target class judgment and regression correction of a target boundary frame.
2. The optimized RPN network-based battlefield target detection method as claimed in claim 1, wherein step 1 is specifically implemented according to the following steps:
(11) firstly, the size of the target object is judged, and the larger value of the width and the height of the target object is assumed to be recorded as PmaxThe pixels divide the targets into three types of targets according to the size of the targets in the visual field, and the size standard of the target classification is as follows:
(12) calibrating targets for targets with different sizes: and respectively calibrating the training data set and the test data set of the tank and the armored vehicle.
3. The method for detecting the battlefield target based on the optimized RPN network as claimed in claim 1 or 2, wherein step 3 is implemented according to the following steps:
(31) the convolutional neural network is a main network of the target detection network, target features are extracted, and shared convolutional layers are generated, wherein the input of each convolutional layer is as follows:
Zl=WlXl-1+bl(2)
(32) the output of the l-th layer convolution is:
Xl=f(Zl)=f(WlXl-1+bl) (3)
(33) the total error of the convolutional layer is:
continuously optimizing parameters W and b of the neural network by a gradient descent method;
(34) the parameters W and b are respectively graded according to equation (5) to obtain:
where e represents the inner product of two vectors,representing a derivation symbol, and calculating values of parameters W and b;
(35) and continuously adjusting parameters of the network to enable the extraction of the target characteristics to be more accurate, obtaining the output of the convolution layer, namely the characteristics of the target, through the function of the activation function of the formula (3) on the convolution result, connecting the characteristics of the target through a full connection layer to form a shared characteristic diagram, and obtaining the shared characteristic diagram by the convolution layer conv3-3 and the conv5-3 layers respectively.
4. The optimized RPN network-based battlefield target detection method as claimed in claim 3, wherein step 4 is specifically implemented according to the following steps:
(41) setting different sliding windows on the shared characteristic graphs of the convolution conv3-3 layer and the convolution conv5-3 layer respectively, setting the sliding window with the size of 5 multiplied by 5 on the conv3-3 layer by the RPN network, and setting the sliding windows with the sizes of 7 multiplied by 7 and 9 multiplied by 9 on the conv5-3 layer;
(42) anchor frames with different scales and proportions are arranged on the sliding window, so that W multiplied by H multiplied by k anchor frames can be obtained;
(43) generating region candidate frames through sliding windows, wherein candidate regions generated by the conv3-3 and conv5-3 layers are respectively represented by a suggestion region 1 and a suggestion region 2, features in each sliding window are mapped to corresponding low-dimensional features, the low-dimensional features are subjected to a ReLU activation function to obtain vectors, the vectors are input into two convolution layers respectively, namely a candidate region classification judgment layer (cls) and a candidate region position regression layer (reg), the classification cls layer represents the probability value that each candidate region is a target, the output of the probability value is 2k, and the reg layer represents the position regression coordinates of k frames, and the output of the regression coordinates is 4 k.
5. The optimized RPN network-based battlefield target detection method as claimed in claim 4, wherein step 5 is specifically implemented according to the following steps:
(51) respectively calculating error values between all target candidate boxes and real boxes of convolution conv3-3 and conv5-3 layers, namely, the minimum value of a loss function, and specifically implementing the following steps:
(511) the boundaries of the prediction box and the anchor box of 4 coordinates fall back:
the boundaries of the anchor frame and the real frame with 4 coordinates fall back:
wherein: x, y, w and h represent the center coordinates and width and height of the frame; x, xaAnd x*Respectively representing the coordinates of the prediction frame, the anchor frame and the real frame; y, yaAnd y*Respectively representing the coordinates of the prediction frame, the anchor frame and the real frame; w, wa,w*Respectively representing the widths of the prediction frame, the anchor frame and the real frame; h, ha,h*Respectively representing the heights of the prediction frame, the anchor frame and the real frame;
(512) the loss function is used for judging the error between the candidate frame and the real frame, continuously training network adjusting parameters by adopting a gradient descent method, and defining the minimum loss function as follows for the same loss functions of convolution conv3-3 and conv5-3 layers:
wherein the content of the first and second substances,{piand tiThe outputs of the cls layer and the reg layer are respectively, and i represents the serial number of the anchor frame; j represents the number of convolutional layers; t is tiRepresents a predicted offset;representing the offset of the anchor frame and the real frame;
(513) continuously training an RPN (resilient packet network), and optimizing the value of a loss function to the minimum to obtain a target candidate region with high accuracy; calculating the minimum loss function L of the convolution conv3-3 layer according to the formula (17)3({pi},{ti}) and the minimum loss function L corresponding to the conv5-3 layers5({pi},{ti});
(52) Comparing the minimum values of the two layers of loss functions to select a smaller loss function, and expressing the minimum value of the two loss functions by L, wherein the minimum value is expressed as:
L=min{L3({pi},{ti}),L5({pi},{ti})} (21)
the L value is obtained as the candidate region after the final optimization, that is, the candidate region with the highest accuracy, which is the minimum loss function on the two convolutional layers.
6. The optimized RPN network-based battlefield target detection method as claimed in claim 5, wherein step 6 is implemented according to the following steps:
(61) inputting the target candidate region after RPN network optimization and the shared feature map of the corresponding convolution layer into a detection network, and extracting region features at the ROI pooling layer;
(62) inputting the region characteristics into a subsequent softmax layer, performing classification judgment and regression correction of a target boundary on each target candidate region, detecting loss functions of targets in the network, including a target classification loss function and a position boundary loss function, wherein the loss functions of the network are as follows:
wherein L 'represents a detection network loss function, a target classification loss function L'clsComprises the following steps:
wherein | S+I represents the number of positive samples, | S-I represents the number of negative samples, a boundary regression loss function is the same as the RPN network boundary loss function, and the loss function is continuously optimized to continuously correct the target classification judgment and the regression of the boundary frame to obtain an optimized target classification and boundary regression value;
(63) through training of a large number of data sets, parameters of the network are continuously adjusted by adopting a gradient descent method, and finally the total loss of the network is minimized, wherein the total loss function is L*The representation, i.e. the total loss function of the network, is:
L*=L+L′ (24)
training the network to L*The minimum value is reached;
(64) and carrying out target detection on the test data set by using the trained detection network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910965047.4A CN110766058B (en) | 2019-10-11 | 2019-10-11 | Battlefield target detection method based on optimized RPN (resilient packet network) |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910965047.4A CN110766058B (en) | 2019-10-11 | 2019-10-11 | Battlefield target detection method based on optimized RPN (resilient packet network) |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110766058A true CN110766058A (en) | 2020-02-07 |
CN110766058B CN110766058B (en) | 2023-04-18 |
Family
ID=69331874
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910965047.4A Active CN110766058B (en) | 2019-10-11 | 2019-10-11 | Battlefield target detection method based on optimized RPN (resilient packet network) |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110766058B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111898560A (en) * | 2020-08-03 | 2020-11-06 | 华南理工大学 | Classification regression feature decoupling method in target detection |
CN111985367A (en) * | 2020-08-07 | 2020-11-24 | 湖南大学 | Pedestrian re-recognition feature extraction method based on multi-scale feature fusion |
CN112287977A (en) * | 2020-10-06 | 2021-01-29 | 武汉大学 | Target detection method based on key point distance of bounding box |
CN112347895A (en) * | 2020-11-02 | 2021-02-09 | 北京观微科技有限公司 | Ship remote sensing target detection method based on boundary optimization neural network |
CN112419310A (en) * | 2020-12-08 | 2021-02-26 | 中国电子科技集团公司第二十研究所 | Target detection method based on intersection and fusion frame optimization |
CN112417981A (en) * | 2020-10-28 | 2021-02-26 | 大连交通大学 | Complex battlefield environment target efficient identification method based on improved FasterR-CNN |
CN112927229A (en) * | 2021-04-21 | 2021-06-08 | 中国人民解放军陆军装甲兵学院 | Tank armored vehicle and three-dimensional detection method for elastic hole damage thereof |
CN113780270A (en) * | 2021-03-23 | 2021-12-10 | 京东鲲鹏(江苏)科技有限公司 | Target detection method and device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107368845A (en) * | 2017-06-15 | 2017-11-21 | 华南理工大学 | A kind of Faster R CNN object detection methods based on optimization candidate region |
CN107451602A (en) * | 2017-07-06 | 2017-12-08 | 浙江工业大学 | A kind of fruits and vegetables detection method based on deep learning |
WO2019149071A1 (en) * | 2018-01-30 | 2019-08-08 | 华为技术有限公司 | Target detection method, device, and system |
-
2019
- 2019-10-11 CN CN201910965047.4A patent/CN110766058B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107368845A (en) * | 2017-06-15 | 2017-11-21 | 华南理工大学 | A kind of Faster R CNN object detection methods based on optimization candidate region |
CN107451602A (en) * | 2017-07-06 | 2017-12-08 | 浙江工业大学 | A kind of fruits and vegetables detection method based on deep learning |
WO2019149071A1 (en) * | 2018-01-30 | 2019-08-08 | 华为技术有限公司 | Target detection method, device, and system |
Non-Patent Citations (1)
Title |
---|
王全东等: "面向多尺度坦克装甲车辆目标检测的改进Faster R-CNN算法", 《计算机辅助设计与图形学学报》 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111898560A (en) * | 2020-08-03 | 2020-11-06 | 华南理工大学 | Classification regression feature decoupling method in target detection |
CN111898560B (en) * | 2020-08-03 | 2023-08-01 | 华南理工大学 | Classification regression feature decoupling method in target detection |
CN111985367A (en) * | 2020-08-07 | 2020-11-24 | 湖南大学 | Pedestrian re-recognition feature extraction method based on multi-scale feature fusion |
CN112287977A (en) * | 2020-10-06 | 2021-01-29 | 武汉大学 | Target detection method based on key point distance of bounding box |
CN112287977B (en) * | 2020-10-06 | 2024-02-09 | 武汉大学 | Target detection method based on bounding box key point distance |
CN112417981A (en) * | 2020-10-28 | 2021-02-26 | 大连交通大学 | Complex battlefield environment target efficient identification method based on improved FasterR-CNN |
CN112417981B (en) * | 2020-10-28 | 2024-04-26 | 大连交通大学 | Efficient recognition method for complex battlefield environment targets based on improved FasterR-CNN |
CN112347895A (en) * | 2020-11-02 | 2021-02-09 | 北京观微科技有限公司 | Ship remote sensing target detection method based on boundary optimization neural network |
CN112419310A (en) * | 2020-12-08 | 2021-02-26 | 中国电子科技集团公司第二十研究所 | Target detection method based on intersection and fusion frame optimization |
CN112419310B (en) * | 2020-12-08 | 2023-07-07 | 中国电子科技集团公司第二十研究所 | Target detection method based on cross fusion frame optimization |
CN113780270A (en) * | 2021-03-23 | 2021-12-10 | 京东鲲鹏(江苏)科技有限公司 | Target detection method and device |
CN112927229A (en) * | 2021-04-21 | 2021-06-08 | 中国人民解放军陆军装甲兵学院 | Tank armored vehicle and three-dimensional detection method for elastic hole damage thereof |
Also Published As
Publication number | Publication date |
---|---|
CN110766058B (en) | 2023-04-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110766058B (en) | Battlefield target detection method based on optimized RPN (resilient packet network) | |
CN113065558B (en) | Lightweight small target detection method combined with attention mechanism | |
CN107563433B (en) | Infrared small target detection method based on convolutional neural network | |
CN111310862A (en) | Deep neural network license plate positioning method based on image enhancement in complex environment | |
CN111563893B (en) | Grading ring defect detection method, device, medium and equipment based on aerial image | |
CN109086799A (en) | A kind of crop leaf disease recognition method based on improvement convolutional neural networks model AlexNet | |
CN110222215B (en) | Crop pest detection method based on F-SSD-IV3 | |
CN108564085B (en) | Method for automatically reading of pointer type instrument | |
CN110569796A (en) | Method for dynamically detecting lane line and fitting lane boundary | |
CN108428220A (en) | Satellite sequence remote sensing image sea island reef region automatic geometric correction method | |
CN111242026B (en) | Remote sensing image target detection method based on spatial hierarchy perception module and metric learning | |
CN109993052B (en) | Scale-adaptive target tracking method and system under complex scene | |
CN112396619B (en) | Small particle segmentation method based on semantic segmentation and internally complex composition | |
CN107909053B (en) | Face detection method based on hierarchical learning cascade convolution neural network | |
CN110245587B (en) | Optical remote sensing image target detection method based on Bayesian transfer learning | |
CN109543585A (en) | Underwater optics object detection and recognition method based on convolutional neural networks | |
CN111144234A (en) | Video SAR target detection method based on deep learning | |
CN112258490A (en) | Low-emissivity coating intelligent damage detection method based on optical and infrared image fusion | |
CN112417981A (en) | Complex battlefield environment target efficient identification method based on improved FasterR-CNN | |
CN117152503A (en) | Remote sensing image cross-domain small sample classification method based on false tag uncertainty perception | |
CN109558803B (en) | SAR target identification method based on convolutional neural network and NP criterion | |
CN113627240B (en) | Unmanned aerial vehicle tree species identification method based on improved SSD learning model | |
CN114549909A (en) | Pseudo label remote sensing image scene classification method based on self-adaptive threshold | |
CN108615240B (en) | Non-parametric Bayesian over-segmentation method combining neighborhood information and distance weight | |
US20230386023A1 (en) | Method for detecting medical images, electronic device, and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20230823 Address after: Room 8448, 2nd Floor, Building 4, Free Trade Industrial Park, No. 2168 Zhenghe Fourth Road, Fengdong New City, Xixian New District, Xi'an City, Shaanxi Province, 710086 Patentee after: Xi'an Keduoduo Information Technology Co.,Ltd. Address before: 710032 No. 2 Xuefu Middle Road, Weiyang District, Xi'an City, Shaanxi Province Patentee before: XI'AN TECHNOLOGICAL University |
|
TR01 | Transfer of patent right |