CN108647573A - A kind of military target recognition methods based on deep learning - Google Patents
A kind of military target recognition methods based on deep learning Download PDFInfo
- Publication number
- CN108647573A CN108647573A CN201810297353.0A CN201810297353A CN108647573A CN 108647573 A CN108647573 A CN 108647573A CN 201810297353 A CN201810297353 A CN 201810297353A CN 108647573 A CN108647573 A CN 108647573A
- Authority
- CN
- China
- Prior art keywords
- algorithm
- anchor point
- drpn
- dfcn
- point frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
Abstract
The military target recognition methods based on deep learning that the invention discloses a kind of, belongs to the automatic target detection field based on image.The invention can identify that extensive military target retrieval, weaponry is intelligent, the application of the fields such as battle field situation in military target.This method is unidirectionally connected for the problem that traditional each layer of target identification network based on deep learning, feature representation scarce capacity, has redesigned a kind of dense algorithm model for connecting the layers such as convolution.Using the mode of dense connection, algorithm model has been multiplexed each layer of feature, to improve the target identification Average Accuracy of algorithm model;The algorithm model smaller that training obtains in this way;Algorithm model solves the problems, such as gradient disperse, gradient expansion simultaneously.
Description
Technical field
The invention belongs to automatic target detection field more particularly to a kind of army based on deep learning based on image
Thing target identification method.
Background technology
Under the new situation, networking combined operation show the three-dimensional battlefield integration of land, sea, air, outer space, electromagnetism, match Baudot dimension
The characteristics of operation pass through multi-platform (ground, airborne, UAV system, carrier-borne, vehicle-mounted, the spaceborne, water surface, underwater, warship extension, data base set
System), Multiple Source Sensor (SAR/ISAR, infrared camera, EO-1 hyperion/multispectral/low-light/EO/ visible lights, sonar, laser, milli
Metric wave) etc. approach can obtain the image of magnanimity, video data, data source has the characteristics that " 5V+1C ", i.e.,:Volume is (big
Capacity), Variety (diversity), Velocity (timeliness) and Veracity (accuracy), Value (value) and
Complexity (complexity).Therefore, how from these different types, different opportunity, the large nuber of images of different resolution, video
Military target classification, the location information of needs are found out in big data, to provide intelligence supports for commander's decision, it appears especially
It is important.
In face of with TB/PB grades of the magnanimity observed images presented, video datas, there is the feelings of " looking for a needle in a haystack " in
Condition, one side data can not more to be handled, and the target on the other hand needed can not find again, lead to not quickly timely provide essence
Really judge, bungles the chance of winning a battle.Military Application field there is an urgent need to a kind of intelligent Target Recognition come to large nuber of images, regard
Frequency resource automatically analyzes, and then provides important evidence for tactical decision.
Military target identification technology based on deep learning is to use automatic data-processing resources, in multi-source detection information
Target data be identified and classify.Have benefited from recent years big data, cloud computing and artificial intelligence technology fast development and
The appearance of extensive marked data set, the breakthrough of the intelligent Target identification technology especially based on deep learning algorithm,
The development of the Target Recognition based on image of promotion energetically.Have benefited from the energy of the powerful feature representation of deep learning
Power, it is fast-developing in pattern-recognition and computer vision field, it is special instead of the manual construction in the past based on priori rapidly
The mode of sign.Wherein, convolutional neural networks (Convolutional Neural Network, CNN) are on object recognition task
Successful application greatly improve the accurate rate of target classification task.This method is in complexity such as different scenes, different resolutions
In the case of compared with conventional method still have higher accuracy rate and robustness.
Therefore for technical problem of the existing technology, it is really necessary to propose a kind of technical solution to overcome the prior art
Defect.
Invention content
In view of this, it is necessory to provide a kind of military target recognition methods based on deep learning, realize to largely regarding
Frequently, image data carries out more efficient processing, to which in weaponry intelligence, the Military Applications such as battle field situation field provides
Support.
In order to solve technical problem of the existing technology, the technical scheme is that:
A kind of military target recognition methods based on deep learning, includes the following steps:
Step (1):Pass through area sampling algorithm (the Dense connected Region based on dense convolutional neural networks
Proposal Network, DRPN) generate as few as possible, high quality sampling area.
DRPN algorithms have the following steps:
Step (1-1), the entitled DRPN of sampling algorithm model, the input of DRPN algorithms are the infrared or visible light of arbitrary dimension
Picture exports as multiple sampling areas corresponding to each class.Algorithm model is a reticular structure from bottom to top, algorithm
Model is stacked by multiple dense convolution blocks (Dense Block).The each layer of algorithm model is one 4 dimension matrix, with member
Group (n, c, h, w) indicates that n indicates that the quantity of batch processing picture when training, c indicate that each layer of port number, h indicate characteristic pattern
Highly (input terminal show as input picture height), w indicate characteristic pattern width, wherein 4 dimension matrixes by convolution, Chi Hua,
The operations such as normalization, linear activation primitive (Rectified Linear Units, RELU) constantly convert.It is trained in single scale
When, the input picture of (h, w) size is uniformly sized to w=600, h=1000.
It is big to generate multiple W × H after the transformation of dense convolutional network layer for the feature image of step (1-2), input channel
Each pixel (neuron) of small characteristic pattern, characteristic pattern has very wide in range receptive field (about 196 × 196).For big
It is divided into W × H grid, for each picture of grid by the small characteristic pattern for W × H (such as 60 × 40) according to pixel
Vegetarian refreshments, " frame " for taking h size to differ respectively on characteristic pattern, we term it anchor point frames, therefore, for the spy of a W × H
We will generate W × H × k anchor point frame to sign figure, and for the characteristic pattern of 60 × 40 sizes, algorithm will generate 21600 and adopt
Sample region, these sampling areas contain a large amount of foreground area (including target) and background area (not including target), also,
The sampling area that height repeats occupies greatly, therefore being selected in W × H × k sample most can representative sample feature
Anchor point frame be particularly important.In training DRPN algorithms, using sample restrainable algorithms (Non Maximum
Suppression, NMS) anchor point frame is refined, specifically:(1) positive sample:For any one anchor point frame, it and foreground area have
Maximum overlapping region or it and foreground area have the overlapping region more than 70%;(2) negative sample:For any one anchor
Point frame under the premise of its satisfaction is not positive sample, and has the overlapping region less than 30% with all foreground areas.For each
A anchor point frame, algorithm all export corresponding confidence score, and it is foreground area or background area that confidence score, which has reacted anchor point frame,
The probability (probability of positive negative sample) in domain.At the same time, for each anchor point, algorithm predicts k recurrence device for correcting
Position coordinates, negative sample are not involved in regression forecasting, and DRPN algorithms show as anchor point frame to true mesh on profound characteristic pattern
Mark the nonlinear regression prediction of background frame.
Step (1-3) in order to share the calculation amount and memory space of dense convolutional network, while accomplishing to train end to end
And test, DRPN is trained using joint cost function herein.For each anchor point frame, dense Region sampling algorithm needs to export
Anchor point frame is the probability of positive negative sample, uses polytypic softmax cost functions, in DRPN algorithms, softmax tables herein
It is now the cost function (it is logistic regression cost function to degenerate) of two classification, for n anchor point frame, algorithm exports 2 × n and sets
Believe score.Softmaxloss cost functions as follows, m indicate that batch processing size (such as 256), k indicate softmax
The quantity of output unit, here two classification k=2, as follows,Middle piIndicate the anchor point frame confidence score of prediction,
If an anchor point frame is negative sample,It is 0, if an anchor point frame is positive sample,It is 1,It is returned for controlling coordinate
The execution of cost function:Being returned without coordinate when being trained if anchor point frame is background area (only has foreground area just to repair
The value of positive coordinate), formula is as follows:
Foreground anchor point frame it is rough illustrate that coordinate position of the foreground area in a pictures, algorithm are needed to foreground zone
Domain carries out coordinate recurrence, as follows:G indicates that real background frame, P indicate that anchor point frame, function F indicate an anchor point frame to really
The mapping function of background frame.Real background frame G tuple (Gx,Gy,Gw,Gh) indicate, wherein (Gx,Gy) indicate in real background frame
Heart point coordinates, (Gw,Gh) indicate that real background frame is corresponding wide and high.By means of the superb Function approximation capabilities of deep learning, F
It need not be arranged by hand, it is to learn to obtain in such a way that deep learning algorithm repeatedly trains iteration, herein by DRPN
Algorithm obtains, and is shown below:Fx(P)、Fy(P)、Fw(P)、Fh(P) it needs algorithm to learn to obtain, uses F*(P) corresponding letter is indicated
Number mapping relations (* indicates x, y, w, h) have following formula, wherein φ (P) to indicate that algorithm middle layer learns in convolutional neural networks
The characteristic pattern matrix arrived,Indicate that the weight that algorithm learns, formula are as follows:
G=F (P)
Gx=PwFx(P)+Px
Gy=PhFy(P)+Py
Gw=Pwexp(dw(P))
Gh=Ph exp(dh(P))
w*It is obtained by minimizing cost function, λ is regularization parameter, λ=1000, t*For object to be returned, formula is such as
Under:
tx=(Gx-Px)/Pw
ty=(Gy-Py)/Ph
tw=log (Gw/Pw)
th=log (Gh/Ph)
Step (1-4), after the cost function that classification and area sampling algorithm has been set separately, design cost function joint
Classification loss (LOSS) and the position loss of sampling area are calculated, algorithm has been accomplished to train end to end in this way.Such as
Shown in following formula:Algorithm devises a joint cost function, LclsAnd LregThe cost letter that presentation class and anchor point frame return respectively
Number, wherein NclsIndicate the anchor point frame quantity (such as 256) or N that primary training is chosenregIndicate that the characteristic pattern for choosing anchor point frame is big
Small (such as 2400), λ is set as 10, and formula is as follows:
After the cost function that classification and area sampling algorithm has been set separately, the cost function combined calculation that uses herein
The classification loss (LOSS) of sampling area and position are lost, and in this way algorithm has been accomplished to train end to end.LclsWith
LregThe cost function that presentation class and anchor point frame return respectively, wherein NclsIndicate the anchor point frame quantity of primary training selection (such as
Or N 256)regIndicate the characteristic pattern size (such as 2400) of selection anchor point frame, λ is set as 10, and formula is as follows:
When training DRPN algorithms, positive negative sample can have non-uniform situation:Negative anchor point frame (negative sample) is far more than just
Anchor point frame (positive sample), the weighting parameter learnt can be biased to negative sample.It is trained using batch stochastic gradient descent algorithm
When DRPN, 256 anchor point frames are chosen every time and participate in training, foreground anchor point frame and the control of background anchor point frame ratio are 1:1, when positive
It is supplemented with background anchor point frame when anchor point frame deficiency.For the characteristic pattern that a size is W × H (such as 60 × 40), DRPN algorithms will
W × H × k anchor point frame is generated, however, there is considerable anchor point frame to be mapped to the boundary that can exceed artwork in artwork, tests table
It is bright:If not handling this kind of anchor point frame, DRPN will not restrain.Therefore, all are exceeded picture by algorithm in the sample process stage
The anchor point frame on boundary is all given up.For 2000 sampling areas generated compared to the selection sampling algorithm of early stage, DRPN is used
Algorithm produces more sampling areas (≈ 6000 × 9), these sampling areas have sizable redundancy:For the same military affairs
Target, algorithm generates many sampling areas for spatially having a large amount of overlapping regions, if sampling of the algorithm to these redundancies
Region executes classification, position correction operation seems and takes time and effort.Therefore, to so many sampling area refine seems particularly
It is important.One is taken herein and is concisely and efficiently method, i.e., only retains confidence score and adopted higher than the foreground and background of a threshold value
Sample region (when identification, individually using background area as a class), experiment shows that will be generated by setting a threshold to 0.7 by 2000
The sampling area of left and right.Next, being further processed using DFCN algorithms 2000 sampling areas high to confidence score, specifically
Process will be illustrated in next section.
Step (2):Pass through fast area sorting algorithm (the Dense connected based on dense convolutional neural networks
Fast Classification Network, DFCN) classify to the sampling area that DPRN is generated.
DFCN algorithms are as follows:
Step (2-1), the DFCN sorting algorithms convolutional layer based on deep learning are made of the convolution block of dense connection:(1)
DFCN inputs are a picture (such as soldier, tank) for containing all kinds of military targets;(2) DFCN is in profound characteristic pattern
Upper extraction target signature is classified;(3) there are one special sampling area pond layer (Region OfInterest for DFCN tools
Pooling Layer, ROIP) for the feature normalization that will input to a uniformly a scale;(4) DFCN is having trained a classification
Also simultaneously position coordinates return while device and corrected.Input the military targets sizes such as soldier's tank in picture not
One, proportional is mapped on characteristic pattern, and eigenmatrix dimension can be different, and full articulamentum needs the input of identical dimensional
Matrix.Therefore, DFCN algorithms need a kind of transformation that the eigenmatrix of different dimensions is normalized to identical dimensional.DFCN is used
A kind of normalization algorithm of entitled ROIP, for the characteristic pattern of m × n size, it is assumed that full articulamentum needs input feature vector
The matrix dimensionality of figure is w × h, formula is as follows:
K in above formulawAnd KhCalculative pond convolution kernel size, S are indicated respectivelywAnd ShIndicate the step-length of pondization operation,Expression rounds up,Indicate that downward rounding, pad are edge filling item.It theoretically can be by the way that different convolution kernels be arranged
This method for normalizing of eigenmatrix that size and step-length (multiple windows) generate arbitrary dimension is referred to as spatial pyramid pond
(Spatial PyramidPooling,SPP).Actually ROIP has only used a group window, the unified spy that dimension differs of algorithm
Figure matrix normalization is levied to a dimension size (7 × 7).
Step (3):In order to make DRPN and DFCN share convolutional layer feature, this patent provides two kinds of joint training methods:
Step (3-1), end to end training method.Regard DRPN and DFCN as a unified entirety, using batch
The mistake of stochastic gradient descent algorithm (Mini-batch Stochastic Gradient Descent, MSGD) training training algorithm
Cheng Zhong, the sampling area that forward direction transfer stages DRPN is generated directly train DFCN, back transfer stage DRPN and DFCN gradients
Anti-pass successively, until algorithmic statement after successive ignition.
Step (3-2), DRFCN substep training algorithms are specific as follows:
Step (3-2-1) trains DRPN using MSGD algorithms, and the convolution module of dense connection is at the beginning of weight trained in advance
Beginningization;
Step (3-2-2) trains DFCN, the convolution module of dense connection using the sampling area that step (1) DRPN is generated
With weights initialisation trained in advance;
Step (3-2-3) initializes DRPN using the convolutional layer of the dense connections of step (2) DFCN, keeps dense interconnecting piece
Fraction weight is constant, only finely tunes the exclusive layers of DRPN, so far, DRPN and DFCN have had shared convolutional layer;
Step (3-2-4) keeps the convolutional layer block weights of dense connection constant, the sampling generated using step (3) DRPN
Regional training DFCN, this step only finely tune the exclusive layers of DFCN;
Step (3-2-5), algorithmic statement, training terminate.
Compared with prior art, beneficial effects of the present invention:Target in detection video in real time, substitutes with military target
The mode of artificial treatment video data in identification mission;It, should different from the modeling pattern that conventional depth network model unidirectionally connects
Algorithm has been multiplexed each layer in depth network model of feature, has been greatly improved by way of the dense connection of convolution module
The feature representation ability of depth network model.The experimental results showed that:It is big in target identification Average Accuracy and depth network model
Small two aspects, DRFCN algorithms are significantly better than that the existing Target Recognition Algorithms based on deep learning.At the same time, DRFCN exists
Solve gradient disperse, gradient expansion aspect significant effect.
Description of the drawings
Fig. 1 DRFCN algorithm overall structure.
Fig. 2 DRPN algorithm overall construction drawings.
Fig. 3 algorithm data types of flow.
Fig. 4 anchor points frame and real background frame transition diagram.
Fig. 5 DFCN algorithm structure figures.
Following specific embodiment will be further illustrated the present invention in conjunction with above-mentioned attached drawing.
Specific implementation mode
The military target recognition methods provided by the invention based on deep learning is made furtherly below with reference to attached drawing
It is bright.
For Related Technical Issues of the existing technology, the present invention is from the theory of military target intelligent recognition, knot
The technological means of deep learning forefront in terms of target detection is closed, proposes that a kind of target based on dense convolutional neural networks is known
Other method, this method can accurately detect the armies such as aircraft, tank, warship, guided missile, submarine, gun, helicopter, rifle, soldier
Thing target.
In order to solve technical problem of the existing technology, the present invention proposes that a kind of military target based on deep learning is known
Other method --- DRFCN, specific such as Fig. 1, includes the following steps (1):It is adopted by the region based on dense convolutional neural networks
Sample algorithm (Dense connected Region Proposal Network, DRPN) generates as few as possible, high quality adopt
Sample region.
DRPN algorithms have the following steps:
(1-1), DRPN algorithm models are as shown in Fig. 2, the input of DRPN algorithms is the infrared or visible light figure of arbitrary dimension
Piece exports as multiple sampling areas corresponding to each class.Such as Fig. 3, algorithm model is a reticular structure from bottom to top,
Algorithm model is stacked by multiple dense convolution blocks (Dense Block).The each layer of algorithm model is one 4 dimension matrix,
It is indicated with tuple (n, c, h, w), n indicates that the quantity of batch processing picture when training, c indicate that each layer of port number, h indicate feature
The height (showing as the height of input picture in input terminal) of figure, w indicates the width of characteristic pattern, wherein 4 dimension matrixes pass through convolution, pond
The operations such as change, normalization, linear activation primitive (Rectified Linear Units, RELU) constantly convert.In single scale
When training, the input picture of (h, w) size is uniformly sized to w=600, h=1000.
The feature image of (1-2), input channel generate multiple W × H sizes after the transformation of dense convolutional network layer
Each pixel (neuron) of characteristic pattern, characteristic pattern has very wide in range receptive field (about 196 × 196).As shown in figure 4,
For the characteristic pattern that size is W × H (such as 60 × 40), it is divided into W × H grid according to pixel, for the every of grid
One pixel, " frame " for taking h size to differ respectively on characteristic pattern, we term it anchor point frames, therefore, for a W
We will generate W × H × k anchor point frame to the characteristic pattern of × H, and for the characteristic pattern of 60 × 40 sizes, algorithm will generate
21600 sampling areas, these sampling areas contain a large amount of foreground area (including target) and background area (does not include mesh
Mark), also, the sampling area that height repeats occupies greatly, therefore select in W × H × k sample and can most represent
The anchor point frame of sample characteristics is particularly important.In training DRPN algorithms, using sample restrainable algorithms (Non Maximum
Suppression, NMS) anchor point frame is refined, specifically:(1) positive sample:For any one anchor point frame, it and foreground area have
Maximum overlapping region or it and foreground area have the overlapping region more than 70%;(2) negative sample:For any one anchor
Point frame under the premise of its satisfaction is not positive sample, and has the overlapping region less than 30% with all foreground areas.For each
A anchor point frame, algorithm all export corresponding confidence score, and it is foreground area or background area that confidence score, which has reacted anchor point frame,
The probability (probability of positive negative sample) in domain.At the same time, for each anchor point, algorithm predicts k recurrence device for correcting
Position coordinates, negative sample are not involved in regression forecasting, and DRPN algorithms show as anchor point frame to true mesh on profound characteristic pattern
Mark the nonlinear regression prediction of background frame.
(1-3) in order to share the calculation amount and memory space of dense convolutional network, while accomplishing to train and survey end to end
Examination trains DRPN using joint cost function herein.For each anchor point frame, dense Region sampling algorithm needs to export anchor point
Frame is the probability of positive negative sample, uses polytypic softmax cost functions herein, in DRPN algorithms, softmax is shown as
The cost function (degenerate is logistic regression cost function) of two classification, for n anchor point frame, algorithm exports 2 × n confidence and obtains
Point.Softmaxloss cost functions as follows, m indicate that batch processing size (such as 256), k indicate softmax outputs
The quantity of unit, here two classification k=2, as follows,Middle piIndicate the anchor point frame confidence score of prediction, if
One anchor point frame is negative sample,It is 0, if an anchor point frame is positive sample,It is 1,Cost is returned for controlling coordinate
The execution of function:Without coordinate recurrence, (only foreground area just has amendment to sit when being trained if anchor point frame is background area
Target is worth), formula is as follows:
Foreground anchor point frame it is rough illustrate that coordinate position of the foreground area in a pictures, algorithm are needed to foreground zone
Domain carries out coordinate recurrence, as follows:G indicates that real background frame, P indicate that anchor point frame, function F indicate an anchor point frame to really
The mapping function of background frame.Real background frame G tuple (Gx,Gy,Gw,Gh) indicate, wherein (Gx,Gy) indicate in real background frame
Heart point coordinates, (Gw,Gh) indicate that real background frame is corresponding wide and high.By means of the superb Function approximation capabilities of deep learning, F
It need not be arranged by hand, it is to learn to obtain in such a way that deep learning algorithm repeatedly trains iteration, herein by DRPN
Algorithm obtains, and is shown below:Fx(P)、Fy(P)、Fw(P)、Fh(P) it needs algorithm to learn to obtain, uses F*(P) corresponding letter is indicated
Number mapping relations (* indicates x, y, w, h) have following formula, wherein φ (P) to indicate that algorithm middle layer learns in convolutional neural networks
The characteristic pattern matrix arrived,Indicate that the weight that algorithm learns, formula are as follows:
G=F (P)
Gx=PwFx(P)+Px
Gy=PhFy(P)+Py
Gw=Pw exp(dw(P))
Gh=Ph exp(dh(P))
w*It is obtained by minimizing cost function, λ is regularization parameter, λ=1000, t*For object to be returned, formula is such as
Under:
tx=(Gx-Px)/Pw
ty=(Gy-Py)/Ph
tw=log (Gw/Pw)
th=log (Gh/Ph)
(1-4), after the cost function that classification and area sampling algorithm has been set separately, the cost function used herein joins
Total classification loss (LOSS) for having calculated sampling area and position loss, algorithm has been accomplished to instruct end to end in this way
Practice.LclsAnd LregThe cost function that presentation class and anchor point frame return respectively, wherein NclsIndicate the anchor point frame that primary training is chosen
Quantity (such as 256) or NregIndicate the characteristic pattern size (such as 2400) of selection anchor point frame, λ is set as 10, and formula is as follows:
When training DRPN algorithms, positive negative sample can have non-uniform situation:Negative anchor point frame (negative sample) is far more than just
Anchor point frame (positive sample), the weighting parameter learnt can be biased to negative sample.It is trained using batch stochastic gradient descent algorithm
When DRPN, 256 anchor point frames are chosen every time and participate in training, foreground anchor point frame and the control of background anchor point frame ratio are 1:1, when positive
It is supplemented with background anchor point frame when anchor point frame deficiency.For the characteristic pattern that a size is W × H (such as 60 × 40), DRPN algorithms will
W × H × k anchor point frame is generated, however, there is considerable anchor point frame to be mapped to the boundary that can exceed artwork in artwork, tests table
It is bright:If not handling this kind of anchor point frame, DRPN will not restrain.Therefore, all are exceeded picture by algorithm in the sample process stage
The anchor point frame on boundary is all given up.For 2000 sampling areas generated compared to the selection sampling algorithm of early stage, DRPN is used
Algorithm produces more sampling areas (≈ 6000 × 9), these sampling areas have sizable redundancy:For the same military affairs
Target, algorithm generates many sampling areas for spatially having a large amount of overlapping regions, if sampling of the algorithm to these redundancies
Region executes classification, position correction operation seems and takes time and effort.Therefore, to so many sampling area refine seems particularly
It is important.One is taken herein and is concisely and efficiently method, i.e., only retains confidence score and adopted higher than the foreground and background of a threshold value
Sample region (when identification, individually using background area as a class), experiment shows that will be generated by setting a threshold to 0.7 by 2000
The sampling area of left and right.Next, being further processed using DFCN algorithms 2000 sampling areas high to confidence score, specifically
Process will be illustrated in next section.
Step (2):Pass through fast area sorting algorithm (the Dense connected based on dense convolutional neural networks
Fast Classification Network, DFCN) classify to the sampling area that DPRN is generated.
DFCN algorithms are as follows:
(2-1), such as Fig. 5, the DFCN sorting algorithms convolutional layer based on deep learning is made of the convolution block of dense connection:
(1) DFCN inputs are a picture (such as soldier, tank) for containing all kinds of military targets;(2) DFCN is in profound spy
Target signature is extracted on sign figure to classify;(3) there are one special sampling area pond layer (Region for DFCN tools
OfInterest Pooling Layer, ROIP) for the feature normalization that will input to a uniformly a scale;(4) DFCN is being instructed
Also simultaneously position coordinates return while having practiced a grader and corrected.Input the military affairs such as soldier's tank in picture
Target sizes differ,
Proportional is mapped on characteristic pattern, and eigenmatrix dimension can be different, and full articulamentum needs identical dimensional
Input matrix.Therefore, DFCN algorithms need a kind of transformation that the eigenmatrix of different dimensions is normalized to identical dimensional.
DFCN uses a kind of normalization algorithm of entitled ROIP, for the characteristic pattern of m × n size, it is assumed that full articulamentum needs
The matrix dimensionality of input feature vector figure is w × h, formula is as follows::
K in above formulawAnd KhCalculative pond convolution kernel size, S are indicated respectivelywAnd ShIndicate the step-length of pondization operation,Expression rounds up,Indicate that downward rounding, pad are edge filling item.It theoretically can be by the way that different convolution kernels be arranged
This method for normalizing of eigenmatrix that size and step-length (multiple windows) generate arbitrary dimension is referred to as spatial pyramid pond
(Spatial Pyramid Pooling,SPP).Actually ROIP has only used a group window, the unified spy that dimension differs of algorithm
Figure matrix normalization is levied to a dimension size (7 × 7).
Step (3):In order to make DRPN and DFCN share convolutional layer feature, this patent provides two kinds of joint training methods:
(3-1), end to end training method.Regard DRPN and DFCN as a unified entirety,
Using batch stochastic gradient descent algorithm (Mini-batch Stochastic Gradient Descent,
MSGD) during training training algorithm, the sampling area that forward direction transfer stages DRPN is generated directly trains DFCN, reversed to pass
Stage DRPN and DFCN gradient anti-pass successively is passed, until algorithmic statement after successive ignition.
(3-2), DRFCN substep training algorithms are specific as follows:
(3-2-1) trains DRPN, the convolution module of dense connection initial with weight trained in advance using MSGD algorithms
Change;
(3-2-2) trains DFCN using the sampling area that step (1) DRPN is generated, and the convolution module of dense connection, which is used, to be surpassed
The weights initialisation of preceding training;
(3-2-3) initializes DRPN using the convolutional layer of the dense connections of step (2) DFCN, keeps dense interconnecting piece fraction
Weight is constant, only finely tunes the exclusive layers of DRPN, so far, DRPN and DFCN have had shared convolutional layer;
(3-2-4) keeps the convolutional layer block weights of dense connection constant, the sampling area generated using step (3) DRPN
Training DFCN, this step only finely tune the exclusive layers of DFCN;
(3-2-5), algorithmic statement, training terminate.
Compared with prior art, the present invention has the following technical effect that:(1) dense convolutional neural networks are based on, are set again
An improved Model of Target Recognition has been counted, model size is reduced under the premise of keeping recognition accuracy;(2) algorithm can be with
It is identified towards military target;(3) gradient disperse, gradient expansion are solved the problems, such as.
The explanation of above example is only intended to facilitate the understanding of the method and its core concept of the invention.It should be pointed out that pair
For those skilled in the art, without departing from the principle of the present invention, the present invention can also be carried out
Some improvements and modifications, these improvement and modification are also fallen within the protection scope of the claims of the present invention.
The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention.
Various modifications to these embodiments will be apparent to those skilled in the art, defined in the present invention
General Principle can realize in other embodiments without departing from the spirit or scope of the present invention.Therefore, this hair
It is bright to be not intended to be limited to these embodiments shown in the present invention, and be to fit to special with principles of this disclosure and novelty
The consistent widest range of point.
Claims (3)
1. a kind of military target recognition methods based on deep learning, which is characterized in that include the following steps:
Step (1):Pass through area sampling algorithm (the Dense connected Region based on dense convolutional neural networks
Proposal Network, DRPN) generate as few as possible, high quality sampling area;
Step (2):Pass through fast area sorting algorithm (the Dense connected Fast based on dense convolutional neural networks
Classification Network, DFCN) classify to the sampling area that DRPN is generated, positioning;
Step (3):In order to make DRPN and DFCN share convolutional layer feature, a kind of algorithm model training method is devised;
Wherein, step (1) redesigns sampling algorithm model, the specific steps are:
Step (1-1), the entitled DRPN of sampling algorithm model, the input of DRPN algorithms are the infrared or visible light figure of arbitrary dimension
Piece exports as multiple sampling areas corresponding to each class;Algorithm model is a reticular structure from bottom to top, algorithm mould
Type is stacked by multiple dense convolution blocks (Dense Block);The each layer of algorithm model is one 4 dimension matrix, uses tuple
(n, c, h, w) is indicated, n indicates that the quantity of batch processing picture when training, c indicate that each layer of port number, h indicate the height of characteristic pattern
Degree (input terminal show as input picture height), w indicate characteristic pattern width, wherein 4 dimension matrixes by convolution, Chi Hua, return
The operations such as one change, linear activation primitive (Rectified Linear Units, RELU) constantly convert;It is trained in single scale
When, the input picture of (h, w) size is uniformly sized to w=600, h=1000;
The feature image of step (1-2), input channel generates multiple W × H sizes after the transformation of dense convolutional network layer
It is divided into W × H grid, for each of grid by characteristic pattern for the characteristic pattern that size is W × H according to pixel
Pixel, the anchor point frame for taking h size to differ respectively on characteristic pattern, therefore, for a W × H characteristic pattern will generate W ×
H × k anchor point frame, these sampling areas contain a large amount of foreground area (including target) and background area (does not include mesh
Mark), selected in W × H × k sample most can representative sample feature anchor point frame;In training DRPN algorithms, pressed down using sample
Algorithm (Non Maximum Suppression, NMS) processed refines anchor point frame, for each anchor point frame, algorithm all the output phases pair
The confidence score answered, confidence score reacted probability that anchor point frame is foreground area or background area (positive negative sample it is general
Rate);For each anchor point, algorithm predicts k recurrence device for correction position coordinate, and negative sample is not involved in regression forecasting,
DRPN algorithms show as anchor point frame on profound characteristic pattern and predict the nonlinear regression of real goal background frame;
Step (1-3) in order to share the calculation amount and memory space of dense convolutional network, while accomplishing to train and survey end to end
Examination trains DRPN using joint cost function;For each anchor point frame, dense Region sampling algorithm need export anchor point frame be
The probability of positive negative sample, using polytypic softmax cost functions, in DRPN algorithms, softmax shows as two classification
Cost function (degenerate is logistic regression cost function), for n anchor point frame, softmaxloss cost functions as follows,
M indicates that batch processing size, k indicate the quantity of softmax output units, here two classification k=2, shown in following formula,Middle piIndicate the anchor point frame confidence score of prediction, if an anchor point frame is negative sample,It is 0, if an anchor
Point frame is positive sample,It is 1,The execution of cost function is returned for controlling coordinate:It is instructed if anchor point frame is background area
(only foreground area just has the value for correcting coordinate) is returned without coordinate when practicing, formula is as follows:
Foreground anchor point frame it is rough illustrate that coordinate position of the foreground area in a pictures, algorithm sit foreground area
Mark returns, shown in following formula:G indicates that real background frame, P indicate that anchor point frame, function F indicate an anchor point frame to real background
The mapping function of frame;Real background frame G tuple (Gx,Gy,Gw,Gh) indicate, wherein (Gx,Gy) indicate real background frame central point
Coordinate, (Gw,Gh) indicate that real background frame is corresponding wide and high;By means of the superb Function approximation capabilities of deep learning, F passes through
Deep learning algorithm repeatedly trains the mode of iteration to learn to obtain, and is obtained, is shown below by DRPN algorithms:Fx(P)、Fy
(P)、Fw(P)、Fh(P) it needs algorithm to learn to obtain, uses F*(P) respective function mapping relations (* indicates x, y, w, h) are indicated,
There are following formula, wherein φ (P) to indicate the characteristic pattern matrix that algorithm middle layer learns in convolutional neural networks,Indicate algorithm
The weight learnt, formula are as follows:
G=F (P)
Gx=PwFx(P)+Px
Gy=PhFy(P)+Py
Gw=Pwexp(dw(P))
Gh=Ph exp(dh(P))
w*It is obtained by minimizing cost function, λ is regularization parameter, λ=1000, t*For object to be returned, formula is as follows:
tx=(Gx-Px)/Pw
ty=(Gy-Py)/Ph
tw=log (Gw/Pw)
th=log (Gh/Ph)
Step (1-4) designs cost function combined calculation after the cost function that classification and area sampling algorithm has been set separately
The classification loss (LOSS) of sampling area and position loss, algorithm has been accomplished to train end to end in this way;Such as following formula
It is shown:Devise a joint cost function, LclsAnd LregThe cost function that presentation class and anchor point frame return respectively, wherein
NclsIndicate the anchor point frame quantity or N that primary training is chosenregIndicate that the characteristic pattern size of selection anchor point frame, λ are set as 10, public affairs
Formula is as follows:
When training DRPN algorithms, only retains confidence score and be higher than the foreground and background sampling area of a threshold value (when identification, individually
Using background area as a class).
2. the military target recognition methods according to claim 1 based on deep learning, which is characterized in that redesign
Sorting algorithm model passes through fast area sorting algorithm (the Dense connected Fast based on dense convolutional neural networks
Classification Network, DFCN) classify to the sampling area that DPRN is generated, positioning;
DFCN algorithms are as follows:
Step (2-1), the DFCN sorting algorithms convolutional layer based on deep learning are made of the convolution block of dense connection:(1)DFCN
Input is a picture for containing all kinds of military targets;(2) DFCN extraction target signatures on profound characteristic pattern carry out
Classification;(3) DFCN tools there are one special sampling area pond layer (Region OfInterest Pooling Layer,
ROIP the feature normalization that will input) is used for a uniformly a scale;(4) DFCN while having trained a grader also simultaneously
Position coordinates return and have been corrected;Input picture in military target it is proportional be mapped on characteristic pattern, DFCN algorithms
The eigenmatrix of different dimensions is normalized to the transformation of identical dimensional;DFCN uses the normalization algorithm of ROIP, for one
The characteristic pattern of a m × n sizes, it is assumed that it is w × h that full articulamentum, which needs the matrix dimensionality of input feature vector figure,, formula is as follows:
K in above formulawAnd KhCalculative pond convolution kernel size, S are indicated respectivelywAnd ShIndicate the step-length of pondization operation,Table
Show and round up,Indicate that downward rounding, pad are edge filling item.
3. the military target recognition methods according to claim 1 based on deep learning, which is characterized in that propose DRPN and
DFCN joint training methods, the step (3) further comprise following steps:
Step (3-1), training method, regards DRPN and DFCN as a unified entirety end to end, random using batch
The process of gradient descent algorithm (Mini-batch Stochastic Gradient Descent, MSGD) training training algorithm
In, sampling area that forward direction transfer stages DRPN is generated directly train DFCN, back transfer stage DRPN and DFCN gradients according to
Secondary anti-pass, until algorithmic statement after successive ignition;
Step (3-2), DRFCN substep training algorithms are specific as follows:
Step (3-2-1) trains DRPN, the convolution module of dense connection initial with weight trained in advance using MSGD algorithms
Change;
Step (3-2-2) trains DFCN using the sampling area that step (1) DRPN is generated, and the convolution module of dense connection, which is used, to be surpassed
The weights initialisation of preceding training;
Step (3-2-3) initializes DRPN using the convolutional layer of the dense connections of step (2) DFCN, keeps dense interconnecting piece fraction
Weight is constant, only finely tunes the exclusive layers of DRPN, so far, DRPN and DFCN have had shared convolutional layer;
Step (3-2-4) keeps the convolutional layer block weights of dense connection constant, the sampling area generated using step (3) DRPN
Training DFCN, this step only finely tune the exclusive layers of DFCN;
Step (3-2-5), algorithmic statement, training terminate.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810297353.0A CN108647573A (en) | 2018-04-04 | 2018-04-04 | A kind of military target recognition methods based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810297353.0A CN108647573A (en) | 2018-04-04 | 2018-04-04 | A kind of military target recognition methods based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108647573A true CN108647573A (en) | 2018-10-12 |
Family
ID=63745642
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810297353.0A Pending CN108647573A (en) | 2018-04-04 | 2018-04-04 | A kind of military target recognition methods based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108647573A (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109492576A (en) * | 2018-11-07 | 2019-03-19 | 北京旷视科技有限公司 | Image-recognizing method, device and electronic equipment |
CN109903331A (en) * | 2019-01-08 | 2019-06-18 | 杭州电子科技大学 | A kind of convolutional neural networks object detection method based on RGB-D camera |
CN110378231A (en) * | 2019-06-19 | 2019-10-25 | 广东工业大学 | Nut recognition positioning method based on deep learning |
CN110852314A (en) * | 2020-01-16 | 2020-02-28 | 江西高创保安服务技术有限公司 | Article detection network method based on camera projection model |
CN111027399A (en) * | 2019-11-14 | 2020-04-17 | 武汉兴图新科电子股份有限公司 | Remote sensing image surface submarine identification method based on deep learning |
CN111027602A (en) * | 2019-11-25 | 2020-04-17 | 清华大学深圳国际研究生院 | Method and system for detecting target with multi-level structure |
CN111311673A (en) * | 2018-12-12 | 2020-06-19 | 北京京东尚科信息技术有限公司 | Positioning method and device and storage medium |
CN111553337A (en) * | 2020-04-27 | 2020-08-18 | 南通智能感知研究院 | Hyperspectral multi-target detection method based on improved anchor frame |
CN111639660A (en) * | 2019-03-01 | 2020-09-08 | 中科院微电子研究所昆山分所 | Image training method, device, equipment and medium based on convolutional network |
CN111768449A (en) * | 2019-03-30 | 2020-10-13 | 北京伟景智能科技有限公司 | Object grabbing method combining binocular vision with deep learning |
CN114510078A (en) * | 2022-02-16 | 2022-05-17 | 南通大学 | Unmanned aerial vehicle maneuver evasion decision-making method based on deep reinforcement learning |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150117760A1 (en) * | 2013-10-30 | 2015-04-30 | Nec Laboratories America, Inc. | Regionlets with Shift Invariant Neural Patterns for Object Detection |
US20170124432A1 (en) * | 2015-11-03 | 2017-05-04 | Baidu Usa Llc | Systems and methods for attention-based configurable convolutional neural networks (abc-cnn) for visual question answering |
CN107609193A (en) * | 2017-10-16 | 2018-01-19 | 杭州时间线信息科技有限公司 | The intelligent automatic processing method and system of picture in a kind of suitable commodity details page |
CN107688850A (en) * | 2017-08-08 | 2018-02-13 | 北京深鉴科技有限公司 | A kind of deep neural network compression method |
CN107818302A (en) * | 2017-10-20 | 2018-03-20 | 中国科学院光电技术研究所 | Non-rigid multiple dimensioned object detecting method based on convolutional neural networks |
-
2018
- 2018-04-04 CN CN201810297353.0A patent/CN108647573A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150117760A1 (en) * | 2013-10-30 | 2015-04-30 | Nec Laboratories America, Inc. | Regionlets with Shift Invariant Neural Patterns for Object Detection |
US20170124432A1 (en) * | 2015-11-03 | 2017-05-04 | Baidu Usa Llc | Systems and methods for attention-based configurable convolutional neural networks (abc-cnn) for visual question answering |
CN107688850A (en) * | 2017-08-08 | 2018-02-13 | 北京深鉴科技有限公司 | A kind of deep neural network compression method |
CN107609193A (en) * | 2017-10-16 | 2018-01-19 | 杭州时间线信息科技有限公司 | The intelligent automatic processing method and system of picture in a kind of suitable commodity details page |
CN107818302A (en) * | 2017-10-20 | 2018-03-20 | 中国科学院光电技术研究所 | Non-rigid multiple dimensioned object detecting method based on convolutional neural networks |
Non-Patent Citations (4)
Title |
---|
HUANG, GAO 等: "Densely Connected Convolutional Networks", 《IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION,2017》 * |
JIFENG DAI 等: "R-FCN: Object Detection via Region-based Fully Convolutional Networks", 《30TH CONFERENCE ON NEURAL INFORMATION PROCESSING SYSTEMS (NIPS 2016)》 * |
TSUNG-YI LIN 等: "Feature Pyramid Networks for Object Detection", 《THE IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 * |
曹诗雨 等: "基于Fast R-CNN的车辆目标检测", 《中国图象图形学报》 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109492576A (en) * | 2018-11-07 | 2019-03-19 | 北京旷视科技有限公司 | Image-recognizing method, device and electronic equipment |
CN111311673A (en) * | 2018-12-12 | 2020-06-19 | 北京京东尚科信息技术有限公司 | Positioning method and device and storage medium |
CN111311673B (en) * | 2018-12-12 | 2023-11-03 | 北京京东乾石科技有限公司 | Positioning method and device and storage medium |
CN109903331A (en) * | 2019-01-08 | 2019-06-18 | 杭州电子科技大学 | A kind of convolutional neural networks object detection method based on RGB-D camera |
CN109903331B (en) * | 2019-01-08 | 2020-12-22 | 杭州电子科技大学 | Convolutional neural network target detection method based on RGB-D camera |
CN111639660A (en) * | 2019-03-01 | 2020-09-08 | 中科院微电子研究所昆山分所 | Image training method, device, equipment and medium based on convolutional network |
CN111639660B (en) * | 2019-03-01 | 2024-01-12 | 中科微至科技股份有限公司 | Image training method, device, equipment and medium based on convolution network |
CN111768449A (en) * | 2019-03-30 | 2020-10-13 | 北京伟景智能科技有限公司 | Object grabbing method combining binocular vision with deep learning |
CN110378231A (en) * | 2019-06-19 | 2019-10-25 | 广东工业大学 | Nut recognition positioning method based on deep learning |
CN111027399B (en) * | 2019-11-14 | 2023-08-22 | 武汉兴图新科电子股份有限公司 | Remote sensing image water surface submarine recognition method based on deep learning |
CN111027399A (en) * | 2019-11-14 | 2020-04-17 | 武汉兴图新科电子股份有限公司 | Remote sensing image surface submarine identification method based on deep learning |
CN111027602A (en) * | 2019-11-25 | 2020-04-17 | 清华大学深圳国际研究生院 | Method and system for detecting target with multi-level structure |
CN111027602B (en) * | 2019-11-25 | 2023-04-07 | 清华大学深圳国际研究生院 | Method and system for detecting target with multi-level structure |
CN110852314A (en) * | 2020-01-16 | 2020-02-28 | 江西高创保安服务技术有限公司 | Article detection network method based on camera projection model |
CN110852314B (en) * | 2020-01-16 | 2020-05-22 | 江西高创保安服务技术有限公司 | Article detection network method based on camera projection model |
CN111553337A (en) * | 2020-04-27 | 2020-08-18 | 南通智能感知研究院 | Hyperspectral multi-target detection method based on improved anchor frame |
CN114510078A (en) * | 2022-02-16 | 2022-05-17 | 南通大学 | Unmanned aerial vehicle maneuver evasion decision-making method based on deep reinforcement learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108647573A (en) | A kind of military target recognition methods based on deep learning | |
Chen et al. | Target classification using the deep convolutional networks for SAR images | |
Chen et al. | Generalizable representation learning for mixture domain face anti-spoofing | |
Khan et al. | Automatic target detection in satellite images using deep learning | |
Nasrabadi | Deeptarget: An automatic target recognition using deep convolutional neural networks | |
Wang et al. | Automatic target recognition using a feature-decomposition and data-decomposition modular neural network | |
CN108614996A (en) | A kind of military ships based on deep learning, civilian boat automatic identifying method | |
CN107680106A (en) | A kind of conspicuousness object detection method based on Faster R CNN | |
Mathur et al. | Crosspooled FishNet: transfer learning based fish species classification model | |
CN107883947B (en) | Star sensor star map identification method based on convolutional neural network | |
CN108537121B (en) | Self-adaptive remote sensing scene classification method based on meteorological environment parameter and image information fusion | |
CN110399856A (en) | Feature extraction network training method, image processing method, device and its equipment | |
CN107491731A (en) | A kind of Ground moving target detection and recognition methods towards precision strike | |
Napoli et al. | Simplified firefly algorithm for 2d image key-points search | |
CN109101926A (en) | Aerial target detection method based on convolutional neural networks | |
CN109460774A (en) | A kind of birds recognition methods based on improved convolutional neural networks | |
CN108257179B (en) | Image processing method | |
Bhargava | On generalizing detection models for unconstrained environments | |
CN116486243A (en) | DP-ViT-based sonar image target detection method | |
CN108765439A (en) | A kind of sea horizon detection method based on unmanned water surface ship | |
Yang et al. | Diver gesture recognition using deep learning for underwater human-robot interaction | |
Li et al. | SAR image object detection based on improved cross-entropy loss function with the attention of hard samples | |
CN116824345A (en) | Bullet hole detection method and device based on computer vision | |
CN111414997A (en) | Artificial intelligence-based method for battlefield target identification | |
CN112967290A (en) | Method for automatically identifying enemies of target aircraft in air by unmanned aerial vehicle |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181012 |