CN109583456B - Infrared surface target detection method based on feature fusion and dense connection - Google Patents

Infrared surface target detection method based on feature fusion and dense connection Download PDF

Info

Publication number
CN109583456B
CN109583456B CN201811386234.9A CN201811386234A CN109583456B CN 109583456 B CN109583456 B CN 109583456B CN 201811386234 A CN201811386234 A CN 201811386234A CN 109583456 B CN109583456 B CN 109583456B
Authority
CN
China
Prior art keywords
feature
image
target
network
boundary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811386234.9A
Other languages
Chinese (zh)
Other versions
CN109583456A (en
Inventor
周慧鑫
施元斌
赵东
郭立新
张嘉嘉
秦翰林
王炳健
赖睿
李欢
宋江鲁奇
姚博
于跃
贾秀萍
周峻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN201811386234.9A priority Critical patent/CN109583456B/en
Publication of CN109583456A publication Critical patent/CN109583456A/en
Application granted granted Critical
Publication of CN109583456B publication Critical patent/CN109583456B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an infrared target detection method based on feature fusion and dense connection, which comprises the steps of constructing an infrared image data set containing a required identification target, calibrating the position and the type of the required identification target in the infrared image data set, and obtaining an original known label image; dividing the infrared image data set into a training set and a verification set; preprocessing the images in the training set, extracting features and fusing the features, and obtaining classification results and boundary boxes through a regression network; performing loss function calculation on the classification result, the boundary box and the original known label image, and updating the parameter value of the convolutional neural network; repeatedly carrying out iterative updating on the convolutional neural network parameters until the error is small enough or the iterative times reach a set upper limit; and processing the image in the verification set through the trained convolutional neural network parameters to acquire the accuracy and the required time of target detection and a final target detection result diagram.

Description

Infrared surface target detection method based on feature fusion and dense connection
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to an infrared surface target detection method based on feature fusion and dense connection.
Background
At present, main target detection methods can be roughly divided into two types, namely a target detection method based on background modeling, a target detection method based on foreground modeling, and a target detection method based on background modeling, wherein a region with large difference from the background in an image is judged to be a target by constructing a background model; due to the complexity of the background, the detection effect of the method is not ideal. The foreground modeling-based method judges the region which is more consistent with the feature information as a target by extracting the feature information of the target, wherein the most representative is a target detection method based on deep learning. The target detection method based on deep learning automatically extracts target characteristics through a deep convolutional neural network and detects the types and positions of targets. And then comparing the extracted characteristics with calibration information in a training set, calculating a loss function, and improving the extracted characteristics of the network by a gradient descent method so as to enable the extracted characteristics to be more in line with the actual conditions of the target. Meanwhile, the parameters of the subsequent detection part are updated, so that the detection result is more accurate. Training is repeated until the expected detection effect is achieved.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a target detection method based on feature fusion and dense blocks.
The technical scheme adopted by the invention is as follows:
the embodiment of the invention provides an infrared target detection method based on feature fusion and dense connection, which is realized by the following steps:
step 1, constructing an infrared image data set containing a required identification target, and calibrating the position and the type of the required identification target in the infrared image data set to obtain an original known label image;
step 2, dividing the infrared image data set into a training set and a verification set;
step 3, preprocessing the image enhancement of the images in the training set;
step 4, carrying out feature extraction and feature fusion on the preprocessed image, and obtaining a classification result and a boundary box through a regression network; performing loss function calculation on the classification result, the boundary box and the original known label image, performing back propagation on a prediction error in the convolutional neural network by using a random gradient descent method containing momentum, and updating parameter values of the convolutional neural network;
step 5, repeating the steps 3 and 4 to update the convolutional neural network parameters in an iterative manner until the error is small enough or the iteration times reach a set upper limit;
and 6, processing the image in the verification set through the trained convolutional neural network parameters to obtain the accuracy and the required time of target detection and a final target detection result diagram.
In the above scheme, in the step 4, feature extraction and feature fusion are performed on the preprocessed image, and a classification result and a bounding box are obtained through a regression network, specifically by the following steps:
step 401, randomly extracting a fixed number of images in the training set, and dividing 10×10 areas for each image;
step 402, inputting the image divided in the step 401 into a dense connection network for feature extraction;
step 403, performing feature fusion on the extracted feature map to obtain a fused feature map;
step 404, generating a fixed number of suggestion boxes for each region in the fused feature map;
and step 405, sending the fused feature map and the suggestion frame into a regression network to carry out classification and bounding box regression, and removing redundancy by using a non-maximum suppression method to obtain a classification result and a bounding box.
In the above scheme, the calculation method of the dense connection network in step 402 includes the following formula:
d l =H l ([d 0 ,d 1 ,...,d l-1 ])
wherein d l Representing the output result of the first convolution layer in the dense connection network, and if the dense connection network contains B convolution layers, l takes a value between 0 and B, H l Is a combined operation of regularization, convolution and linear rectification activation function, d 0 D for inputting image l-1 The output result of the first layer-1.
In the above scheme, in the step 403, feature fusion is performed on the extracted feature images, which is to directly fuse the extracted feature images with different scales through a pooling method.
In the above scheme, in the step 403, feature fusion is performed on the extracted feature map, which is specifically implemented by the following steps:
step 4031, a first set of feature maps F 1 Through pooling operation, the new smaller feature images are converted into a new smaller feature image, and then the new smaller feature image is matched with a second group of feature images F 2 Fusion to obtain a new productFeature map F 2 ’;
Step 4032, new feature map F 2 ' through pooling operation, and then with the third group of feature graphs F 3 Fusion to obtain a new feature map F 3 ’;
Step 4033, using the new profile F 2 ' and F 3 ' replace second set of feature maps F 2 And a third set of feature maps F 3 Entering a regression network.
In the above scheme, in step 405, the fused feature map and the suggestion box are sent to a regression network to perform classification and bounding box regression, and redundancy is removed by using a non-maximum suppression method, so as to obtain a classification result and a bounding box, which is specifically implemented through the following steps:
step 4051, dividing the feature map into 10×10 areas, and inputting the areas into a regression detection network;
step 4051, for each region, the regression detection network will output the location and type of 7 possible targets; wherein, the total number of target types is A, namely the possibility of outputting corresponding A targets is related to the setting of the training set; the position parameters comprise 3 data including the central position coordinates, width and height of the target boundary frame;
in step 4052, the non-maximum suppression method is to calculate the intersection ratio of the same kind of bounding boxes obtained by using the following formula:
Figure BDA0001873022370000031
wherein S is the calculated intersection ratio, M and N represent two boundary boxes of the same class of targets, M and N represent the intersection of the boundary boxes M and N, and M and N represent the union of the boundary boxes M and N. For two bounding boxes with S greater than 0.75, the bounding box with the smaller classification result value is eliminated.
In the above scheme, in the step 4, the classification result and the bounding box are calculated as a loss function with the original known label image, and a random gradient descent method including momentum is used to counter-propagate the prediction error in the convolutional neural network, and update the parameter value of the convolutional neural network, specifically by the following steps:
step 401, calculating a loss function according to the classification result, the position and the type of the target in the bounding box, and the position and the type of the target to be identified calibrated in the training set, wherein the calculation formula of the loss function is as follows:
Figure BDA0001873022370000041
wherein 100 is the number of regions, 7 is the number of suggested frames and finally generated bounding boxes to be predicted for each region, i is the region number, j is the suggested frame and bounding box number, loss is the error value, obj is the presence of an object, noobj is the absence of an object, x and y are the predicted values of the abscissa and ordinate of the center of the suggested frame and bounding box, respectively, w and h are the wide and high predicted values of the suggested frame and bounding box, respectively, C is the predicted value of whether the suggested frame and bounding box contain an object, contains a values, respectively corresponds to the likelihood of a class a objects,
Figure BDA0001873022370000042
for the corresponding labeling value, ++>
Figure BDA0001873022370000043
And->
Figure BDA0001873022370000044
The jth suggestion box and the boundary box respectively represent that the target falls into and does not fall into the region i;
step 402, updating the weight according to the loss function calculation result by using a random gradient descent method containing momentum.
In the above scheme, the preprocessing in step 3 is to expand the training set by random rotation, mirroring, flipping, scaling, translation, scale transformation, contrast transformation, noise disturbance and color change.
Compared with the prior art, the method has the advantages that the infrared image is learned, so that the target detection network can acquire the recognition capability for visible light and infrared targets, and meanwhile, the method has a better detection effect compared with the traditional deep learning method by improving the network structure.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a network architecture diagram of the present invention;
FIG. 3 is a graph showing the results of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The embodiment of the invention provides an infrared target detection method based on feature fusion and dense connection, which is realized by the following steps as shown in fig. 1:
step 1, constructing a data set
If the detection algorithm is required to have the ability to identify the infrared image, the infrared image needs to be added to the data set. The invention constructs a data set by using the infrared image and manually marks the image in the data set by using the boundary box.
Step 2, expanding training set
The training set is expanded by random rotation, mirroring, flipping, scaling, translation, scale transformation, contrast transformation, noise disturbance, color change, and the like. The defect of difficult data set acquisition can be overcome, and the training effect of the small data set is improved.
Step 3, dividing 10×10 regions
The original image is divided into 10 x 10 areas, and each area is respectively responsible for checking the target of the center falling into the area, so that the detection speed can be greatly increased.
Step 4, using dense network to extract the characteristics
The feature extraction process comprises the following steps:
in a first step, the input image is computed using a convolution layer of size 3*3, number 32,then carrying out pooling operation of 2 x 2 to obtain a characteristic diagram F 1
Second, dense block pair F is used, which contains 64 3*3 convolution kernels and 64 1*1 convolution kernels 1 Extracting features, calculating residual errors, and performing 2 x 2 pooling operation to obtain a feature map F 2
Third, dense block pair F is used, which contains 64 1*1 convolution kernels and 64 3*3 convolution kernels 2 Extracting features, calculating residual errors, and performing 2 x 2 pooling operation to obtain a feature map F 3
Fourth step, dense block pair F comprising 64 1*1 convolution kernels and 64 3*3 convolution kernels is used 4 Extracting features, performing 1*1 convolution, calculating residual error, and performing 2×2 pooling operation to obtain feature map F 4
Fifth step, dense block pair F is used, which contains 256 1*1 convolution kernels and 256 3*3 convolution kernels 4 Extracting features, performing 1*1 convolution, calculating residual error, and performing 2×2 pooling operation to obtain feature map F 5
Sixth, dense block pair F is used that contains 1024 1*1 convolution kernels, 1024 3*3 convolution kernels, and 1024 1*1 convolution kernels 5 Extracting features, convolving 1*1, and calculating residual to obtain feature map F 6
Step 5, carrying out feature fusion on the feature extraction result
The feature fusion method comprises the following steps:
first, extracting the feature map F obtained in the step 3 4 、F 5 、F 6
Second step, for the characteristic diagram F 4 Pooling for 4 times 2 x 2, and respectively taking the points of upper left, upper right, lower left and lower right in four fields to form a new characteristic diagram F 4 ' and feature map F 5 Combined into a feature diagram group F 7
Third step, for the characteristic diagram F 7 Pooling for 4 times 2 x 2, and respectively taking the points of upper left, upper right, lower left and lower right in four fields to form a new characteristic diagram F 7 ' and (3)Feature map F 6 Combined into a feature diagram group F 8
Step 6, regression detection to obtain classification result and boundary frame
The method for obtaining the classification result and the bounding box is as follows: for each region, the classification and regression detection network will output the location and kind of 7 targets that may exist. Wherein, the total number of target types is A, namely the possibility of outputting corresponding A targets is related to the setting of the training set; the position parameters comprise 3 data including the central position coordinates, width and height of the target boundary frame;
step 7, calculating the loss function and updating parameters
And (3) calculating a loss function according to the position and the type of the target output in the step (6) and the position and the type of the target to be identified calibrated in the training set, wherein the step is only performed in the training process. The calculation formula of the loss function is as follows:
Figure BDA0001873022370000061
wherein 100 is the number of regions, 7 is the number of suggested frames to be predicted and finally generated edit frames for each region, i is the region number, j is the suggested frame and bounding box number, loss is the error value, obj represents the presence of a target, and noobj represents the absence of a target. x and y are the predicted values of the abscissa and ordinate of the center of the suggestion and bounding boxes, w and h are the predicted values of the width and height of the suggestion and bounding boxes, respectively, C is the predicted value of whether the suggestion and bounding boxes contain objects, contains a values, corresponds to the likelihood of class a objects, respectively,
Figure BDA0001873022370000071
for the corresponding labeling value, ++>
Figure BDA0001873022370000072
And->
Figure BDA0001873022370000073
Jth representing the target drop and non-drop region i, respectivelySuggested boxes and bounding boxes. Then, the weight is updated using a random gradient descent method including momentum according to the loss function calculation result.
Repeating the steps 3-7 until the error meets the requirement or the iteration number reaches the set upper limit.
Step 8, testing by using the test set
And (3) processing the image in the verification set by using the target detection network trained in the step (7) to acquire the accuracy and the required time of target detection and a final target detection result diagram.
The network structure of the present invention will be further described with reference to FIG. 2
1. Network layer number setting
The neural network used in the invention is divided into two parts, wherein the first part is a characteristic extraction network and consists of 5 dense blocks, and the neural network totally comprises 25 layers of convolutional neural networks. The second part is a feature fusion and regression detection network, which comprises an 8-layer convolutional neural network and a 1-layer full convolutional network.
2. Dense block arrangement
The feature extraction network portion uses dense block settings as follows:
(1) Dense block 1 contains 2 layers of convolutional neural networks, the number of convolutional kernels used in the first layer is 64, the size is 1*1, and the step size is 1; the number of convolution kernels used in the second layer is 64, the size is 3*3, and the step size is 1. Dense block 1 was used 1 time.
(2) Dense block 2 contains 2 layers of convolutional neural network, the number of convolution kernels used in the first layer is 64, the size is 3*3, and the step length is 1; the number of convolution kernels used in the second layer is 64, the size is 1*1, and the step size is 1. Dense block 2 was used 1 time.
(3) Dense block 3 contains 2 layers of convolutional neural networks, the number of convolutional kernels used in the first layer is 64, the size is 1*1, and the step size is 1; the number of convolution kernels used in the second layer is 64, the size is 3*3, and the step size is 1. The dense block 3 was used 2 times.
(4) The dense block 4 comprises a 2-layer convolutional neural network, the number of convolution kernels used by the first layer is 256, the size is 1*1, and the step length is 1; the number of convolution kernels used in the second layer is 256, the size is 3*3, and the step size is 1. The dense block 4 was used 4 times.
(5) The dense block 5 comprises a 3-layer convolutional neural network, wherein the number of convolution kernels used by the first layer is 1024, the size is 1*1, and the step length is 1; the number of convolution kernels used in the second layer is 1024, the size is 3*3, and the step length is 1; the number of convolution kernels used for the third layer is 1024, the size is 1*1, and the step size is 1. The dense block 5 is used 2 times.
3. And (5) feature fusion setting.
The 3 sets of feature maps used for feature fusion are derived from layer 9, layer 18, and layer 25 results of the feature extraction network. The generated feature map is then combined with the shallow feature map by convolution and upsampling. The results were further processed through the 3*3 convolution layer and 1*1 convolution layer, and the resulting three new feature maps were feature fused.
The simulation effect of the present invention will be further described with reference to fig. 3.
1. Simulation conditions:
the image size to be detected used by the simulation of the invention is 480 multiplied by 640, and the simulation comprises pedestrians and bicycles.
2. Simulation results and analysis:
FIG. 3 is a graph showing the results of the present invention, wherein FIG. 3 (a) is a graph to be tested; fig. 3 (b) is a feature map obtained by extraction; fig. 2 (c) is a diagram of the detection result.
The feature extraction of fig. 3 (a) is performed using a dense network to obtain a series of feature maps, and only two of the feature maps, i.e., fig. 3 (b) and fig. 3 (c), are extracted due to the too many feature maps in the intermediate process. Wherein, fig. 3 (b) is a feature map extracted from a shallower network, the image size is larger, the detail information is more and the semantic information is less; fig. 3 (c) is a feature map extracted from a deeper network, the image size is smaller, the detail information is less, and the semantic information is more.
After the feature graphs are fused and subjected to regression detection, positions of pedestrians and bicycles can be obtained, and the positions are marked on the original graph, so that a final result graph 3 (c) is obtained.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the present invention.

Claims (2)

1. The infrared target detection method based on feature fusion and dense connection is characterized by comprising the following steps of:
step 1, constructing an infrared image data set containing a required identification target, and calibrating the position and the type of the required identification target in the infrared image data set to obtain an original known label image;
step 2, dividing the infrared image data set into a training set and a verification set;
step 3, preprocessing the image enhancement of the images in the training set;
step 4, carrying out feature extraction and feature fusion on the preprocessed image, and obtaining a classification result and a boundary box through a regression network; performing loss function calculation on the classification result, the boundary box and the original known label image, performing back propagation on a prediction error in the convolutional neural network by using a random gradient descent method containing momentum, and updating parameter values of the convolutional neural network;
step 5, repeating the steps 3 and 4 to update the convolutional neural network parameters in an iterative manner until the error is small enough or the iteration times reach a set upper limit;
step 6, processing the image in the verification set through the trained convolutional neural network parameters to obtain the accuracy and the required time of target detection and a final target detection result diagram;
in the step 4, feature extraction and feature fusion are performed on the preprocessed image, and a classification result and a bounding box are obtained through a regression network, specifically through the following steps:
step 401, randomly extracting a fixed number of images in the training set, and dividing 10×10 areas for each image;
step 402, inputting the image divided in the step 401 into a dense connection network for feature extraction;
step 403, performing feature fusion on the extracted feature map to obtain a fused feature map;
step 404, generating a fixed number of suggestion boxes for each region in the fused feature map;
step 405, sending the fused feature map and the suggestion frame into a regression network to perform classification and bounding box regression, and removing redundancy by using a non-maximum suppression method to obtain a classification result and a bounding box;
the calculation method of the dense connection network in step 402 is as follows:
d l =H l ([d 0 ,d 1 ,…,d l-1 ])
wherein d l Representing the output result of the first convolution layer in the dense connection network, and if the dense connection network contains B convolution layers, l takes a value between 0 and B, H l Is a combined operation of regularization, convolution and linear rectification activation function, d 0 D for inputting image l-1 The output result of the first layer is the output result of the first layer-1;
in the step 403, feature fusion is performed on the extracted feature images, which is to directly fuse the extracted feature images with different scales through a pooling method;
in the step 403, feature fusion is performed on the extracted feature map, which is specifically implemented through the following steps:
step 4031, a first set of feature maps F 1 Through pooling operation, the new smaller feature images are converted into a new smaller feature image, and then the new smaller feature image is matched with a second group of feature images F 2 Fusion to obtain a new feature map F 2 ’;
Step 4032, new feature map F 2 ' through pooling operation, and then with the third group of feature graphs F 3 Fusion to obtain a new feature map F 3 ’;
Step 4033, using the new profile F 2 ' and F 3 ' replace second set of feature maps F 2 And a third set of feature maps F 3 Entering a regression network;
in the step 405, the fused feature map and the suggestion frame are sent to a regression network to perform classification and bounding box regression, and redundancy is removed by using a non-maximum suppression method, so as to obtain a classification result and a bounding box, which is realized specifically through the following steps:
step 4051, dividing the feature map into 10×10 areas, and inputting the areas into a regression detection network;
step 4051, for each region, the regression detection network will output the location and type of 7 possible targets; wherein, the total number of target types is A, namely the possibility of outputting corresponding A targets is related to the setting of the training set; the position parameters comprise 3 data including the central position coordinates, width and height of the target boundary frame;
in step 4052, the non-maximum suppression method is to calculate the intersection ratio of the same kind of bounding boxes obtained by using the following formula:
Figure FDA0004086609070000021
s is the calculated intersection ratio, M and N represent two boundary frames of the same class of targets, M and N represent the intersection of the boundary frames M and N, M and N represent the union of the boundary frames M and N, and for two boundary frames with S more than 0.75, the boundary frame with smaller classification result value is removed;
in the step 4, the classification result and the bounding box are subjected to loss function calculation with the original known label image, a random gradient descent method containing momentum is used for carrying out back propagation on a prediction error in the convolutional neural network, and parameter values of the convolutional neural network are updated, and the method is specifically realized through the following steps:
step 401, calculating a loss function according to the classification result, the position and the type of the target in the bounding box, and the position and the type of the target to be identified calibrated in the training set, wherein the calculation formula of the loss function is as follows:
Figure FDA0004086609070000031
wherein 100 is the number of regions, 7 is the number of suggested frames and finally generated boundary frames to be predicted for each region, i is the region number, j is the suggested frames and boundary frame numbers, loss is the error value, obj represents the existence target, and noobj indicates that no object is present, x and y are predicted values of the abscissa and ordinate of the center of the suggestion and bounding boxes, respectively, w and h are predicted values of the width and height of the suggestion and bounding boxes, respectively, C is the predicted value of whether the suggestion and bounding boxes contain an object, contains a values, corresponds to the likelihood of a class a object, respectively,
Figure FDA0004086609070000032
for the corresponding labeling value, ++>
Figure FDA0004086609070000033
And->
Figure FDA0004086609070000034
The jth suggestion box and the boundary box respectively represent that the target falls into and does not fall into the region i;
step 402, updating the weight according to the loss function calculation result by using a random gradient descent method containing momentum.
2. The method of claim 1, wherein the preprocessing of step 3 is to augment training sets by random rotation, mirroring, flipping, scaling, translation, scaling, contrast transformation, noise perturbation, and color change.
CN201811386234.9A 2018-11-20 2018-11-20 Infrared surface target detection method based on feature fusion and dense connection Active CN109583456B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811386234.9A CN109583456B (en) 2018-11-20 2018-11-20 Infrared surface target detection method based on feature fusion and dense connection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811386234.9A CN109583456B (en) 2018-11-20 2018-11-20 Infrared surface target detection method based on feature fusion and dense connection

Publications (2)

Publication Number Publication Date
CN109583456A CN109583456A (en) 2019-04-05
CN109583456B true CN109583456B (en) 2023-04-28

Family

ID=65923459

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811386234.9A Active CN109583456B (en) 2018-11-20 2018-11-20 Infrared surface target detection method based on feature fusion and dense connection

Country Status (1)

Country Link
CN (1) CN109583456B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110197152B (en) * 2019-05-28 2022-08-26 南京邮电大学 Road target identification method for automatic driving system
CN110532914A (en) * 2019-08-20 2019-12-03 西安电子科技大学 Building analyte detection method based on fine-feature study
CN111461145B (en) * 2020-03-31 2023-04-18 中国科学院计算技术研究所 Method for detecting target based on convolutional neural network

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107609525A (en) * 2017-09-19 2018-01-19 吉林大学 Remote Sensing Target detection method based on Pruning strategy structure convolutional neural networks
CN107808143A (en) * 2017-11-10 2018-03-16 西安电子科技大学 Dynamic gesture identification method based on computer vision
CN107818302A (en) * 2017-10-20 2018-03-20 中国科学院光电技术研究所 Non-rigid multiple dimensioned object detecting method based on convolutional neural networks
CN108038519A (en) * 2018-01-30 2018-05-15 浙江大学 A kind of uterine neck image processing method and device based on dense feature pyramid network
CN108182456A (en) * 2018-01-23 2018-06-19 哈工大机器人(合肥)国际创新研究院 A kind of target detection model and its training method based on deep learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107609525A (en) * 2017-09-19 2018-01-19 吉林大学 Remote Sensing Target detection method based on Pruning strategy structure convolutional neural networks
CN107818302A (en) * 2017-10-20 2018-03-20 中国科学院光电技术研究所 Non-rigid multiple dimensioned object detecting method based on convolutional neural networks
CN107808143A (en) * 2017-11-10 2018-03-16 西安电子科技大学 Dynamic gesture identification method based on computer vision
CN108182456A (en) * 2018-01-23 2018-06-19 哈工大机器人(合肥)国际创新研究院 A kind of target detection model and its training method based on deep learning
CN108038519A (en) * 2018-01-30 2018-05-15 浙江大学 A kind of uterine neck image processing method and device based on dense feature pyramid network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Densely Connected Convolutional Networks;Gao Huang 等;《2017 IEEE Conference on Computer Vision and Pattern Recognition》;20171109;全文 *
复杂天空背景下的红外弱小目标跟踪;赵东 等;《强激光与粒子束》;20180630;第30卷(第6期);全文 *

Also Published As

Publication number Publication date
CN109583456A (en) 2019-04-05

Similar Documents

Publication Publication Date Title
WO2020102988A1 (en) Feature fusion and dense connection based infrared plane target detection method
CN109886066B (en) Rapid target detection method based on multi-scale and multi-layer feature fusion
CN110348319B (en) Face anti-counterfeiting method based on face depth information and edge image fusion
CN104598885B (en) The detection of word label and localization method in street view image
CN106408030B (en) SAR image classification method based on middle layer semantic attribute and convolutional neural networks
CN107862261A (en) Image people counting method based on multiple dimensioned convolutional neural networks
CN114092832B (en) High-resolution remote sensing image classification method based on parallel hybrid convolutional network
CN109583456B (en) Infrared surface target detection method based on feature fusion and dense connection
CN111160249A (en) Multi-class target detection method of optical remote sensing image based on cross-scale feature fusion
CN106023154B (en) Multidate SAR image change detection based on binary channels convolutional neural networks
CN113920107A (en) Insulator damage detection method based on improved yolov5 algorithm
CN109284779A (en) Object detecting method based on the full convolutional network of depth
CN105160400A (en) L21 norm based method for improving convolutional neural network generalization capability
CN108960404B (en) Image-based crowd counting method and device
WO2015176305A1 (en) Human-shaped image segmentation method
CN110543906B (en) Automatic skin recognition method based on Mask R-CNN model
CN107967474A (en) A kind of sea-surface target conspicuousness detection method based on convolutional neural networks
CN106991411B (en) Remote Sensing Target based on depth shape priori refines extracting method
CN110716792B (en) Target detector and construction method and application thereof
CN111914902B (en) Traditional Chinese medicine identification and surface defect detection method based on deep neural network
CN108447057A (en) SAR image change detection based on conspicuousness and depth convolutional network
CN112766283B (en) Two-phase flow pattern identification method based on multi-scale convolution network
CN111161224A (en) Casting internal defect grading evaluation system and method based on deep learning
CN114897816A (en) Mask R-CNN mineral particle identification and particle size detection method based on improved Mask
CN108776777A (en) The recognition methods of spatial relationship between a kind of remote sensing image object based on Faster RCNN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant