CN111814662B - Visible light image airplane rapid detection method based on miniature convolutional neural network - Google Patents

Visible light image airplane rapid detection method based on miniature convolutional neural network Download PDF

Info

Publication number
CN111814662B
CN111814662B CN202010646717.9A CN202010646717A CN111814662B CN 111814662 B CN111814662 B CN 111814662B CN 202010646717 A CN202010646717 A CN 202010646717A CN 111814662 B CN111814662 B CN 111814662B
Authority
CN
China
Prior art keywords
neural network
miniature
convolutional neural
data set
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010646717.9A
Other languages
Chinese (zh)
Other versions
CN111814662A (en
Inventor
晏焕钱
李波
韦星星
王越
赖汝锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202010646717.9A priority Critical patent/CN111814662B/en
Publication of CN111814662A publication Critical patent/CN111814662A/en
Application granted granted Critical
Publication of CN111814662B publication Critical patent/CN111814662B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24317Piecewise classification, i.e. whereby each classification requires several discriminant rules
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Multimedia (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a visible light image airplane rapid detection method based on a miniature convolutional neural network, which comprises the following steps: (1) counting airplane dimension information in the training data set to obtain a data set; calculating the final size of the sliding window; (2) calculating lambda corresponding to a given channel characteristic type omega through the channel characteristics of the training data setΩCompleting the establishment of a rapid characteristic pyramid, and completing the training of a rapid candidate box generation algorithm by adopting an Adaboost algorithm; (3) and (4) correcting the parameters of the candidate frame generation algorithm through a linear Search algorithm Search-delta, and judging the candidate area again by adopting a miniature convolutional neural network, wherein if the current area is classified as true by the network, the candidate frame is considered to have an airplane, otherwise, the current candidate frame is considered as a background and is discarded. The invention has the advantages of high detection speed, high precision, small occupied space of the whole algorithm model and low requirement on the hardware condition of the operation platform.

Description

Visible light image airplane rapid detection method based on miniature convolutional neural network
Technical Field
The invention belongs to the technical field of digital image processing, and particularly relates to a visible light image airplane rapid detection method based on a miniature convolutional neural network.
Background
Target detection is one of the core problems of machine vision and also one of the fastest growing artificial intelligence techniques in recent years. It can be a popular overview for a task to find the target to be identified from a given picture and give the location of this target. The object detection method has been studied and developed for several decades, and can be roughly divided into object detection of the vj (viola jones) era in which artificial features are combined with machine learning and object detection of the deep learning era. The method mainly adopts a dense sliding window to judge whether a target exists in a current window, the process relates to the characteristic extraction of the current window and the identification and judgment of a classifier on the current window, and the method is usually slow; the latter completes the target detection task by learning and fitting the target position and category through a complex network model and a large amount of data, and can be roughly divided into two branches, namely a candidate frame-based deep learning method and a regression prediction-based deep learning method, and the like.
Target detection in the remote sensing image relates to positioning of a region of interest in the satellite shooting region and identification of the current region. It is different from the detection target in the natural image in that: the targets are formed in a top view, multiple directions exist in the targets, the brightness of the targets is greatly different, and the background environment where the targets are located is relatively complex. The detection of the airport airplane in the remote sensing image plays an important role in military reconnaissance and airport monitoring. The target detection algorithm can automatically mark and position the airplane target in the airport, the operation plays an extremely important role in recording and describing the later-stage airport state, and manpower and material resources can be reduced. Although the target detection technology under the natural image has a series of breakthrough developments, detection research on airplanes in the remote sensing image airport is relatively few, and the task still has some problems to be solved, such as high requirement of an actual algorithm on hardware conditions, insufficient processing effect of the algorithm on complex and variable detection environments, and the like.
Therefore, in order to meet the requirement of a lower hardware environment as much as possible, and simultaneously have higher detection accuracy and higher detection speed, through the analysis and research on the airplane detection algorithm in the visible light airport in the current remote sensing image, the problem to be solved by the technical personnel in the field is urgently needed to provide the airport airplane detection algorithm in the visible light image with low requirement on hardware conditions, high detection accuracy and high detection speed.
Disclosure of Invention
The technical problem that this application will solve lies in:
(1) an efficient target detection method is provided and applied to detection of an airport airplane under visible light in a remote sensing image;
(2) a novel method for calculating the size of a sliding window is provided, the method obtains the length and width priori knowledge of an airplane by counting the proportional information of an airplane target in a remote sensing picture, and the size of the sliding window is calculated by adopting a square window. The sliding window calculated by the method can avoid the influence of human marking errors and shooting angles to a certain extent, and can effectively solve the problem of high missing rate caused by multi-directionality of the target angle;
(3) in order to avoid the defects of low precision, low speed, large calculated amount and the like of a candidate frame extraction algorithm at the early stage of target detection, a rapid candidate frame generation algorithm is realized by combining the characteristics of an aggregation channel and an Adaboost classification algorithm;
(4) the method is a simple and efficient linear search algorithm, and the algorithm can be used for quickly fine-tuning the quick candidate box generation algorithm, so that the effectiveness of the quick candidate box generation algorithm is improved;
(5) an efficient and lightweight convolutional neural network is designed for classification of target and background regions. The network model has a shallow layer number, less parameter quantity and higher classification accuracy. The method is different from most of currently available network models and has the defects of large model capacity, low speed and large calculation amount.
In order to achieve the above object, the present application adopts the following technical solutions:
a visible light image airplane rapid detection method based on a miniature convolution neural network comprises the following steps:
(1) counting airplane dimension information in the training data set to obtain a data set; next, a minimum side length S of the aircraft object is calculatedmin(ii) a Finally, calculating the average length-width ratio R of the airplanetargetTo calculate the final sliding window size: (S)min×Rtarget,Smin×Rtarget);
(2) Three types of channel characteristics are calculated for each visible light field picture: a standard gradient magnitude channel characteristic, a gradient direction channel characteristic, and a LUV color channel characteristic; calculating lambda corresponding to a given channel characteristic type omega by using a least square estimation method through channel characteristics of a training data setΩCompleting the establishment of a rapid characteristic pyramid, and completing the training of a rapid candidate box generation algorithm by adopting an Adaboost algorithm, wherein lambdaΩRepresenting an information loss coefficient;
(3) re-detecting the training data set by the fast candidate frame generation algorithm under different parameters delta through a linear Search algorithm Search-delta, and calculating corresponding detection precision and recall rate; obtaining a high-precision and high-recall-rate quick candidate frame generation model by adjusting the parameter delta;
(4) and (3) finishing the re-judgment of the candidate area by adopting a miniature convolutional neural network, if the current area is classified as true by the network, determining that the plane exists in the candidate frame, and if not, considering the current candidate frame as the background and abandoning the current candidate frame.
Preferably, the data set B { (h)i,wi) N, where N denotes the number of currently existing aircraft targets, hiNumber of pixels, w, representing the length of the ith aircraftiIndicating the number of pixels occupied by the width of the ith aircraft, and i indicates the aircraft number.
Preferably, the minimum side length S of the aircraft object is calculatedmin=min{min(h),min(w)};
Average length to width ratio of aircraft
Figure BDA0002573322690000041
Preferably, for the Adaboost algorithm, the following is defined:
Figure BDA0002573322690000042
wherein x represents the aggregation channel characteristic of the current window, the threshold thr is used for judging whether the current area has a target, and SCLFxAn output value representing the strong classifier SCLF to x, which represents the probability magnitude that x is the target;
SCLFxconsists of a series of weak classifiers, which are expressed as follows:
Figure BDA0002573322690000043
clf denotes a weak classifier constructed with a tree structure of depth 2; weightmIs the weight corresponding to each weak classifier; thetamIs a parameter of each weak classifier; δ represents the weight correction coefficient of the weak classifier and is set to 0 during the training phase.
Preferably, the fast candidate box generation algorithm in step (3) is finely adjusted, and the method is as follows: the IoU value for the current candidate box is defined as follows: IoU ═ GB)/(GB ═ DB), where GB denotes the true labeled target frame and DB denotes the target frame generated by the fast candidate frame; the precision and the recall rate are indirectly obtained through IoU values; the index fine-tuned by the fast candidate box generation algorithm is defined as
Figure BDA0002573322690000044
Fidxi=γ×Recalli+PrecisioniWherein γ is a hyperparameter; recalliIndicating the recall rate after i parameter delta adjustments, PrecisioniThe detection accuracy after i-time parameter δ adjustment is shown.
Preferably, the miniature convolutional neural network comprises 4 convolutional layers, 2 mean pooling layers and 2 full-connection layers; the input of the network is a picture with the size of 32 × 3, the network adopts a ReLu function as an activation function, each layer of convolution operation is followed by the activation function, the network is divided into a feature extraction part and a full connection layer, and the whole convolution neural network is a two-classification network model.
Preferably, the feature extraction part performs a convolution operation on the input image by using 18 convolution kernels of 5 × 5, performs a convolution operation by using 24 convolution kernels of 3 × 3 and performs a mean pooling operation by using kernels of 2 × 2, performs a convolution operation by using 32 convolution kernels of 3 × 3 and performs a mean pooling operation by using kernels of 2 × 2, and performs a convolution operation by using 32 convolution kernels of 3 × 3 again, thereby completing the feature extraction of the candidate region.
Preferably, the full connecting layer part is mainly used for classification judgment, and has two layers in total; the first layer inputs the feature vectors obtained by the feature extraction part with the size of 32 × 3, outputs the vectors with the size of 128, and is followed by a Dropout layer and a ReLu activation function layer with the probability size P of 0.1; the second layer is a vector with 128 dimensions as input and 2 vectors as output.
Preferably, the miniature convolutional neural network adopts a cross entropy function as a loss function of the network, which is defined as follows:
Figure BDA0002573322690000051
where BZ represents the batch size, set to 64, y in network trainingiThe true tag, which represents the current candidate box, contains a target of 1, otherwise 0,
Figure BDA0002573322690000052
is a predicted outcome of the network; the optimization of the loss function adopts an Adam optimizer, and the training data set is iterated for 50 times in total, wherein the learning rate lr of the first 40 times is 0.001, and the learning rate lr of the last 10 times is 0.0003; when the miniature convolutional neural network is trained, one part of a training data set is from a real label in an original training data set, and the other part of the training data set is from a candidate frame generated after a rapid candidate frame generation algorithm is run on the training data set; all the pictures to be input into the network are normalized, and the normalized mean value is mean ═ 0.485,0.456,0.406]Variance std ═ 0.229,0.224,0.225]The current input pixel point isAnd X, the normalized pixel is (X-mean)/std.
The invention has the beneficial effects that:
1. the detection speed is high.
2. The detection precision is high.
3. The whole algorithm model occupies less space and has low requirements on the hardware conditions of the operating platform.
4. The algorithm is fast in training speed and needs not to be iterated for hundreds of times.
5. The algorithm does not need a large amount of data training and is very suitable for the characteristic of small remote sensing data quantity.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a schematic diagram of the algorithm flow of the present invention.
FIG. 2 is a schematic diagram of a miniature convolutional neural network according to the present invention.
FIG. 3 is a schematic view of the sliding window of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a visible light image airplane rapid detection method based on a miniature convolutional neural network, and the method is shown in the attached figure 1, wherein a yellow frame represents a training process, a green frame represents a detection process, and a blue frame represents a detection result. The detection method comprises the following steps:
s1: calculation of the sliding window:
because the feature extraction under different scales is completed by adopting the rapid feature pyramid, and the establishment of the feature pyramid is completed by adopting a down-sampling mode, the initial sliding window is a window where a smaller target is located. In order to avoid the influence of artificial marking and shooting angles and the property of adapting to the multidirectional airplane angle and the like, the square sliding window is adopted to finish the airplane detection. The sliding window size calculated by the method is more representative and universal. It mainly comprises the following three steps:
firstly, counting and sorting the airplane dimensions in the training data set to obtain an airplane dimension data set B { (h)i,wi) N, where N denotes the number of currently existing aircraft, hiNumber of pixels, w, representing the length of the ith aircraftiThe number of pixels occupied by the width of the ith airplane is represented, and i represents an airplane number;
then, calculating the minimum side length information of the airplane:
Smin=min{min(h),min(w)} (1)
secondly, counting and sorting the length-width ratio information r of each airplanei
Figure BDA0002573322690000071
So that the average length-width ratio information R can be estimatedtarget
Figure BDA0002573322690000072
In summary, the size of the final sliding window is (S)min×Rtarget,Smin×Rtarget). Referring to fig. 3, blue dots represent aircraft scale information and red dots represent calculated sliding window scale information.
S2: fast candidate box generation based on aggregated channel features
The candidate frame generation algorithm is mostly based on a segmentation algorithm or a clustering algorithm, and the algorithm has the defects of high running speed, low precision, high missing rate and the like. In the traditional target detection in the VJ era, due to the difference of target sizes, sliding window judgment needs to be carried out on a plurality of scale pyramids to ensure higher detection rate. The feature needs to be recalculated at each scale, so the whole detection process is less efficient. The fast pyramid method adopts a characteristic diagram adjacent interpolation mode to calculate the characteristic diagram under partial scale. The detection of the corresponding target can be rapidly finished through the cooperation of the operation and the integral map. The quick characteristic pyramid method is combined with the aggregation channel characteristic and the Adaboost algorithm to complete target detection in real time, and the strategy is adopted to realize a candidate box generation algorithm with high recall rate and high reliability. The method comprises the following specific steps:
polymerization channel characteristics: the feature channel is a mapping of the input picture, which may be point-to-point or region-to-region, and the transformed picture is a feature. And the input picture I has the channel characteristic of C ═ omega (I), and the aggregation channel characteristic of the input picture I can be obtained by connecting and smoothing the characteristic channels in C. The channel features used in the algorithm include three classes: standard gradient magnitude channel characteristics, gradient direction channel characteristics (6 directions), and LUV color channel characteristics;
quick characteristic pyramid: input Picture I, whose Standard feature pyramid can be denoted CsΩ (R (I, s)), where s denotes the scale size, and the function R (I, s) denotes sampling of picture I with scale s. The fast feature pyramid is different from the standard feature pyramid in that only a part of the scale is sampled (s' is an element of {1,1/2,1/4. }), and other intermediate scale features are calculated by adopting a linear interpolation mode
Figure BDA0002573322690000084
For a given channel characteristic type Ω, its corresponding λΩThe least squares estimation can be performed as follows:
Figure BDA0002573322690000081
wherein N represents N pictures, fΩThe definition is as follows:
Figure BDA0002573322690000082
wherein h iss×wsRepresenting the dimension of the picture at the current scale s, i and j representing the position of the pixel, function fΩ(Is) Can be represented as CsAverage value; by indicating the information loss factor lambdaΩThe combination of corresponding formulas can make the characteristic diagram constructed by linear interpolation more accurate.
And (3) generating a candidate frame: a soft-concatenated Adaboost algorithm is used to accomplish the selection of candidate boxes, which is defined as follows:
Figure BDA0002573322690000083
where x represents the aggregate channel characteristics of the current window and the threshold thr is used to determine whether the current region has a target, which is typically set to 0. SCLFxThe output value of the strong classifier SCLF vs. x, which indicates the likelihood of the target existing in the current window, is represented. SCLFxConsists of a series of weak classifiers, which are expressed as follows:
Figure BDA0002573322690000091
clf denotes a weak classifier constructed with a tree structure of depth 2, weightmIs the weight, θ, corresponding to each weak classifiermIs a parameter of each weak classifier, and δ represents a weight correction coefficient of the weak classifier, which is 0 in the training phase.
S3: fine tuning of fast candidate box generation algorithms
The fast candidate box generation algorithm aims to generate candidate target regions with high recall rate and high accuracy, so that the Adaboost algorithm parameters need to be adjusted after training. Through the correction of the parameter delta, on one hand, the accuracy of the classification result is ensured, and on the other hand, the recall rate can be improved as much as possible. Usually the IoU value of the candidate box and any one of the manually labeled boxes is greater than 0.5, then the current candidate box is considered to be correct, and the IoU value is defined as follows:
IoU=(GB∩DB)/(GB∪DB) (8)
wherein GB represents a target frame of a real label, and DB represents a target frame generated by a quick candidate frame. Thus, the Precision (Precision) and Recall (Recall) can be determined:
Figure BDA0002573322690000092
tp (true positive) indicates that the candidate box determines that there is a target, and in fact the current region does have a target. Fp (false positive) indicates that the candidate box determines that there is a target, but in fact there is no target in the current region. Fn (false negative) indicates that the candidate box determines that there is no target, but in fact the current region does. The fast candidate box generation algorithm is used for detecting the training data set under different parameters delta, and corresponding detection precision (precision) and recall (recall) can be calculated. And a high-precision high-recall-rate quick candidate frame generation model is obtained by adjusting the parameter delta.
Since the purpose of the fast candidate box generation algorithm is to obtain a high-precision and high-recall candidate box, wherein the recall rate directly affects the classification effect of the micro convolutional neural network in the later stage, the index finely tuned by the fast candidate box generation algorithm is defined as:
Figure BDA0002573322690000101
Fidxi=γ×Recalli+Precisioni (11)
where γ is a hyperparameter to emphasize the importance of recall values, and furthermore larger γ values will make the curve formed by the Fidx values smoother. The parameter delta is selected by adopting a Search-delta algorithm, and the code of the Search-delta algorithm is as follows:
Figure BDA0002573322690000102
s4: design of miniature convolutional neural network
The micro convolutional neural network is used for judging whether the target exists in the candidate area and comprises 4 convolutional layers, 2 average pooling layers and 2 full-connection layers. The specific structure of the network model is shown in fig. 2, the network model is divided into a feature extraction layer and a classification layer, the first line is a rough model of the model, and each block in the second line describes each component in the corresponding first line in detail. Each candidate region is sampled to a size of 32 x 3 and then input to the network. The network adopts a ReLu function as an activation function, the activation function is followed after each layer of convolution operation, and the network is divided into a feature extraction part and a full connection layer. The feature extraction part firstly uses 18 convolution kernels with 5 × 5 to perform convolution operation on an input picture, then uses 24 convolution kernels with 3 × 3 to perform convolution operation and uses kernels with 2 × 2 to perform mean pooling operation, then uses 32 convolution kernels with 3 × 3 to perform convolution operation and kernels with 2 × 2 to perform mean pooling operation, and finally uses 32 convolution kernels with 3 × 3 to perform convolution operation again, so that the feature extraction work of the candidate region is completed. The full connecting layer part is mainly used for classification judgment and has two layers in total. The first layer receives a feature vector (obtained by the feature extraction section) having a size of 32 × 3, outputs a vector having a size of 128, and is followed by a Dropout layer and a ReLu activation function layer having a probability size P of 0.1. The second layer is a vector with 128 dimensions as input and a vector with 2 dimensions as output, and the micro convolutional neural network aims at identifying a target area, so that only two types of judgment are needed.
The miniature convolutional neural network adopts a cross entropy function as a loss function of the network, which is defined as follows:
Figure BDA0002573322690000111
where BZ represents the batch size, set to 64, y in network trainingiThe true tag, which represents the current candidate box, contains an airplane and is 1, otherwise it is 0,
Figure BDA0002573322690000112
is the predicted result of the network. Optimization of the loss function was performed using an Adam optimizer, for a total of 50 iterations of the training data set, with the first 40 learning rates lr being 0.001 and the last 10 learning rates lr being 0.0003. When the miniature convolutional neural network is trained, part of the training data set comes from the real labels in the original training data set, and part of the training data set comes from the candidate frames generated after the rapid candidate frame generation algorithm runs on the training data set. All the pictures to be input into the network are normalized, and the normalized mean value is mean ═ 0.485,0.456,0.406]Variance std ═ 0.229,0.224,0.225]And if the current input pixel point is X, the normalized pixel is (X-mean)/std. In addition, due to the fact that the aircraft has the challenges of multi-directionality, small training data volume and the like, the current data are all turned over up and down, left and right to expand the data volume, and therefore the classification performance of the network is improved.
S5: detection of aircraft in visible light airport based on fast candidate frame and miniature convolutional neural network
Firstly, counting the target size in a training data set, calculating the size of a sliding window by adopting the scheme in the first step, then, training a rapid candidate frame generation model by adopting the sliding window and the scheme in the second step, wherein lambda corresponding to different channel feature types omega is finished in the training processΩAnd recording the trained fast candidate box generation model. And then, correcting the parameter delta by using the training data set, so that a candidate frame with high reliability and high recall rate can be generated under the current detection model, and replacing the candidate frame generation model which is just stored by the corrected fast candidate frame generation model.
And then, training the miniature convolutional neural network, wherein one part of the training data set is from the real labels in the original training data set, and the other part of the training data set is used for generating candidate frames generated after the training data set is operated by the rapid candidate frame generation algorithm. And all the pictures to be input into the network are subjected to normalization processing. In addition, due to the challenges of multi-directionality, small training data volume and the like of the target, the current data is turned over up and down and left and right to expand the data volume, and the classification performance of the network is improved.
And finally, generating candidate areas by using a rapid candidate box generation algorithm for the input remote sensing picture, judging the areas again by using a miniature convolutional neural network, indicating that a target exists in the current area when the network model classification result is true, and otherwise, discarding the current candidate box as a background.
The method for detecting the airplane in the airport under the visible light is mainly provided for detecting the airplane in the airport under the visible light of 2-5 meters, but the method is also suitable for detecting the target in the visible light remote sensing image under other resolutions, and aiming at the airport under the visible light of other resolutions, the method only needs to recalculate the size of a sliding window, retrain a fast candidate frame generation model and a micro convolutional neural network model and fine tune a candidate frame generation model.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (9)

1. A visible light image airplane rapid detection method based on a miniature convolutional neural network is characterized by comprising the following steps:
(1) counting airplane dimension information in the training data set to obtain a data set; next, a minimum side length S of the aircraft object is calculatedmin(ii) a Finally, calculating the average length-width ratio R of the airplanetargetTo calculate the final sliding window size: (S)min×Rtarget,Smin×Rtarget);
(2) Three types of channel characteristics are calculated for each visible light field picture: a standard gradient magnitude channel characteristic, a gradient direction channel characteristic, and a LUV color channel characteristic; calculating lambda corresponding to a given channel characteristic type omega by using a least square estimation method through channel characteristics of a training data setΩCompleting the establishment of a rapid characteristic pyramid, and completing the training of a rapid candidate box generation algorithm by adopting an Adaboost algorithm, wherein lambdaΩRepresenting an information loss coefficient;
(3) re-detecting the training data set by the fast candidate frame generation algorithm under different parameters delta through a linear Search algorithm Search-delta, and calculating corresponding detection precision and recall rate; obtaining a high-precision high-recall-rate quick candidate frame generation model by adjusting the parameter delta;
(4) and adopting the miniature convolutional neural network to judge the candidate area again, if the current area is classified as true by the network, considering that the plane exists in the candidate frame, and if not, considering the current candidate frame as the background and abandoning the current candidate frame.
2. The method for rapidly detecting the visible light image airplane based on the miniature convolutional neural network as claimed in claim 1, wherein the data set B { (h)i,wi) N, where N denotes the number of currently existing aircraft targets, hiNumber of pixels, w, representing the length of the ith aircraftiRepresenting the number of pixels occupied by the width of the ith aircraft.
3. The method for rapidly detecting the visible light image airplane based on the miniature convolutional neural network as claimed in claim 2, wherein the minimum side length S of the airplane target is calculatedmin=min{min(h),min(w)};
Average length to width ratio of aircraft
Figure FDA0002573322680000021
4. The method for rapidly detecting the airplane with the visible light image based on the miniature convolutional neural network as claimed in claim 1, wherein for the Adaboost algorithm, the following is defined:
Figure FDA0002573322680000022
wherein x represents the aggregation channel characteristic of the current window, the threshold thr is used for judging whether the current area has a target, SCLFxAn output value representing the strong classifier SCLF to x, which represents the probability magnitude that x is the target;
SCLFxconsists of a series of weak classifiers, which are expressed as follows:
Figure FDA0002573322680000023
clf denotes a weak classifier constructed with a tree structure of depth 2; weightmIs the weight corresponding to each weak classifier; thetamIs a parameter of each weak classifier; δ represents the weight correction coefficient of the weak classifier, which is set to 0 during the training phase.
5. The method for rapidly detecting the visible light image airplane based on the miniature convolutional neural network as claimed in claim 1, wherein the fast candidate box generation algorithm in step (3) is finely adjusted by: the IoU value for the current candidate box is defined as follows: IoU ═ GB ≈ dDB)/(GB @ DB), where GB denotes a true labeled target frame, and DB denotes a target frame generated by a fast candidate frame; the precision and the recall rate are indirectly obtained through IoU values; the index fine-tuned by the fast candidate box generation algorithm is defined as
Figure FDA0002573322680000024
Fidxi=γ×Recalli+PrecisioniWherein γ is a hyperparameter; recalliIndicating the recall, Precision, after i parameter delta adjustmentsiThe detection accuracy after i-time parameter δ adjustment is shown.
6. The method for rapidly detecting the visible light image airplane based on the miniature convolutional neural network as claimed in claim 1, wherein the miniature convolutional neural network has 4 convolutional layers, 2 mean pooling layers and 2 full-link layers; the input of the network is a picture with the size of 32 × 3, the network adopts a ReLu function as an activation function, each layer of convolution operation is followed by the activation function, the network is divided into a feature extraction part and a full connection layer, and the whole convolution neural network is a two-classification network model.
7. The method as claimed in claim 6, wherein the feature extraction part performs a convolution operation on the input image by using 18 convolution kernels with 5 × 5, performs a convolution operation by using 24 convolution kernels with 3 × 3 and performs a mean pooling operation by using kernels with 2 × 2, performs a convolution operation by using 32 convolution kernels with 3 × 3 and performs a mean pooling operation by using kernels with 2 × 2, and performs a convolution operation by using 32 convolution kernels with 3 × 3 again, thereby completing the feature extraction of the candidate region.
8. The method for rapidly detecting the visible light image airplane based on the miniature convolutional neural network as claimed in claim 7, wherein the full connecting layer part is mainly used for classification judgment, and has two layers; inputting the feature vectors obtained by the feature extraction part with the size of 32 × 3 into the first layer, outputting the vectors with the size of 128, and then, immediately connecting a Dropout layer and a ReLu activation function layer with the probability size P of 0.1 to the first layer; the second layer is a vector with 128 dimensions as input and 2 vectors as output.
9. The method for rapidly detecting the visible light image airplane based on the miniature convolutional neural network as claimed in claim 8, wherein the miniature convolutional neural network adopts a cross entropy function as a loss function of the network, which is defined as follows:
Figure FDA0002573322680000031
where BZ represents the batch size, set to 64, y in network trainingiThe true tag, which represents the current candidate box, contains a target of 1, otherwise 0,
Figure FDA0002573322680000032
is a prediction result of the network; the optimization of the loss function adopts an Adam optimizer, and the training data set is iterated for 50 times in total, wherein the learning rate lr of the first 40 times is 0.001, and the learning rate lr of the last 10 times is 0.0003; when the miniature convolutional neural network is trained, one part of the training data set is from the real label in the original training data set, and the other part of the training data set is from the candidate frame generated after the rapid candidate frame generation algorithm is operated on the training data set; all the pictures to be input into the network are normalized, and the normalized mean value is mean ═ 0.485,0.456,0.406]Variance std ═ 0.229,0.224,0.225]And if the current input pixel point is X, the normalized pixel is (X-mean)/std.
CN202010646717.9A 2020-07-07 2020-07-07 Visible light image airplane rapid detection method based on miniature convolutional neural network Active CN111814662B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010646717.9A CN111814662B (en) 2020-07-07 2020-07-07 Visible light image airplane rapid detection method based on miniature convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010646717.9A CN111814662B (en) 2020-07-07 2020-07-07 Visible light image airplane rapid detection method based on miniature convolutional neural network

Publications (2)

Publication Number Publication Date
CN111814662A CN111814662A (en) 2020-10-23
CN111814662B true CN111814662B (en) 2022-06-24

Family

ID=72842813

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010646717.9A Active CN111814662B (en) 2020-07-07 2020-07-07 Visible light image airplane rapid detection method based on miniature convolutional neural network

Country Status (1)

Country Link
CN (1) CN111814662B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116882461B (en) * 2023-09-01 2023-11-21 北京航空航天大学 Neural network evaluation optimization method and system based on neuron plasticity

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105975929A (en) * 2016-05-04 2016-09-28 北京大学深圳研究生院 Fast pedestrian detection method based on aggregated channel features
CN108460341A (en) * 2018-02-05 2018-08-28 西安电子科技大学 Remote sensing image object detection method based on integrated depth convolutional network
CN109919108A (en) * 2019-03-11 2019-06-21 西安电子科技大学 Remote sensing images fast target detection method based on depth Hash auxiliary network
CN110543837A (en) * 2019-08-16 2019-12-06 北京航空航天大学 visible light airport airplane detection method based on potential target point

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3029606A3 (en) * 2014-11-14 2016-09-14 Thomson Licensing Method and apparatus for image classification with joint feature adaptation and classifier learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105975929A (en) * 2016-05-04 2016-09-28 北京大学深圳研究生院 Fast pedestrian detection method based on aggregated channel features
CN108460341A (en) * 2018-02-05 2018-08-28 西安电子科技大学 Remote sensing image object detection method based on integrated depth convolutional network
CN109919108A (en) * 2019-03-11 2019-06-21 西安电子科技大学 Remote sensing images fast target detection method based on depth Hash auxiliary network
CN110543837A (en) * 2019-08-16 2019-12-06 北京航空航天大学 visible light airport airplane detection method based on potential target point

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks;Shaoqing Ren 等;《Computer Vision and Pattern Recognition》;20160106;全文 *

Also Published As

Publication number Publication date
CN111814662A (en) 2020-10-23

Similar Documents

Publication Publication Date Title
CN111126472B (en) SSD (solid State disk) -based improved target detection method
CN109241913B (en) Ship detection method and system combining significance detection and deep learning
US9824294B2 (en) Saliency information acquisition device and saliency information acquisition method
CN108596053B (en) Vehicle detection method and system based on SSD and vehicle posture classification
CN109800692B (en) Visual SLAM loop detection method based on pre-training convolutional neural network
CN106023257A (en) Target tracking method based on rotor UAV platform
CN106960195A (en) A kind of people counting method and device based on deep learning
JP7263216B2 (en) Object Shape Regression Using Wasserstein Distance
CN111160407B (en) Deep learning target detection method and system
CN110443279B (en) Unmanned aerial vehicle image vehicle detection method based on lightweight neural network
CN111539422B (en) Flight target cooperative identification method based on fast RCNN
JP2019016114A (en) Image processing device, learning device, focus controlling device, exposure controlling device, image processing method, learning method and program
CN107944437B (en) A kind of Face detection method based on neural network and integral image
CN108734200B (en) Human target visual detection method and device based on BING (building information network) features
CN113221956B (en) Target identification method and device based on improved multi-scale depth model
CN109919246A (en) Pedestrian's recognition methods again based on self-adaptive features cluster and multiple risks fusion
CN111199245A (en) Rape pest identification method
CN111814662B (en) Visible light image airplane rapid detection method based on miniature convolutional neural network
CN114973014A (en) Airplane target fine-grained detection method and system based on multi-network cascade
CN109919215B (en) Target detection method for improving characteristic pyramid network based on clustering algorithm
CN114140485A (en) Method and system for generating cutting track of main root of panax notoginseng
CN109344758B (en) Face recognition method based on improved local binary pattern
Lou et al. Research on edge detection method based on improved HED network
CN117689995A (en) Unknown spacecraft level detection method based on monocular image
CN110348311B (en) Deep learning-based road intersection identification system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant